Main Menu

Search

Showing posts with label Infiniband. Show all posts
Showing posts with label Infiniband. Show all posts

INFINIBAND: How to upgrade Firmware version on Infiniband (IB) Switches?

Below are steps for upgrading Firmware version on Infiniband Switches.

1. Login to IB switch as root user and then switch to ILOM prompt by running "spsh" command.

2. Run load source command as follows to load the Firmware version package
-> load -source <URL to Firmware version package>
URL can be HTTP, FTP which is reachable from IB Switch. Once we run load source command the Firmware upgrade for Switch will begin.

Wait for firmware upgrade to complete. Switch will reboot as part of Firmware upgrade.

3. Log back in to the switch again and run below commands to verify upgrade succeeded.
version
fwverify

Products to which Article Applies
Infiniband Switches

Article Author: Tarun Boyella

EXALOGIC: INFINIBAND: Command to Check Symbol Errors on IB Ports on Compute Nodes (How To Doc)

Below command can be used on Exalogic Compute nodes to check for Symbol Errors on IB Ports on Compute Nodes.
cat /sys/class/infiniband/mlx4_0/ports/*/counters/symbol_error
Below is example snippet of above command. In this case there are zero symbol errors on both the IB ports.
# cat /sys/class/infiniband/mlx4_0/ports/*/counters/symbol_error 
0 
0 

Products to which Article Applies
Exalogic racks using Infiniband Switches
Article Author: Tarun Boyella

EXALOGIC: INFINIBAND: "perfquery" command to check Symbol Errors and Other details of Compute Nodes Ports (How To Doc)

Following perfquery command can be executed on Compute node to check the Symbol Errors and Other details of Compute Nodes Ports.
perfquery <Baselid> <Port>
We can get Base lid and Port details from "ibstat" command of the Compute Node. Below is example snippet of "ibstat" command.
CA 'mlx4_0' 
CA type: MT26428 
Number of ports: 2 
Firmware version: 2.11.2010 
Hardware version: b0 
Node GUID: XXXXXXXXXXXXX 
System image GUID: XXXXXXXXXXXXX 

Port 1: 
State: Active 
Physical state: LinkUp 
Rate: 40 
Base lid: 7 
LMC: 0 
SM lid: 41 
Capability mask: 0x02510868 
Port GUID: XXXXXXXXXXXXX 
Link layer: IB 

Port 2: 
State: Active 
Physical state: LinkUp 
Rate: 40 
Base lid: 9 
LMC: 0 
]SM lid: 41 
Capability mask: 0x02510868 
Port GUID: XXXXXXXXXXXXX 
Link layer: IB 
From above output we can see Base lid for Ports 1 and 2 of Compute Nodes. Now if we want to run perfquery on Port 1 on lid 7 command will look as follows:

perfquery 7 1
Below is example snippet of above command. If we notice Symbol Errors increasing in above command output, then it indicates that there is an issue with Compute node IB Port (bad cable, loose cable or other issues).
 # Port counters: Lid 7 port 1 (CapMask: 0x1400) 
PortSelect:......................1 
CounterSelect:...................0x0000 
SymbolErrorCounter:..............0 
LinkErrorRecoveryCounter:........0 
LinkDownedCounter:...............0 
PortRcvErrors:...................0 
PortRcvRemotePhysicalErrors:.....0 
PortRcvSwitchRelayErrors:........0 
PortXmitDiscards:................2 
PortXmitConstraintErrors:........0 
PortRcvConstraintErrors:.........0 
CounterSelect2:..................0x00 
LocalLinkIntegrityErrors:........0 
ExcessiveBufferOverrunErrors:....0 
VL15Dropped:.....................0 
PortXmitData:....................4294967295 
PortRcvData:.....................4294967295 
PortXmitPkts:....................4294967295 
PortRcvPkts:.....................1088593887 
PortXmitWait:....................65418428 

Products to which Article Applies
Exalogic racks using Infiniband Switches

Article Author: Tarun Boyella

INFINIBAND: How to check Node, System Image and Port GUID of IB Switch? (ibstat)

"ibstat" command can be used on IB Switch to check Node, System Image and Port GUID of IB Switch.

Below is example snippet of "ibstat" command.
Switch 'is4_0' 
Switch type: MT48436 
Number of ports: 0 
Firmware version: 7.4.3002 
Hardware version: a1 
Node GUID: 0x0010e0620028c0a0 
System image GUID: 0x0010e0620028c0a3 
Port 0: 
State: Active 
Physical state: LinkUp 
Rate: 40 
Base lid: 12 
LMC: 0 
SM lid: 27 
Capability mask: 0x4250084a 
Port GUID: 0x0010e0620028c0a0 

Products to which Article Applies
Infiniband Switches

Article Author: Tarun Boyella

EXALOGIC: INFINIBAND: How to correlate which Physical Connector Ports of IB Switch are connected to which Compute Nodes?

On IB Switches typically Switch ports will have physical name for connector ports which is different than the Switch Port numbers seen in "ibnetdiscover" command. Names for Switch Ports will be 1A, 1B ... 14A, 14B ... etc. These names can be mapped to Switch Port numbers by looking at "listlinkup" command output.

For e.g. if you want to track the Switch connector port 13A below is how we can do it.

Run "listlinkup" command to see to which Switch Port the Switch Connector port 13A maps to. For e.g. from listlinkup command example output below we see as follows that connector port 13A corresponds to Switch Port 9.
 Connector 13A Present <-> Switch Port 9 up (Enabled) 
Now run "ibnetdiscover" command. From ibnetdiscover output we see switch Port 9 of both Switches is connected to Compute node 6 as seen in below example output. 

 Ca 2 "H-XXXXXXXXX" # "el01cn06 EL-C  XX.XX.XX.XX HCA-1" 
[2](ZZZZZZZZZZZZZZ) "S-00XXXXXXXXXX0a0"[9] # lid 25 lmc 0 "SUN IB QDR GW switch el01gw02 XX.XX.XX.XX leaf:2" lid 11 4xQDR 
[1](ZZZZZZZZZZZZZZ) "S-00XXXXXXXXXX0a0"[9] # lid 24 lmc 0 "SUN IB QDR GW switch el01gw01 XX.XX.XX.XX leaf:1" lid 41 4xQDR  



Products to which Article Applies
Infiniband Switches

Article Author: Tarun Boyella

INFINIBAND: "getportcounters" Command to check if there are any Issues on Ethernet ETH Connector Ports of IB Switch.

Below "getportcounters" Command can be used to check if there are any Issues on Ethernet ETH Connector Ports of IB Switch.
getportcounters <Connector Port>
For example if you want to check the counters on port 0A-ETH-1 above command looks as follows:
getportcountersm 0A-ETH-1
Below is sample output of above command. If we see RX CRC count increasing then it indicates a problem with connector port (either bad cable, loose cabling or issues with Switch connector port or external switch to which Switch connector port is cabled).

Port counters for connector 0A-ETH-1 Bridge-0 port Bridge-0-2 
RX bytes.........................4867164034 
RX packets.......................42669859 
RX Jumbo packets.................0 
RX unicast packets...............28356686 
RX multicast packets.............12729499 
RX broadcast packets.............1583674 
RX no buffer.....................0 
RX CRC...........................0 
RX runt..........................0 
RX errors........................0 
TX bytes.........................52228156 
TX packets.......................126667 
TX Jumbo packets.................80133903301347 
TX unicast packets...............116537 
TX multicast packets.............10105 
TX broadcast packets.............25 
TX errors........................0 

Products to which Article Applies
Infiniband Switches

Article Author: Tarun Boyella

INFINIBAND: "getportcounters" Command to check if there are any Symbol Errors, Link Issues on Switch Ports of IB Switch.

Below "getportcounters" Command can be used to check if there are any Symbol Errors, Link Issues on Switch Ports of IB Switch.
getportcounters <Switch Port>
For example if you want to check the counters on port 6 above command looks as follows:
getportcounters 6
Below is sample output of above command. If we see SymbolErrors increasing then that indicates there is an issue with the Switch port (loose cable, bad cable, bad switch port or something else)
SymbolErrors.....................0 
LinkRecovers.....................0 
LinkDowned.......................0 
RcvErrors........................0 
RcvRemotePhysErrors..............0 
RcvSwRelayErrors.................297 
XmtDiscards......................6 
XmtConstraintErrors..............0 
RcvConstraintErrors..............0 
LinkIntegrityErrors..............0 
ExcBufOverrunErrors..............0 
VL15Dropped......................0 
XmtData..........................4294967295 
RcvData..........................4294967295 
XmtPkts..........................717430186 
RcvPkts..........................634187063 
XmtWait..........................1338332 

Products to which Article Applies
Infiniband Switches

Article Author: Tarun Boyella

INFINIBAND: "getportstatus" command to check status and details (MTU, Speed) of Ethernet ETH connector port on IB Switches (How To Doc)

Following getportstatus command can be used to check status and details (MTU, Speed) of Ethernet ETH connector ports on IB Switches.
getportstatus <Connector Port>
If you want to check status of 0A-ETH-1, command looks like follows:
getportstatus 0A-ETH-1
Below is example output of above command.
Adminstate.......................Enabled 
State............................Up 
Link state.......................Up 
Protocol.........................Ethernet 
Link Mode........................XFI 
Speed............................10Gb/s 
MTU..............................9600 
Tx pause.........................Global 
Rx pause.........................Global 

Products to which Article Applies
Infiniband Switches

Article Author: Tarun Boyella

INFINIBAND: "showgwports" command to check status of Ethernet ETH connector ports on IB Switches (How To Doc)

"showgwports" command can be used to check status of Ethernet ETH connector ports on IB Switches (How To Doc)

Below is example output of showgwports command for reference.

In below output 0A-ETH-1 is enabled which means it is connected and link state is Down (likely due to port issue on upstream data center switch). 0A-ETH-3 is Disabled which means that it is not even connected to upstream switch. 0A-ETH-2 is the connector port which is enabled and link state is UP which means that it is operational.
Port      Bridge      Adminstate Link  State       MTU  TxPause  RxPause 
------------------------------------------------------------------------- 
0A-ETH-1  Bridge-0-2  Enabled    Down  Reset       9600 Global   Global 
0A-ETH-2  Bridge-0-2  Enabled    Up    Up          9600 Global   Global 
0A-ETH-3  Bridge-0-1  Disabled   Down  Reset       9600 Global   Global 
0A-ETH-4  Bridge-0-1  Enabled    Down  Reset       9600 Global   Global 

Products to which Article Applies
Infiniband Switches

Article Author: Tarun Boyella

INFINIBAND: How to enable Ethernet ETH Connector Ports on IB Switch?

Below command can be used to enable Ethernet ETH Connector Ports on IB Switch.
enablegwport <Connector Port>
For example if connector port 0A-ETH-3 has to be enabled, command will look as follows:
enablegwport 0A-ETH-3

Products to which Article Applies
Infiniband Switches

Article Author: Tarun Boyella

INFINIBAND: How to disable Ethernet ETH Connector Ports on IB Switch?

Below command can be used to disable Ethernet ETH Connector Ports on IB Switch.
disablegwport <Connector Port>
For example if connector port 0A-ETH-3 has to be disabled, command will look as follows:
disablegwport 0A-ETH-3

Products to which Article Applies
Infiniband Switches

Article Author: Tarun Boyella

INFINIBAND: How to Move (Failover) SM Master (Subnet Manager Master) From One IB Switch to Other IB Switch?

Follow below steps for doing SM Master failover.

1. Verify that SM is enabled with correct SM settings on two or more IB Gateway Leaf Switches (depending on the configuration you have). For validating if SM is running following command can be executed on all the IB Switches.
service opensmd status
For validating the current SM settings following command can be executed.

setsmpriority list
2. Identify the Switch running the SM Master. For this run below command on any of the IB Switches.
getmaster
3. Login to IB Switch running SM Master (identified in above step 1) and disable SM using below command.
disablesm
After running above command SM Master will move to one of the other Switches which has SM enabled.

4. Verify that the SM master has moved to different Switch by running "getmaster" command on one of the IB Switches.

5. Enabled back the SM again on the Switch which was running the SM master before (identified in above step 2). For this run below command.
enablesm

Products to which Article Applies
Infiniband Switches

Article Author: Tarun Boyella

INFINIBAND: smpartition command to abort the changes done to partitions on IB Switch (How To Doc)

Below smpartition abort command can be used to abort any changes made to the partition.
smpartition abort
For making changes to partitions we first run "smpartition start", make changes to partitions using smpartition commands and then finally commit the changes using "smpartition commit" command. If for some reason if changes made to partitions has to be aborted, above "smpartition abort" command can be executed instead of "smpartition commit" command


Products to which Article Applies
Infiniband Switches


Article Author: Tarun Boyella

INFINIBAND: How to create VLAN on IB Switch?

Below command can be executed to create VLAN on IB Switch.
createvlan <Connector Port> -VLAN <VLAN ID> -PKEY <Partition Key> 
For e.g. if you want to create VLAN 370 which on Connector Port 0A-ETH-1 and Partition Key 8007, your command will look as follows:
createvlan 0A-ETH-1 -VLAN 370 -PKEY 0x0007  
Products to which Article Applies
Infiniband Switches

Article Author: Tarun Boyella

INFINIBAND: How to delete VLAN on IB Switch?

Below command can be executed to delete VLAN on IB Switch.
deletevlan <Connector Port on which VLAN exists> -vlan <VLAN ID>
For e.g. if you want to delete VLAN 370 which is on Connector Port 0A-ETH-1, your command will look as follows:
deletevlan 0A-ETH-1 -vlan 370
Products to which Article Applies
Infiniband Switches

Article Author: Tarun Boyella

EXALOGIC: INFINIBAND: How to check if all Exalogic Compute Nodes are connected correctly to both the IB Switches

For checking if all Exalogic Compute Nodes are connected correctly to both the IB Switches, run below command on one of the IB Switches in the Fabric.
ibnetdiscover | grep -A 2 "Ca" | egrep -i -A 2 "<node 1 >|<node 2>|.....|<node N>"
For e.g. if you have four compute nodes in the rack with names el01cn01, el01cn02, el01cn03 and el01cn04 your command will look as follows:
ibnetdiscover | grep -A 2 "Ca" | egrep -i -A 2 "el01cn01|el01cn02|el01cn03|el01cn04" 
Output will be in below format
Ca 2 "H-00XXXXXXX" # "< CN/SN Hostname> EL-C <CN/SN IP> HCA-1" 
[1](XXXXXX) "S-00XXXXXX"[<Switch Port>] # lid <lid #> lmc 0 "SUN IB QDR GW switch <Switch 1> <Switch 1 IP>" lid <lid #> 4xQDR 
[2](XXXXXX) "S-00XXXXXX"[<Switch Port>] # lid <lid #> lmc 0 "SUN IB QDR GW switch <Switch 2> <Switch 2 IP>" lid <lid #> 4xQDR  
Below is sample output which shows el01cn01 connected to two IB Switches el01gw01 & el01gw02 correctly.
Ca 2 "H-00XXXXXX128bf08" # "el01cn01 EL-C XX.XX.XX.XX HCA-1" 
[1](XXXXXX128bf09) "S-00XXXXXXXXbac0a0"[6] # lid 5 lmc 0 "SUN IB QDR GW switch el01gw01 XX.XX.XX.XX" lid 41 4xQDR 
[2](XXXXXX128bf0a) "S-00XXXXXXX728c0a0"[6] # lid 6 lmc 0 "SUN IB QDR GW switch el01gw02 XX.XX.XX.XX" lid 11 4xQDR   

Products to which Article Applies
Exalogic racks using Infiniband Switches

Article Author: Tarun Boyella

INFINIBAND: How to Enable Particular Switch Port or ETH Connector Port on IB Switch?

Below command can be used to enable particular Switch Port or ETH Connector Port on IB Switch.

enableswitchport <Switch Port or ETH connector Port>
For e.g. if you want to enable Port 6 of IB Switch, command will look as follows
enableswitchport 6
For e.g. if you want to enable Connector Port 0A-ETH-1 of IB Switch, command will look as follows
enableswitchport 0A-ETH-1


Products to which Article Applies
Infiniband Switches

Article Author: Tarun Boyella

INFINIBAND: How to Get status of Particular Switch Port or ETH Connector Port on IB Switch?

Below command can be used to get status of particular Switch Port or ETH Connector Port on IB Switch.

getportstatus <Switch Port or ETH connector Port>
For e.g. if you want to get status of Port 6 of IB Switch, command will look as follows
getportstatus 6
For e.g. if you want to get status of Connector Port 0A-ETH-1 of IB Switch, command will look as follows
getportstatus 0A-ETH-1


Products to which Article Applies
Infiniband Switches

Article Author: Tarun Boyella

INFINIBAND: How to disable Particular Switch Port or ETH Connector Port on IB Switch?

Below command can be used to disable particular Switch Port or ETH Connector Port on IB Switch.
disablegwport <Switch Port or ETH connector Port>
For e.g. if you want to disable Port 6 of IB Switch, command will look as follows
disablegwport 6
For e.g. if you want to disable Connector Port 0A-ETH-1 of IB Switch, command will look as follows
disablegwport 0A-ETH-1


Products to which Article Applies
Infiniband Switches

Article Author: Tarun Boyella

INFINIBAND: How to check SM Priorities on all the IB Switches in the Fabric?

Following are steps to check SM priorities on all the IB Switches in Fabric.

1. Run "ibswitches" command on any one of the IB Switches to list the Switch names and their GUIDS

2. Run below ibdiagnet command on any of the IB Switches. This command will run for few seconds. Wait for command execution to complete.

3. On IB Switch where you ran ibdiagnet command, cat below file.
cat /tmp/ibdiagnet.sm
/tmp/ibdiagnet.sm file shows the SM priorities for each Switch GUID. You can map the GUID to the Switch name from the output of "ibswitches" in above step 1. Below is example snippet of /tmp/ibdiagnet.sm file..
-I--------------------------------------------------- 
-I- Summary Fabric SM-state-priority 
-I--------------------------------------------------- 
  SM - master 
    The Local Device : Port=0 lid=0x0029 guid=0xZZZZZZbea4bac0a0 dev=48438  
    priority:14 
  SM - standby 
    Port=27 lid=0x000b guid=0xXXXXXX620728c0a0 dev=48438  priority:5 

Products to which Article Applies
Infiniband Switches

Article Author: Tarun Boyella