On Tue, 28 Apr 2009 20:27:36 -0700 Ira Weiny <[email protected]> wrote:
> Sasha, Hal, > > I have some hardware on which the following query does not work. > > 18:40:54 > ./smpquery -c nodeinfo 243 0,1 > ibwarn: [22072] mad_rpc: _do_madrpc failed; dport (Lid 243 DR path slid > 148; dlid 65535; 0,1) > ./smpquery: iberror: failed: operation nodeinfo: node info query failed > > from the node I am running on. > > 20:08:46 > ibstat > CA 'mlx4_0' > CA type: MT25418 > Number of ports: 2 > Firmware version: 2.6.0 > Hardware version: a0 > Node GUID: 0x0002c9020025feb4 > System image GUID: 0x0002c9020025feb7 > Port 1: > State: Active > Physical state: LinkUp > Rate: 10 > Base lid: 148 > LMC: 2 > SM lid: 148 > Capability mask: 0x0251086a > Port GUID: 0x0002c9020025feb5 > [snip] > > 19:12:10 > hostname > hype137 > > > A query on the LID alone returns this. > > 18:41:20 > ./smpquery nodeinfo 243 > # Node info: Lid 243 > [snip] > NodeType:........................Switch > NumPorts:........................24 > SystemGuid:......................0x0008f10400400e69 > Guid:............................0x0008f10400400e69 > PortGuid:........................0x0008f10400400e69 > [snip] > > And iblinkinfo is. > > 18:41:26 > iblinkinfo.pl -S 0x0008f10400400e69 > Switch 0x0008f10400400e69 ISR9288 Voltaire sFB-12D: > 243 1[ ] ==( 4X 2.5 Gbps Active / LinkUp)==> 646 10[ ] > "ISR9288/ISR9096 Voltaire sLB-24D" ( ) > [snip] > > > It looks like combined routing is not working at all except for this one > query. (LID 37 is the switch which is connected to the HCA I am running > on.) > > 18:53:18 > ./smpquery -c portinfo 37 0,1 > # Port info: Lid 37 DR path slid 148; dlid 65535; 0,1 port 0 > Mkey:............................0x0000000000000000 > GidPrefix:.......................0xfe80000000000000 > Lid:.............................148 > SMLid:...........................148 > [snip] > > All other combined routing queries I try fail. And even this one above is > wrong. It is returning the data on port 6 not 1. Look at the output from the > local switch. > > 19:12:00 > iblinkinfo.pl -R -S 0x000b8cffff004663 > Switch 0x000b8cffff004663 MT47396 Infiniscale-III Mellanox Technologies: > 37 1[ ] ==( 4X 2.5 Gbps Active / LinkUp)==> 108 1[ ] > "hype132" ( ) > 37 2[ ] ==( 4X 2.5 Gbps Active / LinkUp)==> 528 1[ ] > "hype133" ( ) > 37 3[ ] ==( 4X 2.5 Gbps Active / LinkUp)==> 296 1[ ] > "hype134" ( ) > 37 4[ ] ==( 4X 2.5 Gbps Active / LinkUp)==> 92 1[ ] > "hype135" ( ) > 37 5[ ] ==( 4X 2.5 Gbps Active / LinkUp)==> 144 1[ ] > "hype136" ( ) > > This is what is connected to LID 148... > 37 6[ ] ==( 4X 2.5 Gbps Active / LinkUp)==> 148 1[ ] > "hype137" ( ) > > 37 7[ ] ==( 4X 2.5 Gbps Active / LinkUp)==> 540 1[ ] > "hype138" ( ) > 37 8[ ] ==( 4X 2.5 Gbps Active / LinkUp)==> 212 1[ ] > "hype139" ( ) > 37 9[ ] ==( 4X 2.5 Gbps Active / LinkUp)==> 532 1[ ] > "hype140" ( ) > 37 10[ ] ==( 4X 2.5 Gbps Active / LinkUp)==> 60 1[ ] > "hype141" ( ) > 37 11[ ] ==( 4X 2.5 Gbps Active / LinkUp)==> 192 1[ ] > "hype142" ( ) > 37 12[ ] ==( 4X 2.5 Gbps Active / LinkUp)==> 312 1[ ] > "hype143" ( ) > 37 13[ ] ==( 4X 2.5 Gbps Active / LinkUp)==> 647 13[12] > "ISR9288/ISR9096 Voltaire sLB-24D" ( ) > 37 14[ ] ==( 4X 2.5 Gbps Active / LinkUp)==> 641 13[12] > "ISR9288/ISR9096 Voltaire sLB-24D" ( ) > 37 15[ ] ==( 4X 2.5 Gbps Active / LinkUp)==> 643 13[12] > "ISR9288/ISR9096 Voltaire sLB-24D" ( ) > 37 16[ ] ==( 4X 2.5 Gbps Active / LinkUp)==> 653 13[12] > "ISR9288/ISR9096 Voltaire sLB-24D" ( ) > 37 17[ ] ==( 4X 2.5 Gbps Active / LinkUp)==> 637 13[12] > "ISR9288/ISR9096 Voltaire sLB-24D" ( ) > 37 18[ ] ==( 4X 2.5 Gbps Active / LinkUp)==> 610 13[12] > "ISR9288/ISR9096 Voltaire sLB-24D" ( ) > 37 19[ ] ==( 4X 2.5 Gbps Active / LinkUp)==> 655 13[12] > "ISR9288/ISR9096 Voltaire sLB-24D" ( ) > 37 20[ ] ==( 4X 2.5 Gbps Active / LinkUp)==> 645 13[12] > "ISR9288/ISR9096 Voltaire sLB-24D" ( ) > 37 21[ ] ==( 4X 2.5 Gbps Active / LinkUp)==> 635 13[12] > "ISR9288/ISR9096 Voltaire sLB-24D" ( ) > 37 22[ ] ==( 4X 2.5 Gbps Active / LinkUp)==> 651 13[12] > "ISR9288/ISR9096 Voltaire sLB-24D" ( ) > 37 23[ ] ==( 4X 2.5 Gbps Active / LinkUp)==> 639 13[12] > "ISR9288/ISR9096 Voltaire sLB-24D" ( ) > 37 24[ ] ==( 4X 2.5 Gbps Active / LinkUp)==> 649 13[12] > "ISR9288/ISR9096 Voltaire sLB-24D" ( ) > > Any idea what is going on? These were all run with a smpquery built from the > current master tree. > > On my little test system this seems to work just fine... But not on this > system. Did some older hardware not support combined DR routing? Actually I take this back. It seems an older version of smpquery works but not this newer one. So I don't think this is a hardware issue. :-( 20:54:47 > ./smpquery -c nodeinfo 14 0,10 ibwarn: [21947] _do_madrpc: send failed; Invalid argument ibwarn: [21947] mad_rpc: _do_madrpc failed; dport (Lid 14 DR path slid 4; dlid 65535; 0,10) ./smpquery: iberror: failed: operation nodeinfo: node info query failed 20:54:52 > ./smpquery -V ./smpquery BUILD VERSION: 1.5.1_76524e3_dirty Build date: Apr 28 2009 20:47:10 20:54:55 > smpquery -c nodeinfo 14 0,10 # Node info: Lid 14 DR path 0,10 BaseVers:........................1 ClassVers:.......................1 NodeType:........................Switch NumPorts:........................24 SystemGuid:......................0x0008f10400411b19 Guid:............................0x0008f10400411b18 PortGuid:........................0x0008f10400411b18 PartCap:.........................8 DevId:...........................0x5a30 Revision:........................0x000001a1 LocalPort:.......................24 VendorId:........................0x0008f1 20:54:59 > smpquery -V smpquery BUILD VERSION: 1.3.6 Build date: Oct 13 2008 12:20:42 Ira _______________________________________________ general mailing list [email protected] http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
