Re: Errors from ibchecknet

2010-09-08 Thread Chuck Hartley
We found what the problem is... There is an onboard mthca DDR interface and a mlx4 QDR add-in card. When the system comes up, it finds the onboard HCA first and was making that IB0. The ifconfig output shows that interface ib0 is up even though the actual IB state is down/polling. We disabled the

Re: Errors from ibchecknet

2010-09-07 Thread Ira Weiny
On Fri, 3 Sep 2010 14:04:37 -0700 Chuck Hartley wrote: > I checked another working fabric here and also see the same warnings, > so it looks like the warnings are not really a problem. Yes I think you should consider those warnings not errors. > > Well, I assume that it is just IPoIB that isn

Re: Errors from ibchecknet

2010-09-03 Thread Chuck Hartley
I checked another working fabric here and also see the same warnings, so it looks like the warnings are not really a problem. Well, I assume that it is just IPoIB that isn't working. Since ibping works, I believe that says the IB part is ok. Of course, I can't run any of the perftools since they

Re: Errors from ibchecknet

2010-09-02 Thread Hal Rosenstock
On Thu, Sep 2, 2010 at 4:16 PM, Ira Weiny wrote: > On Thu, 2 Sep 2010 11:11:13 -0700 > Chuck Hartley wrote: > >> Sure, here is the output: >> Note this is with the switch we swapped in, so the port numbers don't >> match the ibchecknet output in the original message. >> >> # ibstat >> CA 'mlx4_0'

Re: Errors from ibchecknet

2010-09-02 Thread Ira Weiny
On Thu, 2 Sep 2010 11:11:13 -0700 Chuck Hartley wrote: > Sure, here is the output: > Note this is with the switch we swapped in, so the port numbers don't > match the ibchecknet output in the original message. > > # ibstat > CA 'mlx4_0' > CA type: MT26428 > Number of ports: 2 >

Re: Errors from ibchecknet

2010-09-02 Thread Chuck Hartley
BTW, I am able to communicate between nodes via 'ibping'. That is the only test program I found that will work without needing a host IP. On Thu, Sep 2, 2010 at 12:03 PM, Ira Weiny wrote: > On Thu, 2 Sep 2010 06:56:50 -0700 > Chuck Hartley wrote: > >> We swapped in a different switch and see

Re: Errors from ibchecknet

2010-09-02 Thread Chuck Hartley
Sure, here is the output: Note this is with the switch we swapped in, so the port numbers don't match the ibchecknet output in the original message. # ibstat CA 'mlx4_0' CA type: MT26428 Number of ports: 2 Firmware version: 2.6.0 Hardware version: a0 Node GU

Re: Errors from ibchecknet

2010-09-02 Thread Ira Weiny
On Thu, 2 Sep 2010 06:56:50 -0700 Chuck Hartley wrote: > We swapped in a different switch and see the same errors. The opensm > logfile does not show any errors: Could you run "ibstat" on the node with OpenSM running? And "iblinkinfo" on the same node? Send that output. Ira > >

Re: Errors from ibchecknet

2010-09-02 Thread Chuck Hartley
We swapped in a different switch and see the same errors. The opensm logfile does not show any errors: - OpenSM 3.3.5 Command Line Arguments: Daemon mode Log File: /var/log/opensm.log - OpenSM 3.3.5

Re: Errors from ibchecknet

2010-09-02 Thread Hal Rosenstock
On Thu, Sep 2, 2010 at 8:34 AM, Chuck Hartley wrote: > Hello, > > We installed 1.5.1 and are having problems getting the IB fabric > working. ibv_devinfo shows the HCAs ports are ok and ibdiagnet reports > no errors. However, ibchecknet shows that the switch ports are not > being configured.  We h

Errors from ibchecknet

2010-09-02 Thread Chuck Hartley
Hello, We installed 1.5.1 and are having problems getting the IB fabric working. ibv_devinfo shows the HCAs ports are ok and ibdiagnet reports no errors. However, ibchecknet shows that the switch ports are not being configured. We have never seen this before and are at a loss as to where the prob