It's not so much pinging every 10 seconds as expecting a response within 10 seconds (Clive, correct me if I'm wrong).
You only need to do 1) or 2), not both. Cisco configures 1) in the OFED binary RPMs we release at http://www.cisco.com/cgi-bin/tablebuild.pl/sfs-linux. I prefer to have the host be more responsive. Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems > -----Original Message----- > From: Koen Segers [mailto:[EMAIL PROTECTED] > Sent: Tuesday, May 22, 2007 3:35 PM > To: Scott Weitzenkamp (sweitzen) > Cc: Shirley Ma; Ami Perlmutter; > [email protected]; [EMAIL PROTECTED] > Subject: RE: [ofa-general] GPFS node loses IB-connection > > If I understand it wright, the switch is actually polling > (=pinging) the > interfaces every 10s. This means that when the interface is handling > other traffic, the poll can fail and the port could be > considered out of > service. My question is then: "How can the timeout be reached while > packets are being sent/received to/from the interface?" > > Anyway, what timeout-value would you recommend for us? And why? > > To recapitulate: these are the actions I'll take tomorrow > 1) change the MAD niceness of the servers > 2) change the timeout on the switches > > Are these changes sufficient for the HCA's to keep their ports in > PORT_ACTIVE state? > > Regards, > > Koen > > On Tue, 2007-05-22 at 12:59 -0700, Scott Weitzenkamp (sweitzen) wrote: > > Yes, you can tune it. Here's an example via the switch CLI: > > > > SFS-7000D(config)# ib sm subnet-prefix fe:80:00:00:00:00:00:00 > > node-timeout <value> > > > > The default is 10 seconds, it can be configured up to 2000 seconds. > > If a HCA is completely unresponsive for longer than the node-timeout > > value, then we consider that HCA out of service. > > > > Scott Weitzenkamp > > SQA and Release Manager > > Server Virtualization Business Unit > > Cisco Systems > > > > > > > > > ______________________________________________________________ > > From: Shirley Ma [mailto:[EMAIL PROTECTED] > > Sent: Tuesday, May 22, 2007 11:30 AM > > To: [EMAIL PROTECTED] > > Cc: Ami Perlmutter; [email protected]; > > [EMAIL PROTECTED]; Scott Weitzenkamp > > (sweitzen) > > Subject: RE: [ofa-general] GPFS node loses IB-connection > > > > > > > > Koen, > > > > So it is most likely you hit the same bug as 229 (Scott > > pointed out earlier). The same workaround might work for you > > by renicing ib_mad as Scott suggested. > > > > I think this should be a SM query timeout tunable value in > > Cisco SM. Am I right, Scott? > > > > Thanks > > Shirley Ma > > > > > > Inactive hide details for Koen Segers > <[EMAIL PROTECTED]>Koen > > Segers <[EMAIL PROTECTED]> > > > > > > Koen Segers > <[EMAIL PROTECTED]> > > > > 05/22/07 11:14 AM > > Please respond to > > [EMAIL PROTECTED] > > > > > > To > > > > Shirley > > Ma/Beaverton/[EMAIL PROTECTED] > > > > cc > > > > Ami Perlmutter > > <[EMAIL PROTECTED]>, > [email protected], [EMAIL PROTECTED] > > > > Subject > > > > RE: > > [ofa-general] > > GPFS node loses > > IB-connection > > > > > > > > Hi, > > > > It is the Cisco SM. > > > > SFS-7000P> show version > > > > > > > ============================================================== > ================== > > System Version Information > > > ============================================================== > ================== > > system-version : SFS-7000P TopspinOS 2.9.0 releng > > #147 > > 10/25/2006 02:01:32 > > contact : [EMAIL PROTECTED] > > name : SFS-7000P > > location : 170 West Tasman Drive, > San Jose, CA > > 95134 > > up-time : 11(d):7(h):49(m):3(s) > > last-change : none > > last-config-save : none > > action : none > > result : none > > oper-mode : normal > > > > There is also a command that gives the SM version, > but I can't > > find it > > right now. > > > > On Tue, 2007-05-22 at 09:45 -0700, Shirley Ma wrote: > > > Hello Koen, > > > > > > From the switch log, it looks a SM issue to me. > The node was > > kicked > > > out of the membership. Which SM you are using in your > > fabric? > > > > > > Thanks > > > Shirley Ma > > > > > *** Disclaimer *** > > > > Vlaamse Radio- en Televisieomroep > > Auguste Reyerslaan 52, 1043 Brussel > > > > nv van publiek recht > > BTW BE 0244.142.664 > > RPR Brussel > > http://www.vrt.be/disclaimer > > > > > > > > > > > *** Disclaimer *** > > Vlaamse Radio- en Televisieomroep > Auguste Reyerslaan 52, 1043 Brussel > > nv van publiek recht > BTW BE 0244.142.664 > RPR Brussel > http://www.vrt.be/disclaimer > > _______________________________________________ general mailing list [email protected] http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
