No C code changes, just a few config file changes (RENICE_IB_MAD=yes in openib.conf, memlock in /etc/security/limits.conf, fix /etc/hosts on SLES10 for bug 267, etc.).
Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems > -----Original Message----- > From: SEGERS Koen [mailto:[EMAIL PROTECTED] > Sent: Wednesday, May 23, 2007 6:48 AM > To: Scott Weitzenkamp (sweitzen); Clive Hall (clivhall) > Cc: Shirley Ma; Ami Perlmutter; > [email protected]; [EMAIL PROTECTED] > Subject: RE: [ofa-general] GPFS node loses IB-connection > > This far, all tests seem to work. > > Thanks for the help! > > Scott, > Are there more bugfixes that cisco does in its rpms? > > Greetz > > Koen > > -----Oorspronkelijk bericht----- > Van: Scott Weitzenkamp (sweitzen) [mailto:[EMAIL PROTECTED] > Verzonden: woensdag 23 mei 2007 0:39 > Aan: SEGERS Koen; Scott Weitzenkamp (sweitzen); Clive Hall (clivhall) > CC: Shirley Ma; Ami Perlmutter; [email protected]; > [EMAIL PROTECTED] > Onderwerp: RE: [ofa-general] GPFS node loses IB-connection > > It's not so much pinging every 10 seconds as expecting a > response within > 10 seconds (Clive, correct me if I'm wrong). > > You only need to do 1) or 2), not both. Cisco configures 1) > in the OFED > binary RPMs we release at > http://www.cisco.com/cgi-bin/tablebuild.pl/sfs-linux. I > prefer to have > the host be more responsive. > > > Scott Weitzenkamp > SQA and Release Manager > Server Virtualization Business Unit > Cisco Systems > > > > -----Original Message----- > > From: Koen Segers [mailto:[EMAIL PROTECTED] > > Sent: Tuesday, May 22, 2007 3:35 PM > > To: Scott Weitzenkamp (sweitzen) > > Cc: Shirley Ma; Ami Perlmutter; > > [email protected]; [EMAIL PROTECTED] > > Subject: RE: [ofa-general] GPFS node loses IB-connection > > > > If I understand it wright, the switch is actually polling > > (=pinging) the > > interfaces every 10s. This means that when the interface is handling > > other traffic, the poll can fail and the port could be > > considered out of > > service. My question is then: "How can the timeout be reached while > > packets are being sent/received to/from the interface?" > > > > Anyway, what timeout-value would you recommend for us? And why? > > > > To recapitulate: these are the actions I'll take tomorrow > > 1) change the MAD niceness of the servers > > 2) change the timeout on the switches > > > > Are these changes sufficient for the HCA's to keep their ports in > > PORT_ACTIVE state? > > > > Regards, > > > > Koen > > > > On Tue, 2007-05-22 at 12:59 -0700, Scott Weitzenkamp > (sweitzen) wrote: > > > Yes, you can tune it. Here's an example via the switch CLI: > > > > > > SFS-7000D(config)# ib sm subnet-prefix fe:80:00:00:00:00:00:00 > > > node-timeout <value> > > > > > > The default is 10 seconds, it can be configured up to > 2000 seconds. > > > If a HCA is completely unresponsive for longer than the > node-timeout > > > value, then we consider that HCA out of service. > > > > > > Scott Weitzenkamp > > > SQA and Release Manager > > > Server Virtualization Business Unit > > > Cisco Systems > > > > > > > > > > > > > > ______________________________________________________________ > > > From: Shirley Ma [mailto:[EMAIL PROTECTED] > > > Sent: Tuesday, May 22, 2007 11:30 AM > > > To: [EMAIL PROTECTED] > > > Cc: Ami Perlmutter; [email protected]; > > > [EMAIL PROTECTED]; Scott Weitzenkamp > > > (sweitzen) > > > Subject: RE: [ofa-general] GPFS node loses IB-connection > > > > > > > > > > > > Koen, > > > > > > So it is most likely you hit the same bug as 229 (Scott > > > pointed out earlier). The same workaround might > work for you > > > by renicing ib_mad as Scott suggested. > > > > > > I think this should be a SM query timeout tunable value in > > > Cisco SM. Am I right, Scott? > > > > > > Thanks > > > Shirley Ma > > > > > > > > > Inactive hide details for Koen Segers > > <[EMAIL PROTECTED]>Koen > > > Segers <[EMAIL PROTECTED]> > > > > > > > > > Koen Segers > > <[EMAIL PROTECTED]> > > > > > > 05/22/07 11:14 AM > > > Please respond to > > > [EMAIL PROTECTED] > > > > > > > > > To > > > > > > Shirley > > > Ma/Beaverton/[EMAIL PROTECTED] > > > > > > cc > > > > > > Ami Perlmutter > > > <[EMAIL PROTECTED]>, > > [email protected], [EMAIL PROTECTED] > > > > > > Subject > > > > > > RE: > > > [ofa-general] > > > GPFS node loses > > > IB-connection > > > > > > > > > > > > Hi, > > > > > > It is the Cisco SM. > > > > > > SFS-7000P> show version > > > > > > > > > > > ============================================================== > > ================== > > > System Version Information > > > > > ============================================================== > > ================== > > > system-version : SFS-7000P TopspinOS > 2.9.0 releng > > > #147 > > > 10/25/2006 02:01:32 > > > contact : [EMAIL PROTECTED] > > > name : SFS-7000P > > > location : 170 West Tasman Drive, > > San Jose, CA > > > 95134 > > > up-time : 11(d):7(h):49(m):3(s) > > > last-change : none > > > last-config-save : none > > > action : none > > > result : none > > > oper-mode : normal > > > > > > There is also a command that gives the SM version, > > but I can't > > > find it > > > right now. > > > > > > On Tue, 2007-05-22 at 09:45 -0700, Shirley Ma wrote: > > > > Hello Koen, > > > > > > > > From the switch log, it looks a SM issue to me. > > The node was > > > kicked > > > > out of the membership. Which SM you are using in your > > > fabric? > > > > > > > > Thanks > > > > Shirley Ma > > > > > > > *** Disclaimer *** > > > > > > Vlaamse Radio- en Televisieomroep > > > Auguste Reyerslaan 52, 1043 Brussel > > > > > > nv van publiek recht > > > BTW BE 0244.142.664 > > > RPR Brussel > > > http://www.vrt.be/disclaimer > > > > > > > > > > > > > > > > > *** Disclaimer *** > > > > Vlaamse Radio- en Televisieomroep > > Auguste Reyerslaan 52, 1043 Brussel > > > > nv van publiek recht > > BTW BE 0244.142.664 > > RPR Brussel > > http://www.vrt.be/disclaimer > > > > > *** Disclaimer *** > > Vlaamse Radio- en Televisieomroep > Auguste Reyerslaan 52, 1043 Brussel > > nv van publiek recht > BTW BE 0244.142.664 > RPR Brussel > http://www.vrt.be/disclaimer > > _______________________________________________ general mailing list [email protected] http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
