We are not using IPoIB. We only use SDP, but IPoIB is compiled just in case we 
need it (when SDP is not sufficient).
All interfaces are given an IPv4 address, so the messages aren't harmful I 
guess.
 
Thanks!
 
Koen

________________________________

Van: Clive Hall (clivhall) [mailto:[EMAIL PROTECTED]
Verzonden: do 24/05/2007 22:37
Aan: Shirley Ma; SEGERS Koen
CC: [EMAIL PROTECTED]; [email protected]
Onderwerp: RE: [ofa-general] GPFS node loses IB-connection


Those particular log messages are just informational messages.  They're logged 
when multicast groups are created (when the first group member joins) and when 
multicast groups are deleted (when the last group member leaves).
 
As Shirley said, if you're not using IPv6 anyway then those messages aren't 
harmful.  Even if you are using IPv6 it will quite possibly still be fine, 
although I don't know why hosts would be leaving/rejoining the multicast groups.
 
Clive.
 



________________________________

        From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Shirley 
Ma
        Sent: Thursday, May 24, 2007 11:16 AM
        To: SEGERS Koen
        Cc: [EMAIL PROTECTED]; [email protected]
        Subject: RE: [ofa-general] GPFS node loses IB-connection
        
        

        Koen,
        
        Are you using IPv6? If not, then this is no harmful. If you don't use 
it, you can simply disable loading IPv6 module in your notes when rebooting.
        
        Thanks
        Shirley Ma
        IBM Linux Technology Center
        15300 SW Koll Parkway
        Beaverton, OR 97006-6063
        Phone(Fax): (503) 578-7638
        
        
         Inactive hide details for "SEGERS Koen" <[EMAIL 
PROTECTED]><https://webmail.vrt.be/exchange/Koen.SEGERS/Concepten/RE:%20%5Bofa-general%5D%20GPFS%20node%20loses%20IB-connection-3.EML/1_multipart/graycol.gif>
 "SEGERS Koen" <[EMAIL PROTECTED]>
        
        
        

                                "SEGERS Koen" <[EMAIL PROTECTED]> 
                                Sent by: [EMAIL PROTECTED] 

                                05/24/07 11:03 AM

 
<https://webmail.vrt.be/exchange/Koen.SEGERS/Concepten/RE:%20%5Bofa-general%5D%20GPFS%20node%20loses%20IB-connection-3.EML/1_multipart/ecblank.gif>
 

To
 
<https://webmail.vrt.be/exchange/Koen.SEGERS/Concepten/RE:%20%5Bofa-general%5D%20GPFS%20node%20loses%20IB-connection-3.EML/1_multipart/ecblank.gif>
 
"Scott Weitzenkamp (sweitzen)" <[EMAIL PROTECTED]>, "Hal Rosenstock" <[EMAIL 
PROTECTED]>        
 
<https://webmail.vrt.be/exchange/Koen.SEGERS/Concepten/RE:%20%5Bofa-general%5D%20GPFS%20node%20loses%20IB-connection-3.EML/1_multipart/ecblank.gif>
 

cc
 
<https://webmail.vrt.be/exchange/Koen.SEGERS/Concepten/RE:%20%5Bofa-general%5D%20GPFS%20node%20loses%20IB-connection-3.EML/1_multipart/ecblank.gif>
 
[EMAIL PROTECTED], [email protected]        
 
<https://webmail.vrt.be/exchange/Koen.SEGERS/Concepten/RE:%20%5Bofa-general%5D%20GPFS%20node%20loses%20IB-connection-3.EML/1_multipart/ecblank.gif>
 

Subject
 
<https://webmail.vrt.be/exchange/Koen.SEGERS/Concepten/RE:%20%5Bofa-general%5D%20GPFS%20node%20loses%20IB-connection-3.EML/1_multipart/ecblank.gif>
 
RE: [ofa-general] GPFS node loses IB-connection 
 
<https://webmail.vrt.be/exchange/Koen.SEGERS/Concepten/RE:%20%5Bofa-general%5D%20GPFS%20node%20loses%20IB-connection-3.EML/1_multipart/ecblank.gif>
     
<https://webmail.vrt.be/exchange/Koen.SEGERS/Concepten/RE:%20%5Bofa-general%5D%20GPFS%20node%20loses%20IB-connection-3.EML/1_multipart/ecblank.gif>
    

        After changing the switch timeout value, we never got the error again. 
Yesterday, we started a 24h stresstest. This test was succesfull. I think we 
can conclude that the problem is fixed now.
        
        But, there is a strange message in de logs of the switch: 

        Topspin-120sc ib_sm.x[632]: %IB-6-INFO: Generate SM DELETE_MC_GROUP 
trap for GID=xx 

        Topspin-120sc ib_sm.x[632]: %IB-6-INFO: Generate SM CREATE_MC_GROUP 
trap for GID=xx 

        Topspin-120sc ib_sm.x[632]: %IB-6-INFO: Generate SM DELETE_MC_GROUP 
trap for GID=xx 

        Topspin-120sc ib_sm.x[632]: %IB-6-INFO: Generate SM CREATE_MC_GROUP 
trap for GID=xx 

        Topspin-120sc ib_sm.x[618]: %IB-6-INFO: Configuration caused by 
multicast membership change 

        Topspin-120sc ib_sm.x[618]: %IB-6-INFO: Configuration caused by 
multicast membership change 

        Topspin-120sc ib_sm.x[632]: %IB-6-INFO: Generate SM DELETE_MC_GROUP 
trap for GID=yy 

        Topspin-120sc ib_sm.x[632]: %IB-6-INFO: Generate SM CREATE_MC_GROUP 
trap for GID=yy 

        Topspin-120sc ib_sm.x[632]: %IB-6-INFO: Generate SM DELETE_MC_GROUP 
trap for GID=yy 

        Topspin-120sc ib_sm.x[632]: %IB-6-INFO: Generate SM CREATE_MC_GROUP 
trap for GID=yy 

        Topspin-120sc ib_sm.x[618]: %IB-6-INFO: Configuration caused by 
multicast membership change 

        Topspin-120sc ib_sm.x[618]: %IB-6-INFO: Configuration caused by 
multicast membership change 

        With xx,yy = (e.g) ff:12:60:1b:ff:ff:00:00:00:00:00:01:ff:05:87:d9 but 
changing to different GIDs in the next group of loggings each belonging to the 
IB ports of the server HCA's. 

        This logging occurs every few minutes (not at a regular interval). Is 
there somewhere a Cisco manual available that describes or explains these 
messages? Or can anyone explain what is happening? And whether this can harm a 
setup that doesn't use multicast? 

        Greetz 

        Koen 

        
        
________________________________

        Van: Scott Weitzenkamp (sweitzen) [mailto:[EMAIL PROTECTED]
        Verzonden: wo 23/05/2007 17:40
        Aan: SEGERS Koen; Hal Rosenstock
        CC: Clive Hall (clivhall); [EMAIL PROTECTED]; 
[email protected]
        Onderwerp: RE: [ofa-general] GPFS node loses IB-connection
        

        Try 20 seconds, I'm curious if if you are barely crossing the 10-second
        threshold.
        
        Scott
        
        > -----Original Message-----
        > From: SEGERS Koen [mailto:[EMAIL PROTECTED] <mailto:[EMAIL 
PROTECTED]> ]
        > Sent: Wednesday, May 23, 2007 8:39 AM
        > To: Scott Weitzenkamp (sweitzen); Hal Rosenstock
        > Cc: Clive Hall (clivhall);
        > [EMAIL PROTECTED]; [email protected]
        > Subject: RE: [ofa-general] GPFS node loses IB-connection
        >
        > What value would you recommend then?
        >
        > Koen
        >
        > -----Oorspronkelijk bericht-----
        > Van: Scott Weitzenkamp (sweitzen) [mailto:[EMAIL PROTECTED] 
<mailto:[EMAIL PROTECTED]> ]
        > Verzonden: woensdag 23 mei 2007 17:38
        > Aan: SEGERS Koen; Hal Rosenstock
        > CC: Clive Hall (clivhall); [EMAIL PROTECTED];
        > [email protected]
        > Onderwerp: RE: [ofa-general] GPFS node loses IB-connection
        >
        > The boot time of the host doesn't matter for this timeout. While the
        > host is booting, the IB link is down anyway.
        >
        > Scott
        >
        > > -----Original Message-----
        > > From: SEGERS Koen [mailto:[EMAIL PROTECTED] <mailto:[EMAIL 
PROTECTED]> ]
        > > Sent: Wednesday, May 23, 2007 8:20 AM
        > > To: Hal Rosenstock; Scott Weitzenkamp (sweitzen)
        > > Cc: Clive Hall (clivhall);
        > > [EMAIL PROTECTED]; [email protected]
        > > Subject: RE: [ofa-general] GPFS node loses IB-connection
        > >
        > > After a whole day of stresstesting with the MAD renicing
        > turned on, we
        > > got the error once. So I think I should raise the timeout on
        > > the switch
        > > also.
        > >
        > > It takes about 2 minutes to boot the system. Do you agree
        > > that this is a
        > > good value for the timeout?
        > >
        > > Scott,
        > > Can you explain me the problem of the memlock?
        > >
        > > I saw that the SLES10 bug is only an issue in MVAPICH.
        > Since we didn't
        > > install this, the bug is not related to us. This is
        > correct, isn't it?
        > >
        > > Greetz
        > >
        > > Koen
        > >
        > > -----Oorspronkelijk bericht-----
        > > Van: Hal Rosenstock [mailto:[EMAIL PROTECTED] <mailto:[EMAIL 
PROTECTED]> ]
        > > Verzonden: woensdag 23 mei 2007 16:12
        > > Aan: Scott "Weitzenkamp (sweitzen)
        > > CC: SEGERS Koen; Clive Hall (clivhall);
        > > [EMAIL PROTECTED]; [email protected]
        > > Onderwerp: RE: [ofa-general] GPFS node loses IB-connection
        > >
        > > On Wed, 2007-05-23 at 09:51, Scott Weitzenkamp (sweitzen) wrote:
        > > > No C code changes, just a few config file changes
        > (RENICE_IB_MAD=yes
        > > in
        > > > openib.conf,
        > >
        > > Does the host really not respond to MAD requests for over 10
        > > seconds in
        > > some cases ?
        > >
        > > -- Hal
        > >
        > > > memlock in /etc/security/limits.conf, fix /etc/hosts on
        > > > SLES10 for bug 267, etc.).
        > > >
        > > > Scott Weitzenkamp
        > > > SQA and Release Manager
        > > > Server Virtualization Business Unit
        > > > Cisco Systems
        > > > 
        > > >
        > > > > -----Original Message-----
        > > > > From: SEGERS Koen [mailto:[EMAIL PROTECTED] <mailto:[EMAIL 
PROTECTED]> ]
        > > > > Sent: Wednesday, May 23, 2007 6:48 AM
        > > > > To: Scott Weitzenkamp (sweitzen); Clive Hall (clivhall)
        > > > > Cc: Shirley Ma; Ami Perlmutter;
        > > > > [email protected];
        > > [EMAIL PROTECTED]
        > > > > Subject: RE: [ofa-general] GPFS node loses IB-connection
        > > > >
        > > > > This far, all tests seem to work.
        > > > >
        > > > > Thanks for the help!
        > > > >
        > > > > Scott,
        > > > > Are there more bugfixes that cisco does in its rpms?
        > > > >
        > > > > Greetz
        > > > >
        > > > > Koen
        > > > >
        > > > > -----Oorspronkelijk bericht-----
        > > > > Van: Scott Weitzenkamp (sweitzen) [mailto:[EMAIL PROTECTED] 
<mailto:[EMAIL PROTECTED]> ]
        > > > > Verzonden: woensdag 23 mei 2007 0:39
        > > > > Aan: SEGERS Koen; Scott Weitzenkamp (sweitzen); Clive Hall
        > > (clivhall)
        > > > > CC: Shirley Ma; Ami Perlmutter; [email protected];
        > > > > [EMAIL PROTECTED]
        > > > > Onderwerp: RE: [ofa-general] GPFS node loses IB-connection
        > > > >
        > > > > It's not so much pinging every 10 seconds as expecting a
        > > > > response within
        > > > > 10 seconds (Clive, correct me if I'm wrong).
        > > > >
        > > > > You only need to do 1) or 2), not both. Cisco configures 1)
        > > > > in the OFED
        > > > > binary RPMs we release at
        > > > > http://www.cisco.com/cgi-bin/tablebuild.pl/sfs-linux 
<http://www.cisco.com/cgi-bin/tablebuild.pl/sfs-linux> . I
        > > > > prefer to have
        > > > > the host be more responsive.
        > > > >
        > > > >
        > > > > Scott Weitzenkamp
        > > > > SQA and Release Manager
        > > > > Server Virtualization Business Unit
        > > > > Cisco Systems
        > > > > 
        > > > >
        > > > > > -----Original Message-----
        > > > > > From: Koen Segers [mailto:[EMAIL PROTECTED] <mailto:[EMAIL 
PROTECTED]> ]
        > > > > > Sent: Tuesday, May 22, 2007 3:35 PM
        > > > > > To: Scott Weitzenkamp (sweitzen)
        > > > > > Cc: Shirley Ma; Ami Perlmutter;
        > > > > > [email protected];
        > > [EMAIL PROTECTED]
        > > > > > Subject: RE: [ofa-general] GPFS node loses IB-connection
        > > > > >
        > > > > > If I understand it wright, the switch is actually polling
        > > > > > (=pinging) the
        > > > > > interfaces every 10s. This means that when the interface is
        > > handling
        > > > > > other traffic, the poll can fail and the port could be
        > > > > > considered out of
        > > > > > service. My question is then: "How can the timeout be reached
        > > while
        > > > > > packets are being sent/received to/from the interface?"
        > > > > >
        > > > > > Anyway, what timeout-value would you recommend for
        > us? And why?
        > > > > >
        > > > > > To recapitulate: these are the actions I'll take tomorrow
        > > > > > 1) change the MAD niceness of the servers
        > > > > > 2) change the timeout on the switches
        > > > > >
        > > > > > Are these changes sufficient for the HCA's to keep
        > > their ports in
        > > > > > PORT_ACTIVE state?
        > > > > >
        > > > > > Regards,
        > > > > >
        > > > > > Koen
        > > > > >
        > > > > > On Tue, 2007-05-22 at 12:59 -0700, Scott Weitzenkamp
        > > > > (sweitzen) wrote:
        > > > > > > Yes, you can tune it. Here's an example via the switch CLI:
        > > > > > > 
        > > > > > > SFS-7000D(config)# ib sm subnet-prefix
        > fe:80:00:00:00:00:00:00
        > > > > > > node-timeout <value>
        > > > > > >
        > > > > > > The default is 10 seconds, it can be configured up to
        > > > > 2000 seconds.
        > > > > > > If a HCA is completely unresponsive for longer than the
        > > > > node-timeout
        > > > > > > value, then we consider that HCA out of service.
        > > > > > > 
        > > > > > > Scott Weitzenkamp
        > > > > > > SQA and Release Manager
        > > > > > > Server Virtualization Business Unit
        > > > > > > Cisco Systems
        > > > > > > 
        > > > > > >
        > > > > > > 
        > > > > > > 
        > > > > > ______________________________________________________________
        > > > > > > From: Shirley Ma [mailto:[EMAIL PROTECTED] <mailto:[EMAIL 
PROTECTED]> ]
        > > > > > > Sent: Tuesday, May 22, 2007 11:30 AM
        > > > > > > To: [EMAIL PROTECTED]
        > > > > > > Cc: Ami Perlmutter; [email protected];
        > > > > > > [EMAIL PROTECTED]; Scott
        > > Weitzenkamp
        > > > > > > (sweitzen)
        > > > > > > Subject: RE: [ofa-general] GPFS node loses
        > > IB-connection
        > > > > > > 
        > > > > > > 
        > > > > > > 
        > > > > > > Koen,
        > > > > > > 
        > > > > > > So it is most likely you hit the same bug as
        > > 229 (Scott
        > > > > > > pointed out earlier). The same workaround might
        > > > > work for you
        > > > > > > by renicing ib_mad as Scott suggested.
        > > > > > > 
        > > > > > > I think this should be a SM query timeout
        > > tunable value
        > > in
        > > > > > > Cisco SM. Am I right, Scott?
        > > > > > > 
        > > > > > > Thanks
        > > > > > > Shirley Ma
        > > > > > > 
        > > > > > > 
        > > > > > > Inactive hide details for Koen Segers
        > > > > > <[EMAIL PROTECTED]>Koen
        > > > > > > Segers <[EMAIL PROTECTED]>
        > > > > > > 
        > > > > > > 
        > > > > > > Koen Segers
        > > > > > <[EMAIL PROTECTED]>
        > > > > > > 
        > > > > > > 05/22/07 11:14 AM
        > > > > > > Please respond to
        > > > > > > [EMAIL PROTECTED]
        > > > > > > 
        > > > > > > 
        > > > > > > To
        > > > > > > 
        > > > > > > Shirley
        > > > > > > Ma/Beaverton/[EMAIL PROTECTED]
        > > > > > > 
        > > > > > > cc
        > > > > > > 
        > > > > > > Ami Perlmutter
        > > > > > > <[EMAIL PROTECTED]>,
        > > > > > [email protected],
        > > [EMAIL PROTECTED]
        > > > > > > 
        > > > > > > Subject
        > > > > > > 
        > > > > > > RE:
        > > > > > > [ofa-general]
        > > > > > > GPFS node loses
        > > > > > > IB-connection
        > > > > > > 
        > > > > > > 
        > > > > > > 
        > > > > > > Hi,
        > > > > > > 
        > > > > > > It is the Cisco SM.
        > > > > > > 
        > > > > > > SFS-7000P> show version
        > > > > > > 
        > > > > > > 
        > > > > > > 
        > > > > > ==============================================================
        > > > > > ==================
        > > > > > > System Version Information
        > > > > > > 
        > > > > > ==============================================================
        > > > > > ==================
        > > > > > > system-version : SFS-7000P TopspinOS
        > > > > 2.9.0 releng
        > > > > > > #147
        > > > > > > 10/25/2006 02:01:32
        > > > > > > contact : [EMAIL PROTECTED]
        > > > > > > name : SFS-7000P
        > > > > > > location : 170 West Tasman Drive,
        > > > > > San Jose, CA
        > > > > > > 95134
        > > > > > > up-time : 11(d):7(h):49(m):3(s)
        > > > > > > last-change : none
        > > > > > > last-config-save : none
        > > > > > > action : none
        > > > > > > result : none
        > > > > > > oper-mode : normal
        > > > > > > 
        > > > > > > There is also a command that gives the SM version,
        > > > > > but I can't
        > > > > > > find it
        > > > > > > right now.
        > > > > > > 
        > > > > > > On Tue, 2007-05-22 at 09:45 -0700, Shirley Ma wrote:
        > > > > > > > Hello Koen,
        > > > > > > >
        > > > > > > > From the switch log, it looks a SM issue to me.
        > > > > > The node was
        > > > > > > kicked
        > > > > > > > out of the membership. Which SM you are
        > > using in your
        > > > > > > fabric?
        > > > > > > >
        > > > > > > > Thanks
        > > > > > > > Shirley Ma
        > > > > > > >
        > > > > > > *** Disclaimer ***
        > > > > > > 
        > > > > > > Vlaamse Radio- en Televisieomroep
        > > > > > > Auguste Reyerslaan 52, 1043 Brussel
        > > > > > > 
        > > > > > > nv van publiek recht
        > > > > > > BTW BE 0244.142.664
        > > > > > > RPR Brussel
        > > > > > > http://www.vrt.be/disclaimer <http://www.vrt.be/disclaimer> 
        > > > > > > 
        > > > > > > 
        > > > > > > 
        > > > > > > 
        > > > > > > 
        > > > > > *** Disclaimer ***
        > > > > >
        > > > > > Vlaamse Radio- en Televisieomroep
        > > > > > Auguste Reyerslaan 52, 1043 Brussel
        > > > > >
        > > > > > nv van publiek recht
        > > > > > BTW BE 0244.142.664
        > > > > > RPR Brussel
        > > > > > http://www.vrt.be/disclaimer <http://www.vrt.be/disclaimer> 
        > > > > > 
        > > > > >
        > > > > *** Disclaimer ***
        > > > >
        > > > > Vlaamse Radio- en Televisieomroep
        > > > > Auguste Reyerslaan 52, 1043 Brussel
        > > > >
        > > > > nv van publiek recht
        > > > > BTW BE 0244.142.664
        > > > > RPR Brussel
        > > > > http://www.vrt.be/disclaimer <http://www.vrt.be/disclaimer> 
        > > > > 
        > > > >
        > > > _______________________________________________
        > > > general mailing list
        > > > [email protected]
        > > > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general 
<http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general> 
        > > >
        > > > To unsubscribe, please visit
        > > http://openib.org/mailman/listinfo/openib-general 
<http://openib.org/mailman/listinfo/openib-general> 
        > >
        > > *** Disclaimer ***
        > >
        > > Vlaamse Radio- en Televisieomroep
        > > Auguste Reyerslaan 52, 1043 Brussel
        > >
        > > nv van publiek recht
        > > BTW BE 0244.142.664
        > > RPR Brussel
        > > http://www.vrt.be/disclaimer <http://www.vrt.be/disclaimer> 
        > > 
        > >
        > *** Disclaimer ***
        >
        > Vlaamse Radio- en Televisieomroep
        > Auguste Reyerslaan 52, 1043 Brussel
        >
        > nv van publiek recht
        > BTW BE 0244.142.664
        > RPR Brussel
        > http://www.vrt.be/disclaimer <http://www.vrt.be/disclaimer> 
        > 
        > 

        *** Disclaimer ***
        
        Vlaamse Radio- en Televisieomroep
        Auguste Reyerslaan 52, 1043 Brussel
        
        nv van publiek recht
        BTW BE 0244.142.664
        RPR Brussel
        http://www.vrt.be/disclaimer
        _______________________________________________
        general mailing list
        [email protected]
        http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
        
        To unsubscribe, please visit 
http://openib.org/mailman/listinfo/openib-general 

        

*** Disclaimer ***

Vlaamse Radio- en Televisieomroep
Auguste Reyerslaan 52, 1043 Brussel

nv van publiek recht
BTW BE 0244.142.664
RPR Brussel
http://www.vrt.be/disclaimer
 

<<graycol.gif>>

<<ecblank.gif>>

_______________________________________________
general mailing list
[email protected]
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Reply via email to