Re: SunFire X2200 ilo's bge1 DOWN/UP

2013-06-06 Thread YongHyeon PYUN
On Mon, Jun 03, 2013 at 09:25:33AM +0300, Daniel Braniss wrote:
> > On Fri, May 31, 2013 at 08:24:47AM +0300, Daniel Braniss wrote:
> > > > On Thursday, May 30, 2013 2:44:35 am Daniel Braniss wrote:
> > > > > > --/04w6evG8XlLl3ft
> > > > > > Content-Type: text/x-diff; charset=us-ascii
> > > > > > Content-Disposition: attachment; filename="bge.media_sts.diff"
> > > > > > 
> > > > > > Index: sys/dev/bge/if_bge.c
> > > > > > ===
> > > > > > --- sys/dev/bge/if_bge.c(revision 251021)
> > > > > > +++ sys/dev/bge/if_bge.c(working copy)
> > > > > > @@ -5583,6 +5583,10 @@ bge_ifmedia_sts(struct ifnet *ifp, struct 
> > > > > > ifmediar
> > > > > >  
> > > > > > BGE_LOCK(sc);
> > > > > >  
> > > > > > +   if ((ifp->if_flags & IFF_UP) == 0) {
> > > > > > +   BGE_UNLOCK(sc);
> > > > > > +   return;
> > > > > > +   }
> > > > > > if (sc->bge_flags & BGE_FLAG_TBI) {
> > > > > > ifmr->ifm_status = IFM_AVALID;
> > > > > > ifmr->ifm_active = IFM_ETHER;
> > > > > > 
> > > > > > --/04w6evG8XlLl3ft--
> > > > > after 18hs, the logs are empty!
> > > > > it seems the patch fixes the problem.
> > > > > 
> > > > > now maybe it's time to hunt for who is randomly calling for 
> > > > > bge_ifmedia_sts
> > > > > ...
> > > > 
> > > > It could be any number of daemons that query interface state such as an
> > > > SNMP server, ladvd, etc.
> > > > 
> > > > If you wanted help you could modify the patch so that it does something 
> > > > like 
> > > > this:
> > > > 
> > > #include 
> > > > if (/* test for IFF_UP */) {
> > > > BGE_UNLOCK(sc);
> > > > if_printf(ifp, "state queried on down interface by pid 
> > > > %d (%s)",
> > > --|
> > >  add a \n
> > > > curthread->td_proc->p_pid, 
> > > > curthread->td_proc->p_comm);
> > > > return;
> > > > }
> > > > 
> > > > -- 
> > > > John Baldwin
> > > snmpd call this several times a second, (difficult to measeure since 
> > > sysolog 
> > > just says
> > >last message repeated 22 times
> > > in any case, the DOWN/UP appears once every few hours, oh well.
> > > I have now stopped the snmpd daemon, maybe there is someone else ...
> > 
> > I have no idea why snmpd wants to know media status for interfaces
> > that are put into down state. The media status resolved after
> > bringing up the interface may be different one that was seen
> > before.
> > The patch also makes dhclient think driver got a valid link
> > regardless of link establishment. I guess that wouldn't be
> > issue though. I'll commit the patch after some more testing.
> > 
> > Thanks for reporting and testing!
> >
> no problem!
> 
> after more than 3 days, there were no more 'reports', so snmpd was the 
> culprit.
> the snmpd we use is from ports, i'll try and see waht's going on ...
> 

FYI: Committed in r251481.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: SunFire X2200 ilo's bge1 DOWN/UP

2013-06-02 Thread Daniel Braniss
> On Fri, May 31, 2013 at 08:24:47AM +0300, Daniel Braniss wrote:
> > > On Thursday, May 30, 2013 2:44:35 am Daniel Braniss wrote:
> > > > > --/04w6evG8XlLl3ft
> > > > > Content-Type: text/x-diff; charset=us-ascii
> > > > > Content-Disposition: attachment; filename="bge.media_sts.diff"
> > > > > 
> > > > > Index: sys/dev/bge/if_bge.c
> > > > > ===
> > > > > --- sys/dev/bge/if_bge.c  (revision 251021)
> > > > > +++ sys/dev/bge/if_bge.c  (working copy)
> > > > > @@ -5583,6 +5583,10 @@ bge_ifmedia_sts(struct ifnet *ifp, struct 
> > > > > ifmediar
> > > > >  
> > > > >   BGE_LOCK(sc);
> > > > >  
> > > > > + if ((ifp->if_flags & IFF_UP) == 0) {
> > > > > + BGE_UNLOCK(sc);
> > > > > + return;
> > > > > + }
> > > > >   if (sc->bge_flags & BGE_FLAG_TBI) {
> > > > >   ifmr->ifm_status = IFM_AVALID;
> > > > >   ifmr->ifm_active = IFM_ETHER;
> > > > > 
> > > > > --/04w6evG8XlLl3ft--
> > > > after 18hs, the logs are empty!
> > > > it seems the patch fixes the problem.
> > > > 
> > > > now maybe it's time to hunt for who is randomly calling for 
> > > > bge_ifmedia_sts
> > > > ...
> > > 
> > > It could be any number of daemons that query interface state such as an
> > > SNMP server, ladvd, etc.
> > > 
> > > If you wanted help you could modify the patch so that it does something 
> > > like 
> > > this:
> > > 
> > #include 
> > >   if (/* test for IFF_UP */) {
> > >   BGE_UNLOCK(sc);
> > >   if_printf(ifp, "state queried on down interface by pid %d (%s)",
> > --|
> >  add a \n
> > >   curthread->td_proc->p_pid, curthread->td_proc->p_comm);
> > >   return;
> > >   }
> > > 
> > > -- 
> > > John Baldwin
> > snmpd call this several times a second, (difficult to measeure since 
> > sysolog 
> > just says
> >  last message repeated 22 times
> > in any case, the DOWN/UP appears once every few hours, oh well.
> > I have now stopped the snmpd daemon, maybe there is someone else ...
> 
> I have no idea why snmpd wants to know media status for interfaces
> that are put into down state. The media status resolved after
> bringing up the interface may be different one that was seen
> before.
> The patch also makes dhclient think driver got a valid link
> regardless of link establishment. I guess that wouldn't be
> issue though. I'll commit the patch after some more testing.
> 
> Thanks for reporting and testing!
>
no problem!

after more than 3 days, there were no more 'reports', so snmpd was the culprit.
the snmpd we use is from ports, i'll try and see waht's going on ...

thanks
danny

> > 
> > thanks,
> > danny
> > 
> > 


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: SunFire X2200 ilo's bge1 DOWN/UP

2013-05-30 Thread YongHyeon PYUN
On Fri, May 31, 2013 at 08:24:47AM +0300, Daniel Braniss wrote:
> > On Thursday, May 30, 2013 2:44:35 am Daniel Braniss wrote:
> > > > --/04w6evG8XlLl3ft
> > > > Content-Type: text/x-diff; charset=us-ascii
> > > > Content-Disposition: attachment; filename="bge.media_sts.diff"
> > > > 
> > > > Index: sys/dev/bge/if_bge.c
> > > > ===
> > > > --- sys/dev/bge/if_bge.c(revision 251021)
> > > > +++ sys/dev/bge/if_bge.c(working copy)
> > > > @@ -5583,6 +5583,10 @@ bge_ifmedia_sts(struct ifnet *ifp, struct 
> > > > ifmediar
> > > >  
> > > > BGE_LOCK(sc);
> > > >  
> > > > +   if ((ifp->if_flags & IFF_UP) == 0) {
> > > > +   BGE_UNLOCK(sc);
> > > > +   return;
> > > > +   }
> > > > if (sc->bge_flags & BGE_FLAG_TBI) {
> > > > ifmr->ifm_status = IFM_AVALID;
> > > > ifmr->ifm_active = IFM_ETHER;
> > > > 
> > > > --/04w6evG8XlLl3ft--
> > > after 18hs, the logs are empty!
> > > it seems the patch fixes the problem.
> > > 
> > > now maybe it's time to hunt for who is randomly calling for 
> > > bge_ifmedia_sts
> > > ...
> > 
> > It could be any number of daemons that query interface state such as an
> > SNMP server, ladvd, etc.
> > 
> > If you wanted help you could modify the patch so that it does something 
> > like 
> > this:
> > 
> #include 
> > if (/* test for IFF_UP */) {
> > BGE_UNLOCK(sc);
> > if_printf(ifp, "state queried on down interface by pid %d (%s)",
> --|
>  add a \n
> > curthread->td_proc->p_pid, curthread->td_proc->p_comm);
> > return;
> > }
> > 
> > -- 
> > John Baldwin
> snmpd call this several times a second, (difficult to measeure since sysolog 
> just says
>last message repeated 22 times
> in any case, the DOWN/UP appears once every few hours, oh well.
> I have now stopped the snmpd daemon, maybe there is someone else ...

I have no idea why snmpd wants to know media status for interfaces
that are put into down state. The media status resolved after
bringing up the interface may be different one that was seen
before.
The patch also makes dhclient think driver got a valid link
regardless of link establishment. I guess that wouldn't be
issue though. I'll commit the patch after some more testing.

Thanks for reporting and testing!

> 
> thanks,
>   danny
> 
> 
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: SunFire X2200 ilo's bge1 DOWN/UP

2013-05-30 Thread Daniel Braniss
> On Thursday, May 30, 2013 2:44:35 am Daniel Braniss wrote:
> > > --/04w6evG8XlLl3ft
> > > Content-Type: text/x-diff; charset=us-ascii
> > > Content-Disposition: attachment; filename="bge.media_sts.diff"
> > > 
> > > Index: sys/dev/bge/if_bge.c
> > > ===
> > > --- sys/dev/bge/if_bge.c  (revision 251021)
> > > +++ sys/dev/bge/if_bge.c  (working copy)
> > > @@ -5583,6 +5583,10 @@ bge_ifmedia_sts(struct ifnet *ifp, struct ifmediar
> > >  
> > >   BGE_LOCK(sc);
> > >  
> > > + if ((ifp->if_flags & IFF_UP) == 0) {
> > > + BGE_UNLOCK(sc);
> > > + return;
> > > + }
> > >   if (sc->bge_flags & BGE_FLAG_TBI) {
> > >   ifmr->ifm_status = IFM_AVALID;
> > >   ifmr->ifm_active = IFM_ETHER;
> > > 
> > > --/04w6evG8XlLl3ft--
> > after 18hs, the logs are empty!
> > it seems the patch fixes the problem.
> > 
> > now maybe it's time to hunt for who is randomly calling for bge_ifmedia_sts
> > ...
> 
> It could be any number of daemons that query interface state such as an
> SNMP server, ladvd, etc.
> 
> If you wanted help you could modify the patch so that it does something like 
> this:
> 
#include 
>   if (/* test for IFF_UP */) {
>   BGE_UNLOCK(sc);
>   if_printf(ifp, "state queried on down interface by pid %d (%s)",
--|
 add a \n
>   curthread->td_proc->p_pid, curthread->td_proc->p_comm);
>   return;
>   }
> 
> -- 
> John Baldwin
snmpd call this several times a second, (difficult to measeure since sysolog 
just says
 last message repeated 22 times
in any case, the DOWN/UP appears once every few hours, oh well.
I have now stopped the snmpd daemon, maybe there is someone else ...

thanks,
danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: SunFire X2200 ilo's bge1 DOWN/UP

2013-05-30 Thread John Baldwin
On Thursday, May 30, 2013 2:44:35 am Daniel Braniss wrote:
> > --/04w6evG8XlLl3ft
> > Content-Type: text/x-diff; charset=us-ascii
> > Content-Disposition: attachment; filename="bge.media_sts.diff"
> > 
> > Index: sys/dev/bge/if_bge.c
> > ===
> > --- sys/dev/bge/if_bge.c(revision 251021)
> > +++ sys/dev/bge/if_bge.c(working copy)
> > @@ -5583,6 +5583,10 @@ bge_ifmedia_sts(struct ifnet *ifp, struct ifmediar
> >  
> > BGE_LOCK(sc);
> >  
> > +   if ((ifp->if_flags & IFF_UP) == 0) {
> > +   BGE_UNLOCK(sc);
> > +   return;
> > +   }
> > if (sc->bge_flags & BGE_FLAG_TBI) {
> > ifmr->ifm_status = IFM_AVALID;
> > ifmr->ifm_active = IFM_ETHER;
> > 
> > --/04w6evG8XlLl3ft--
> after 18hs, the logs are empty!
> it seems the patch fixes the problem.
> 
> now maybe it's time to hunt for who is randomly calling for bge_ifmedia_sts
> ...

It could be any number of daemons that query interface state such as an
SNMP server, ladvd, etc.

If you wanted help you could modify the patch so that it does something like 
this:

if (/* test for IFF_UP */) {
BGE_UNLOCK(sc);
if_printf(ifp, "state queried on down interface by pid %d (%s)",
curthread->td_proc->p_pid, curthread->td_proc->p_comm);
return;
}

-- 
John Baldwin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: SunFire X2200 ilo's bge1 DOWN/UP

2013-05-29 Thread Daniel Braniss
> 
> --/04w6evG8XlLl3ft
> Content-Type: text/plain; charset=us-ascii
> Content-Disposition: inline
> 
> On Tue, May 28, 2013 at 09:55:24AM +0300, Daniel Braniss wrote:
> > > On Tue, May 28, 2013 at 09:28:00AM +0300, Daniel Braniss wrote:
> > > > > On Mon, May 27, 2013 at 10:59:28AM +0300, Daniel Braniss wrote:
> > > > > > > On Fri, May 24, 2013 at 05:31:13PM +0300, Daniel Braniss wrote:
> > > > > > > > hi, after upgrading to 9.1-stable, this particular hardware - 
> > > > > > > > SunFire X2200,
> > > > > > > 
> > > > > > > Show me dmesg(bge(4) and brgphy(4) only) and 'ifconfig bge1' 
> > > > > > > output.
> > > > > > > 
> > > > > > 
> > > > > > bge0:  > > > > > 0x009003> mem 
> > > > > > 0xfdff-0xfdff,0xfdfe-0xfdfe irq 17 at device 4.0 on 
> > > > > > pci6
> > > > > > bge0: CHIP ID 0x9003; ASIC REV 0x09; CHIP REV 0x90; PCI-X 133 
> > > > > > MHz
> > > > > > miibus2:  on bge0
> > > > > > brgphy0:  PHY 1 on miibus2
> > > > > > brgphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 
> > > > > > 1000baseT, 
> > > > > > 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, 
> > > > > > auto-flow
> > > > > > bge0: Ethernet address: 00:1b:24:5d:5b:bd
> > > > > > bge1:  > > > > > 0x009003> mem 
> > > > > > 0xfdfc-0xfdfc,0xfdfb-0xfdfb irq 18 at device 4.1 on 
> > > > > > pci6
> > > > > > bge1: CHIP ID 0x9003; ASIC REV 0x09; CHIP REV 0x90; PCI-X 133 
> > > > > > MHz
> > > > > > miibus3:  on bge1
> > > > > > brgphy1:  PHY 1 on miibus3
> > > > > > brgphy1:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 
> > > > > > 1000baseT, 
> > > > > > 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, 
> > > > > > auto-flow
> > > > > > bge1: Ethernet address: 00:1b:24:5d:5b:be
> > > > > > 
> > > > > > sf-10> ifconfig bge1
> > > > > > bge1: flags=8802 metric 0 mtu 1500
> > > > > > 
> > > > > > options=8009b > > > > > TE>
> > > > > > ether 00:1b:24:5d:5b:be
> > > > > > nd6 options=21
> > > > > > media: Ethernet autoselect (100baseTX )
> > > > > > status: active
> > > > > > 
> > > > > 
> > > > > Because bge1 is not UP, I wonder how you get link UP/DOWN events.
> > > > > Do you have some network script run by cron?
> > > > 
> > > > no scripts.
> > > > this port is shared with the ILO/IPMI, and back in March you fixed a 
> > > > problem
> > > > that it was hanging soon after it was initialized by the driver,
> > > > (r248226 - but I'm not sure if it was ever MFC'ed).
> > > 
> > > It was MFCed.
> > > 
> > > > Initialy I thought it could be caused by connections to it from other
> > > > hosts (either via the web, or ssh) so I killed them, but it didn't help.
> > > > without that patch the connection fails, and I don't see any DOWN/UP.
> > > 
> > > Could you check how many number of interrupts you get from bge1?
> > > Ideally you shouldn't get any interrupts for bge1.
> > 
> > it's not even mentioned :-)
> > sf-04> vmstat -i
> > interrupt  total   rate
> > irq3: uart1  964  0
> > irq4: uart06  0
> > irq14: ata0   227354  0
> > irq17: bge0  1021981  2
> > irq21: ohci0  28  0
> > irq22: ehci0   2  0
> > irq23: atapci1293228  0
> > cpu0:timer 383244076   1124
> > cpu1:timer   2225144  6
> > cpu2:timer   2056087  6
> > cpu3:timer   2093943  6
> > Total  391162813   1147
> > 
> 
> Then the only way link UP/DOWN event could be generated for DOWN
> interface would be invocation of media status query
> (i.e. ifconfig -a) triggered by an external application.  Most
> drivers I touched check IFF_UP flag before poking media status
> register. However I'm not sure you're seeing this issue because you
> do not use any network script run by cron.
> Anyway, try attached patch and let me know whether it makes any
> difference.
> 
> > > 
> > > > 
> > > > > 
> > > > > > > > is toggeling bge1 DOWN/UP every few hours, this port is being 
> > > > > > > > used by the ILO.
> > > > > > > > To check, I upgraded another identical host, and the same 
> > > > > > > > problem appears. 
> > > > > > > 
> > > > > > > What is the last known working revision?
> > > > > > 
> > > > > > I have no idea, but I have older versions, and ill start from the 
> > > > > > oldets 
> > > > > > (9.1-prerelease), but
> > > > > > it will take time, since it takes hours till it happens.
> > > > > > 
> > > > > 
> > > > > ok.
> > > > 
> > > > 
> > 
> > 
> 
> --/04w6evG8XlLl3ft
> Content-Type: text/x-diff; charset=us-ascii
> Content-Disposition: attachment; filename="bge.media_sts.diff"
> 
> Index: sys/dev/bge/if_bge.c
> ===
> --- sys/dev/bge/if_bge.c 

Re: SunFire X2200 ilo's bge1 DOWN/UP

2013-05-29 Thread Daniel Braniss
> 
> --/04w6evG8XlLl3ft
> Content-Type: text/plain; charset=us-ascii
> Content-Disposition: inline
> 
> On Tue, May 28, 2013 at 09:55:24AM +0300, Daniel Braniss wrote:
> > > On Tue, May 28, 2013 at 09:28:00AM +0300, Daniel Braniss wrote:
> > > > > On Mon, May 27, 2013 at 10:59:28AM +0300, Daniel Braniss wrote:
> > > > > > > On Fri, May 24, 2013 at 05:31:13PM +0300, Daniel Braniss wrote:
> > > > > > > > hi, after upgrading to 9.1-stable, this particular hardware - 
> > > > > > > > SunFire X2200,
> > > > > > > 
> > > > > > > Show me dmesg(bge(4) and brgphy(4) only) and 'ifconfig bge1' 
> > > > > > > output.
> > > > > > > 
> > > > > > 
> > > > > > bge0:  > > > > > 0x009003> mem 
> > > > > > 0xfdff-0xfdff,0xfdfe-0xfdfe irq 17 at device 4.0 on 
> > > > > > pci6
> > > > > > bge0: CHIP ID 0x9003; ASIC REV 0x09; CHIP REV 0x90; PCI-X 133 
> > > > > > MHz
> > > > > > miibus2:  on bge0
> > > > > > brgphy0:  PHY 1 on miibus2
> > > > > > brgphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 
> > > > > > 1000baseT, 
> > > > > > 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, 
> > > > > > auto-flow
> > > > > > bge0: Ethernet address: 00:1b:24:5d:5b:bd
> > > > > > bge1:  > > > > > 0x009003> mem 
> > > > > > 0xfdfc-0xfdfc,0xfdfb-0xfdfb irq 18 at device 4.1 on 
> > > > > > pci6
> > > > > > bge1: CHIP ID 0x9003; ASIC REV 0x09; CHIP REV 0x90; PCI-X 133 
> > > > > > MHz
> > > > > > miibus3:  on bge1
> > > > > > brgphy1:  PHY 1 on miibus3
> > > > > > brgphy1:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 
> > > > > > 1000baseT, 
> > > > > > 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, 
> > > > > > auto-flow
> > > > > > bge1: Ethernet address: 00:1b:24:5d:5b:be
> > > > > > 
> > > > > > sf-10> ifconfig bge1
> > > > > > bge1: flags=8802 metric 0 mtu 1500
> > > > > > 
> > > > > > options=8009b > > > > > TE>
> > > > > > ether 00:1b:24:5d:5b:be
> > > > > > nd6 options=21
> > > > > > media: Ethernet autoselect (100baseTX )
> > > > > > status: active
> > > > > > 
> > > > > 
> > > > > Because bge1 is not UP, I wonder how you get link UP/DOWN events.
> > > > > Do you have some network script run by cron?
> > > > 
> > > > no scripts.
> > > > this port is shared with the ILO/IPMI, and back in March you fixed a 
> > > > problem
> > > > that it was hanging soon after it was initialized by the driver,
> > > > (r248226 - but I'm not sure if it was ever MFC'ed).
> > > 
> > > It was MFCed.
> > > 
> > > > Initialy I thought it could be caused by connections to it from other
> > > > hosts (either via the web, or ssh) so I killed them, but it didn't help.
> > > > without that patch the connection fails, and I don't see any DOWN/UP.
> > > 
> > > Could you check how many number of interrupts you get from bge1?
> > > Ideally you shouldn't get any interrupts for bge1.
> > 
> > it's not even mentioned :-)
> > sf-04> vmstat -i
> > interrupt  total   rate
> > irq3: uart1  964  0
> > irq4: uart06  0
> > irq14: ata0   227354  0
> > irq17: bge0  1021981  2
> > irq21: ohci0  28  0
> > irq22: ehci0   2  0
> > irq23: atapci1293228  0
> > cpu0:timer 383244076   1124
> > cpu1:timer   2225144  6
> > cpu2:timer   2056087  6
> > cpu3:timer   2093943  6
> > Total  391162813   1147
> > 
> 
> Then the only way link UP/DOWN event could be generated for DOWN
> interface would be invocation of media status query
> (i.e. ifconfig -a) triggered by an external application.  Most
> drivers I touched check IFF_UP flag before poking media status
> register. However I'm not sure you're seeing this issue because you
> do not use any network script run by cron.
> Anyway, try attached patch and let me know whether it makes any
> difference.
> 
> > > 
> > > > 
> > > > > 
> > > > > > > > is toggeling bge1 DOWN/UP every few hours, this port is being 
> > > > > > > > used by the ILO.
> > > > > > > > To check, I upgraded another identical host, and the same 
> > > > > > > > problem appears. 
> > > > > > > 
> > > > > > > What is the last known working revision?
> > > > > > 
> > > > > > I have no idea, but I have older versions, and ill start from the 
> > > > > > oldets 
> > > > > > (9.1-prerelease), but
> > > > > > it will take time, since it takes hours till it happens.
> > > > > > 
> > > > > 
> > > > > ok.
> > > > 
> > > > 
> > 
> > 
> 
> --/04w6evG8XlLl3ft
> Content-Type: text/x-diff; charset=us-ascii
> Content-Disposition: attachment; filename="bge.media_sts.diff"
> 
> Index: sys/dev/bge/if_bge.c
> ===
> --- sys/dev/bge/if_bge.c 

Re: SunFire X2200 ilo's bge1 DOWN/UP

2013-05-29 Thread YongHyeon PYUN
On Tue, May 28, 2013 at 09:55:24AM +0300, Daniel Braniss wrote:
> > On Tue, May 28, 2013 at 09:28:00AM +0300, Daniel Braniss wrote:
> > > > On Mon, May 27, 2013 at 10:59:28AM +0300, Daniel Braniss wrote:
> > > > > > On Fri, May 24, 2013 at 05:31:13PM +0300, Daniel Braniss wrote:
> > > > > > > hi, after upgrading to 9.1-stable, this particular hardware - 
> > > > > > > SunFire X2200,
> > > > > > 
> > > > > > Show me dmesg(bge(4) and brgphy(4) only) and 'ifconfig bge1' output.
> > > > > > 
> > > > > 
> > > > > bge0:  > > > > 0x009003> mem 
> > > > > 0xfdff-0xfdff,0xfdfe-0xfdfe irq 17 at device 4.0 on 
> > > > > pci6
> > > > > bge0: CHIP ID 0x9003; ASIC REV 0x09; CHIP REV 0x90; PCI-X 133 MHz
> > > > > miibus2:  on bge0
> > > > > brgphy0:  PHY 1 on miibus2
> > > > > brgphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 
> > > > > 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow
> > > > > bge0: Ethernet address: 00:1b:24:5d:5b:bd
> > > > > bge1:  > > > > 0x009003> mem 
> > > > > 0xfdfc-0xfdfc,0xfdfb-0xfdfb irq 18 at device 4.1 on 
> > > > > pci6
> > > > > bge1: CHIP ID 0x9003; ASIC REV 0x09; CHIP REV 0x90; PCI-X 133 MHz
> > > > > miibus3:  on bge1
> > > > > brgphy1:  PHY 1 on miibus3
> > > > > brgphy1:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 
> > > > > 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow
> > > > > bge1: Ethernet address: 00:1b:24:5d:5b:be
> > > > > 
> > > > > sf-10> ifconfig bge1
> > > > > bge1: flags=8802 metric 0 mtu 1500
> > > > > 
> > > > > options=8009b > > > > TE>
> > > > > ether 00:1b:24:5d:5b:be
> > > > > nd6 options=21
> > > > > media: Ethernet autoselect (100baseTX )
> > > > > status: active
> > > > > 
> > > > 
> > > > Because bge1 is not UP, I wonder how you get link UP/DOWN events.
> > > > Do you have some network script run by cron?
> > > 
> > > no scripts.
> > > this port is shared with the ILO/IPMI, and back in March you fixed a 
> > > problem
> > > that it was hanging soon after it was initialized by the driver,
> > > (r248226 - but I'm not sure if it was ever MFC'ed).
> > 
> > It was MFCed.
> > 
> > > Initialy I thought it could be caused by connections to it from other
> > > hosts (either via the web, or ssh) so I killed them, but it didn't help.
> > > without that patch the connection fails, and I don't see any DOWN/UP.
> > 
> > Could you check how many number of interrupts you get from bge1?
> > Ideally you shouldn't get any interrupts for bge1.
> 
> it's not even mentioned :-)
> sf-04> vmstat -i
> interrupt  total   rate
> irq3: uart1  964  0
> irq4: uart06  0
> irq14: ata0   227354  0
> irq17: bge0  1021981  2
> irq21: ohci0  28  0
> irq22: ehci0   2  0
> irq23: atapci1293228  0
> cpu0:timer 383244076   1124
> cpu1:timer   2225144  6
> cpu2:timer   2056087  6
> cpu3:timer   2093943  6
> Total  391162813   1147
> 

Then the only way link UP/DOWN event could be generated for DOWN
interface would be invocation of media status query
(i.e. ifconfig -a) triggered by an external application.  Most
drivers I touched check IFF_UP flag before poking media status
register. However I'm not sure you're seeing this issue because you
do not use any network script run by cron.
Anyway, try attached patch and let me know whether it makes any
difference.

> > 
> > > 
> > > > 
> > > > > > > is toggeling bge1 DOWN/UP every few hours, this port is being 
> > > > > > > used by the ILO.
> > > > > > > To check, I upgraded another identical host, and the same problem 
> > > > > > > appears. 
> > > > > > 
> > > > > > What is the last known working revision?
> > > > > 
> > > > > I have no idea, but I have older versions, and ill start from the 
> > > > > oldets 
> > > > > (9.1-prerelease), but
> > > > > it will take time, since it takes hours till it happens.
> > > > > 
> > > > 
> > > > ok.
> > > 
> > > 
> 
> 
Index: sys/dev/bge/if_bge.c
===
--- sys/dev/bge/if_bge.c	(revision 251021)
+++ sys/dev/bge/if_bge.c	(working copy)
@@ -5583,6 +5583,10 @@ bge_ifmedia_sts(struct ifnet *ifp, struct ifmediar
 
 	BGE_LOCK(sc);
 
+	if ((ifp->if_flags & IFF_UP) == 0) {
+		BGE_UNLOCK(sc);
+		return;
+	}
 	if (sc->bge_flags & BGE_FLAG_TBI) {
 		ifmr->ifm_status = IFM_AVALID;
 		ifmr->ifm_active = IFM_ETHER;
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: SunFire X2200 ilo's bge1 DOWN/UP

2013-05-28 Thread Jeremy Chadwick
On Tue, May 28, 2013 at 10:57:22AM +0300, Daniel Braniss wrote:
> 
> [...]
> > 1. r248226 in head was MFC'd to stable/9 as r248858.  Validation:
> > 
> > http://svnweb.freebsd.org/base/stable/9/sys/dev/bge/if_bge.c?view=log
> > 
> > So the answer: whether or not you have that MFC in stable/9 depends on
> > what SVN rev your kernel is.
> 
> I do a svnsync then I convert to mercurial so from the svn logs I see that
> the highest rev number is 250960.
> 
> [...]
> > 
> > That "piggybacking" crap never should have been invented.  All it has
> > done is cause problems for every OS I know of (including Windows) since
> > its inception, and is also exactly why today almost all vendors I've
> > seen provide a dedicated NIC and RJ45 port for the iLO/IPMI interface.
> > It's admission the "piggybacking" method doesn't work.  And may it rot
> > in hell for all I care, while simultaneously feeling very sorry for
> > those who have to suffer/deal with it.
> > 
> > This is just another reason why I've always been very picky about what
> > hardware I'd buy for server deployments.  Vendors never actually
> > disclose this crap until you've shelled out money for the hardware, by
> > which point it's too late and you're suffering.  Really great model --
> > for the pocketbook.  :/
> > 
> 
> I couldn't agree more!
> 
> [...]
> 
> in the case of the SunFire X2200, it has 4 bge ports, the
> 2nd, bge1, is only used by the ilo, it's not enabled (UP'ed),
> it doesn't have an interrupt assigned, it's, as far as I can tell,
> just anoying to have the DOWN/UP messages - unless something more sinester
> is lurking.

Does output from "ps -auxH | grep kernel/bge" show anything for
bge1?

What about "vmstat -i -a" (you might be surprised about the -a flag and
what shows up compared to just using -i).  Gut feeling says it will show
up there.  (See vmstat(8) for what -a does)

Possibly interrupt generation isn't what's "triggering" the bge(4)
device to see link going up/down; maybe this is done via some memory
mapped I/O, which would explain why "vmstat -i" shows nothing for bge1
(no interrupts ever generated).

That doesn't change the fact that the driver still is being told via
some means that link is going up/down.

Just a general FYI (probably not relevant here too much, but I often
have to point it out for younger SAs (not saying anyone here is one,
but the list is archived...)): there is a very distinct difference
between a link being physically up/down vs. administratively up/down.

With *IX ifconfig, the social assumption is that there's a 1:1
correlation between those (especially with Ethernet devices), when in
reality it depends on the device driver and all subsystems in between.
I remember quite clearly on some OSes (can't remember if BSD or Linux or
Solaris) where "ifconfig xxx down" on certain devices would still result
in packets being passed across xxx.  This used to shock me when I was
younger, but nowadays doesn't because I have a better understanding of
why.

ifconfig is just a generic tool that interfaces with a lot of things and
tries to do too much, in my opinion.  On BSD we tend to cram as much
crap into ifconfig as humanly possible, while on other OSes separate
per-device tools/utilities have been developed to segregate the
intended behaviours/desires.

-- 
| Jeremy Chadwick   j...@koitsu.org |
| UNIX Systems Administratorhttp://jdc.koitsu.org/ |
| Mountain View, CA, US|
| Making life hard for others since 1977. PGP 4BD6C0CB |
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: SunFire X2200 ilo's bge1 DOWN/UP

2013-05-28 Thread Daniel Braniss

[...]
> 1. r248226 in head was MFC'd to stable/9 as r248858.  Validation:
> 
> http://svnweb.freebsd.org/base/stable/9/sys/dev/bge/if_bge.c?view=log
> 
> So the answer: whether or not you have that MFC in stable/9 depends on
> what SVN rev your kernel is.

I do a svnsync then I convert to mercurial so from the svn logs I see that
the highest rev number is 250960.

[...]
> 
> That "piggybacking" crap never should have been invented.  All it has
> done is cause problems for every OS I know of (including Windows) since
> its inception, and is also exactly why today almost all vendors I've
> seen provide a dedicated NIC and RJ45 port for the iLO/IPMI interface.
> It's admission the "piggybacking" method doesn't work.  And may it rot
> in hell for all I care, while simultaneously feeling very sorry for
> those who have to suffer/deal with it.
> 
> This is just another reason why I've always been very picky about what
> hardware I'd buy for server deployments.  Vendors never actually
> disclose this crap until you've shelled out money for the hardware, by
> which point it's too late and you're suffering.  Really great model --
> for the pocketbook.  :/
> 

I couldn't agree more!

[...]

in the case of the SunFire X2200, it has 4 bge ports, the
2nd, bge1, is only used by the ilo, it's not enabled (UP'ed),
it doesn't have an interrupt assigned, it's, as far as I can tell,
just anoying to have the DOWN/UP messages - unless something more sinester
is lurking.

thanks,
danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: SunFire X2200 ilo's bge1 DOWN/UP

2013-05-27 Thread Jeremy Chadwick
On Mon, May 27, 2013 at 11:49:31PM -0700, Jeremy Chadwick wrote:
> Other question: is there any correlation between the amount of time that
> goes by between events with, say, ARP/MAC address expiry in "arp -a"?  I
> mention this because I know some of the ASF methods have historically
> shown two MAC addresses on the same physif, and I can see how this might
> confuse some stacks.

Never mind -- I thought about this more, and it's irrelevant.

-- 
| Jeremy Chadwick   j...@koitsu.org |
| UNIX Systems Administratorhttp://jdc.koitsu.org/ |
| Mountain View, CA, US|
| Making life hard for others since 1977. PGP 4BD6C0CB |
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: SunFire X2200 ilo's bge1 DOWN/UP

2013-05-27 Thread Daniel Braniss
> On Tue, May 28, 2013 at 09:28:00AM +0300, Daniel Braniss wrote:
> > > On Mon, May 27, 2013 at 10:59:28AM +0300, Daniel Braniss wrote:
> > > > > On Fri, May 24, 2013 at 05:31:13PM +0300, Daniel Braniss wrote:
> > > > > > hi, after upgrading to 9.1-stable, this particular hardware - 
> > > > > > SunFire X2200,
> > > > > 
> > > > > Show me dmesg(bge(4) and brgphy(4) only) and 'ifconfig bge1' output.
> > > > > 
> > > > 
> > > > bge0:  > > > 0x009003> mem 
> > > > 0xfdff-0xfdff,0xfdfe-0xfdfe irq 17 at device 4.0 on pci6
> > > > bge0: CHIP ID 0x9003; ASIC REV 0x09; CHIP REV 0x90; PCI-X 133 MHz
> > > > miibus2:  on bge0
> > > > brgphy0:  PHY 1 on miibus2
> > > > brgphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 
> > > > 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow
> > > > bge0: Ethernet address: 00:1b:24:5d:5b:bd
> > > > bge1:  > > > 0x009003> mem 
> > > > 0xfdfc-0xfdfc,0xfdfb-0xfdfb irq 18 at device 4.1 on pci6
> > > > bge1: CHIP ID 0x9003; ASIC REV 0x09; CHIP REV 0x90; PCI-X 133 MHz
> > > > miibus3:  on bge1
> > > > brgphy1:  PHY 1 on miibus3
> > > > brgphy1:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 
> > > > 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow
> > > > bge1: Ethernet address: 00:1b:24:5d:5b:be
> > > > 
> > > > sf-10> ifconfig bge1
> > > > bge1: flags=8802 metric 0 mtu 1500
> > > > 
> > > > options=8009b > > > TE>
> > > > ether 00:1b:24:5d:5b:be
> > > > nd6 options=21
> > > > media: Ethernet autoselect (100baseTX )
> > > > status: active
> > > > 
> > > 
> > > Because bge1 is not UP, I wonder how you get link UP/DOWN events.
> > > Do you have some network script run by cron?
> > 
> > no scripts.
> > this port is shared with the ILO/IPMI, and back in March you fixed a problem
> > that it was hanging soon after it was initialized by the driver,
> > (r248226 - but I'm not sure if it was ever MFC'ed).
> 
> It was MFCed.
> 
> > Initialy I thought it could be caused by connections to it from other
> > hosts (either via the web, or ssh) so I killed them, but it didn't help.
> > without that patch the connection fails, and I don't see any DOWN/UP.
> 
> Could you check how many number of interrupts you get from bge1?
> Ideally you shouldn't get any interrupts for bge1.

it's not even mentioned :-)
sf-04> vmstat -i
interrupt  total   rate
irq3: uart1  964  0
irq4: uart06  0
irq14: ata0   227354  0
irq17: bge0  1021981  2
irq21: ohci0  28  0
irq22: ehci0   2  0
irq23: atapci1293228  0
cpu0:timer 383244076   1124
cpu1:timer   2225144  6
cpu2:timer   2056087  6
cpu3:timer   2093943  6
Total  391162813   1147

> 
> > 
> > > 
> > > > > > is toggeling bge1 DOWN/UP every few hours, this port is being used 
> > > > > > by the ILO.
> > > > > > To check, I upgraded another identical host, and the same problem 
> > > > > > appears. 
> > > > > 
> > > > > What is the last known working revision?
> > > > 
> > > > I have no idea, but I have older versions, and ill start from the 
> > > > oldets 
> > > > (9.1-prerelease), but
> > > > it will take time, since it takes hours till it happens.
> > > > 
> > > 
> > > ok.
> > 
> > 


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: SunFire X2200 ilo's bge1 DOWN/UP

2013-05-27 Thread Jeremy Chadwick
On Tue, May 28, 2013 at 09:28:00AM +0300, Daniel Braniss wrote:
> > On Mon, May 27, 2013 at 10:59:28AM +0300, Daniel Braniss wrote:
> > > > On Fri, May 24, 2013 at 05:31:13PM +0300, Daniel Braniss wrote:
> > > > > hi, after upgrading to 9.1-stable, this particular hardware - SunFire 
> > > > > X2200,
> > > > 
> > > > Show me dmesg(bge(4) and brgphy(4) only) and 'ifconfig bge1' output.
> > > > 
> > > 
> > > bge0:  > > 0x009003> mem 
> > > 0xfdff-0xfdff,0xfdfe-0xfdfe irq 17 at device 4.0 on pci6
> > > bge0: CHIP ID 0x9003; ASIC REV 0x09; CHIP REV 0x90; PCI-X 133 MHz
> > > miibus2:  on bge0
> > > brgphy0:  PHY 1 on miibus2
> > > brgphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 
> > > 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow
> > > bge0: Ethernet address: 00:1b:24:5d:5b:bd
> > > bge1:  > > 0x009003> mem 
> > > 0xfdfc-0xfdfc,0xfdfb-0xfdfb irq 18 at device 4.1 on pci6
> > > bge1: CHIP ID 0x9003; ASIC REV 0x09; CHIP REV 0x90; PCI-X 133 MHz
> > > miibus3:  on bge1
> > > brgphy1:  PHY 1 on miibus3
> > > brgphy1:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 
> > > 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow
> > > bge1: Ethernet address: 00:1b:24:5d:5b:be
> > > 
> > > sf-10> ifconfig bge1
> > > bge1: flags=8802 metric 0 mtu 1500
> > > 
> > > options=8009b > > TE>
> > > ether 00:1b:24:5d:5b:be
> > > nd6 options=21
> > > media: Ethernet autoselect (100baseTX )
> > > status: active
> > > 
> > 
> > Because bge1 is not UP, I wonder how you get link UP/DOWN events.
> > Do you have some network script run by cron?
> 
> no scripts.
> this port is shared with the ILO/IPMI, and back in March you fixed a problem
> that it was hanging soon after it was initialized by the driver,
> (r248226 - but I'm not sure if it was ever MFC'ed).
> Initialy I thought it could be caused by connections to it from other
> hosts (either via the web, or ssh) so I killed them, but it didn't help.
> without that patch the connection fails, and I don't see any DOWN/UP.

Two things:

1. r248226 in head was MFC'd to stable/9 as r248858.  Validation:

http://svnweb.freebsd.org/base/stable/9/sys/dev/bge/if_bge.c?view=log

So the answer: whether or not you have that MFC in stable/9 depends on
what SVN rev your kernel is.

2. Is there some way to verify that the ASF/iLO/IPMI bits (i.e. the IPMI
firmware itself) are not shutting down bge1's PHY intentionally?  Unless
the IPMI module chooses to log something useful (e.g. "I'm doing this"),
I'm not sure how you'd figure that out.

Other question: is there any correlation between the amount of time that
goes by between events with, say, ARP/MAC address expiry in "arp -a"?  I
mention this because I know some of the ASF methods have historically
shown two MAC addresses on the same physif, and I can see how this might
confuse some stacks.


That "piggybacking" crap never should have been invented.  All it has
done is cause problems for every OS I know of (including Windows) since
its inception, and is also exactly why today almost all vendors I've
seen provide a dedicated NIC and RJ45 port for the iLO/IPMI interface.
It's admission the "piggybacking" method doesn't work.  And may it rot
in hell for all I care, while simultaneously feeling very sorry for
those who have to suffer/deal with it.

This is just another reason why I've always been very picky about what
hardware I'd buy for server deployments.  Vendors never actually
disclose this crap until you've shelled out money for the hardware, by
which point it's too late and you're suffering.  Really great model --
for the pocketbook.  :/


-- 
| Jeremy Chadwick   j...@koitsu.org |
| UNIX Systems Administratorhttp://jdc.koitsu.org/ |
| Mountain View, CA, US|
| Making life hard for others since 1977. PGP 4BD6C0CB |
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: SunFire X2200 ilo's bge1 DOWN/UP

2013-05-27 Thread YongHyeon PYUN
On Tue, May 28, 2013 at 09:28:00AM +0300, Daniel Braniss wrote:
> > On Mon, May 27, 2013 at 10:59:28AM +0300, Daniel Braniss wrote:
> > > > On Fri, May 24, 2013 at 05:31:13PM +0300, Daniel Braniss wrote:
> > > > > hi, after upgrading to 9.1-stable, this particular hardware - SunFire 
> > > > > X2200,
> > > > 
> > > > Show me dmesg(bge(4) and brgphy(4) only) and 'ifconfig bge1' output.
> > > > 
> > > 
> > > bge0:  > > 0x009003> mem 
> > > 0xfdff-0xfdff,0xfdfe-0xfdfe irq 17 at device 4.0 on pci6
> > > bge0: CHIP ID 0x9003; ASIC REV 0x09; CHIP REV 0x90; PCI-X 133 MHz
> > > miibus2:  on bge0
> > > brgphy0:  PHY 1 on miibus2
> > > brgphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 
> > > 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow
> > > bge0: Ethernet address: 00:1b:24:5d:5b:bd
> > > bge1:  > > 0x009003> mem 
> > > 0xfdfc-0xfdfc,0xfdfb-0xfdfb irq 18 at device 4.1 on pci6
> > > bge1: CHIP ID 0x9003; ASIC REV 0x09; CHIP REV 0x90; PCI-X 133 MHz
> > > miibus3:  on bge1
> > > brgphy1:  PHY 1 on miibus3
> > > brgphy1:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 
> > > 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow
> > > bge1: Ethernet address: 00:1b:24:5d:5b:be
> > > 
> > > sf-10> ifconfig bge1
> > > bge1: flags=8802 metric 0 mtu 1500
> > > 
> > > options=8009b > > TE>
> > > ether 00:1b:24:5d:5b:be
> > > nd6 options=21
> > > media: Ethernet autoselect (100baseTX )
> > > status: active
> > > 
> > 
> > Because bge1 is not UP, I wonder how you get link UP/DOWN events.
> > Do you have some network script run by cron?
> 
> no scripts.
> this port is shared with the ILO/IPMI, and back in March you fixed a problem
> that it was hanging soon after it was initialized by the driver,
> (r248226 - but I'm not sure if it was ever MFC'ed).

It was MFCed.

> Initialy I thought it could be caused by connections to it from other
> hosts (either via the web, or ssh) so I killed them, but it didn't help.
> without that patch the connection fails, and I don't see any DOWN/UP.

Could you check how many number of interrupts you get from bge1?
Ideally you shouldn't get any interrupts for bge1.

> 
> > 
> > > > > is toggeling bge1 DOWN/UP every few hours, this port is being used by 
> > > > > the ILO.
> > > > > To check, I upgraded another identical host, and the same problem 
> > > > > appears. 
> > > > 
> > > > What is the last known working revision?
> > > 
> > > I have no idea, but I have older versions, and ill start from the oldets 
> > > (9.1-prerelease), but
> > > it will take time, since it takes hours till it happens.
> > > 
> > 
> > ok.
> 
> 
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: SunFire X2200 ilo's bge1 DOWN/UP

2013-05-27 Thread Daniel Braniss
...
> There are ways you can speed up the replication time. I tend to flood a ser=
> ver with
> TCP while I've heard of it happening under UDP flood too.
> 
> Here's a nice way to flood a server with TCP (assuming you have SSH access =
> to the
> system via keys):
> 
> sh -c 'while :;do dd if=3D/dev/urandom of=3D/dev/stdout bs=3D1m count=3D102=
> 4 | ssh HOST2KILL /sbin/md5; done'
> 
> Run that about 16 times in separate screen sessions from various other host=
> s on your network,
> taking care to replace "HOST2KILL" with the hostname or IP of the box with =
> the SunFire X2200.
> 
> Let that run for a while, and then when you think you've had a reset (if yo=
> u weren't standing
> there watching for one)=85
> 
> grep 'bge.*DOWN' /var/log/messages
> 
> On a system that has booted and stayed up-and-running, there shouldn't be a=
> ny messages like this:
> 
> bge0: link state changed to DOWN
> 
> When you actually get this message (if your experience is like ours), you'l=
> l be down for 90 seconds
> while the NIC resets.
> 
> However, since you say you have some older 9.1 releases=85 I'd start by fir=
> st trying to bring the
> replication time of the problem down by using TCP and/or UDP floods. That w=
> ay you'll be able to
> test for resolution of the problem as you progress up to stable/9 (where th=
> e problem should be fixed
> by the aforementioned SVN revisions -- specific to your hardware).
...
> any ideas?
> 
> 
> Well, you say the connection is OK=85 so it doesn't sound like a full reset=
>  as it
> was in our case (we have a different chipset).
> 
> But I agree that a log full of those would be annoying.
> 
> Try getting up to stable/9 in its current state (note: stable/8 also has al=
> l the
> aforementioned revisions too).
> --
> Devin

Hi Devin,
the kernel is pretty new, actually last Friday's, and the svn says
it's r250960.

the bg1 port is not UP, it's shared with the onboard BMC/ILO/IPMI thingy.
connecting to it via ssh gets me into it's ILO manager:
...
Sun(TM) Embedded Lights Out Manager

Copyright 2004-2006 Sun Microsystems, Inc. All rights reserved.

Version 3.23
...
and so typing
start AgentInfo/console
I can get to the 'serial' console.

cheers, and thanks,
danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: SunFire X2200 ilo's bge1 DOWN/UP

2013-05-27 Thread Daniel Braniss
> On Mon, May 27, 2013 at 10:59:28AM +0300, Daniel Braniss wrote:
> > > On Fri, May 24, 2013 at 05:31:13PM +0300, Daniel Braniss wrote:
> > > > hi, after upgrading to 9.1-stable, this particular hardware - SunFire 
> > > > X2200,
> > > 
> > > Show me dmesg(bge(4) and brgphy(4) only) and 'ifconfig bge1' output.
> > > 
> > 
> > bge0:  
> > mem 
> > 0xfdff-0xfdff,0xfdfe-0xfdfe irq 17 at device 4.0 on pci6
> > bge0: CHIP ID 0x9003; ASIC REV 0x09; CHIP REV 0x90; PCI-X 133 MHz
> > miibus2:  on bge0
> > brgphy0:  PHY 1 on miibus2
> > brgphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 
> > 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow
> > bge0: Ethernet address: 00:1b:24:5d:5b:bd
> > bge1:  
> > mem 
> > 0xfdfc-0xfdfc,0xfdfb-0xfdfb irq 18 at device 4.1 on pci6
> > bge1: CHIP ID 0x9003; ASIC REV 0x09; CHIP REV 0x90; PCI-X 133 MHz
> > miibus3:  on bge1
> > brgphy1:  PHY 1 on miibus3
> > brgphy1:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 
> > 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow
> > bge1: Ethernet address: 00:1b:24:5d:5b:be
> > 
> > sf-10> ifconfig bge1
> > bge1: flags=8802 metric 0 mtu 1500
> > 
> > options=8009b > TE>
> > ether 00:1b:24:5d:5b:be
> > nd6 options=21
> > media: Ethernet autoselect (100baseTX )
> > status: active
> > 
> 
> Because bge1 is not UP, I wonder how you get link UP/DOWN events.
> Do you have some network script run by cron?

no scripts.
this port is shared with the ILO/IPMI, and back in March you fixed a problem
that it was hanging soon after it was initialized by the driver,
(r248226 - but I'm not sure if it was ever MFC'ed).
Initialy I thought it could be caused by connections to it from other
hosts (either via the web, or ssh) so I killed them, but it didn't help.
without that patch the connection fails, and I don't see any DOWN/UP.

> 
> > > > is toggeling bge1 DOWN/UP every few hours, this port is being used by 
> > > > the ILO.
> > > > To check, I upgraded another identical host, and the same problem 
> > > > appears. 
> > > 
> > > What is the last known working revision?
> > 
> > I have no idea, but I have older versions, and ill start from the oldets 
> > (9.1-prerelease), but
> > it will take time, since it takes hours till it happens.
> > 
> 
> ok.


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: SunFire X2200 ilo's bge1 DOWN/UP

2013-05-27 Thread YongHyeon PYUN
On Mon, May 27, 2013 at 10:59:28AM +0300, Daniel Braniss wrote:
> > On Fri, May 24, 2013 at 05:31:13PM +0300, Daniel Braniss wrote:
> > > hi, after upgrading to 9.1-stable, this particular hardware - SunFire 
> > > X2200,
> > 
> > Show me dmesg(bge(4) and brgphy(4) only) and 'ifconfig bge1' output.
> > 
> 
> bge0:  
> mem 
> 0xfdff-0xfdff,0xfdfe-0xfdfe irq 17 at device 4.0 on pci6
> bge0: CHIP ID 0x9003; ASIC REV 0x09; CHIP REV 0x90; PCI-X 133 MHz
> miibus2:  on bge0
> brgphy0:  PHY 1 on miibus2
> brgphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 
> 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow
> bge0: Ethernet address: 00:1b:24:5d:5b:bd
> bge1:  
> mem 
> 0xfdfc-0xfdfc,0xfdfb-0xfdfb irq 18 at device 4.1 on pci6
> bge1: CHIP ID 0x9003; ASIC REV 0x09; CHIP REV 0x90; PCI-X 133 MHz
> miibus3:  on bge1
> brgphy1:  PHY 1 on miibus3
> brgphy1:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 
> 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow
> bge1: Ethernet address: 00:1b:24:5d:5b:be
> 
> sf-10> ifconfig bge1
> bge1: flags=8802 metric 0 mtu 1500
> 
> options=8009b TE>
> ether 00:1b:24:5d:5b:be
> nd6 options=21
> media: Ethernet autoselect (100baseTX )
> status: active
> 

Because bge1 is not UP, I wonder how you get link UP/DOWN events.
Do you have some network script run by cron?

> > > is toggeling bge1 DOWN/UP every few hours, this port is being used by the 
> > > ILO.
> > > To check, I upgraded another identical host, and the same problem 
> > > appears. 
> > 
> > What is the last known working revision?
> 
> I have no idea, but I have older versions, and ill start from the oldets 
> (9.1-prerelease), but
> it will take time, since it takes hours till it happens.
> 

ok.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: SunFire X2200 ilo's bge1 DOWN/UP

2013-05-27 Thread Teske, Devin

On May 27, 2013, at 12:59 AM, Daniel Braniss wrote:

On Fri, May 24, 2013 at 05:31:13PM +0300, Daniel Braniss wrote:
hi, after upgrading to 9.1-stable, this particular hardware - SunFire X2200,


If you're truly running stable/9, and it's up-to-date, you should have have 
already
SVN revisions 248858 and 250650. Both of which have significant impact for
(a) the SunFire X2200 (r248858) and (b) the DOWN/UP problem (r250650).


Show me dmesg(bge(4) and brgphy(4) only) and 'ifconfig bge1' output.


bge0:  mem
0xfdff-0xfdff,0xfdfe-0xfdfe irq 17 at device 4.0 on pci6
bge0: CHIP ID 0x9003; ASIC REV 0x09; CHIP REV 0x90; PCI-X 133 MHz
miibus2:  on bge0
brgphy0:  PHY 1 on miibus2
brgphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT,
1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow
bge0: Ethernet address: 00:1b:24:5d:5b:bd
bge1:  mem
0xfdfc-0xfdfc,0xfdfb-0xfdfb irq 18 at device 4.1 on pci6
bge1: CHIP ID 0x9003; ASIC REV 0x09; CHIP REV 0x90; PCI-X 133 MHz
miibus3:  on bge1
brgphy1:  PHY 1 on miibus3
brgphy1:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT,
1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow
bge1: Ethernet address: 00:1b:24:5d:5b:be

sf-10> ifconfig bge1
bge1: flags=8802 metric 0 mtu 1500
   options=8009b
   ether 00:1b:24:5d:5b:be
   nd6 options=21
   media: Ethernet autoselect (100baseTX )
   status: active


Saw similar things happening over here with different broadcom chipset, and the 
above revisions
helped significantly (URLs below):

http://svnweb.freebsd.org/base?view=revision&revision=248858
http://svnweb.freebsd.org/base?view=revision&revision=250650



is toggeling bge1 DOWN/UP every few hours, this port is being used by the ILO.
To check, I upgraded another identical host, and the same problem appears.

What is the last known working revision?

I have no idea, but I have older versions, and ill start from the oldets
(9.1-prerelease), but
it will take time, since it takes hours till it happens.


There are ways you can speed up the replication time. I tend to flood a server 
with
TCP while I've heard of it happening under UDP flood too.

Here's a nice way to flood a server with TCP (assuming you have SSH access to 
the
system via keys):

sh -c 'while :;do dd if=/dev/urandom of=/dev/stdout bs=1m count=1024 | ssh 
HOST2KILL /sbin/md5; done'

Run that about 16 times in separate screen sessions from various other hosts on 
your network,
taking care to replace "HOST2KILL" with the hostname or IP of the box with the 
SunFire X2200.

Let that run for a while, and then when you think you've had a reset (if you 
weren't standing
there watching for one)…

grep 'bge.*DOWN' /var/log/messages

On a system that has booted and stayed up-and-running, there shouldn't be any 
messages like this:

bge0: link state changed to DOWN

When you actually get this message (if your experience is like ours), you'll be 
down for 90 seconds
while the NIC resets.

However, since you say you have some older 9.1 releases… I'd start by first 
trying to bring the
replication time of the problem down by using TCP and/or UDP floods. That way 
you'll be able to
test for resolution of the problem as you progress up to stable/9 (where the 
problem should be fixed
by the aforementioned SVN revisions -- specific to your hardware).




There
is not correlation with time, since they happend at totaly different times.
I rebooted both hosts at almost the same time.
one host :
uptime: 5:24PM  up  6:15, 0 users, load averages: 0.00, 0.00, 0.00
May 24 12:53:52 sf-04 kernel: bge1: link state changed to DOWN
May 24 12:53:55 sf-04 kernel: bge1: link state changed to UP
May 24 15:34:25 sf-04 kernel: bge1: link state changed to DOWN
May 24 15:34:28 sf-04 kernel: bge1: link state changed to UP

and
uptime: 5:24PM  up  6:14, 0 users, load averages: 0.00, 0.00, 0.00

May 24 16:30:44 sf-10 kernel: bge1: link state changed to DOWN
May 24 16:30:44 sf-10 kernel: bge1: link state changed to UP

this is not serious, the ilo (ssh) connection is ok, but it's anoying, we have
more
than 10 of this hosts, and if I upgrade all of them, the logs will fill up
with this :-)

any ideas?


Well, you say the connection is OK… so it doesn't sound like a full reset as it
was in our case (we have a different chipset).

But I agree that a log full of those would be annoying.

Try getting up to stable/9 in its current state (note: stable/8 also has all the
aforementioned revisions too).
--
Devin

_
The information contained in this message is proprietary and/or confidential. 
If you are not the intended recipient, please: (i) delete the message and all 
copies; (ii) do not disclose, distribute or use the message in any manner; and 
(iii) notify the sender immediately. In addition, please be aware that any 
message addressed to our domain is subject to archiving and review by persons 
other than the intended recipient. Thank you.
__

Re: SunFire X2200 ilo's bge1 DOWN/UP

2013-05-27 Thread Daniel Braniss
> On Fri, May 24, 2013 at 05:31:13PM +0300, Daniel Braniss wrote:
> > hi, after upgrading to 9.1-stable, this particular hardware - SunFire X2200,
> 
> Show me dmesg(bge(4) and brgphy(4) only) and 'ifconfig bge1' output.
> 

bge0:  mem 
0xfdff-0xfdff,0xfdfe-0xfdfe irq 17 at device 4.0 on pci6
bge0: CHIP ID 0x9003; ASIC REV 0x09; CHIP REV 0x90; PCI-X 133 MHz
miibus2:  on bge0
brgphy0:  PHY 1 on miibus2
brgphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 
1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow
bge0: Ethernet address: 00:1b:24:5d:5b:bd
bge1:  mem 
0xfdfc-0xfdfc,0xfdfb-0xfdfb irq 18 at device 4.1 on pci6
bge1: CHIP ID 0x9003; ASIC REV 0x09; CHIP REV 0x90; PCI-X 133 MHz
miibus3:  on bge1
brgphy1:  PHY 1 on miibus3
brgphy1:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 
1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow
bge1: Ethernet address: 00:1b:24:5d:5b:be

sf-10> ifconfig bge1
bge1: flags=8802 metric 0 mtu 1500
options=8009b
ether 00:1b:24:5d:5b:be
nd6 options=21
media: Ethernet autoselect (100baseTX )
status: active

> > is toggeling bge1 DOWN/UP every few hours, this port is being used by the 
> > ILO.
> > To check, I upgraded another identical host, and the same problem appears. 
> 
> What is the last known working revision?

I have no idea, but I have older versions, and ill start from the oldets 
(9.1-prerelease), but
it will take time, since it takes hours till it happens.

> 
> > There
> > is not correlation with time, since they happend at totaly different times.
> > I rebooted both hosts at almost the same time.
> > one host :
> > uptime: 5:24PM  up  6:15, 0 users, load averages: 0.00, 0.00, 0.00
> > May 24 12:53:52 sf-04 kernel: bge1: link state changed to DOWN
> > May 24 12:53:55 sf-04 kernel: bge1: link state changed to UP
> > May 24 15:34:25 sf-04 kernel: bge1: link state changed to DOWN
> > May 24 15:34:28 sf-04 kernel: bge1: link state changed to UP
> > 
> > and
> > uptime: 5:24PM  up  6:14, 0 users, load averages: 0.00, 0.00, 0.00
> > 
> > May 24 16:30:44 sf-10 kernel: bge1: link state changed to DOWN
> > May 24 16:30:44 sf-10 kernel: bge1: link state changed to UP
> > 
> > this is not serious, the ilo (ssh) connection is ok, but it's anoying, we 
> > have 
> > more
> > than 10 of this hosts, and if I upgrade all of them, the logs will fill up
> > with this :-)
> > 
> > any ideas?
> > 
> > cheers,
> > danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: SunFire X2200 ilo's bge1 DOWN/UP

2013-05-26 Thread YongHyeon PYUN
On Fri, May 24, 2013 at 05:31:13PM +0300, Daniel Braniss wrote:
> hi, after upgrading to 9.1-stable, this particular hardware - SunFire X2200,

Show me dmesg(bge(4) and brgphy(4) only) and 'ifconfig bge1' output.

> is toggeling bge1 DOWN/UP every few hours, this port is being used by the ILO.
> To check, I upgraded another identical host, and the same problem appears. 

What is the last known working revision?

> There
> is not correlation with time, since they happend at totaly different times.
> I rebooted both hosts at almost the same time.
> one host :
> uptime: 5:24PM  up  6:15, 0 users, load averages: 0.00, 0.00, 0.00
> May 24 12:53:52 sf-04 kernel: bge1: link state changed to DOWN
> May 24 12:53:55 sf-04 kernel: bge1: link state changed to UP
> May 24 15:34:25 sf-04 kernel: bge1: link state changed to DOWN
> May 24 15:34:28 sf-04 kernel: bge1: link state changed to UP
> 
> and
> uptime: 5:24PM  up  6:14, 0 users, load averages: 0.00, 0.00, 0.00
> 
> May 24 16:30:44 sf-10 kernel: bge1: link state changed to DOWN
> May 24 16:30:44 sf-10 kernel: bge1: link state changed to UP
> 
> this is not serious, the ilo (ssh) connection is ok, but it's anoying, we 
> have 
> more
> than 10 of this hosts, and if I upgrade all of them, the logs will fill up
> with this :-)
> 
> any ideas?
> 
> cheers,
>   danny
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: SunFire X2200 ilo's bge1 DOWN/UP

2013-05-25 Thread Daniel Braniss
> Is this your bug?
> 
> http://www.freebsd.org/cgi/query-pr.cgi?pr=171121


no, this bge is only used for the ilo, it happens even if it's idling, ie no 
active connection.
it is also very erratic, it happens at random intervals, from 3 hs to 10hs, 
and the down/up
'hickup' lasts between less than a sec to about 3sec.

thanks,
danny



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: SunFire X2200 ilo's bge1 DOWN/UP

2013-05-25 Thread Daniel Braniss
> Daniel Braniss wrote:
> > hi, after upgrading to 9.1-stable, this particular hardware - SunFire X2200,
> > is toggeling bge1 DOWN/UP every few hours, this port is being used by the 
> > ILO.
> > To check, I upgraded another identical host, and the same problem appears.
> > There
> > is not correlation with time, since they happend at totaly different times.
> > I rebooted both hosts at almost the same time.
> > one host :
> > uptime: 5:24PM  up  6:15, 0 users, load averages: 0.00, 0.00, 0.00
> > May 24 12:53:52 sf-04 kernel: bge1: link state changed to DOWN
> > May 24 12:53:55 sf-04 kernel: bge1: link state changed to UP
> > May 24 15:34:25 sf-04 kernel: bge1: link state changed to DOWN
> > May 24 15:34:28 sf-04 kernel: bge1: link state changed to UP
> >
> > and
> > uptime: 5:24PM  up  6:14, 0 users, load averages: 0.00, 0.00, 0.00
> >
> > May 24 16:30:44 sf-10 kernel: bge1: link state changed to DOWN
> > May 24 16:30:44 sf-10 kernel: bge1: link state changed to UP
> >
> > this is not serious, the ilo (ssh) connection is ok, but it's anoying, we 
> > have
> > more
> > than 10 of this hosts, and if I upgrade all of them, the logs will fill up
> > with this :-)
> 
> What revision are you running?
> 
Friday morning's, probably r250960 (I run svnsync then convert to hg :-)

> There was problem report at February
> http://lists.freebsd.org/pipermail/freebsd-net/2013-February/034715.html
> http://lists.freebsd.org/pipermail/freebsd-net/2013-March/034778.html
> 
> I provided access to Yongari to our Sun Fire X2100 M2 (bge 5715C) and he 
> fixed the problem. (in revision r248226, I don't know if it was MFCed)
> 
> http://lists.freebsd.org/pipermail/freebsd-net/2013-March/034922.html
> 
> Miroslav Lachman

well, it seems different, it just goes down, then few secs later goes up.
thanks,
danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: SunFire X2200 ilo's bge1 DOWN/UP

2013-05-24 Thread Miroslav Lachman

Daniel Braniss wrote:

hi, after upgrading to 9.1-stable, this particular hardware - SunFire X2200,
is toggeling bge1 DOWN/UP every few hours, this port is being used by the ILO.
To check, I upgraded another identical host, and the same problem appears.
There
is not correlation with time, since they happend at totaly different times.
I rebooted both hosts at almost the same time.
one host :
uptime: 5:24PM  up  6:15, 0 users, load averages: 0.00, 0.00, 0.00
May 24 12:53:52 sf-04 kernel: bge1: link state changed to DOWN
May 24 12:53:55 sf-04 kernel: bge1: link state changed to UP
May 24 15:34:25 sf-04 kernel: bge1: link state changed to DOWN
May 24 15:34:28 sf-04 kernel: bge1: link state changed to UP

and
uptime: 5:24PM  up  6:14, 0 users, load averages: 0.00, 0.00, 0.00

May 24 16:30:44 sf-10 kernel: bge1: link state changed to DOWN
May 24 16:30:44 sf-10 kernel: bge1: link state changed to UP

this is not serious, the ilo (ssh) connection is ok, but it's anoying, we have
more
than 10 of this hosts, and if I upgrade all of them, the logs will fill up
with this :-)


What revision are you running?

There was problem report at February
http://lists.freebsd.org/pipermail/freebsd-net/2013-February/034715.html
http://lists.freebsd.org/pipermail/freebsd-net/2013-March/034778.html

I provided access to Yongari to our Sun Fire X2100 M2 (bge 5715C) and he 
fixed the problem. (in revision r248226, I don't know if it was MFCed)


http://lists.freebsd.org/pipermail/freebsd-net/2013-March/034922.html

Miroslav Lachman
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: SunFire X2200 ilo's bge1 DOWN/UP

2013-05-24 Thread Mark Felder

Is this your bug?

http://www.freebsd.org/cgi/query-pr.cgi?pr=171121
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


SunFire X2200 ilo's bge1 DOWN/UP

2013-05-24 Thread Daniel Braniss
hi, after upgrading to 9.1-stable, this particular hardware - SunFire X2200,
is toggeling bge1 DOWN/UP every few hours, this port is being used by the ILO.
To check, I upgraded another identical host, and the same problem appears. 
There
is not correlation with time, since they happend at totaly different times.
I rebooted both hosts at almost the same time.
one host :
uptime: 5:24PM  up  6:15, 0 users, load averages: 0.00, 0.00, 0.00
May 24 12:53:52 sf-04 kernel: bge1: link state changed to DOWN
May 24 12:53:55 sf-04 kernel: bge1: link state changed to UP
May 24 15:34:25 sf-04 kernel: bge1: link state changed to DOWN
May 24 15:34:28 sf-04 kernel: bge1: link state changed to UP

and
uptime: 5:24PM  up  6:14, 0 users, load averages: 0.00, 0.00, 0.00

May 24 16:30:44 sf-10 kernel: bge1: link state changed to DOWN
May 24 16:30:44 sf-10 kernel: bge1: link state changed to UP

this is not serious, the ilo (ssh) connection is ok, but it's anoying, we have 
more
than 10 of this hosts, and if I upgrade all of them, the logs will fill up
with this :-)

any ideas?

cheers,
danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"