Re: bce(4) panics, 9.2rc1 [redux]

2013-07-31 Thread Hiroki Sato
[Added yougari@ and davidch@ to the To:/Cc: list] I confirmed that my issue reported on -current@ is due to the bxe(4) driver (BCM57711). If it is disabled, shutdown works fine without NMI. Also, I received several reports about the same box that NMI occurred even on bge(4) (BCM5717)

Re: bce(4) panics, 9.2rc1 [redux]

2013-07-31 Thread Yonghyeon PYUN
On Wed, Jul 31, 2013 at 03:54:06PM +0900, Hiroki Sato wrote: [Added yougari@ and davidch@ to the To:/Cc: list] I confirmed that my issue reported on -current@ is due to the bxe(4) driver (BCM57711). If it is disabled, shutdown works fine without NMI. Also, I received several reports

Re: bce(4) panics, 9.2rc1 [redux]

2013-07-31 Thread Hiroki Sato
Yonghyeon PYUN pyu...@gmail.com wrote in 20130731074341.gc1...@michelle.cdnetworks.com: py On Wed, Jul 31, 2013 at 03:54:06PM +0900, Hiroki Sato wrote: py [Added yougari@ and davidch@ to the To:/Cc: list] py py I confirmed that my issue reported on -current@ is due to the bxe(4) py driver

Re: bce(4) panics, 9.2rc1 [redux]

2013-07-30 Thread Sean Bruno
http://svnweb.freebsd.org/base?view=revisionrevision=236216 Ok, confirmed after ~50 reboots. There is a timing problem in this revision that I don't fully understand. Adding printf's inside bce_reset() will cause the existing code to succeed, and sometimes the existing code in this

Re: bce(4) panics, 9.2rc1 [redux]

2013-07-29 Thread Sean Bruno
On Wed, 2013-07-24 at 14:07 -0700, Sean Bruno wrote: Running 9.2 in production load mail servers. We're hitting the watchdog message and crashing with the stable/9 version. We're reverting the change from 2 weeks ago and seeing if it still happens. We didn't see this from stable/9 from

Re: bce(4) panics, 9.2rc1 [redux]

2013-07-29 Thread Barney Cordoba
From: Sean Bruno sean_br...@yahoo.com To: freebsd-net@freebsd.org freebsd-net@freebsd.org Sent: Monday, July 29, 2013 8:56 PM Subject: Re: bce(4) panics, 9.2rc1 [redux] On Wed, 2013-07-24 at 14:07 -0700, Sean Bruno wrote: Running 9.2 in production load

Re: bce(4) panics, 9.2rc1, IPMI related?

2013-07-26 Thread Sean Bruno
bce0: Broadcom NetXtreme II BCM5716 1000Base-T (C0) mem 0xda00-0xdbff irq 36 at device 0.0 on pci1 miibus0: MII bus on bce0 brgphy0: BCM5709 10/100/1000baseT PHY PHY 1 on miibus0 brgphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-master,

Re: bce(4) panics, 9.2rc1

2013-07-25 Thread Sean Bruno
On Wed, 2013-07-24 at 14:23 -0700, Sean Bruno wrote: On Wed, 2013-07-24 at 14:07 -0700, Sean Bruno wrote: Running 9.2 in production load mail servers. We're hitting the watchdog message and crashing with the stable/9 version. We're reverting the change from 2 weeks ago and seeing if it

bce(4) panics, 9.2rc1

2013-07-24 Thread Sean Bruno
Running 9.2 in production load mail servers. We're hitting the watchdog message and crashing with the stable/9 version. We're reverting the change from 2 weeks ago and seeing if it still happens. We didn't see this from stable/9 from about a month ago. Sean ref:

Re: bce(4) panics, 9.2rc1

2013-07-24 Thread Sean Bruno
On Wed, 2013-07-24 at 14:07 -0700, Sean Bruno wrote: Running 9.2 in production load mail servers. We're hitting the watchdog message and crashing with the stable/9 version. We're reverting the change from 2 weeks ago and seeing if it still happens. We didn't see this from stable/9 from about

Re: bce(4) panics, 9.2rc1

2013-07-24 Thread hiren panchasara
On Wed, Jul 24, 2013 at 2:23 PM, Sean Bruno sean_br...@yahoo.com wrote: On Wed, 2013-07-24 at 14:07 -0700, Sean Bruno wrote: Running 9.2 in production load mail servers. We're hitting the watchdog message and crashing with the stable/9 version. We're reverting the change from 2 weeks ago and

Re: bce(4) panics, 9.2rc1

2013-07-24 Thread Steven Hartland
- Original Message - From: Sean Bruno sean_br...@yahoo.com As a guess its likely the interrupt handler is triggering while the watchdog timeout handler is re-initialising the card so you inconsitent state resulting in the crash. In from /var/crash should help determine the cause and