On 11/1/12 2:25 AM, Eugene Grosbein wrote:
31.10.2012 23:58, Charles Owens пишет:
Hello,

We're seeing boot-time panics in about 4% of cases when upgrading from
FreeBSD 8.1 to 8.3-RELEASE (i386).  This problem is subtle enough that
it escaped detection during our regular testing cycle... now with over
100 systems upgraded we're convinced there's a real issue.  Our kernel
config is essentially PAE (ie. static modules ... with a few drivers
added/removed).  The hardware is Intel Server System SR1625UR.

This appears to match a finding discussed in these threads, having to do
with timing of initialization of the igb(4)-based NICs (if I'm
understanding it properly):

http://lists.freebsd.org/pipermail/freebsd-stable/2011-May/062596.html
http://lists.freebsd.org/pipermail/freebsd-stable/2011-June/062949.html
http://lists.freebsd.org/pipermail/freebsd-stable/2011-September/063867.html
http://lists.freebsd.org/pipermail/freebsd-stable/2011-September/063958.html


These threads include some potential patches and possibility of
commit/MFC... but it isn't clear that there was ever final resolution
(and MFC to 8-stable).  I've cc'd a few folks from back then.

A real challenge here is the frequency of occurrence. As mentioned, it
only hit's a fraction of our systems.  When it _does_ hit, the system
may enter a reboot loop for days and then mysteriously break out of
it... and thereafter seem to work fine.

I'd be very grateful for any help.  Some questions:

   * Was there ever a final "blessed" patch?
       o if so, will it apply to RELENG_8_3?
   * Is there anything that could be said that might help us with
     reproducing-the-problem / testing / validating-a-fix?


Panic message is --

panic: m_getzone: m_getjcl: invalid cluster type
cpuid = 0
KDB: stack backtrace:
#0 0xc059c717 at kdb_backtrace+0x47
#1 0xc056caf7 at panic+0x117
#2 0xc03c979e at igb_refresh_mbufs+0x25e
#3 0xc03c9f98 at igb_rxeof+0x638
#4 0xc03ca135 at igb_msix_que+0x105
#5 0xc0541e2b at intr_event_execute_handlers+0x13b
#6 0xc05434eb at ithread_loop+0x6b
#7 0xc053efb7 at fork_exit+0x97
#8 0xc0806744 at fork_trampoline+0x8

Thanks very much,

Charles
Take a look at http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/172113 that 
contains
simple workaround in followup message not involving any patching, and the fix.

Eugene Grosbein


Eugene, thanks very much for the pointer. This is definitely what we were looking for! -- Charles


Charles Owens
Great Bay Software, Inc.


_______________________________________________
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Reply via email to