2011/8/12 Andrew Boyer <abo...@averesystems.com>:
> Re: panic: bufwrite: buffer is not busy??? (originally on freebsd-net)
> Re: debugging frequent kernel panics on 8.2-RELEASE (originally on 
> freebsd-stable)
> Re: System hang in USB umass module while processing panic  (originally on 
> freebsd-usb)
>
> Hello Andriy and Hans,
>
> Sorry for tying in so many discussions on this topic, but I think I have an 
> explanation for the problems we have been reporting* with hanging coredumps 
> on multicore systems on 8.2-RELEASE, and it has implications for Andriy's 
> proposed scheduler patch** and for USB.
>
> In today's 8.X and 9.X branches, nothing that I can find stops the other CPUs 
> when the kernel panics, but many parts of the locking code get disabled (grep 
> on 'panicstr').  The 'bufwrite: buffer is not busy???' panic is caused by the 
> syncer encountering an error.  If that happens when it's on the dumping CPU 
> everything hangs.  If it's running on a different CPU, it will be blocked and 
> hidden by the panic_cpu spinlock in panic(), and the dump continues, polling 
> every attached keyboard for a Ctl-C.
>
> But, the new 8.X USB stack relies on multithreading.  (The new stack is the 
> variable that broke coredumps for us in the 7.1->8.2 transition, I think.)  
> SVN 224223 fixes a hang that would happen when dumpsys() polls the USB 
> keyboard (IPMI KVM, in our case).  That helps, but it only gets as far as 
> usb_process(), where it hangs in a loop around a cv_wait() call.  This is 
> easy to reproduce by adding code to the watchdog to break into the debugger 
> if panicstr is set.
>
> I am experimenting with Andriy's patch** to stop the scheduler and it seems 
> to be most of the way there, stopping the CPUs and disabling the rest of 
> locking.  There are a few places that still reference panicstr, but that's 
> minor.  These are the changes I made to the patch:
>  * Changed ukbd_do_poll() to return immediately if SCHEDULER_STOPPED() is 
> true, so that we don't hang up in USB.  ukbd_yield()  locks up in 
> DROP_GIANT(), and if you skip ukbd_yield(), usbd_transfer_poll() locks up 
> trying to drop mutexes.
>  * Changed the call to spinlock_enter() back to critical_enter(), so that 
> interrupts stay enabled and the hardclock still functions.

Which spinlock_enter() are you referring here?
I think that having interrupts fast handlers running during
panic/shutdown is something we should avoid like hell.

Attilio


-- 
Peace can only be achieved by understanding - A. Einstein
_______________________________________________
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Reply via email to