Re: SCHED_ASSERT_UNLOCKED is considered harmful in the _kernel_lock()

Mark Kettenis Mon, 08 Nov 2010 03:04:17 -0800

> Date: Sun, 7 Nov 2010 16:01:05 -0800
> From: Philip Guenther <[email protected]>
> 
> On Sunday, November 7, 2010, Mark Kettenis <[email protected]> wrote:
> >> Date: Fri, 5 Nov 2010 17:52:23 +0100
> >> From: Mike Belopuhov <[email protected]>
> >
> > Mike, you might want to take a look at PR 6508.  I think the
> > "sched_lock" panic:
> >
> >> ddb{0}> show panic
> >> kernel diagnostic assertion "__mp_lock_held(&sched_lock) == 0" failed:
> >
> > is actually a side effect of trapping in the middle of a context
> > switch when we're doing the sched_lock/kernel_lock dance.  In PR 6508
> > it is almost certainly a page fault that happened because we
> > overflowed the stack.  That also might be the cause of your panic.  At
> > least judging from the traceback, your stack is seriously hosed:
> 
> Since i386 and amd64 put the lernel stack above the PCB, stack overrun
> means the PCB has already been overwritten.  At that point, trying to
> save the process context will probably blow up trying to follow some
> pointer therein.
> 
> I had chatted some with Theo about putting a guard page below the
> kernel stack to catch this sort of thing.  Would want to move the PCB
> to above the stack at the same time to save most of a page.  Would the
> result help isolate these problems enough to be worth the effort?


If you're considering changing the way we allocate the PCB, we should
look into moving it off the stack altogether and allocate them from a
pool.  That would make life simpler on hppa, since pools are mapped
1:1 which means issues with non-equivalent aliases go away.

Re: SCHED_ASSERT_UNLOCKED is considered harmful in the _kernel_lock()

Reply via email to