I just spent a little over a day debugging a stack overflow problem in
mdb itself.  It turned out to be a fairly simple problem -- I'd added
a new dcmd, and one of the functions had a structure on the stack that
turned out to be unexpectedly _huge_ (512K+) -- but the symptoms of
the problem were fairly misleading and unexpected.  I saw panics that
looked like this:

kmdb ABORT: "../common/umem.c", line 1264: assertion failed: sp->slab_cache == 
cp
Debugger aborted
Program terminated
{2} ok boot

It turns out that allocating big things on the stack inside mdb can be
somewhat toxic.

I fixed my problem by allocating the offending structure with
mdb_alloc, but that begs a question: are there other instances of this
problem hiding in here?  Could this be near the root of weird problems
like CR 6766866?

It seems to me that the compiler must (obviously) know how much
storage it's reserving for auto variables.  Is there any way to find
this out and enforce a limit?  That wouldn't fix the problem of
nesting too deeply (or just recursing), but it'd at least catch
obvious blunders before they turn into lengthy trials.

-- 
James Carlson, Solaris Networking              <james.d.carlson at sun.com>
Sun Microsystems / 35 Network Drive        71.232W   Vox +1 781 442 2084
MS UBUR02-212 / Burlington MA 01803-2757   42.496N   Fax +1 781 442 1677

Reply via email to