I just spent a little over a day debugging a stack overflow problem in mdb itself. It turned out to be a fairly simple problem -- I'd added a new dcmd, and one of the functions had a structure on the stack that turned out to be unexpectedly _huge_ (512K+) -- but the symptoms of the problem were fairly misleading and unexpected. I saw panics that looked like this:
kmdb ABORT: "../common/umem.c", line 1264: assertion failed: sp->slab_cache == cp Debugger aborted Program terminated {2} ok boot It turns out that allocating big things on the stack inside mdb can be somewhat toxic. I fixed my problem by allocating the offending structure with mdb_alloc, but that begs a question: are there other instances of this problem hiding in here? Could this be near the root of weird problems like CR 6766866? It seems to me that the compiler must (obviously) know how much storage it's reserving for auto variables. Is there any way to find this out and enforce a limit? That wouldn't fix the problem of nesting too deeply (or just recursing), but it'd at least catch obvious blunders before they turn into lengthy trials. -- James Carlson, Solaris Networking <james.d.carlson at sun.com> Sun Microsystems / 35 Network Drive 71.232W Vox +1 781 442 2084 MS UBUR02-212 / Burlington MA 01803-2757 42.496N Fax +1 781 442 1677