On Apr 16, 2012, at 7:24 AM, Tom Browder wrote:

> On Sun, Apr 15, 2012 at 13:50, Tom Browder <[email protected]> wrote:
>> On Sun, Apr 15, 2012 at 08:34, Christopher Sean Morrison <[email protected]> 
>> wrote:
>> ...
>>> There are certainly places and situations where a bomb
>>> might not be caught (such as during the 'catch' or outside the try/catch).
> 
> I've found one possible cause of the problem.  In
> db_tree.s:db_free_tree there is a comment and line:
> 
>    /*
>     * Before recursion, smash the magic number, so that if another
>     * thread tries to free this same tree, they will fail.
>     */
>    tp->magic = (uint32_t)-3;          /* special bad flag */
> 
> and that "special bad flag" looks suspiciously like the
> "Unknown_Magic" number I see in the crash message:
> 
>  ERROR: bad pointer 0x812390: s/b union tree(x91191191),
>  was Unknown_Magic(xfffffffd),
>  file /usr/src/tbrowde/brlcad-svn/brlcad/trunk/src/librt/db_tree.c, line 1458

Excellent, that does narrow down the problem.  The special bad flag basically 
set up a bad-bookkeeping tripwire, and something tripped on it.  Someone's 
accessing an already-freed tree pointer.

> That occurs when trying to free a tree in a catch block.  When I
> commented out the delete tree line in g-nff the crash stopped and we
> got a graceful end.

It could be that the tree was already freed and there are lingering pointers 
that didn't get cleaned up properly or the free-tree block/function has a bug 
causing a double-delete.

> 1.  Shouldn't the special flag be generalized and get a bu macro?

I don't think so.  Only the "real" magic numbers need to go in headers.  If 
it's *anything* other than the one defined for that struct type, it's not valid.

> 2.  Shouldn't a test for the "free flag" be built into the
> db_free_tree function to return silently if the tree is already freed?

That's a possible solution but doesn't sound like the right one.  If a tree 
pointer was freed, it shouldn' t be referenced any longer.  That's the problem.

> 3.  Is there a way to force single-threading for debugging?

The converters are all single-threaded.  They say "This routine must be 
prepared to run in parallel." but they don't actually run in parallel.

> I've checked the other culprits in the TODO list I added and the same
> kind of situation is happening.  I made a local change to db_tree to
> add the test I mentioned in question 2 above and the bomb problems
> went away (except for g-iges which now has another problem but with
> another bad tree magic number [maybe other "special" flags floating
> around?]).
> 
> I also got a successful full regression test afterwards.   Obviously
> the test has wide-spread ramifications so it may not be the right
> thing to do, but it looks promising.

It's still an improvement, so you could go ahead and commit it.  That might 
make the problem more obvious, but it'll probably require walking through the 
state of the tree prior to and during delete.  Either way, delete shouldn't 
bomb.  The generalized fix is probably to not do a bomb check but only free if 
magic == known magic.

Cheers!
Sean


------------------------------------------------------------------------------
For Developers, A Lot Can Happen In A Second.
Boundary is the first to Know...and Tell You.
Monitor Your Applications in Ultra-Fine Resolution. Try it FREE!
http://p.sf.net/sfu/Boundary-d2dvs2
_______________________________________________
BRL-CAD Developer mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/brlcad-devel

Reply via email to