Hi Charles,

> I haven't been able to get the record feature of gdb to do much of
> anything useful, because it keeps stopping on an unsupported instruction
> (0xc5).  I don't know if this is related to the crashing or not.  :-/

I spent a good 10 hours on this and have narrowed things down to 
weirdness, as well.  Using the technique I described, I traced the 
segfault into malloc.c.  I had to do a bunch of reading to understand 
malloc's data structures before I could see the problem, and when I 
finally did, no luck.

Recycled memory chunks are split for a new malloc(), one part given to 
the program, the remainder put back into a 'bin' in a circular 
doubly-linked list for later use.  I could step back and forth across 
this instruction here:

             remainder->bk = bck;

'bck' has a valid non-NULL value, but remainder->bk is assigned NULL. 
This isn't an optimizer thing, since the value *does* change there, and 
it stays NULL until later in the program when it causes the segfault.

My only thought about the cause is funkiness with gdb record and 
multi-threadedness, which would put me back at square one.  I suppose 
your 'unsupported instruction' problem could be caused by the same kind 
of corrupt pointers into code, but it looks like the record feature 
would have the same limitations.

I did find an old patch to enable gdb record with threads, but I've 
already got my hands full resurrecting old patches.  ;)  Unless anyone 
has some other good idea, I think I hit a dead end with this one.

> I was, however, able to turn up what might be a useful clue using
> electric fence.  I used the .gdbinit example from the debian package,
> which sets two environment variables prior to running rtapi_app:
>
>    set environment EF_PROTECT_BELOW 1
>    set environment LD_PRELOAD /usr/lib/libefence.so.0.0
>
> When run this way, the application *ALWAYS* segfaults on line 131 of
> linux_rtapi.c: the call to rtapi_print_msg call in the
> rtapi_reset_pagefault_count function (although it is the vfprintf in
> default_rtapi_msg_handler that is actually causing the segfault).
>
> I still don't know _why_ this is happening, but it feels like at least
> the root issue may be getting closer to the surface.

Ahhh, just in the nick of time to save me from the pit of despair, a 
fresh path.  Thank you Charles!

        John

------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Emc-developers mailing list
Emc-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/emc-developers

Reply via email to