-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 4/28/2012 1:53 PM, John Morris wrote:
<snip>
> From the debugging side, neither of my tacks have gotten very far.
>  I've never done anything terribly difficult with C, gdb or 
> assembly before, so I've been reading docs at every step.


I'm in the same boat, I typically don't code on linux and dubug with
gdb.  Normally, I'm writing VHDL (for Altera FPGAs) and debugging via
a few blinking LEDs (and by staring at the code for a long time! :).

> The two tacks:
> 
> - Use the debugger to understand why test_and_set_bit returns 0. I 
> don't know what the 'tsbbl' instruction is, and haven't figured
> out how to examine memory yet.
> 
> - Find some way to log function calls and argument values as the 
> execution progresses.  I hope to (dis)confirm that the pre-problem
>  and problem versions are taking the same path up to the point 
> where the problem version starts spinning.  Stepping through with 
> gdb is, of course, too painful.  I read that valgrind might be
> able to help with this, but it doesn't seem to be a common usage. 
> Charles, have you heard of using valgrind for that purpose?
> 


I wasn't able to get anywhere easily with gdb, so I switched to the
GUI front-end program nemiver.  It's been pretty easy to use so-far,
but I'm sure I'm missing out on some of what gdb is capable of.  It is
reasonably easy to monitor memory and the call stack (at least most of
the time...I've had times where the call stack appears empty even
though there should be something there).

As for valgrind, the on-line manual (not the man page) indicates a few
tests geared at multi-threaded code to find potential race conditions
and deadlocks.  I haven't tried running any of these yet, as it
appears to assume you are using the available library mutex functions
while the linuxcnc code in question is running with custom locking
functions (which you can 'teach' valgrind about, but that is beyond my
current skill level).

Also, I doubt there is a serious problem with the locking code, as it
is the same code that is used successfully without the linux-rtapi
patch.  Instead, it seems like there is some form of memory corruption
happening, which is why I pulled out valgrind to begin with (or at
least that's what googling "crash in malloc" leads me to believe).  I
suspect the change in how message strings are being handled has caused
some portion of memory to get corrupted or overwritten, but I still
don't understand the code well enough to start figuring out exactly
what's going wrong.  I think I need to step back and start looking at
the program flow instead of the diffs between versions, and see when
anything related to messaging might be running and causing a problem.

- -- 
Charles Steinkuehler
char...@steinkuehler.net
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk+cVtAACgkQLywbqEHdNFxhbwCgpVT0jUjE0Ze5wum2Tr88gMI/
6KYAn04mouf8z4tiJtwTWdRLdp8s89js
=EJZ0
-----END PGP SIGNATURE-----

------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Emc-developers mailing list
Emc-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/emc-developers

Reply via email to