-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 4/28/2012 1:53 PM, John Morris wrote: <snip> > From the debugging side, neither of my tacks have gotten very far. > I've never done anything terribly difficult with C, gdb or > assembly before, so I've been reading docs at every step.
I'm in the same boat, I typically don't code on linux and dubug with gdb. Normally, I'm writing VHDL (for Altera FPGAs) and debugging via a few blinking LEDs (and by staring at the code for a long time! :). > The two tacks: > > - Use the debugger to understand why test_and_set_bit returns 0. I > don't know what the 'tsbbl' instruction is, and haven't figured > out how to examine memory yet. > > - Find some way to log function calls and argument values as the > execution progresses. I hope to (dis)confirm that the pre-problem > and problem versions are taking the same path up to the point > where the problem version starts spinning. Stepping through with > gdb is, of course, too painful. I read that valgrind might be > able to help with this, but it doesn't seem to be a common usage. > Charles, have you heard of using valgrind for that purpose? > I wasn't able to get anywhere easily with gdb, so I switched to the GUI front-end program nemiver. It's been pretty easy to use so-far, but I'm sure I'm missing out on some of what gdb is capable of. It is reasonably easy to monitor memory and the call stack (at least most of the time...I've had times where the call stack appears empty even though there should be something there). As for valgrind, the on-line manual (not the man page) indicates a few tests geared at multi-threaded code to find potential race conditions and deadlocks. I haven't tried running any of these yet, as it appears to assume you are using the available library mutex functions while the linuxcnc code in question is running with custom locking functions (which you can 'teach' valgrind about, but that is beyond my current skill level). Also, I doubt there is a serious problem with the locking code, as it is the same code that is used successfully without the linux-rtapi patch. Instead, it seems like there is some form of memory corruption happening, which is why I pulled out valgrind to begin with (or at least that's what googling "crash in malloc" leads me to believe). I suspect the change in how message strings are being handled has caused some portion of memory to get corrupted or overwritten, but I still don't understand the code well enough to start figuring out exactly what's going wrong. I think I need to step back and start looking at the program flow instead of the diffs between versions, and see when anything related to messaging might be running and causing a problem. - -- Charles Steinkuehler char...@steinkuehler.net -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk+cVtAACgkQLywbqEHdNFxhbwCgpVT0jUjE0Ze5wum2Tr88gMI/ 6KYAn04mouf8z4tiJtwTWdRLdp8s89js =EJZ0 -----END PGP SIGNATURE----- ------------------------------------------------------------------------------ Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ _______________________________________________ Emc-developers mailing list Emc-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/emc-developers