On 4/28/2012 2:53 PM, John Morris wrote: > Hi Charles, > >> I >> have been unable to figure out the problem, but I can replicate your >> results here. The previous commit works, while this one fails. >> >> I did try running under valgrind and with MALLOC_CHECK_ set to 3, but >> didn't get anything useful (to my eyes, anyway). Interestingly, I did >> seem to get slightly different behavior when running under valgrind >> with the memorytest, but nothing that really points a finger at what >> might be wrong. It's possible there's some timing related problem, >> but I suspect what's happening is valgrind is changing the system >> behavior with respect to the task that crashes. >> >> Mostly I just wanted to report that I can replicate John's results on >> Debian, even though I haven't made any other progress. > Interesting tests. Here's something new I'm having trouble explaining. > Maybe someone else can figure out what's going on. > > I've been running gdb on halcmd. While running latency-test, make a > copy of the /tmp/hal.lat.foo directory into /tmp/hal.lat, and stop the > test. Then: > > . scripts/env-environment > cd /tmp/hal.lat > halcmd -f lat.hal > # if you're running the pre-problem version, stop the test and > halcmd unload all > # if not, go kill -9 the mess > > This works nicely to reproduce the problem, except it does something > weird for me: > > Run this in the pre-problem version. Should work well. > > Then fix your PATH to point at the problem version. > > halcmd -f lat.hal > > Now go clean up the mess. > > Fix your PATH again to point at the pre-problem version. > > halcmd -f lat.hal > > Stops working for me. I'm probably missing something obvious, but I've > carefully compared the environments and ps lists before/after every run > and there are no differences that seem important. If I kill the shell > and start over, the pre-problem version will begin working again. > > From the debugging side, neither of my tacks have gotten very far. > I've never done anything terribly difficult with C, gdb or assembly > before, so I've been reading docs at every step. The two tacks: > > - Use the debugger to understand why test_and_set_bit returns 0. I > don't know what the 'tsbbl' instruction is, and haven't figured out how > to examine memory yet. While I don't know the details of a tsbbl in this context, in general, test and set instructions are used to implement mutual exclusion. So typically, a test and set instruction will set a flag (or return a value) corresponding to the current state of (in this case) a bit AND then set the bit. This is done *atomically*. That simply means that it cannot be interrupted, nor can another processor do the same thing and get a non-zero value at the same time. (Remember, this stuff is supposed to work in a multi-processor, multi-core system.)
In a brief google of tsbbl instruction, I found some references that seem to imply that 386's don't have a real instruction that does this. It is implemented by putting a "lock prefix" in front of some other instruction. Someone who does 386 assembly might be able to tell you what the previous sentence actually means. I'm just quoting some stuff I saw on the net. Ken > > - Find some way to log function calls and argument values as the > execution progresses. I hope to (dis)confirm that the pre-problem and > problem versions are taking the same path up to the point where the > problem version starts spinning. Stepping through with gdb is, of > course, too painful. I read that valgrind might be able to help with > this, but it doesn't seem to be a common usage. Charles, have you heard > of using valgrind for that purpose? > > That's all I've got. > > John > > ------------------------------------------------------------------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > _______________________________________________ > Emc-developers mailing list > Emc-developers@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/emc-developers ------------------------------------------------------------------------------ Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ _______________________________________________ Emc-developers mailing list Emc-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/emc-developers