Hi Charles. are you checking for spurious wakeups in your threads? (using a while loop and a custom flag like:
pthread_mutex_lock(&t_lock); while (!t_cond_flag) { err = pthread_cond_wait(&t_cond, &t_lock); if (err) { whatever } } //do work t_cond_flag = 0; pthread_mutex_unlock(&t_lock); if you're not checking for these "spurious" wakeups, pthread_cond_wait can return zero... but really not have gotten a signal. . . in which case it would try to unlock a mutex it didn't lock. which would lead to very undefined behavior. not sure if this can happen with linux. the solaris man page warns about it, as does Stevens. Bill ----- Original Message ----- From: "Charles Lockhart" <[EMAIL PROTECTED]> To: "LUAU" <[EMAIL PROTECTED]> Sent: Sunday, January 05, 2003 8:07 AM Subject: [luau] kicking a threads butt into action > I've got an application that consists of three threads: > > Thread 1 services a device on the PCI bus, basically reading data from > the device into buffers in memory. After it's filled a buffer, it calls > pthread_cond_signal to start up thread 2, then it goes to sleep for > greater than a set period. This thread seems to be working fine. > > Thread 2, when woken up, confirms that the data in the buffers is ready > to process, processes the data into a different set of buffers, calls > pthread_cond_signal to wake up thread 3, checks to see if there is any > more work to be done, then if not, goes back to sleep. This thread also > seems to be working fine. > > Thread 3, when woken up, writes the data to a file, checks to see if > there's any more work to be done, if not goes back to sleep. This > thread is causing me problems. Every once in a while it just stops > working. I've verified that thread 2's calling of pthread_cond_signal > isn't returning an error. Also, if I throw in printf's for debugging > (into thread 3), the problem seems to go away, though for some reason I > have this idea that maybe it just starts happening less often, but that > could be the paranoia talking. > > So, like, this thread, thread 3, just up and quits doing anything. If > it was just taking a long time to do the stuff, I could probably figure > it out, but it just seems to not be doing anything at all. Do I need to > schedule this sucker myself, or what? Anybody? > > Uh, this is RH7.2, tested running both kernel 2.4.18-3 and > 2.4.18-preemptive, the vanilla kernel with the RML preemptive patch. > > Thanks, > > -Charles > > _______________________________________________ > LUAU mailing list > [EMAIL PROTECTED] > http://videl.ics.hawaii.edu/mailman/listinfo/luau