On 2017-02-25 17:44-0800 Alan W. Irwin wrote: > However, I certainly agree mutual use of the same resource (shared > memory) is a tricky world. And now that you have encouraged me to > think about races, I discovered there is indeed a race condition that > could explain this bug. I have now worked around that race (commit > 4e6932e) and please see that commit message for more commentary > concerning this type of race. Assuming I really did understand this > race, I am virtually positive my simple crude fix will deal with it > without any noticeable reduction in speed. However, time will tell about > that.
Hi Phil: I think 10 ms sleep used in the above crude workaround would likely always work going forward because it would be pretty unusual for the OS scheduler to not give a process access to the cpu for essentially 10 million instructions. Nevertheless, that argument does depend on process speed and assumptions about scheduler details so having thought a lot more about this, I would far prefer to avoid sleep workarounds for race conditions not only on these grounds but also as simply a matter of good IPC style. Therefore, I plan to turn the current two-semaphore approach into a three semaphore approach where m_wsem and m_rsem will continue to be used for the details of a complete transfer of an array of bytes, but an additional m_tsem semaphore (where "t" stands for transfer) will be used so that only one such transfer of bytes can be done at a given time. As far as I can tell, this change means I can completely drop the moveBytesReaderReversed variant of moveBytesWriter and the moveBytesWriterReversed variant of moveBytesReader which is a really nice simplification. Furthermore, I plan to rename moveBytesWriter to transmitBytes and moveBytesReader to receiveBytes where both transmitBytes and receiveBytes will be used by either of -dev wxwidgets or wxPLViewer as needed depending simply on the direction of data flow. The additional m_tsem semaphore will be initialized to 1; transmitBytes will start by calling sem_wait on that semaphore and will end by calling sem_post on that semaphore. That simple changes means if wxPLViewer uses transmitBytes to send data that is received by -dev wxwidgets with receiveBytes, and then -dev wxwidgets follows up by calling transmitBytes to send data back that is received by wxPLViewer with a call to receiveBytes, that second use of transmitBytes will be halted by the sem_wait until that first use of transmitBytes is entirely completed, i.e., any call by either side of the IPC connection to transmitBytes cannot possibly race with a previous call to that routine by either side. Anyhow, I like this pure semaphore way to avoid the race condition much more than the 10 ms sleep, and I hope to get it completely implemented tomorrow. Alan __________________________ Alan W. Irwin Astronomical research affiliation with Department of Physics and Astronomy, University of Victoria (astrowww.phys.uvic.ca). Programming affiliations with the FreeEOS equation-of-state implementation for stellar interiors (freeeos.sf.net); the Time Ephemerides project (timeephem.sf.net); PLplot scientific plotting software package (plplot.sf.net); the libLASi project (unifont.org/lasi); the Loads of Linux Links project (loll.sf.net); and the Linux Brochure Project (lbproject.sf.net). __________________________ Linux-powered Science __________________________ ------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot _______________________________________________ Plplot-devel mailing list Plplot-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/plplot-devel