to establish better common understanding, here are some diagrams: 

1. the two threads, and the state they share: 
http://static.mah.priv.at/public/stepgen-sharedstate.png . The dashed box 
contains the state which is critical with respect to atomic update and 
consumption.

2. the intended sequencing of events between those threads: 
http://static.mah.priv.at/public/stepgen-normal.png

3. --> under the assumption of arbitrary scheduling  <-- (i.e. not as the code 
stands now), this is one of several possible scenarios where the shared state 
from 1) becomes inconsistent as target_addval and deltalim as used by 
make-pulses do not origin from the same generation of an update-freq run: 
http://static.mah.priv.at/public/stepgen-haywire.png

4. the interaction model I propose, and whose correctness is invariant with 
respect to relative scheduling of the two threads: 
http://static.mah.priv.at/public/stepgen-atomicupdate.png - the key difference 
are: both update-freq producing, and make-pulses consuming an update tuple 
{target_addval, deltalim} are atomic; and the result of make-pulses consuming a 
tuple is held in thread-local storage, as opposed to global shared state where 
it is subject to unsynchronized modification.


It is important to understand one key aspect here: scheduling policy is used to 
ascertain that make-pulses runs uninterrupted, i.e. to completion without being 
interrupted by update-freq. For you hardware types, this is technically 'IRQ 
disabled', but this is not possible in the environment at hand so scheduling 
policy is used instead. For you software types, it is a critical region.



Also, we need to separate two aspects here, 'performance' and correctness of a 
parallel program. As far as I am concerned, we are exclusively discussing a 
correctness issue here. 

I already hear the argument 'but your solution will be slower'. Beyond my 
answer 'measuring preferred over handwaving', it is important to keep in mind:

A synchronization model which does not rely on a certain scheduling policy for 
correctness is bound to perform better with hardware around the corner, and 
both Charles and JMK already alluded to that. But more importantly, the current 
approach will likely not be possible at all in the general case of several 
cores which may not share a scheduling policy to start with. In that case the 
performance argument is completely dead, because you cannot ascertain 
correctness to start with.


- Michael


------------------------------------------------------------------------------
Subversion Kills Productivity. Get off Subversion & Make the Move to Perforce.
With Perforce, you get hassle-free workflows. Merge that actually works. 
Faster operations. Version large binaries.  Built-in WAN optimization and the
freedom to use Git, Perforce or both. Make the move to Perforce.
http://pubads.g.doubleclick.net/gampad/clk?id=122218951&iu=/4140/ostg.clktrk
_______________________________________________
Emc-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/emc-developers

Reply via email to