"Fence-Free Work Stealing on Bounded TSO Processors" ASPLOS '14
https://www.cs.technion.ac.il/people/mad/online-publications/asplos2014-ffwsq.pdf

Abstract
Work stealing is the method of choice for load balancing
in task parallel programming languages and frameworks.
Yet despite considerable effort invested in optimizing work
stealing task queues, existing algorithms issue a costly memory
fence when removing a task, and these fences are believed
to be necessary for correctness.
This paper refutes this belief, demonstrating work stealing
algorithms in which a worker does not issue a memory
fence for microarchitectures with a bounded total store ordering
(TSO) memory model. Bounded TSO is a novel restriction
of TSO – capturing mainstream x86 and SPARC
TSO processors – that bounds the number of stores a load
can be reordered with.
Our algorithms eliminate the memory fence penalty, improving
the running time of a suite of parallel benchmarks
on modern x86 multicore processors by 7%􀀀11% on average
(and up to 23%), compared to the Cilk and Chase-Lev
work stealing queues.

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"Scalable Synchronization Algorithms" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/lock-free/CALV1V4-F-HdC4qpq%2BRkK4sfetD8CGpRNQQG%3DWzAV%2ByfE7EeUmg%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.

Reply via email to