On 3/18/2011 7:33 PM, Jonas Drewsen wrote:
Not in Cilk style. Everything just goes to a shared queue. In theory
this could
be a bottleneck in the micro parallelism case. However, some
experimentation I
did early in the design convinced me that in practice there's so much
other
overhead involved in moving work from one processor to another (cache
misses,
needing to wake up a thread, etc.) that, in cases where a shared queue
might be a
bottleneck, the parallelism is probably too fine-grained anyhow.

I guess that work stealing could be implemented without changing the
current interface if evidence shows up that would favor work stealing?

Yes, this would be possible. However, in my experience super fine-grained parallelism is almost never needed to take full advantage of whatever hardware you're running on. Therefore, I'm hesitant to add complexity to std.parallelism to support super fine-grained parallelism, at least without strong justification in terms of real-world use cases. The one thing work stealing (and improvements to the queue in general) has going for it is that it would only make the implementation more complex, not the interface.

Reply via email to