Re: review of std.parallelism

dsimcha Sun, 20 Mar 2011 20:26:02 -0700

On 3/20/2011 10:44 PM, Michel Fortin wrote:


I don't see a problem with the above. The array elements you modify are
passed through parallel's opApply which can check easily whether it's
safe or not to pass them by ref to different threads (by checking the
element's size) and allow or disallow the operation accordingly.

It could even do a clever trick to make it safe to pass things such as
elements of array of bytes by ref (by coalescing loop iterations for all
bytes sharing the same word into one task). That might not work for
ranges which are not arrays however.

That said, feel free to suggest more problematic examples.

Ok, I completely agree in principle, though I question whether it'sworth actually implementing something like this, especially until we getsome kind of support for shared delegates.

Also, your example can be trivially modified to be safe.

void main() {
int sum = 0;
foreach (int value; taskPool.parallel([0,2,3,6,1,4,6,3,3,3,6])) {
synchronized sum += value;
}
writeln(sum);
}

In this case that kills all parallelism, but in more realistic cases I
use this pattern often. I find it very common to have an expensive
loop body can be performed in parallel, except for a tiny portion that
must update a shared data structure. I'm aware that it might be
possible, in theory, to write this more formally using reduce() or
something. However:

1. If the portion of the loop that deals with shared data is very
small (and therefore the serialization caused by the synchronized
block is negligible), it's often more efficient to only keep one data
structure in memory and update it concurrently, rather than use
stronger isolation between threads like reduce() does, and have to
maintain one data structure for each thread.

2. In my experience synchronizing on a small portion of the loop body
works very well in practice. My general philosophy is that, in a
library like this, dangerous but useful constructs must be supported
and treated as innocent until proven guilty, not the other way round.


Your second example is not really a good justification of anything. I'll
refer you to how synchronized classes work. It was decided that
synchronized in a class protects everything that is directly stored in
the class. Anything behind an indirection is considered shared by the
compiler. The implication of this is that if you have an array or a
pointer to something that you want semantically to be protected by the
class's mutex, you have to cast things to unshared. It was decided that
things should be safe against low-level races first, and convenience was
relegated as a secondary concern. I don't like it very much, but that's
what was decided and written in TDPL.

I'd go a little further. If the guarantees that shared was supposed toprovide are strong, i.e. apply no matter what threading module is used,then I utterly despise it. It's one of the worst decisions made in thedesign of D. Making things pedantically strict, so that the type systemgets in the way more than it helps, encourages the user to reflexivelycircumvent the type system without thinking hard about doing this, thusdefeating its purpose. (The alternative of always complying with whatthe type system "expects" you to do is too inflexible to even be worthconsidering.) Type systems should err on the side of accepting asuperset of what's correct and treating code as innocent until provenguilty, not the other way around. I still believe this even if some ofthe bugs it could be letting pass through might be very difficult todebug. See the discussion we had a few weeks ago about implicit integercasting and porting code to 64.

My excuse for std.parallelism is that it's pedal-to-metal parallelism,so it's more acceptable for it to be dangerous than general caseconcurrency. IMHO when you use the non-@safe parts of std.parallelism(i.e. most of the library), that's equivalent to casting away shared ina whole bunch of places. Typing "import std.parallelism;" in anon-@safe module is an explicit enough step here.

The guarantee is still preserved that, if you only use std.concurrency(D's flagship "safe" concurrency module) for multithreading and don'tcast away shared, there can be no low level data races. IMHO this isstill a substantial accomplishment in that there exists a way to dosafe, statically checkable concurrency in D, even if it's not the**only** way concurrency can be done. BTW, core.thread can also be usedto get around D's type system, not just std.parallelism. If you want tocheck that only safe concurrency is used, importing std.parallelism andcore.thread can be grepped just as easily as casting away shared.

If, on the other hand, the guarantees of shared are supposed to be weakin that they only apply to programs where only std.concurrency is usedfor multithreading, then I think strictness is the right thing to do.The whole point of std.concurrency is to give strong guarantees, but ifyou prefer more dangerous but more flexible multithreading, otherparadigms should be readily available.


Now we're in the exact same situation (except that no classes are
involved) and you're proposing that for this case we make convenience
prime over low-level race safety? To me this would be an utter lack of
coherency in the design of D's "synchronized" feature to go that route.

For the case above, wouldn't it be better to use an atomic add instead
of a synchronized block?

Yes. I've used this pattern in much more complicated cases, though,where atomic wouldn't cut it.

In this case you can make "sum" shared and have
the type system check that everything is safe (using my proposed rules).
And if your data structure is bigger, likely it'll be a synchronized
class and you won't have to resort to bypass type-system safeties inside
your loop (although you might have to inside your synchronized class,
but that's another matter).

I'm **still** totally confused about how shared is supposed to work,because I don't have a fully debugged/implemented implementation or goodexamples of stuff written in this paradigm to play around with. TDPLhelps to some degree, but for me it's a lot easier to understandsomething by actually trying to use it.

Re: review of std.parallelism

Reply via email to