Re: Parallelism and Concurrency was Re: Ideas for a (nntp: message (nntp: message 18 of 20) 14 of 20) "Object-Belongs-to-Thread" threading model

nigelsandever Sun, 16 May 2010 11:35:51 -0700

On Fri, 14 May 2010 20:00:01 +0100, Daniel Ruoso - dan...@ruoso.com<+nntp+browseruk+d52dbf78bb.daniel#ruoso....@spamgourmet.com> wrote:

Em Sex, 2010-05-14 às 18:13 +0100, nigelsande...@btconnect.com escreveu:

The point I(we)'ve been trying to make is that once you have a reentrant
interpreter, and the ability to spawn one in an OS thread,

all the other bits can be built on top. But unless you have thatability,

whilst the others can be constructed, the cannot make
use of SMP. Eg. They cannot scale.


Okay, this is an important point... Having a reentrant interpreter is a
given already, The form I tried to implement that in SMOP was by using
CPS.

The idea of using green threads is just to enforce serialized access to
some data in order to avoid locking (locking with the amount of
polymorphism required by Perl 6 is too expensive), and then the regular
threads wouldn't share data between them.

Of course we could try to have the data structures thread-safe, I didn't
try to go that path since I was trying to get Perl 5 interoperability
and therefore used refcount garbage collector.

The other possibility would be to use processor affinity to force
different threads to spawn in the same processor and that way ensure
serialized access to the non-thread-safe data.

daniel

Sorry, but restricting yourself to one processor in order to avoid lockingmakes no sense at all.

It's like making one of your 4x400m relay runners run the whole race, tosave carrying and passing the baton.



There are essentially four cases to consider:

1) IO-bound coroutines running on a single CPU machine:
   Locking will have very little effect upon coroutines bound by IO.

2) CPU-bound coroutines running on a single CPU machine:
   Locking would have some effect on throughput.

But why would you parallelise CPU-bound coroutines on a single core.Even without locking, the context switching would have a negative affectupon throughput. You'd be better to just run the routines serially.


3) IO-bound coroutines running on a multi-core machine:
   Again, locking has almost no effect due to IO latency.

4) CPU-bound coroutines running on a multi-core machine:

There is no way that the absence of locking is going to compensate foronly using 1/2 of the available processor power on a 2-core orhyper-threading processor; much less only 1/4 of the current crop quad-CPUcommodity processors.

Also, basing the Perl 6 concurrency model upon what is convenient for theimplementation of SMOP:


    http://www.perlfoundation.org/perl6/index.cgi?smop

as clever as it is, and as important as that has been to the evolution ofthe Perl 6 development effort, is not a good idea.

Given it's dependency list, it seems unlikely that SMOP is going to become*the* perl interpreter in the long term. Interoperability with Perl 5 andis reference counting should not be a high priority in the decision makingprocess for defining the Perl 6 concurrency model.

Ultimately, I think your apprehensions about the costs of locking areunfounded.

If your interpreter is reentrant, then you should be able to start twocopies in two threads, without any need to add internal locking as far aslexicals are concerned, because the scoping will prevent any attempts atconcurrent access. Put simply, neither will be able to see the variableswithin the other, so even though they share the same address space, thelanguage scoping keeps them apart.


There are 3 exceptions to this:

1) Process global entities: file & directory handles, environmentvariables etc.

These are sufficiently rare that applying per entity locking is not aproblem.

Per entity (1 bit in the status word for each scalar), user spacelocking (bit test & set) requires minimal development/maintenance effortand imposes minimal runtime performance impact, in the light of theoverhead of IO anyway.


2) Explicitly shared data.

These require (internal) locks regardless of how you wrap them. Be itwith STM--which is looking less and less viable; message passing whichworks but precludes (or renders grossly inefficient), many data-parallelalgorithms. I've reached the conclusion that some provision of userspecified locking must be made available for some applications. But itdoesn't mean there isn't the scope for hiding the complexities of theunderlying (POSIX) mechanisms.


3) The tough-y: Closed-over variables.

These are tough because it exposes lexicals to sharing, but they are sonatural to use, it is hard to suggest banning their use in concurrentroutines.

However, interpreters already have to detect closed over variables inorder to 'lift' them and extend their lifetimes beyond their naturalscope. It doesn't seem it would be any harder to lift them to sharedvariable status, moving them out of the thread-local lexical pads and intothe same data-space as process globals and explicitly shared data.

My currently favoured mechanism for handling shared data, is viamessage-passing, but passing references to the shared data, rather thanthe data itself. This seems to give the reason-ability, compose-abilityand controlled access of message passing whilst retaining the efficiencyof direct, shared-state mutability. Only the code that declares the shareddata, plus any other thread it choses to send a handle to, has anyknowledge of, and therefore access to the shared state.

Effectively, allocating a shared entity returns a handle to the underlyingstate, and only the holder of that handle can access it. Such handleswould be indirect references and only usable from the thread that createsthem. When a handle is passed as a message to another thread, it istransformed into a handle usable by the recipient thread during thetransfer and the old handle becomes invalid. Attempt to use an old handleafter it has been sent result in a runtime exception.

This gives you the lock-free shared state manipulations of message passingand the efficiency of direct shared-state access.

Anyway, this probably already longer than anyone will read, so I stopthere and hope that I've convinced a few people not to give up thescalability of kernel threading, and efficiency of mutable shared-statetoo lightly. There are ways to making them usable. They just need to bethrashed out.

Buk

Re: Parallelism and Concurrency was Re: Ideas for a (nntp: message (nntp: message 18 of 20) 14 of 20) "Object-Belongs-to-Thread" threading model

Reply via email to