forks and threads (was: Re: META.yml how to declare the need for threaded perl?)

Dr.Ruud Wed, 05 Nov 2008 00:05:46 -0800

"Christopher Brown" schreef:

> I spent some time thinking about Dr. Ruud's reply over the past
> several days.  Although I mostly share his views, I think that he has
> done a disservice to the discussion of the relative merits of
> threading and forking .  Like most things in technology different
> approaches exploits different trade-offs.  Usually not one approach
> is superior to all others in all situations.  (cf. Python vs Perl vs
> Ruby, eg. )


Hello Christopher, thanks for your comments. Yes, I pulled harder on one
side of the string than is practical.

We haven't defined what we mean by threads versus forks, their
differences
and communalities. Some see threads as "lightweight processes", other
see
the type of memory-space as the only signifcant difference.

There is a hardware and a software oriented view on forks and threads.

And then there is the Perl meaning of "thread". (which might have been
the one that was meant in the "wrong question" statement)


> Here is my opinion on the matter.
>
> Forking is a simpler and cleaner approach.   By design the programmer
> does not have to worry about sharing of data and race conditions.
> When you algorithm is limited in memory requirements and completely
> compartamentalized, this is a good choice.
> And this is very often the case.
>
> Threading is not without its place, however.  Notwithstanding, Dr.
> Ruud's argument:
>
> Most wishes for "readily-shared memory" result from (and to) bad
> design.
>
> This argument, stated without proof, makes a grand generalization.

Yes, and I hope it was taken as that.

Read also the arcticle "The problem with threads", which is linked to on
http://en.wikipedia.org/wiki/Thread_(computer_science)
"Deterministic ends should be accomplished with deterministic means."


> There are quite a few algorithms: sort, search, merge that greatily
> benefit from shared (or increased) memory.  The simplest is using a
> very large shared hash to map values from a very, very large data
> stream.  The ability of the shared memory will allow for more
> cooperative threads than forks where memory cannot be shared.

Even for those cases there is often a good and less connected parallel
solution available, where independent processes deal with considerable
parts of the stream, and come together only at the end to sync and merge
the results.
Such a solution often means that "double work" is performed, and that
significant more CPU cycles are used than would have been with
"readily-shared memory" (but still the result can be achieved earlier
and more reliable).


> It is possible to marshall, serialize, or use other IPC between
> forks, but then the balance of simpler, cleaner probably tilts toward
> a threading solution.

Threads can run simultaneously in a multiprocessor system, but are
susceptible to dead locks and to waste resources while polling.
Forks share memory (code, read-only data, copy-on-write data) and files
too.
See also http://en.wikipedia.org/wiki/Non-blocking_synchronization
about how to make multiprocessing more efficient. A lot of research
is going on in that field.

-- 
Affijn, Ruud

"Gewoon is een tijger."

forks and threads (was: Re: META.yml how to declare the need for threaded perl?)

Reply via email to