[Haskell-cafe] Re: Can you do everything without shared-memory concurrency?

2008-09-10 Thread Simon Marlow

Bruce Eckel wrote:
So this is the kind of problem I keep running into. There will seem to 
be consensus that you can do everything with isolated processes message 
passing (and note here that I include Actors in this scenario even if 
their mechanism is more complex). And then someone will pipe up and say 
"well, of course, you have to have threads" and the argument is usually 
"for efficiency."


I make two observations here which I'd like comments on:

1) What good is more efficiency if the majority of programmers can never 
get it right? My position: if a programmer has to explicitly synchronize 
anywhere in the program, they'll get it wrong. This of course is a point 
of contention; I've met a number of people who say "well, I know you 
don't believe it, but *I* can write successful threaded programs." I 
used to think that, too. But now I think it's just a learning phase, and 
you aren't a reliable thread programmer until you say "it's impossible 
to get right" (yes, a conundrum).


(welcome Bruce!)

Let's back up a bit.  If the goal is just to make something go faster, 
then threads are definitely not the first tool the programmer should be 
looking at, and neither is message passing or STM.  The reason is that 
threads and mutable state inherently introduce non-determinism, and when 
you're just trying to make something go faster non-determinism is almost 
certainly unnecessary (there are problems where non-determinism helps, 
but usually not).


In Haskell, for example, we have par/seq and Strategies which are 
completely determinstic, don't require threads or mutable state, and are 
trivial to use correctly.  Now, getting good speedup is still far from 
trivial, but that's something we're actively working on.  Still, people 
are often able to get a speedup just by using a parMap or something. 
Soon we'll have Data Parallel Haskell too, which also targets the need 
for deterministic parallelism.


We make a clean distinction between Concurrency and Parallelism. 
Concurrency is a _programming paradigm_, wherein threads are used 
typically for dealing with multiple asynchronous events fromm the 
environment, or for structuring your program as a collection of 
interacting agents.  Parallelism, on the other hand, is just about 
making your programs go faster.  You shouldn't need threads to do 
parallelism, because there are no asynchronous stimuli to respond to. 
It just so happens that it's possible to run a concurrent program in 
parallel on a multiprocessor, but that's just a bonus.  I guess the main 
point I'm making is that to make your program go faster, you shouldn't 
have to make it concurrent.  Concurrent programs are hard to get right, 
parallel programs needn't be.


Cheers,
Simon

2) What if you have lots of processors? Does that change the picture 
any? That is, if you use isolated processes with message passing and you 
have as many processors as you want, do you still think you need 
shared-memory threading?


A comment on the issue of serialization -- note that any time you need 
to protect shared memory, you use some form of serialization. Even 
optimistic methods guarantee serialization, even if it happens after the 
memory is corrupted, by backing up to the uncorrupted state. The effect 
is the same; only one thread can access the shared state at a time.


On Tue, Sep 9, 2008 at 4:03 AM, Sebastian Sylvan 
<[EMAIL PROTECTED] > wrote:




On Mon, Sep 8, 2008 at 8:33 PM, Bruce Eckel <[EMAIL PROTECTED]
> wrote:

As some of you on this list may know, I have struggled to understand
concurrency, on and off for many years, but primarily in the C++ and
Java domains. As time has passed and experience has stacked up,
I have
become more convinced that while the world runs in parallel, we
think
sequentially and so shared-memory concurrency is impossible for
programmers to get right -- not only are we unable to think in
such a
way to solve the problem, the unnatural domain-cutting that
happens in
shared-memory concurrency always trips you up, especially when the
scale increases.

I think that the inclusion of threads and locks in Java was just a
knee-jerk response to solving the concurrency problem. Indeed, there
were subtle threading bugs in the system until Java 5. I personally
find the Actor model to be most attractive when talking about
threading and objects, but I don't yet know where the limitations of
Actors are.

However, I keep running across comments where people claim they
"must"
have shared memory concurrency. It's very hard for me to tell
whether
this is just because the person knows threads or if there is
truth to
it. 

 
For correctness, maybe not, for efficiency, yes definitely!
 
Imagine a pro

[Haskell-cafe] Re: Can you do everything without shared-memory concurrency?

2008-09-11 Thread Aaron Denney
On 2008-09-10, David Roundy <[EMAIL PROTECTED]> wrote:
> On Wed, Sep 10, 2008 at 03:30:50PM +0200, Jed Brown wrote:
>> On Wed 2008-09-10 09:05, David Roundy wrote:
>> > I should point out, however, that in my experience MPI programming
>> > involves deadlocks and synchronization handling that are at least as
>> > nasty as any I've run into doing shared-memory threading.
>>
>> Absolutely, avoiding deadlock is the first priority (before error
>> handling).  If you use the non-blocking interface, you have to be very
>> conscious of whether a buffer is being used or the call has completed.
>> Regardless, the API requires the programmer to maintain a very clear
>> distinction between locally owned and remote memory.
>
> Even with the blocking interface, you had subtle bugs that I found
> pretty tricky to deal with.  e.g. the reduce functions in lam3 (or was
> it lam4) at one point didn't actually manage to result in the same
> values on all nodes (with differences caused by roundoff error), which
> led to rare deadlocks, when it so happened that two nodes disagreed as
> to when a loop was completed.  Perhaps someone made the mistake of
> assuming that addition was associative, or maybe it was something
> triggered by the non-IEEE floating point we were using.  But in any
> case, it was pretty nasty.  And it was precisely the kind of bug that
> won't show up except when you're doing something like MPI where you
> are pretty much forced to assume that the same (pure!) computation has
> the same effect on each node.

Ah, okay.  I think that's a real edge case, and probably not how most
use MPI.  I've used both threads and MPI; MPI, while cumbersome, never
gave me any hard-to-debug deadlock problems.

-- 
Aaron Denney
-><-

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: Can you do everything without shared-memory concurrency?

2008-09-12 Thread Robert Greayer
--- On Fri, 9/12/08, Bruce Eckel <[EMAIL PROTECTED]> wrote:

> OK, let me throw another idea out here. When Allen Holub
> first
> explained Actors to me, he made the statement that Actors
> prevent
> deadlocks. In my subsequent understanding of them, I
> haven't seen
> anything that would disagree with that -- as long as you
> only use
> Actors and nothing else for parallelism.
> 

As I believe it is the case that you can emulate shared resources, and locks to 
control concurrent access to them, using the actor model, I can't see how this 
can be true.

rcg



  
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: Can you do everything without shared-memory concurrency?

2008-09-12 Thread Bruce Eckel
OK, let me throw another idea out here. When Allen Holub first
explained Actors to me, he made the statement that Actors prevent
deadlocks. In my subsequent understanding of them, I haven't seen
anything that would disagree with that -- as long as you only use
Actors and nothing else for parallelism.

If someone were to create a programming system where you were only
able to use Actors and nothing else for parallelism, could you do
everything using Actors? Is there anything you couldn't do?

I'm assuming again that we can throw lots of processors at a problem.

On Thu, Sep 11, 2008 at 8:17 PM, Aaron Denney <[EMAIL PROTECTED]> wrote:
> On 2008-09-10, David Roundy <[EMAIL PROTECTED]> wrote:
>> On Wed, Sep 10, 2008 at 03:30:50PM +0200, Jed Brown wrote:
>>> On Wed 2008-09-10 09:05, David Roundy wrote:
>>> > I should point out, however, that in my experience MPI programming
>>> > involves deadlocks and synchronization handling that are at least as
>>> > nasty as any I've run into doing shared-memory threading.
>>>
>>> Absolutely, avoiding deadlock is the first priority (before error
>>> handling).  If you use the non-blocking interface, you have to be very
>>> conscious of whether a buffer is being used or the call has completed.
>>> Regardless, the API requires the programmer to maintain a very clear
>>> distinction between locally owned and remote memory.
>>
>> Even with the blocking interface, you had subtle bugs that I found
>> pretty tricky to deal with.  e.g. the reduce functions in lam3 (or was
>> it lam4) at one point didn't actually manage to result in the same
>> values on all nodes (with differences caused by roundoff error), which
>> led to rare deadlocks, when it so happened that two nodes disagreed as
>> to when a loop was completed.  Perhaps someone made the mistake of
>> assuming that addition was associative, or maybe it was something
>> triggered by the non-IEEE floating point we were using.  But in any
>> case, it was pretty nasty.  And it was precisely the kind of bug that
>> won't show up except when you're doing something like MPI where you
>> are pretty much forced to assume that the same (pure!) computation has
>> the same effect on each node.
>
> Ah, okay.  I think that's a real edge case, and probably not how most
> use MPI.  I've used both threads and MPI; MPI, while cumbersome, never
> gave me any hard-to-debug deadlock problems.
>
> --
> Aaron Denney
> -><-
>
> ___
> Haskell-Cafe mailing list
> Haskell-Cafe@haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe
>



-- 
Bruce Eckel
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: Can you do everything without shared-memory concurrency?

2008-09-12 Thread Sebastian Sylvan
On Fri, Sep 12, 2008 at 4:07 PM, Bruce Eckel <[EMAIL PROTECTED]> wrote:

> OK, let me throw another idea out here. When Allen Holub first
> explained Actors to me, he made the statement that Actors prevent
> deadlocks. In my subsequent understanding of them, I haven't seen
> anything that would disagree with that -- as long as you only use
> Actors and nothing else for parallelism.


I think you need to specify what you mean by actors, because I can't see how
they would eliminate deadlocks as I understand them. Could you not write an
actor that holds a single cell mailbox (both reads and writes are blocking),
then set up two classes that shuffles values from the same two mailboxes in
the opposite direction?

-- 
Sebastian Sylvan
+44(0)7857-300802
UIN: 44640862
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe