On 16/05/11 20:31, dm-list-haskell-c...@scs.stanford.edu wrote:
At Mon, 16 May 2011 10:56:02 +0100,
Simon Marlow wrote:

Yes, it's not actually documented as far as I know, and we should fix
that.  But if you think about it, sequential consistency is really the
only sensible policy: suppose one processor creates a heap object and
writes a reference to it in the IORef, then another processor reads the
IORef.  The writes that created the heap object must be visible to the
second processor, otherwise it will encounter uninitialised memory and
crash.  So sequential consistency is necessary to ensure concurrent
programs can't crash.

Now perhaps it's possible to have a relaxed memory model that provides
the no-crashes guarantee but still allows IORef writes to be reordered
(e.g. some kind of causal consistency).  That might be important if
there is some processor arcitecture that provides that memory model, but
as far as I know there isn't.

Actually, in your heap object example, it sounds like you only really
care about preserving program order, rather than write atomicity.
Thus, you can get away with less-than-sequential consistency and not
crash.

The x86 is an example of a relaxed memory model that provides the
no-crashes guarantee you are talking about.  Specifically, the x86
deviates from sequential consistency in two ways

   1. A load can finish before an earlier store to a different memory
      location.  [intel, Sec. 8.2.3.4]

   2. A thread can read its own writes early. [intel, 8.2.3.5]

   [Section references are to the intel architecture manual, vol 3a:
    http://www.intel.com/Assets/PDF/manual/253668.pdf]

One could imagine an implementation of IORefs that relies on the fact
that pointer writes are atomic and that program order is preserved to
avoid mutex overhead for most calls.  E.g.:

   struct IORef {
     spinlock_t lock;     /* Only ever used by atomicModifyIORef */
     HaskellValue *val;   /* Updated atomically because pointer-sized
                             writes are atomic */
   };

   HaskellValue *
   readIORef (struct IORef *ref)
   {
     return ref->val;
   }

   void
   writeIORef (struct IORef *ref, HaskellValue *val)
   {
     /* Note that if *val was initialized in the same thread, then by
      * the time another CPU sees ref->val, it will also see the
      * correct contents of *ref->val, because stores are seen in a
      * consistent order by other processors [intel, Sec. 8.2.3.7].
      *
      * If *val was initialized in a different thread, then since this
      * thread has seen it, other threads will too, because x86
      * guarantees stores are transitively visible [intel, Sec. 8.2.3.6].
      */
     ref->val = val;
   }

   /* modifyIORef is built out of readIORef and writeIORef */

   HaskellValue *
   atomicModifyIORef (Struct IORef *ref, HaskellFunction *f)
   {
     HaskellValue *result;
     spinlock_acquire (&ref->lock);

     result = modifyIORef (ref, f);

     spinlock_release (&ref->lock);
     return result;
   }

This is actually how I assumed IORefs worked.

Right, that is how IORefs work. (well, atomicModifyIORef is a bit different, but the differences aren't important here)

But then consider the
following program:

   maybePrint :: IORef Bool ->  IORef Bool ->  IO ()
   maybePrint myRef yourRef = do
     writeIORef myRef True
     yourVal<- readIORef yourRef
     unless yourVal $ putStrLn "critical section"

   main :: IO ()
   main = do
     r1<- newIORef False
     r2<- newIORef False
     forkOS $ maybePrint r1 r2
     forkOS $ maybePrint r2 r1
     threadDelay 1000000

Under sequential consistency, the string "critical section" should be
output at most once.  However, with the above IORef implementation on
x86, since a read can finish before a write to a different location,
both threads might see False for the value of yourVal.

To prevent this deviation from sequential consistency, you would need
to do something like stick an MFENCE instruction at the end of
writeIORef, and that would slow down the common case where you don't
care about sequential consistency.  In fact, I would argue that if you
care about S.C., you should either be using atomicModifyIORef or
MVars.

Good example - so it looks like we don't get full sequential consistency on x86 (actually I'd been thinking only about write ordering and forgetting that reads could be reordered around writes).

But that's bad because it means Haskell has a memory model, and we have to say what it is, or at least say that ordering is undefined.

In practice I don't think anyone actually does use IORef in this way. Typically you need at least one atomicModifyIORef somewhere, and that acts as a barrier.

As an aside, these days one sees a lot of hand-wringing over the fact
that CPU clock rates have been flat for a while and the only way to
get more performance is through parallelism.  "How are we going to
teach programmers to write concurrent code when it's so hard to write
and debug?" I've heard numerous people ask.

Haskell could be a major step in the right direction, since in the
absence of variables, it's impossible to have data races.  (You can
still have deadlock and other kinds of race condition, such as the one
in maybePrint above, if you had my definition of IORef, but data races
are by far the most pernicious concurrency problems.)  Of course, the
key to making Haskell useful in a parallel setting is that things like
the memory model have to be fully specified...

Hmm, as it happens I have rather a lot to say on this particular matter! But I've never heard anyone claim that a prerequisite to Haskell being useful as a parallel programming language is a well-defined memory model. I think there's a couple of reasons for that:

  - deterministic parallel programming models (e.g. Strategies,
    monad-par) don't care about memory models.  These are the
    first port of call for parallel programming.

  - If you have to use concurrency, then none of MVars,
    atomicModifyIORef or STM care about memory models either.

So the memory model only becomes visible when you use concurrency with shared IORefs (without atomicModifyIORef) or bare peek/poke, which is pretty rare and easily avoided.

Cheers,
        Simon

_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Reply via email to