Stephan Diestelhorst writes:
>
> Am Dienstag, 25. November 2014, 11:15:36 schrieb Hans Boehm:
> > I'm no hardware architect, but fundamentally it seems to me that
> >
> > load x
> > acquire_fence
> >
> > imposes a much more stringent constraint than
> >
> > load_acquire x
> >
> > Consider the case in which the load from x is an L1 hit, but a preceding
> > load (from say y) is a long-latency miss.  If we enforce
> ordering by just
> > waiting for completion of prior operation, the former has to
> wait for the
> > load from y to complete; while the latter doesn't.  I find it hard to
> > believe that this doesn't leave an appreciable amount of
> performance on the
> > table, at least for some interesting microarchitectures.
>
> I agree, Hans, that this is a reasonable assumption.  Load_acquire x
> does allow roach motel, whereas the acquire fence does not.
>
> >  In addition, for better or worse, fencing requirements on at least
> >  Power are actually driven as much by store atomicity issues, as by
> >  the ordering issues discussed in the cookbook.  This was not
> >  understood in 2005, and unfortunately doesn't seem to be amenable to
> >  the kind of straightforward explanation as in Doug's cookbook.
>
> Coming from a strongly ordered architecture to a weakly ordered one
> myself, I also needed some mental adjustment about store (multi-copy)
> atomicity.  I can imagine others will be unaware of this difference,
> too, even in 2014.

Sorry I'm missing the connection between fences and multi-copy atomicity.

David

> Stephan
>
> _______________________________________________
> Concurrency-interest mailing list
> concurrency-inter...@cs.oswego.edu
> http://cs.oswego.edu/mailman/listinfo/concurrency-interest

Reply via email to