On 14-11-2012 16:08, Andrei Alexandrescu wrote:
On 11/14/12 6:39 AM, Alex Rønne Petersen wrote:
On 14-11-2012 15:14, Andrei Alexandrescu wrote:
On 11/14/12 1:19 AM, Walter Bright wrote:
On 11/13/2012 11:56 PM, Jonathan M Davis wrote:
Being able to have double-checked locking work would be valuable, and
having
memory barriers would reduce race condition weirdness when locks
aren't used
properly, so I think that it would be desirable to have memory
barriers.

I'm not saying "memory barriers are bad". I'm saying that having the
compiler blindly insert them for shared reads/writes is far from the
right way to do it.

Let's not hasten. That works for Java and C#, and is allowed in C++.

Andrei



I need some clarification here: By memory barrier, do you mean x86's
mfence, sfence, and lfence?

Sorry, I was imprecise. We need to (a) define intrinsics for loading and
storing data with high-level semantics (a short list: acquire, release,
acquire+release, and sequentially-consistent) and THEN (b) implement the
needed code generation appropriately for each architecture. Indeed on
x86 there is little need to insert fence instructions, BUT there is a
definite need for the compiler to prevent certain reorderings. That's
why implementing shared data operations (whether implicit or explicit)
as sheer library code is NOT possible.

Let's continue this part of the discussion in my other reply (the one explaining how core.atomic is implemented in the various compilers).


Because as Walter said, inserting those blindly when unnecessary can
lead to terrible performance because it practically murders
pipelining.

I think at this point we need to develop a better understanding of
what's going on before issuing assessments.

I dunno. On low-end architectures like ARM the out-of-order processing is pretty much what makes them usable at all because they don't have the raw power that x86 does (I even recall an ARM Holdings executive saying that they couldn't possibly switch to a strong memory model with an in-order pipeline without severely reducing the efficiency of ARM). So I'm just putting that out there - it's definitely worth taking into consideration because very few architectures are actually fully in-order like x86.



Andrei

--
Alex Rønne Petersen
a...@lycus.org
http://lycus.org

Reply via email to