On Mon, Dec 1, 2014 at 1:51 PM, Hans Boehm <bo...@acm.org> wrote:
> Needless to say, I would clearly also like to see a simple correspondence.
>
> But this does raise the interesting question of whether put/get and
> store(..., memory_order_relaxed)/load(memory_order_relaxed) are intended to
> have similar semantics.  I would guess not, in that the former don't satisfy
> coherence; accesses to the same variable can be reordered as for normal
> variable accesses, while the C++11/C11 variants do provide those guarantees.
> On most, but not all, architectures that's entirely a compiler issue; the
> hardware claims to provide that guarantee.
>
> This affects, for example, whether a variable that is only ever incremented
> by one thread can appear to another thread to decrease in value.  Or if a
> reference set to a non-null value exactly once can appear to change back to
> null after appearing non-null.  In my opinion, it makes sense to always
> provide coherence for atomics, since the overhead is small, and so are the
> odds of getting code relying on non-coherent racing accesses correct.  But
> for ordinary variables whose accesses are not intended to race the
> trade-offs are very different.

It would be nice to pretend that ordinary java loads and stores map
perfectly to C11 relaxed loads and stores.  This maps well to the lack
of undefined behavior for data races in Java.  But this fails also
with lack of atomicity of Java longs and doubles.  I have no intuition
as to whether always requiring per-variable sequential consistency
would be a performance problem.  Introducing an explicit relaxed
memory order mode in Java when the distinction between ordinary access
is smaller than in C/C++ 11 would be confusing.

Despite all that, it would be clean, consistent and seemingly
straightforward to simply add all of the C/C++ atomic loads, stores
and fences to sun.misc.Unsafe (with the possible exception of consume,
which is still under a cloud).  If that works out for jdk-internal
code, we can add them to a public API.  Providing the full set will
help with interoperability with C code running in another thread
accessing a direct buffer.

Reply via email to