On Jan 14, 2020, at 9:11 AM, Doug Lea <[email protected]> wrote: > > On 1/13/20 4:44 PM, Tobi Ajila wrote: >> Hi John >> >> Given that inline types can be flattened there is a possibility that >> data races will occur in places where users were not expecting it >> before. So your `__AlwaysAtomic` modifier is a necessary tool as the >> existing spec will only enforce atomicity for 32bit primitives and >> references. I just want to confirm if the intention of the >> __AlwaysAtomic bit on an inline class is only to ensure atomic reads and >> writes of inline types and that there are no happens-before ordering >> expectations as there are with the existing volatile modifier on fields. >> > > In which case "__AlwaysOpaque" would be a more accurate term.
Very interesting! I guess this is the most relevant definition of opaque: http://gee.cs.oswego.edu/dl/html/j9mm.html#opaquesec Doug, in honor of one of your pet expressions, I would have preferred to spell this keyword “not-too-tearable”, but that’s truly an opaque phrase. OK, so the above document defines a nice linear scale of four memory access modes, ordered from weak to strong: Plain, Opaque, Release/Acquire, and Volatile. “Any guaranteed property of a weaker mode, plus more, holds for a stronger mode.” For a JIT writer, this means stronger modes will require additional ordering constraints, in IR and/or as hardware fence instructions, and perhaps also stronger memory access instructions. In the worst case (which we may see with inline types) library calls may be required to perform some accesses — plus the space overhead of control variables for things like seq-locks or mutexes. The effect of the Plain mode on atomicity is described here: > Additionally, while Java Plain accesses to int, char, short, float, byte, and > reference types are primitively bitwise atomic, for the others, long, double, > as well as compound Value Types planned for future JDK releases, it is > possible for a racy read to return a value with some bits from a write by one > thread, and other bits from another, with unusable results. Then, Opaque mode tightens up the behavior of Plain mode by adding Bitwise Atomicity (what I want here), plus three more guarantees: Per-variable antecedence acyclicity, Coherence, and Progress. The document then suggests that these three more guarantees won’t inconvenience the JIT writer: > Opaque mode does not directly impose any ordering constraints with respect to > other variables beyond Plain mode. But I think there will might be inconveniences. Our current prototype doesn’t mess with STM or HTM, but just buffers every new value (under always-atomic or volatile) into a freshly allocated heap node, issues a Release fence, and publishes the node reference into the relevant 64-bit variable. The node reference itself is stored in Plain (Relaxed) mode, not Opaque or Release mode, and subsequent loads are also relaxed (no IR or HW fences). What we are doing with this buffering trick is meeting the requirements of atomicity by using the previously-specified mechanisms for safe publication (of regular identity classes with final instance variables). In order to use this trick correctly we need to ensure that the specified behavior of the always-atomic store does not make additional requirements. When I look at the HotSpot code, I find that, if I were to classify loads and stores of always-atomic as always-Opaque, I would find myself adding more IR constraints than if I simply use the trick of buffering for safe publication. Maybe HotSpot is doing some overkill on Opaque mode (see notes below for evidence of that) but I can’t help thinking that at least the requirement of Progress (for Opaque) will require the loop optimizer to take special care with always-Opaque variables that it would not have to take with merely always-atomic ones. This is a round-about way of say, “really Opaque? Why not just atomic?” If I take always-Opaque as the definition I can use a clearly defined category in upcoming JMM revisions (good!) but OTOH I get knock on requirements (slow-downs) from that same category (bad!). It’s not right to say, “but always-atomic values will *always* be *slow* as well, so quit complaining about lost optimizations”. That’s because the JVM will often pack small always-atomic values into single memory units (64- or 128-bit, whatever the hardware supports with native atomicity). In such cases, Plain order has a real performance benefit relative to Opaque order, yes? So, in the end, I’d like to call it always-atomic, and leave Opaque mode as an additional opt-in for these types. — John P.S. More background, FTR: Our intention with always-atomic types is to guarantee a modest extension of type safety, that combinations of field values which appear from memory reads will never be different from combinations that have been created by constructor code (or else they are the default combination). This appeal to constructors extends type safety in the sense that the inline class is able to exclude from its value set some of the composite values that would otherwise appear if the composite were an unconstrained tuple type (aka. direct product). If the class has the power to absolutely constrain its value set in this way, any type safety properties that depend on exclusion of values can be proven, while if tearing is allowed, type safety proofs must confront all values physically possible to the corresponding tuple type. In particular, if an inline type implements a var-handle holding the components of an unsafe addressing mode (Object, long), tearing could introduce arbitrary combinations of (previously stored) Object and long values, breaking delicate type safety invariants. Such inlines need to be marked always-atomic to exclude tearing. This concern is separate from any other control over memory ordering and race exclusion. In particular, the JVM will (by default) freely reorder reads and writes of always-atomic inlines like any other Plain mode operations, as if the inline were a naturally atomic value (like an int or reference). The semantics will be “as if” the inline value were actually represented as a reference to an all-final box-like object. We call such an object a “buffer” to distinguish it from a user-visible “box” or “wrapper object”. The point is not to always require memory allocations when storing an always-atomic value (though such allocations are a valid tactic). The point is to align atomicity constraints with the safe-publication rules enjoyed by non-inline identity objects, whether or not actual memory buffers are created. In that case, why not just make inline values safe by default, by analogy with safe publication of their (all-final) indirect cousines? The problem is that safe-publication is likely to be more expensive for inlines than for identity objects, so it needs to be an opt-in feature. Hence the very special always-atomic modifier. (Thought exercise: What would always-atomic mean if applied to stateful identity objects? I think it would mean that updates to the object would become transactional, as if the whole object state were a single always-atomic tuple. I suppose every method body would be treated as a transaction, probably under an N readers / 1 writer discipline. As with inlines, such a transaction would be enforced by whatever tricks the JVM could muster, maybe TSX on Intel, or STM, or hidden buffering.) To finish the analogy with naturally atomic types (int, reference), an inline can be accessed in Opaque, Release/Acquire, or Volatile mode by means of appropriate var-handle operations (or the equivalent). Such accesses will be subject to stronger ordering constraints. They will also be atomic whether or not the inline type was declared always-atomic. BTW some code in HotSpot refers to Plain mode as Relaxed. Both the JVM and the Unsafe API work in the expected way with stores that Release and reads that Acquire. Confusingly, other HotSpot code assigns the term RELAXED to Opaque, using UNORDERED for Plain. Naming is hard. Also, the HotSpot code treats RELAXED/Opaque as stronger not weaker than VOLATILE. Not everyone is on the same page yet. So the current state of the HotSpot code, Opaque (aka. RELAXED > VOLATILE) does in fact add IR ordering constraints where Plain mode does not.
