----- Original Message ----- > From: "daniel smith" <daniel.sm...@oracle.com> > To: "valhalla-spec-experts" <valhalla-spec-experts@openjdk.java.net> > Sent: Mardi 5 Octobre 2021 01:34:37 > Subject: Addressing the full range of use cases
> When we talk about use cases for Valhalla, we've often considered a very broad > set of class abstractions that represent immutable, identity-free data. JEP > 401 > mentions varieties of integers and floats, points, dates and times, tuples, > records, subarrays, cursors, etc. However, as shorthand this broad set often > gets reduced to an example like Point or Int128, and these latter examples are > not necessarily representative of all candidate value types. yes ! > > Specifically, our favorite example classes have a property that doesn't > generalize: they'll happily accept any combination of field values as a valid > instance. (In fact, they're even happy to accept any combination of *bits* of > the appropriate length.) Many candidate primitive classes don't have this > property—the constructors do important validation work, and only certain > combinations of fields are allowed to represent valid instances. I now believe the mantra "code like a class acts as an int" is harmful. A class provides encapsulation, an int has no encapsulation, there is a mismatch. > > Related areas of concern that we've had on the radar for awhile: > > - The "all zeros is your default value" strategy forces an all-zero instance > into the class's value set, even if that doesn't make sense for the class. > Many > candidate classes have no reasonable default at all, leading naturally to wish > for "null is your default value" (or other, more exotic, strategies involving > revisiting the idea that every type has a default value). We've provided > 'P.ref' for those use sites that *need* null, but haven't provided a complete > story for value types that want it to be *their* default value, too. > > - Non-atomic heap updates can be used to create new instances that arbitrary > combine previously-validated instances' fields. There is no guarantee that the > new combination of fields is semantically valid. Again, while there's > precedent > for this with 'double' and 'long' (JLS 17.7), those are special cases that > don't generalize—any combination of double bit fields is *still a valid > double*. (This is usually described as "tearing", although JLS 17.6 has > something else in mind when it uses that word...) The language provides > 'volatile' as a use-site opt-in to atomicity, and we've toyed with a > declaration-site opt-in as well. But object integrity being "off" by default > may not be ideal. > > - Existing class types like LocalDate are both nullable and atomic. These are > useful properties to preserve during migration; nullability, in particular, is > essential for source compatibility. We've provided reference-default > declarations as a mechanism to make reference types (which have these > properties) the default, with 'P.val' as an opt-in to value types. But in > doing > so we take away the many benefits of value types by default, and force new > code > to work with the "bad name". The existing class LocalDate is not atomic per se, atomic in Java implies volatile and currently if a LocalDate field is updated in one thread, another thread may never see that update. LocalDate is currently not tearable, a QLocalDate; is tearable in case of racy code. And yes, nullablibilty is a huge compatibility issue. > > While we can provide enough knobs to accommodate all of these special cases, > we're left with a complex user model which asks class authors to make n > different choices they may not immediately grasp the consequences of, and > class > users to keep 2^n different categories straight in their heads. yes ! > > As an alternative, we've been exploring whether a simpler model is workable. > It > is becoming clear that there are (at least) two clusters of uses for value > types. The "classic" value types are like numerics -- they'll happily accept > any combination of field values as a valid instance, and the zero value is a > sensible (often the best possible) default value. They make relatively little > use of encapsulation. These are the ones that best "work like an int." The > "encapsulated" value types are those that are more like typical aggregates > ("codes like a class") -- their constructors do important validation work, and > only certain combinations of fields are allowed to represent valid instances. > These are more likely to not have valid zero values (and hence want to be > nullable). I agree. > > Some questions to consider for this approach: > > - How do we group features into clusters so that they meet the sweet spot of > user expectations and use cases while minimizing complexity? Is two clusters > the right number? Is two already too many? (And what do we call them? What > keywords best convey the intended intuitions?) Two is too many, see below. > > - If there are knobs within the clusters, what are the right defaults? E.g., > should atomicity be opt-in or opt-out? I prefer opt-in, see below. > > - What are the performance costs (or, in the other direction, performance > gains) > associated with each feature? For certain feature combinations, have we > canceled out the performance gains over identity classes (and at that point, > is > that combination even worth supporting?) Good question ... Let's me reformulate. But before, we can not that we have 3 ways of specifying primitive class features, - we can use different types, by example, Foo.val vs Foo.ref - we can have container attributes (opt-in or opt-out), by example, declaring a field volatile make it non tearable - we have runtime knobs, like an array can allow null or not. First the problem, as you said, if we have a code like the one just below, the field primFoo is flattened so primFoo.someValue is 0 bypassing the constructor. primitive class PrimFoo { PrimFoo(int someValue) { if (someValue == 0) { throw new IAE(); } this.someValue = someValue; } int someValue; } class Foo { PrimFoo primFoo; } I believe we should try to make a primitive class nullable and flattenable by default, so have one tent pole and have knobs for 2 special cases, non-nullable primitive classes (for use-cases like Complex) and non flattenable classes when stored in field/array cell (the use case "atomicity"). So a primitive class (the default): - represent the null value (initialized) with a supplementary field when stored on heap, and a supplementary register if necessary - is tearable in case of racy code (don't write racy code) - is represented by a Q-type in the bytecode for full flattening or a L-type using a pointer to be backward compatible - is represented by different java.lang.Class (one for the Q-type, the primary class and one for the L-Type, the secondary class) I think that a Q-type can be backward compatible with a L-type in the method descriptors, a Q-type should be represented as a L-type + an out-of-band bit saying that this is a Q-type so it should be loaded eagerly (like we use out-of-band attributes for the generic specialization). Obviously, the way to create a Q-type (default + with + with) is still different from a L-type (new + dup + invokespecial) so creating a Q-type instead of a L-type is not backward compatible. So the VM has to generate several method entry points for method that is annotated with the attribute saying there is Q-type in the descriptor (or override a method with such attribute). The special cases: 1) non-nullable when flattened. In believe that all primitive type should be nullable but that a user should have a knob to choose that a primitive class is non-nullable when flattened. So the VM will throw a NPE, if a field/or an array is annotated with something saying that null is not a supported value. For array, we already have that bit at runtime, i believe we should have a modifier for field saying that null is a possible value when flattened. 2) non tearable. We already support the modifier 'volatile' to say that a primitive class should be manipulated by pointer. Should we have a declaration site keyword, i don't know. It's perhaps a corner case where not using a primitive class is better. To summarize, i believe that if a primitive class is always nullable (apart some opt-in special cases), it can be backward compatible (enough) to transform all value based class to primitive class and just let the new version of javac to replace all the L-type by Q-type in the method descriptor (using an atttribute) without asking the user to think too much about it (apart if the code is racy). regards, Rémi