On Feb 7, 2020, at 2:05 PM, Brian Goetz <[email protected]> wrote: >> >> So, summary: >> >> - Yes, we should figure out how to support abstract class supertypes of >> inline classes, if only at the VM level; >> - There should be one way to declare an inline class, with a modifier >> saying which projection gets the good name; >> - Both the ref and val projections should have the same accessibility, in >> part so that the compiler can freely use inline widening/narrowing as >> convenient; >> - We would prefer to avoid duplication of the methods on both projections, >> where possible; >> - The migration case requires that, for ref-default inline classes, we >> translate so that the methods appear on the ref projection.
Abstract classes, check. User control over good name, check. Co-accessibility of both projections, check. No schema duplication, check. Methods on ref projection for migration, check. Awesome! I’m relieved that we are embracing abstract classes, because (a) the JVM processes them a little more easily than interfaces, and (b) they have fewer nit-picky limitations than interfaces (toString/equals/hashCode, package access members). Thanks, Dan and whoever else agitated for abstract classes; the JVM thanks you. I have a tiny reservation about the co-accessibility of both projections, although it’s a good principle overall. There might be cases (migration and maybe new code) where the nullable type has wider access than the inline type, where the type’s contract somehow embraces nullability to the extent that the .val projection is invisible. But we can cross that bridge when and if we come to it; I can’t think of compelling examples. > Let me flesh this out some more, since the previous mail was a bit of a > winding lead-up. > > #### Abstract class supertypes > > It is desirable, both for the migration story and for the language in > general, for inline classes to be able to extend abstract classes. There are > restrictions: no fields, no constructors, no instance initializers, no > synchronized methods. These can be checked by the compiler at compile time, > and need to be re-checked by the VM at load time. (Nitpick: The JVM *fully* checks synchronization of such things dynamically; it cannot fully check at load time. Given that, it is not a good idea to partially check for evidence of synchronization; that just creates the semblance of an invariant where one does not exist. The JVM tries hard to make static checks that actually prove things, rather than just “catch user errors”. So, please, no JVM load-time checks for synchronized methods, except *maybe* within the inline classes themselves.) > The VM folks have indicated that their preferred way to say "inline-friendly > abstract class" is to have only a no-arg constructor which is ACC_ABSTRACT. > For abstract classes that meet the inline-friendly requirement, the static > compiler can replace the default constructor we generate now with an abstract > one. The VM would have to be able to deal with subclasses doing > `invokespecial <init>` super-calls on these. More info, from a JVM perspective: In that case, and that case alone, the JVM would validly look up the superclass chain for a non-abstract <init> method, and link to that instead. This is a very special case of inheritance where a constructor is inherited and used as-is, rather than wrapped by a subclass constructor. It’s a valid operation precisely because the abstract constructor is provably a no-op. The Object constructor is the initial point of this inheritance process, and the end of the upward search. I’m leaning towards keeping that as non-abstract, both for compatibility, and as a physical landing place for the upward search past abstract constructors. For inlines, we say that the inline class constructor is required to inherit the Object constructor, with no non-abstract constructors in intervening supers, and furthermore that the JVM is allowed to omit the call to the Object constructor. This amounts to a special pleading that “everybody knows Object.<init> does nothing”. Actually in HotSpot it does something: For a class with a finalizer it registers something somewhere. But that’s precisely irrelevant to inlines. > > My current bikeshed preference for how to indicate these is to do just the > test structurally, with good error messages, and back it up with annotation > support similar to `@FunctionalInterface` that turns on stricter type > checking and documentation support. (The case we would worry about, which > stronger declaration-site indication would help with, would be: a public > accidentally-inline-friendly abstract class in one maintenance domain, > extended by an inline class in another maintenance domain, and then > subsequently the abstract class is modified to, say, add a field. This could > happen, but likely would not happen that often; we can warn users of the > risks by additionally issuing a warning on the subclass when the superclass > is not marked with the annotation.) That seems OK, even under restrictions about the effects of annotations. Annotations which cause the compiler to exit with an error don’t change the runtime semantics. And then the translation strategy can say: “I’ve got a new trick up my sleeve! If the constructor is truly empty, with just a delegating call to my super <init>, then I can express this condition as an abstract constructor, rather than some classfile boilerplate.” As a JVM person, I’m always itchy when somebody pours boilerplate into classfiles. Maybe I need to write a “boilerplate considered harmful” manifesto about classfiles and translation strategies. > #### Val and ref projections > > … (Yay!) > #### Translation -- classfiles > > A val-default inline class `C` is translated to two classfiles, `C` (val > projection) and `C$ref` (ref projection). A ref-default inline class `D` is > translated to two classfiles, `D` (ref projection) and `D$val` (val > projection), as follows: > > - The two classfiles are members of the same nest. > - The ref projection is a sealed abstract class that permits only the val > projection. > - Instance fields are lifted onto the val projection. > - Supertypes, methods (including static methods, and including the static > "constructor" factory), and static fields are lifted onto the ref projection. > Method bodies may internally require downcasting to `C.val` to access > fields. This is a little like MVT, in that inline classes end up containing very little other than fields. This is the right move, IMO, for migrated classes. Hollowing out *all* inline classes strikes me as over-rotation for the sake of migration. I see how it allows both cases to have the same translation strategy, *except for the name*. That’s a pleasing property on paper. Maybe I can get used to it, but I’m uncomfortable with loading everything into the ref class even in the val-default case. I’d prefer (if consistency were not an issue) to make the ref class be completely empty (except for an abstract constructor), just like a marker interface, for the common case of a val-default. In the case of reflection, I think we can afford to show a consistent view for both kinds of inlines, by making all fields and methods appear on both projections. In other words, core reflection doesn’t require you to hunt around through both projections to find some API point; all API points are present on both projections. Does anybody see a downside to that? If we put API points just where they appear in the classfile, then people have to hunt around, which is bad, since it’s a translation strategy option which might conceivably change. If we put API points only on the val projection, legacy code will fail for migrated classes. If we put API points only on the ref projection, then users of val-default classes will be always fumbling around to fetch the ref projection when they reflect API points. So reflecting everything in both places looks OK to me. If we support non-sealed abstract supers of inlines (records!) then the hack on core reflection should copy the API points from the inline *only* if the super is sealed uniquely to the inline. > #### Translation -- uses > > Variables of type `C.ref` are translated as L types (`LC` or `LC$ref`, > depending); variables of type `C.val` are translated as Q types (`QC` or > `QC$val`, depending.) The Q-descriptor gives a necessary and sufficient signal to the JVM to load the inline class and determine its layout. The JVM is free to reject QR; where R fails to be an inline class, and the JVM is free to treat LV;, where V is an inline class, as an ill-defined descriptor, like L__noSuchClass;. (Note that the JVM does not *reject* ill-defined descriptors; it’s physically impossible, except in special cases like the resolution of C_MethodType. Resolving a C_MT of ()LV; should fail, though, if V is an inline. It was the compiler’s responsibility to say ()QV; in such a case.) > `C.val` is widened to `C.ref` by direct assignment, since in the VM, an > inline class is related to its supertypes by subtyping. `C.ref` is narrowed > to `C.val` by casting, which the VM can optimize to a null check. +1 > Instance field accesses on `C.val` are translated to `getfield`; field > accesses on `C.ref` are translated by casting to `C.val` and `getfield`. +1 Construction requests on either type are translated to calls to a factory C.val::<init>. > Method invocation on `C.val` or `C.ref` can be translated directly, except > for private methods, which would require casting `C.val` to `C.ref` first > (not because they are inaccessible, but because they are not inherited.) > Same for static fields. +0.5; I see this is a place where consistency pays off, just a little, but I’m still annoyed that the ref class gets all the members except fields and constructors. If we flip the other way, then it’s like this: Method invocation on `C.val` or `C.ref` can be translated directly, except for private methods, which may require casting first *to the class holding the method* (not because they are inaccessible, but because they are not inherited.) Same for static fields. This to say “the class holding the method” instead of “the C.ref”, we preserve the immediate goal of supporting migration of Optional etc., but we incur some migration debt, because it’s harder to move from val-default to ref-default. This, I think, is best fixed by adding auto-bridging of some sort later, rather than over-rotating towards the migration case right now. (Did I miss some other reason for putting everything on C.ref?) > Conversion of `C.ref` to supertypes is ordinary subtyping; conversion of > `C.val` goes through widening to `C.ref`. Similarly, `instanceof` on an > operand of type `C.val` goes through casting to `C.ref`. Casting (actually, unboxing) conversion of C.ref to C.val is a regular checkcast. Conversion (via cast or anything else) of C.val to C.ref is a no-op. Instanceof never needs a checkcast, because the JVM treats the operand of instanceof as an untyped reference; there’s nothing new here for instanceof. > There are other stackings, of course, but this is a starting point, chosen > for simplicity and compatibility. I like it, very very much, with the one reservation harped on above. — John
