Re: User model stacking: current status

Brian Goetz Mon, 13 Jun 2022 16:36:06 -0700

I've done a little more shaking of this tree. It involves keeping thenotion that the non-identity buckets differ only in the treatment oftheir val projection, but makes a further normalization that enables thebuckets to mostly collapse away.


"value class X" means:


 - Instances are identity-free

- There are two types, X.ref (reference, nullable) and X.val (direct,non-nullable)

 - Reference types are atomic, as always
 - X is an alias for X.ref

Now, what is the essence of B2? B2 means not "I hate zeros", but "Idon't like that uninitialized variables are initialized to zero." Itdoesn't mean the .val projection is meaningless, it means that we don'ttrust arbitrary clients with it. So, we can make a slight adjustment:

- The .val type is always there, but for "B2" classes, it is*inaccessible outside the nest*, as per ordinary accessibility.

This means that within the nest, code that understands the restrictionscan, say, create `new X.val[7]` and expose it as an `X[]`, as long as itdoesn't let the zero escape. This gives B2 classes a lot more latitudeto use the .val type in safe ways. Basically: if you don't trust peoplewith the .val type, don't let the val type escape.


There's a bikeshed to paint, but it might look something like:

    value class B2 {
        private class val { }
    }

or, flipping the default:

    value class B3a {
        public class val { }
    }

So B2 is really a B3a whose value projection is encapsulated.

The other bucket, B3n, I think can live with a modifier:

    non-atomic value class B3n { }

While these are all the same buckets as before, this feels much morelike "one new bucket" (the `non-atomic` modifier is like `volatile` on afield; we don't think of this as creating a different bucket of fields.)


Summary:

    class B1 { }
    value class B2 { private class val { } }
    value class B3a { }
    non-atomic value class B3n { }

Value class here is clearly the star of the show; all value classes aretreated uniformly (ref-default, have a val); some value classesencapsulate the val type; some value classes further relax the integrityrequirements of instances on the heap, to get better flattening andperformance, when their semantics don't require it.

It's an orthogonal choice whether the default is "val is private" and"val is public".




On 6/3/2022 3:14 PM, Brian Goetz wrote:

Continuing to shake this tree.
I'm glad we went through the exploration of "flattenable B3.ref";while I think we probably could address the challenges of tearingacross the null channel / data channels boundary, I'm pretty willingto let this one go. Similarly I'm glad we went through the "atomicityorthogonal to buckets" exploration, and am ready to let that one go too.
What I'm not willing to let go of us making atomicity explicit in themodel. Not only is piggybacking non-atomicity on something likeval-ness too subtle and surprising, but non-atomicity seems like it isa property that the class author needs to ask for. Flatness is animportant benefit, but only when it doesn't get in the way of safety.
Recall that we have three different representation techniques:

 - no-flat -- use a pointer
- low-flat -- for sufficiently small (depending on size of atomicinstructions provided by the hardware) values, pack multiple fieldsinto a single, atomically accessed unit. - full-flat -- flatten the layout, access individual individualfields directly, may allow tearing.
The "low-flat" bucket got some attention recently when we discoveredthat there are usable 128-bit atomics on Intel (based on a recentrevision of the chip spec), but this is not a slam-dunk; it requiressome serious compiler heroics to pack multiple values into singleaccesses. But there may be targets of opportunity here forsingle-field values (like Optional) or final fields. And we canalways fall back to no-flat whenever the VM feels like it.
One of the questions that has been raised is how similar B3.ref is toB2, specifically with respect to atomicity. We've gone back and forthon this.
Having shaken the tree quite a bit, what feels like the low energystate to me right now is:
- The ref type of all on-identity classes are treated uniformly;B3.ref and B2.ref are translated the same, treated the same, have thesame atomicity, the same nullity, etc. - The only difference across the spectrum of non-identity classes isthe treatment of the val type. For B2, this means the val type is*illegal*; for B3, this means it is atomic; for B3n, it is non-atomic(which in practice will mean more flatness.) - (controversial) For all types, the ref type is the default. Thismeans that some current value-based classes can migrate not only toB2, but to B3 or B3n. (And that we could migrate to B2 today andfurther to B3 tomorrow.)
While this is technically four flavors, I don't think it needs to feelthat complex. I'll pick some obviously silly modifiers for exposition:
 - class B1 { }
 - zero-hostile value class B2 { }
 - value class B3 { }
 - tearing-happy value class B3n { }
In other words: one new concept ("value class"), with twosub-modifiers (zero-hostile, and tearing-happy) which affect thebehavior of the val type (forbidden for B2, loosened for B3n.)
For heap flattening, what this gets us is:

 - B1 -- no-flat
 - B2, B3.ref, B3n.ref -- low-flat atomic (with null channel)
 - B3 -- low-flat (atomic, no null channel)
 - B3n -- full-flat (non-atomic, no null channel)
This is a slight departure from earlier tree-shakings with respect totearing. In particular, refs do not tear at all, so programs that useall refs will never see tearing (but it is still possible to get atorn value using .val and then box that into a ref.)
If you turn this around, the declaration-site decision tree becomes:

 - Do I need identity (mutability, subclassing, aliasing)? Then B1.
 - Are uninitialized values unacceptable?  Then B2.
 - Am I willing to tolerate tearing to enable more flattening?  Then B3n.
 - Otherwise, B3.

And the use-site decision tree becomes:

 - For B1, B2 -- no choices to make.
 - Do I need nullity?  Then .ref
- Do I need atomicity, and the class doesn't already provide it? Then .ref
 - Otherwise, can use .val
The main downside of making ref the default is that people willgrumble about having to say .val at the use site all the time. Andthey will! And it does feel a little odd that you have to opt intoval-ness at both the declaration and use sites. But it unlocks a lotof things (see Kevin's list for more):
 - The default name is the safest version.
- Every unadorned name works the same way; it's always a referencetype. You don't need to maintain a mental database around "which kindof name is this". - Migration from B1 -> B2 -> B3 is possible. This is huge (and morethan we had hoped for when we started this game.)
(The one thing to still worry about is that while refs can't tear, youcan still observe a torn value through a ref, if someone tore it andthen boxed it. I don't see how we defend against this, but thenon-atomic label should be enough of a warning.)
On 5/6/2022 10:04 AM, Brian Goetz wrote:
In this model, (non-atomic B3).ref takes the place of (non-atomic B2)in the stacking I've been discussing. Is that what you're saying?
    class B1 { }  // ref, identity, atomic
    value-based class B2 { }  // ref, non-identity, atomic
[ non-atomic ] value class B3 { } // ref or val, zero is ok,both projections share atomicity
If we go with ref-default, then this is a small leap from yesterday'sstacking, because "B3" and "B2" are both reference types, so if youwant a tearable, non-atomic reference type, saying `non-atomic valueclass B3` and then just using B3 gets you that. Then:
 - B2 is like B1, minus identity
- B3 means "uninitialized values are OK, you get two types, azero-default and a non-default" - Non-atomicity is an extra property we can add to B3, to get moreflattening in exchange for less integrity - The use cases for non-atomic B2 are served by non-atomic B3 (when.ref is the default)
I think this still has the properties I want; I can freely choose thereasonable subsets of { identity, has-zero, nullable, atomicity }that I want; the orthogonality of non-atomic across buckets becomesorthogonality of non-atomic with nullity, and the "B3.ref is justlike B2" is shown to be the "false friend."

Re: User model stacking: current status

Reply via email to