Re: minimal value types proposal

John Rose Mon, 29 Aug 2016 17:05:23 -0700

On Aug 29, 2016, at 4:17 PM, Dan Smith <daniel.sm...@oracle.com> wrote:
> 
> Some high-level feedback from me:
> 
> I think the idea is reasonable. In other circles, we might call this a 
> "milestone". Should we define a first milestone that we're willing to commit 
> to strongly, with some sort of distribution channel (something better than 
> build-your-own-JDK) and some level of support commitment to users who want to 
> get their hands dirty? Sure, absolutely.
> 
> There are some design decisions that surprise/confuse me. Basically, this is 
> me saying "YAGNI" over and over again:
> 
> 1) Automatic boxing adds tons of complexity, and I don't see the benefit. The 
> feature eliminates boilerplate and supports migration, but I'm not looking 
> for either of those in a minimal first step. We're talking about a handful of 
> value types, which can easily be defined like this:
> 
> class Val {
>    public final int i;
>    public final int j;
>    public static ValBox box(Val x) { return new ValBox(x); }
>    public static Val unbox(ValBox b) { return b.x; }
> }
> 
> class ValBox implements Foo {
>    public final Val x;
>    public ValBox(Val x) { this.x = x; }
> }
> 
> Get rid of boxes, and you can get rid of interfaces, default methods, 
> automatic conversions, constructors, …


It's worth thinking about, and Brian has encouraged me to think about it also.

Boxes (and the other stuff you mention) are so useful that removing them may 
well cause more trouble than supporting them up front.  Inside the JVM, we need 
a boxed representation for some data flows (unless we make all data flows 
radically value-safe up front).  For the user, a boxed representation is needed 
for basic debuggability.  What does println or JVMTI do unless there's a box?

I do like the idea of requiring the user to set up both classes manually, at 
first.  It has the advantage of making very clear (all too clear) the 
distinction between the Q-type and the L-type:  No source code defines both; 
the Val guy would (presumably) disable its L-type so people could not use it.  
(Maybe the JVM would use that for an internal box:  But see where that string 
leads!)  Maybe that's the way to go, if the JVM implementation of the 
single-source-class solution turns out to be difficult.

> 2) Instance methods also add tons of complexity.

I disagree; I think the incremental complexity is comparable to trying to do 
everything with statics, which is why I'm recommending this in the minimal 
model.

The only invocation paths for instance methods (and instance fields) on Q-types 
is through method handles.  Method handles treat all arguments (including 
'this') symmetrically, so any effort applied to have them work on Q-types *at 
all* will apply to 'this' parameters for Q-types.

Perhaps you are objecting to the inefficiency of operating on 'this' in the 
boxed L-type form, when the operation starts as a MH-based invocation of a 
Q-type?  That's only a startup transient; there are several tactics we can use 
to remove it.  For example, box elision (already in the JITs, though not 
value-friendly yet) would remove boxing overheads without requiring any manual 
recoding at all.

> Again, they only exist for convenience and migration. If static methods can 
> operate on value types, that's all you need. No longer necessary to deal with 
> bytecode written to operate on an L-typed 'this' and somehow re-interpret it 
> for a Q-typed 'this'. No longer necessary to deal with Object methods 
> (because no operation supports invoking them).

Convenience and migration cannot be driven to zero; that optimizes for 
"minimal" at the expense of "viable".  To preserve viability, there are at 
least a few really basic conventions, like Object.toString, that would have to 
be re-encoded using such statics.  Re-building virtuals (at least some of the) 
on top of statics has its own cost, in wasted motion and confusion.

> (If we really do want instance methods, I suggest making 'this' Q-typed to 
> begin with, not diverting resources into figuring out how to make L-typed 
> instance methods efficient.)

Making L-typed instance methods efficient is a sunk cost; it's something the 
JITs are already good at.

We can and should work towards real Q-typed 'this'.  The simplest way is what 
I'm proposing with the method handle hack.  In addition, I suggest 
experimentally modifying javac to emit two copies of non-static methods in 
value-capable classes, one with the standard bytecodes, and one as a static 
(with mangled name) which takes a Q-typed 'this' in local 0.  Then teach the 
method handle resolver to find these guys and bind them, in preference to the 
boxed-this dance.  Users can get on with their business, unaware of all of this.

> 3) The minimal feature set for basic operations -- field getters, default 
> value, withers, comparison, arrays -- is a class (e.g., ValueTypeSupport) 
> with bootstrap methods that can be called via invokedynamic. No need to touch 
> MethodHandles.Lookup, etc.

I don't think the cost of touching MH.Lookup is great, especially given that 
the MH runtime will have to be able to work with Q-types more or less 
pervasively.  I agree that all the extended lookup functionality could be 
placed on a new class (alongside findWither etc.), but I don't see any benefit 
to doing that.  Given that we are touching the MH runtime, it's better to put 
the new stuff in one place.  The new class will probably just be a wormhole 
back in to java.lang.invoke, to call non-public API points (which will 
eventually be public).

> More generally, why so much attention given to reflection? Sure, you need 
> class objects to represent all the JVM's types. But member lookup? Fields, 
> Methods, Constructors? These do not seem necessary.

Because method handles are where the functionality comes from; you need basic 
reflection in order to mention the method handles you want.  Bytecode spinning 
is not enough, since that would require us to invent a full bytecode set and 
implement it.  The MH runtime is more malleable than the JVM's interpreter, so 
we are starting with MHs.  Hence the need for MHs.

> If I squint, I can kind of see how the idea is that somebody might want to 
> write reflective code to operate on values, since they don't have language 
> support.

And they don't have bytecode support either.  The javac runtime (indy BSMs for 
vgetfield, etc.) will have to do some of this stuff too.

> But almost everything has to be boxed when using these libraries, which means 
> if you care about performance (which is why you're using this prototype), 
> you're going to be spinning bytecode to do your low-level operations.

Not completely.  The bytecode will use MHs or indy do low-level stuff.

> If this is the use case, I think a better use of resources would be to 
> surface Q types in the language.

Yes, surface them, but don't require a full set of bytecodes to operate on 
them.  That's the slow way to do it.

> 4) I don't love hacking CONSTANT_Class to encode new types, but I can 
> probably live with it. My preference is to design it the "right" way -- 
> however we envision these ultimately being expressed -- rather than this 
> intermediate step in which everybody learns to interpret some new syntax, 
> only to turn around and deprecate that syntax a little while later. (I 
> realize it's probably easier to change string formats than it is to add a new 
> constant pool form.)

Yes, that's why we are starting this way.  CONSTANT_Class CP entries get 
overloaded; a bunch of other legacy API points get overloaded.  It's an 
expedient when the data flows in and out of the APIs can be augmented more 
easily than we can invent new API points.  But (as the shady-values document 
says several times) the final API is likely to be different, and in particular 
to make the new distinctions in more principled ways.

> I don't think it's necessary to support Q types as the receivers of 
> CONSTANT_Fieldrefs and CONSTANT_Methodrefs. The receiver can be a vanilla 
> CONSTANT_Class, and the client (in this case, the 'vgetfield' API point) can 
> figure out what to do with the resolved reference.

Yes, that's one way to go.  But representing Q-types as java.lang.Class objects 
will be a sunk cost, so passing the L/Q distinction through existing data flows 
(on "overloaded" API points) is a reasonable design pattern, for a prototype.

I also think (in this case) the Lookup API will, in the long term, look 
something like the current sketch; there won't be a separate 
Lookup.findValueGetter any more than there is a separate 
Lookup.findInterfaceVirtual.

— John

Re: minimal value types proposal

Reply via email to