Re: Evolving instance creation

2022-03-01 Thread Dan Smith
On Mar 1, 2022, at 6:56 AM, Kevin Bourrillion 
mailto:kev...@google.com>> wrote:

The main thing I think CICEs/`new` accomplish is simply to "cross the bridge". 
Constructors are void and non-static; yet somehow we need to call them as if 
they're static and non-void! `new` gets us across that gap. This seems to me 
like a special-snowflake problem that `new` is custom built to address, and I 
would hope we keep it.

Okay. So support for 'new Point()' (1) over just 'Point()' (3) on the basis 
that constructor declarations need special magic to enter the context where the 
constructor body lives. So as long as we're declaring value class constructors 
in the same way as identity class constructors, it makes sense for both to have 
the same invocation syntax, and for that syntax to be somehow different from a 
method invocation.

I suppose (3) envisions this magic happening invisibly, as part of the 
instantiation API provided by the class—there's some magic under the covers 
where a bridge/factory-like entity gets invoked and sets up the context for the 
constructor body. But I agree that it's probably better not to have to appeal 
to something invisible when people are already used to the magic being explicit.

A couple more minor points about the factories idea:

A related, possibly-overlapping new Java feature idea (not concretely proposed, 
but something the language might want in the future) is the declaration of 
canonical factory methods in a class, which intentionally *don't* promise 
unique instances (for example, they might implement interning). These factories 
would be like constructors in that they wouldn't have a unique method name, but 
otherwise would behave like ad hoc static factory methods—take some arguments, 
use them to create/locate an appropriate instance, return it.

Can you clarify what these offer that static methods don't already provide? The 
two weaknesses I'm aware of with static factory methods are (1) subclasses 
still need a constructor to call and (2) often you don't really want the burden 
of naming them, you just want them to look like the obvious standard creation 
path. It sounds like this addresses (2) but not (1), and I assume also 
addresses some (3).

A couple of things:

- If it's canonical, everybody knows where to find it. APIs like reflection and 
tools like serialization can create instances through a universally-recognized 
mechanism (but one that is more flexible than constructors).

- In a similar vein, if JVMS can count on instantiation being supported by a 
canonical method name, then this approach can subsume existing uses of 
'new/dup/', which are a major source of complexity. This is a very long 
game, but the idea is that eventually the old mechanism (specifically, use of 
the 'new' bytecode outside of the class being instantiated) could be deprecated.

(2) 'new Foo()' as a general-purpose creation tool

In this approach, 'new Foo()' is the use-site syntax for *both* factory and 
constructor invocation. Factories and constructors live in the same overload 
resolution "namespace", and all will be considered by the use site.

It sounds to me like these factories would be static, so `new` would not be 
required by the "cross the bridge" interpretation given above.

Right. This approach gives up the use-site/declaration-site alignment, instead 
interpreting 'new' as "make me one of these, using whatever mechanism the class 
provides".


Re: Evolving instance creation

2022-03-01 Thread Kevin Bourrillion
Seems like this decision is trending in the direction I'd prefer already,
but here's some argumentation that *might* be helpful from the
programming-model perspective.


On Tue, Feb 22, 2022 at 1:17 PM Dan Smith  wrote:

One of the longstanding properties of class instance creation expressions
> ('new Foo()') is that the instance being produced is unique—that is, not
> '==' to any previously-created instance.
>

I'll argue that this is an incidental association only.

Note that `new` has simply never been *needed* for identityless types
before; for those (all 8 of them), literals and binary expressions and
things had us covered. So `new` has so far seemed associated with
identityful types. But I think the expectation quoted here clearly comes
from the identity-type-ness, not from the `new` keyword. If we use `new`
with identityless objects or values, a distinct-identity expectation simply
doesn't apply.

Plus as Remi says, changing to that type changes `==` into a different
operator with the same name. So I think that this:


new Point(1, 2) == new Point(1, 2) // always true
>

 is entirely *un*problematic!

Dan H. says, "`new` carries the mental model of allocating space" -- again,
I think it's incidental. Because the *point of introducing *identityless
types is that the distinction between creating and reusing (summoning from
the either somehow) vanishes. We shouldn't be able to distinguish those
cases.

As Dan S. says later,

I'd rather have programmers think in these terms: when you instantiate a
> value class, you might get an object that already exists. Whether there are
> copies of that object at different memory locations or not is
> irrelevant—it's still *the same object*.


But my reaction is: then to the programming model it *might as well just
look like creation*. It can't really "look like reusing" without forcing
the question of "reusing what from where?". It can only just look
*different*. But I don't think it needs to. (I think this is what Dan H
ends up supporting too.)

The main thing I think CICEs/`new` accomplish is simply to "cross the
bridge". Constructors are void and non-static; yet somehow we need to
*call* them
as if they're static and non-void! `new` gets us across that gap. This
seems to me like a special-snowflake problem that `new` is custom built to
address, and I would hope we keep it.

Seen this way, it's essential that this bridge-crossing happens *somewhere*,
but it doesn't necessarily mean constructors need to be spiffy public API.
It could be a dirty secret we hide within our static factory methods. And
we often do this, because public constructors *aren't* spiffy; they can't
have names, relying purely on argument types and order to disambiguate, and
they weirdly promise the caller never to return a subtype even though a
caller should never even care about that (because substitutability
principle).


Here are three approaches that I could imagine pursuing:
>
> (1) Value classes are a special case for 'new Foo()'
>
> This is the plan of record: the unique instance invariant continues to
> hold for 'new Foo()' where Foo is an identity class, but if Foo is a value
> class, you might get an existing instance.
>

(I've argued above it's not even a special case.)


Biggest concerns: for now, it can be surprising that 'new' doesn't always
> give you a unique instance.


This is the best kind of surprise! Because grappling with it points you
directly toward understanding what identityless classes are all about.


A couple more minor points about the factories idea:


> A related, possibly-overlapping new Java feature idea (not concretely
> proposed, but something the language might want in the future) is the
> declaration of canonical factory methods in a class, which intentionally
> *don't* promise unique instances (for example, they might implement
> interning). These factories would be like constructors in that they
> wouldn't have a unique method name, but otherwise would behave like ad hoc
> static factory methods—take some arguments, use them to create/locate an
> appropriate instance, return it.
>

Can you clarify what these offer that static methods don't already provide?
The two weaknesses I'm aware of with static factory methods are (1)
subclasses still need a constructor to call and (2) often you don't really
want the burden of naming them, you just want them to look like the obvious
standard creation path. It sounds like this addresses (2) but not (1), and
I assume also addresses some (3).



> (2) 'new Foo()' as a general-purpose creation tool
>
> In this approach, 'new Foo()' is the use-site syntax for *both* factory
> and constructor invocation. Factories and constructors live in the same
> overload resolution "namespace", and all will be considered by the use site.
>

It sounds to me like these factories would be static, so `new` would not be
required by the "cross the bridge" interpretation given above.


On Thu, Feb 24, 2022 at 7:40 AM Dan Heidinga  

Re: Evolving instance creation

2022-02-24 Thread Brian Goetz
I find DanH's way of presenting it more natural (and makes perfect sense 
now that its been said that way): it *is* allocating something, just not 
in the heap.  It is requesting new storage for a new object, which might 
be in the heap, or the stack, or registers. And we might find that new 
object to be == to an old object, but we're still requesting that space 
for a new object be allocated.



"new" always creates a new instance but in case of value types, == does not 
allow us see if the instance are different or not.

I'm not sure this is a good way to think value creation, though. It suggests 
that there still *is* an identity there (i.e., the new value has been newly 
allocated), you just can't see it.

I'd rather have programmers think in these terms: when you instantiate a value 
class, you might get an object that already exists. Whether there are copies of 
that object at different memory locations or not is irrelevant—it's still *the 
same object*.




Re: Evolving instance creation

2022-02-24 Thread Dan Smith
> On Feb 24, 2022, at 8:47 AM, Remi Forax  wrote:
> 
> - Original Message -
>> From: "Dan Heidinga" 
>> To: "daniel smith" 
>> Cc: "valhalla-spec-experts" 
>> Sent: Thursday, February 24, 2022 4:39:52 PM
>> Subject: Re: Evolving instance creation
> 
>> Repeating what I said in the EG meeting:
>> 
>> * "new" carries the mental model of allocating space.  For identity
>> objects, that's on the heap.  For values, that may just be stack space
>> / registers.  But it indicates that some kind of allocation / demand
>> for new storage has occurred.
>> 
>> * It's important that "new" returns a unique instance.  That invariant
>> has existed since Java's inception and we should be careful about
>> breaking it.  In the case of values, two identical values can't be
>> differentiated so I think we're safe to say they are unique but
>> indistinguishable as no user program can differentiate them.
> 
> Yes, it's more about == being different than "new" being different.
> 
> "new" always creates a new instance but in case of value types, == does not 
> allow us see if the instance are different or not.

I'm not sure this is a good way to think value creation, though. It suggests 
that there still *is* an identity there (i.e., the new value has been newly 
allocated), you just can't see it.

I'd rather have programmers think in these terms: when you instantiate a value 
class, you might get an object that already exists. Whether there are copies of 
that object at different memory locations or not is irrelevant—it's still *the 
same object*.

Re: Evolving instance creation

2022-02-24 Thread Remi Forax
- Original Message -
> From: "Dan Heidinga" 
> To: "daniel smith" 
> Cc: "valhalla-spec-experts" 
> Sent: Thursday, February 24, 2022 4:39:52 PM
> Subject: Re: Evolving instance creation

> Repeating what I said in the EG meeting:
> 
> * "new" carries the mental model of allocating space.  For identity
> objects, that's on the heap.  For values, that may just be stack space
> / registers.  But it indicates that some kind of allocation / demand
> for new storage has occurred.
> 
> * It's important that "new" returns a unique instance.  That invariant
> has existed since Java's inception and we should be careful about
> breaking it.  In the case of values, two identical values can't be
> differentiated so I think we're safe to say they are unique but
> indistinguishable as no user program can differentiate them.

Yes, it's more about == being different than "new" being different.

"new" always creates a new instance but in case of value types, == does not 
allow us see if the instance are different or not.

> 
> The rest of this is more of a language design question than a VM one.
> The `Foo()` (without a new) is a good starting point for a canonical
> factory model.  The challenge will be in expressing the difference
> between the factory method and the constructor as they need to be
> distinct items in the source (different invariants, different return
> values, etc)
> 
> --Dan

Rémi

> 
> On Tue, Feb 22, 2022 at 4:17 PM Dan Smith  wrote:
>>
>> One of the longstanding properties of class instance creation expressions 
>> ('new
>> Foo()') is that the instance being produced is unique—that is, not '==' to 
>> any
>> previously-created instance.
>>
>> Value classes will disrupt this invariant, because it's possible to "create" 
>> an
>> instance of a value class that already exists:
>>
>> new Point(1, 2) == new Point(1, 2) // always true
>>
>> A related, possibly-overlapping new Java feature idea (not concretely 
>> proposed,
>> but something the language might want in the future) is the declaration of
>> canonical factory methods in a class, which intentionally *don't* promise
>> unique instances (for example, they might implement interning). These 
>> factories
>> would be like constructors in that they wouldn't have a unique method name, 
>> but
>> otherwise would behave like ad hoc static factory methods—take some 
>> arguments,
>> use them to create/locate an appropriate instance, return it.
>>
>> I want to focus here on the usage of class instance creation expressions, and
>> how to approach changes to their semantics. This involves balancing the needs
>> of programmers who depend on the unique instance invariant with those who 
>> don't
>> care and would prefer fewer knobs/less complexity.
>>
>> Here are three approaches that I could imagine pursuing:
>>
>> (1) Value classes are a special case for 'new Foo()'
>>
>> This is the plan of record: the unique instance invariant continues to hold 
>> for
>> 'new Foo()' where Foo is an identity class, but if Foo is a value class, you
>> might get an existing instance.
>>
>> In bytecode, the translation of 'new Foo()' depends on the kind of class (as
>> determined at compile time). Identity class creation continues to be
>> implemented via 'new Foo; dup; invokespecial Foo.()V'. Value class
>> creation occurs via 'invokestatic Foo.()LFoo;' (method name
>> bikeshedding tk). There is no compatibility between the two (e.g., if an
>> identity class becomes a value class).
>>
>> In a way, it shouldn't be surprising that a value class doesn't guarantee 
>> unique
>> instances, because uniqueness is closely tied to identity. So special-casing
>> 'new Foo()' isn't that different from special-casing Object.equals'—in the
>> absence of identity, we'll do something reasonable, but not quite the same.
>>
>> Factories don't enter into this story at all. If we end up having unnamed
>> factories in the future, they will be declared and invoked with a separate
>> syntax, and will be declarable both by identity classes and value classes.
>> (Value class factories don't seem particularly compelling, but they could, 
>> say,
>> be used to smooth migration, like 'Integer.valueOf'.)
>>
>> Biggest concerns: for now, it can be surprising that 'new' doesn't always 
>> give
>> you a unique instance. In a future with factories, navigating between the 
>> 'new'
>> syntax and the factory invocation syntax may be burdensome

Re: Evolving instance creation

2022-02-24 Thread Dan Heidinga
Repeating what I said in the EG meeting:

* "new" carries the mental model of allocating space.  For identity
objects, that's on the heap.  For values, that may just be stack space
/ registers.  But it indicates that some kind of allocation / demand
for new storage has occurred.

* It's important that "new" returns a unique instance.  That invariant
has existed since Java's inception and we should be careful about
breaking it.  In the case of values, two identical values can't be
differentiated so I think we're safe to say they are unique but
indistinguishable as no user program can differentiate them.

The rest of this is more of a language design question than a VM one.
The `Foo()` (without a new) is a good starting point for a canonical
factory model.  The challenge will be in expressing the difference
between the factory method and the constructor as they need to be
distinct items in the source (different invariants, different return
values, etc)

--Dan

On Tue, Feb 22, 2022 at 4:17 PM Dan Smith  wrote:
>
> One of the longstanding properties of class instance creation expressions 
> ('new Foo()') is that the instance being produced is unique—that is, not '==' 
> to any previously-created instance.
>
> Value classes will disrupt this invariant, because it's possible to "create" 
> an instance of a value class that already exists:
>
> new Point(1, 2) == new Point(1, 2) // always true
>
> A related, possibly-overlapping new Java feature idea (not concretely 
> proposed, but something the language might want in the future) is the 
> declaration of canonical factory methods in a class, which intentionally 
> *don't* promise unique instances (for example, they might implement 
> interning). These factories would be like constructors in that they wouldn't 
> have a unique method name, but otherwise would behave like ad hoc static 
> factory methods—take some arguments, use them to create/locate an appropriate 
> instance, return it.
>
> I want to focus here on the usage of class instance creation expressions, and 
> how to approach changes to their semantics. This involves balancing the needs 
> of programmers who depend on the unique instance invariant with those who 
> don't care and would prefer fewer knobs/less complexity.
>
> Here are three approaches that I could imagine pursuing:
>
> (1) Value classes are a special case for 'new Foo()'
>
> This is the plan of record: the unique instance invariant continues to hold 
> for 'new Foo()' where Foo is an identity class, but if Foo is a value class, 
> you might get an existing instance.
>
> In bytecode, the translation of 'new Foo()' depends on the kind of class (as 
> determined at compile time). Identity class creation continues to be 
> implemented via 'new Foo; dup; invokespecial Foo.()V'. Value class 
> creation occurs via 'invokestatic Foo.()LFoo;' (method name 
> bikeshedding tk). There is no compatibility between the two (e.g., if an 
> identity class becomes a value class).
>
> In a way, it shouldn't be surprising that a value class doesn't guarantee 
> unique instances, because uniqueness is closely tied to identity. So 
> special-casing 'new Foo()' isn't that different from special-casing 
> Object.equals'—in the absence of identity, we'll do something reasonable, but 
> not quite the same.
>
> Factories don't enter into this story at all. If we end up having unnamed 
> factories in the future, they will be declared and invoked with a separate 
> syntax, and will be declarable both by identity classes and value classes. 
> (Value class factories don't seem particularly compelling, but they could, 
> say, be used to smooth migration, like 'Integer.valueOf'.)
>
> Biggest concerns: for now, it can be surprising that 'new' doesn't always 
> give you a unique instance. In a future with factories, navigating between 
> the 'new' syntax and the factory invocation syntax may be burdensome, with 
> style wars about which approach is better.
>
> (2) 'new Foo()' as a general-purpose creation tool
>
> In this approach, 'new Foo()' is the use-site syntax for *both* factory and 
> constructor invocation. Factories and constructors live in the same overload 
> resolution "namespace", and all will be considered by the use site.
>
> In bytecode, the preferred translation of 'new Foo()' is 'invokestatic 
> Foo.()LFoo;'. Note that this is the case for both value classes *and 
> identity classes*. For compatibility, 'new/dup/' also needs to be 
> supported for now; eventually, it might be deprecated. Refactoring between 
> constructors and factories is generally compatible.
>
> Because this re-interpretation of 'new Foo()' supports factories, there is no 
> unique instance invariant. At best, particular classes can document that they 
> produce unique instances, and clients who need this behavior should ensure 
> they're working with classes that promise it. (It's not as simple as looking 
> for a *current* factory, because constructors can be refactored to