----- Original Message -----
> From: "Dan Heidinga" <heidi...@redhat.com>
> To: "daniel smith" <daniel.sm...@oracle.com>
> Cc: "valhalla-spec-experts" <valhalla-spec-experts@openjdk.java.net>
> Sent: Thursday, February 24, 2022 4:39:52 PM
> Subject: Re: Evolving instance creation

> Repeating what I said in the EG meeting:
> 
> * "new" carries the mental model of allocating space.  For identity
> objects, that's on the heap.  For values, that may just be stack space
> / registers.  But it indicates that some kind of allocation / demand
> for new storage has occurred.
> 
> * It's important that "new" returns a unique instance.  That invariant
> has existed since Java's inception and we should be careful about
> breaking it.  In the case of values, two identical values can't be
> differentiated so I think we're safe to say they are unique but
> indistinguishable as no user program can differentiate them.

Yes, it's more about == being different than "new" being different.

"new" always creates a new instance but in case of value types, == does not 
allow us see if the instance are different or not.

> 
> The rest of this is more of a language design question than a VM one.
> The `Foo()` (without a new) is a good starting point for a canonical
> factory model.  The challenge will be in expressing the difference
> between the factory method and the constructor as they need to be
> distinct items in the source (different invariants, different return
> values, etc)
> 
> --Dan

Rémi

> 
> On Tue, Feb 22, 2022 at 4:17 PM Dan Smith <daniel.sm...@oracle.com> wrote:
>>
>> One of the longstanding properties of class instance creation expressions 
>> ('new
>> Foo()') is that the instance being produced is unique—that is, not '==' to 
>> any
>> previously-created instance.
>>
>> Value classes will disrupt this invariant, because it's possible to "create" 
>> an
>> instance of a value class that already exists:
>>
>> new Point(1, 2) == new Point(1, 2) // always true
>>
>> A related, possibly-overlapping new Java feature idea (not concretely 
>> proposed,
>> but something the language might want in the future) is the declaration of
>> canonical factory methods in a class, which intentionally *don't* promise
>> unique instances (for example, they might implement interning). These 
>> factories
>> would be like constructors in that they wouldn't have a unique method name, 
>> but
>> otherwise would behave like ad hoc static factory methods—take some 
>> arguments,
>> use them to create/locate an appropriate instance, return it.
>>
>> I want to focus here on the usage of class instance creation expressions, and
>> how to approach changes to their semantics. This involves balancing the needs
>> of programmers who depend on the unique instance invariant with those who 
>> don't
>> care and would prefer fewer knobs/less complexity.
>>
>> Here are three approaches that I could imagine pursuing:
>>
>> (1) Value classes are a special case for 'new Foo()'
>>
>> This is the plan of record: the unique instance invariant continues to hold 
>> for
>> 'new Foo()' where Foo is an identity class, but if Foo is a value class, you
>> might get an existing instance.
>>
>> In bytecode, the translation of 'new Foo()' depends on the kind of class (as
>> determined at compile time). Identity class creation continues to be
>> implemented via 'new Foo; dup; invokespecial Foo.<init>()V'. Value class
>> creation occurs via 'invokestatic Foo.<newvalue>()LFoo;' (method name
>> bikeshedding tk). There is no compatibility between the two (e.g., if an
>> identity class becomes a value class).
>>
>> In a way, it shouldn't be surprising that a value class doesn't guarantee 
>> unique
>> instances, because uniqueness is closely tied to identity. So special-casing
>> 'new Foo()' isn't that different from special-casing Object.equals'—in the
>> absence of identity, we'll do something reasonable, but not quite the same.
>>
>> Factories don't enter into this story at all. If we end up having unnamed
>> factories in the future, they will be declared and invoked with a separate
>> syntax, and will be declarable both by identity classes and value classes.
>> (Value class factories don't seem particularly compelling, but they could, 
>> say,
>> be used to smooth migration, like 'Integer.valueOf'.)
>>
>> Biggest concerns: for now, it can be surprising that 'new' doesn't always 
>> give
>> you a unique instance. In a future with factories, navigating between the 
>> 'new'
>> syntax and the factory invocation syntax may be burdensome, with style wars
>> about which approach is better.
>>
>> (2) 'new Foo()' as a general-purpose creation tool
>>
>> In this approach, 'new Foo()' is the use-site syntax for *both* factory and
>> constructor invocation. Factories and constructors live in the same overload
>> resolution "namespace", and all will be considered by the use site.
>>
>> In bytecode, the preferred translation of 'new Foo()' is 'invokestatic
>> Foo.<new>()LFoo;'. Note that this is the case for both value classes *and
>> identity classes*. For compatibility, 'new/dup/<init>' also needs to be
>> supported for now; eventually, it might be deprecated. Refactoring between
>> constructors and factories is generally compatible.
>>
>> Because this re-interpretation of 'new Foo()' supports factories, there is no
>> unique instance invariant. At best, particular classes can document that they
>> produce unique instances, and clients who need this behavior should ensure
>> they're working with classes that promise it. (It's not as simple as looking
>> for a *current* factory, because constructors can be refactored to 
>> factories.)
>>
>> For developers who don't care about unique instances, this is the simplest
>> approach: whenever you want an instance of Foo, you say 'new Foo()'.
>>
>> Biggest concerns: we've demoted an ironclad semantic guarantee to an optional
>> property of some classes. For those developers/use cases who care about the
>> unique instance invariant, that may be difficult, especially because we're
>> undoing a longstanding property rather than designing it this way from the
>> beginning.
>>
>> (3) 'new Foo()' for unique instances and just 'Foo()' otherwise
>>
>> Here, the 'new' keyword is reserved for cases in which a unique instance is
>> guaranteed. For value class creation, factory invocation, and constructor
>> invocation when unique instances don't matter, a bare 'Foo()' call is used
>> instead. 'new Point()' would be an error—this syntax doesn't work with value
>> classes.
>>
>> In bytecode, 'new Foo()' always compiles to 'new/dup/<init>', while plain
>> 'Foo()' typically compiles to 'invokestatic Foo.<make>()LFoo;' (method name
>> bikeshedding tk). For compatibility, plain 'Foo()' would support
>> 'new/dup/<init>' invocations as well, if that's all the class provides.
>> Refactoring between constructors and factories is generally compatible for
>> plain 'Foo()' use sites, but not 'new Foo()' use sites.
>>
>> The plain 'Foo()' would become the preferred style for general-purpose usage,
>> while 'new Foo()' would (eventually, after a long migration period) signal an
>> interest in the unique instance guarantee. Java code written with the updated
>> style is a little lighter on "ceremony".
>>
>> Biggest concerns: a somewhat arbitrary shift in coding style for all 
>> programmers
>> to learn, which at a minimum must be adopted when working with value classes.
>>
>> ---
>>
>> What are your thoughts about the significance of the unique instance 
>> invariant?
>> Is it important enough to design instance creation syntax around it? Do 
>> either
> > (2) or (3) above sound like a better destination than the plan of record?

Reply via email to