Re: Evolving instance creation

2022-02-24 Thread Brian Goetz
I find DanH's way of presenting it more natural (and makes perfect sense 
now that its been said that way): it *is* allocating something, just not 
in the heap.  It is requesting new storage for a new object, which might 
be in the heap, or the stack, or registers. And we might find that new 
object to be == to an old object, but we're still requesting that space 
for a new object be allocated.



"new" always creates a new instance but in case of value types, == does not 
allow us see if the instance are different or not.

I'm not sure this is a good way to think value creation, though. It suggests 
that there still *is* an identity there (i.e., the new value has been newly 
allocated), you just can't see it.

I'd rather have programmers think in these terms: when you instantiate a value 
class, you might get an object that already exists. Whether there are copies of 
that object at different memory locations or not is irrelevant—it's still *the 
same object*.




Re: Evolving instance creation

2022-02-24 Thread Dan Smith
> On Feb 24, 2022, at 8:47 AM, Remi Forax  wrote:
> 
> - Original Message -
>> From: "Dan Heidinga" 
>> To: "daniel smith" 
>> Cc: "valhalla-spec-experts" 
>> Sent: Thursday, February 24, 2022 4:39:52 PM
>> Subject: Re: Evolving instance creation
> 
>> Repeating what I said in the EG meeting:
>> 
>> * "new" carries the mental model of allocating space.  For identity
>> objects, that's on the heap.  For values, that may just be stack space
>> / registers.  But it indicates that some kind of allocation / demand
>> for new storage has occurred.
>> 
>> * It's important that "new" returns a unique instance.  That invariant
>> has existed since Java's inception and we should be careful about
>> breaking it.  In the case of values, two identical values can't be
>> differentiated so I think we're safe to say they are unique but
>> indistinguishable as no user program can differentiate them.
> 
> Yes, it's more about == being different than "new" being different.
> 
> "new" always creates a new instance but in case of value types, == does not 
> allow us see if the instance are different or not.

I'm not sure this is a good way to think value creation, though. It suggests 
that there still *is* an identity there (i.e., the new value has been newly 
allocated), you just can't see it.

I'd rather have programmers think in these terms: when you instantiate a value 
class, you might get an object that already exists. Whether there are copies of 
that object at different memory locations or not is irrelevant—it's still *the 
same object*.

Re: Abstract class with fields implementing ValueObject

2022-02-24 Thread Dan Smith
TLDR: I'm convinced, let's revise our approach so that the JVM never infers 
interfaces for abstract classes.

On Feb 24, 2022, at 8:57 AM, Dan Heidinga 
mailto:heidi...@redhat.com>> wrote:

Whether
they can be instantiated is a decision better left to other parts of
the spec (in this case, I believe verification will succeed and
resolution of the `super()`  call will fail).

Right, my mistake. Verifier doesn't care what methods are declared in the 
superclass, but resolution of the invokespecial will fail.

(3) no ACC_PERMITS_VALUE,  declaration

The JVM infers that this class implements IdentityObject, if it doesn't 
already. If it also implements ValueObject, an error occurs at class load time.

I think this should be driven purely by the presence of the
ACC_PERMITS_VALUE flag and the VM shouldn't be looking at the 
methods.

Sounds like the consensus, agreed.

 The JVM shouldn't infer either IdentityObject or ValueObject
for this abstract class - any inference decision should be delayed to
the subclasses that extend this abstract class.

My initial reaction was that, no, we really do want IdentityObject here, 
because it's useful to be able to assign an abstract class type to 
IdentityObject.

But: for new classes, the compiler will have an opportunity to be explicit. 
It's mostly a question of how we handle legacy classes. And there, it would 
actually be bad to infer IdentityObject, when in most cases the class will get 
permits_value when it is recompiled. Probably best to avoid a scenario like:

- Compile against legacy API, assign library.AbstractBanana to IdentityObject 
in your code
- Upgrade to newer version of the API, assignment from library.AbstractBanana 
to IdentityObject is an error

So, okay, let's say we limit JVM inference to concrete classes. And javac will 
infer/generate 'implements IdentityObject' if it decides an abstract class 
can't be permits_value.

What about separate compilation? javac's behavior might be something like: 1) 
look for fields, 'synchronized', etc. in the class declaration, and if any are 
present, add 'implements IdentityObject' (if it's not already there); 2) if the 
superclass is permits_value and this class doesn't extend IdentityObject 
(directly or indirectly), set permits_value. (1) is a local decision, while (2) 
depends on multiple classes, so can be disrupted by separate compilation. But, 
thinking through the scenarios here... I'm pretty comfortable saying that an 
abstract class that is neither permits_value nor a subclass of IdentityObject 
is in an unstable state, and, like the legacy case, it's probably better if 
programmers *don't* write code assuming they can assign to IdentityObject.



Re: Abstract class with fields implementing ValueObject

2022-02-24 Thread Dan Heidinga
On Wed, Feb 23, 2022 at 1:36 PM Dan Smith  wrote:
>
> Fred suggested that we enumerate the whole space here. So, some cases to 
> consider:
>
> { ACC_PERMITS_VALUE, not }
> { has an  declaration, not }
> { implements IdentityObject, not }
> { implements ValueObject, not }
>
> "implements" here refers to both direct and indirect superinterfaces.
>
> I'll focus on the first two, which affect the inference of superinterfaces.
>
> (1) ACC_PERMITS_VALUE,  declaration
>
> This is a class that is able to support both identity and value subclasses. 
> It implements no extra interfaces, but can restrict its subclasses via 
> 'implements IdentityObject' or 'implements ValueObject'.
>
> (2) ACC_PERMITS_VALUE, no  declaration
>
> The JVM infers that this class implements ValueObject, if it doesn't already. 
> If it also implements IdentityObject, an error occurs at class load time.
>
> (Design alternative: we could ignore the  declarations and treat this 
> like case (1). In that case, the class could implement IdentityObject or be 
> extended by identity classes without error (as long as it doesn't also 
> implement ValueObject). But those identity subclasses couldn't declare 
> verification-compatible  methods, just like subclasses of abstract 
> classes that have no  methods today.)

I think ignoring the  declarations is the model we want here.
Both (1) and (2) should be treated the same by the VM - in either case
the subclasses can implement IdentityObject or ValueObject.  Whether
they can be instantiated is a decision better left to other parts of
the spec (in this case, I believe verification will succeed and
resolution of the `super()`  call will fail).

>
> (3) no ACC_PERMITS_VALUE,  declaration
>
> The JVM infers that this class implements IdentityObject, if it doesn't 
> already. If it also implements ValueObject, an error occurs at class load 
> time.

I think this should be driven purely by the presence of the
ACC_PERMITS_VALUE flag and the VM shouldn't be looking at the 
methods.  The JVM shouldn't infer either IdentityObject or ValueObject
for this abstract class - any inference decision should be delayed to
the subclasses that extend this abstract class.

An abstract class that doesn't have the ACC_PERMITS_VALUE flag binds
tightly to the IdentityObject interface.  The presence of the
ACC_PERMITS_VALUE flag delays the interface binding until we hit
either a concrete class or an abstract class without the flag.

>
> (4) no ACC_PERMITS_VALUE, no  declaration
>
> This is a class that effectively supports *no* subclasses. We don't infer any 
> superinterfaces, but it can choose to implement IdentityObject or 
> ValueObject. A value class that extends this class will fail to load.

See above.  No ACC_PERMITS_VALUE flag means it binds tightly to
IdentityObject and can only be subclassed by IdentityObject-compatible
classes.  All value classes extending this abstract will fail to load.

> If the class doesn't implement ValueObject, an identity class that extends 
> this class could load, but couldn't declare verification-compatible  
> methods, just like subclasses of abstract classes that have no  methods 
> today.

The class extending this will load but cannot be instantiated.
Verification succeeds but resolution of any super() calls in the
constructor will fail to resolve.

>
> (Design alternative: we could ignore the  declarations and treat this 
> like case (3). In that case, it would be an error for the class to implement 
> ValueObject, because it also implicitly implements IdentityObject.)

+1.

>
> ---
>
> Spelling this out makes me feel like treating the presence of  methods 
> as an inference signal may be overreaching and overcomplicating things. 
> Today, declaring an  method, or not, has no direct impact on anything, 
> other than the side-effect that you can't write verification-compatible 
>  methods in your subclasses. I like the parallel between "permits 
> identity (via )" and "permits value (via flag)", but flags and  
> methods aren't really parallel constructs; in cases (2) and (4), we still 
> "permit" identity subclasses, even if they're pretty useless.
>
> (And it doesn't help that javac doesn't give you any way to create these 
> -free classes, so in practice they certainly don't have parallel 
> prevalence.)
>
> Pursuing the "design alternative" strategies would essentially collapse this 
> down to two cases: (1) ACC_PERMITS_VALUE, no superinterfaces inferred, but 
> various checks performed (e.g., no instance fields); and (3) no 
> ACC_PERMITS_VALUE, IdentityObject is inferred, error if there's also an 
> explicit 'implements ValueObject'.
>
> How do we feel about that?
>

I think this design alternative is the right strategy and more inline
with existing conventions for how the VM handles classes.

--Dan



Re: Evolving instance creation

2022-02-24 Thread Remi Forax
- Original Message -
> From: "Dan Heidinga" 
> To: "daniel smith" 
> Cc: "valhalla-spec-experts" 
> Sent: Thursday, February 24, 2022 4:39:52 PM
> Subject: Re: Evolving instance creation

> Repeating what I said in the EG meeting:
> 
> * "new" carries the mental model of allocating space.  For identity
> objects, that's on the heap.  For values, that may just be stack space
> / registers.  But it indicates that some kind of allocation / demand
> for new storage has occurred.
> 
> * It's important that "new" returns a unique instance.  That invariant
> has existed since Java's inception and we should be careful about
> breaking it.  In the case of values, two identical values can't be
> differentiated so I think we're safe to say they are unique but
> indistinguishable as no user program can differentiate them.

Yes, it's more about == being different than "new" being different.

"new" always creates a new instance but in case of value types, == does not 
allow us see if the instance are different or not.

> 
> The rest of this is more of a language design question than a VM one.
> The `Foo()` (without a new) is a good starting point for a canonical
> factory model.  The challenge will be in expressing the difference
> between the factory method and the constructor as they need to be
> distinct items in the source (different invariants, different return
> values, etc)
> 
> --Dan

Rémi

> 
> On Tue, Feb 22, 2022 at 4:17 PM Dan Smith  wrote:
>>
>> One of the longstanding properties of class instance creation expressions 
>> ('new
>> Foo()') is that the instance being produced is unique—that is, not '==' to 
>> any
>> previously-created instance.
>>
>> Value classes will disrupt this invariant, because it's possible to "create" 
>> an
>> instance of a value class that already exists:
>>
>> new Point(1, 2) == new Point(1, 2) // always true
>>
>> A related, possibly-overlapping new Java feature idea (not concretely 
>> proposed,
>> but something the language might want in the future) is the declaration of
>> canonical factory methods in a class, which intentionally *don't* promise
>> unique instances (for example, they might implement interning). These 
>> factories
>> would be like constructors in that they wouldn't have a unique method name, 
>> but
>> otherwise would behave like ad hoc static factory methods—take some 
>> arguments,
>> use them to create/locate an appropriate instance, return it.
>>
>> I want to focus here on the usage of class instance creation expressions, and
>> how to approach changes to their semantics. This involves balancing the needs
>> of programmers who depend on the unique instance invariant with those who 
>> don't
>> care and would prefer fewer knobs/less complexity.
>>
>> Here are three approaches that I could imagine pursuing:
>>
>> (1) Value classes are a special case for 'new Foo()'
>>
>> This is the plan of record: the unique instance invariant continues to hold 
>> for
>> 'new Foo()' where Foo is an identity class, but if Foo is a value class, you
>> might get an existing instance.
>>
>> In bytecode, the translation of 'new Foo()' depends on the kind of class (as
>> determined at compile time). Identity class creation continues to be
>> implemented via 'new Foo; dup; invokespecial Foo.()V'. Value class
>> creation occurs via 'invokestatic Foo.()LFoo;' (method name
>> bikeshedding tk). There is no compatibility between the two (e.g., if an
>> identity class becomes a value class).
>>
>> In a way, it shouldn't be surprising that a value class doesn't guarantee 
>> unique
>> instances, because uniqueness is closely tied to identity. So special-casing
>> 'new Foo()' isn't that different from special-casing Object.equals'—in the
>> absence of identity, we'll do something reasonable, but not quite the same.
>>
>> Factories don't enter into this story at all. If we end up having unnamed
>> factories in the future, they will be declared and invoked with a separate
>> syntax, and will be declarable both by identity classes and value classes.
>> (Value class factories don't seem particularly compelling, but they could, 
>> say,
>> be used to smooth migration, like 'Integer.valueOf'.)
>>
>> Biggest concerns: for now, it can be surprising that 'new' doesn't always 
>> give
>> you a unique instance. In a future with factories, navigating between the 
>> 'new'
>> syntax and the factory invocation syntax may be burdensome, with style wars
>> about which approach is better.
>>
>> (2) 'new Foo()' as a general-purpose creation tool
>>
>> In this approach, 'new Foo()' is the use-site syntax for *both* factory and
>> constructor invocation. Factories and constructors live in the same overload
>> resolution "namespace", and all will be considered by the use site.
>>
>> In bytecode, the preferred translation of 'new Foo()' is 'invokestatic
>> Foo.()LFoo;'. Note that this is the case for both value classes *and
>> identity classes*. For compatibility, 'new/dup/' also needs to be
>> 

Re: Evolving instance creation

2022-02-24 Thread Dan Heidinga
Repeating what I said in the EG meeting:

* "new" carries the mental model of allocating space.  For identity
objects, that's on the heap.  For values, that may just be stack space
/ registers.  But it indicates that some kind of allocation / demand
for new storage has occurred.

* It's important that "new" returns a unique instance.  That invariant
has existed since Java's inception and we should be careful about
breaking it.  In the case of values, two identical values can't be
differentiated so I think we're safe to say they are unique but
indistinguishable as no user program can differentiate them.

The rest of this is more of a language design question than a VM one.
The `Foo()` (without a new) is a good starting point for a canonical
factory model.  The challenge will be in expressing the difference
between the factory method and the constructor as they need to be
distinct items in the source (different invariants, different return
values, etc)

--Dan

On Tue, Feb 22, 2022 at 4:17 PM Dan Smith  wrote:
>
> One of the longstanding properties of class instance creation expressions 
> ('new Foo()') is that the instance being produced is unique—that is, not '==' 
> to any previously-created instance.
>
> Value classes will disrupt this invariant, because it's possible to "create" 
> an instance of a value class that already exists:
>
> new Point(1, 2) == new Point(1, 2) // always true
>
> A related, possibly-overlapping new Java feature idea (not concretely 
> proposed, but something the language might want in the future) is the 
> declaration of canonical factory methods in a class, which intentionally 
> *don't* promise unique instances (for example, they might implement 
> interning). These factories would be like constructors in that they wouldn't 
> have a unique method name, but otherwise would behave like ad hoc static 
> factory methods—take some arguments, use them to create/locate an appropriate 
> instance, return it.
>
> I want to focus here on the usage of class instance creation expressions, and 
> how to approach changes to their semantics. This involves balancing the needs 
> of programmers who depend on the unique instance invariant with those who 
> don't care and would prefer fewer knobs/less complexity.
>
> Here are three approaches that I could imagine pursuing:
>
> (1) Value classes are a special case for 'new Foo()'
>
> This is the plan of record: the unique instance invariant continues to hold 
> for 'new Foo()' where Foo is an identity class, but if Foo is a value class, 
> you might get an existing instance.
>
> In bytecode, the translation of 'new Foo()' depends on the kind of class (as 
> determined at compile time). Identity class creation continues to be 
> implemented via 'new Foo; dup; invokespecial Foo.()V'. Value class 
> creation occurs via 'invokestatic Foo.()LFoo;' (method name 
> bikeshedding tk). There is no compatibility between the two (e.g., if an 
> identity class becomes a value class).
>
> In a way, it shouldn't be surprising that a value class doesn't guarantee 
> unique instances, because uniqueness is closely tied to identity. So 
> special-casing 'new Foo()' isn't that different from special-casing 
> Object.equals'—in the absence of identity, we'll do something reasonable, but 
> not quite the same.
>
> Factories don't enter into this story at all. If we end up having unnamed 
> factories in the future, they will be declared and invoked with a separate 
> syntax, and will be declarable both by identity classes and value classes. 
> (Value class factories don't seem particularly compelling, but they could, 
> say, be used to smooth migration, like 'Integer.valueOf'.)
>
> Biggest concerns: for now, it can be surprising that 'new' doesn't always 
> give you a unique instance. In a future with factories, navigating between 
> the 'new' syntax and the factory invocation syntax may be burdensome, with 
> style wars about which approach is better.
>
> (2) 'new Foo()' as a general-purpose creation tool
>
> In this approach, 'new Foo()' is the use-site syntax for *both* factory and 
> constructor invocation. Factories and constructors live in the same overload 
> resolution "namespace", and all will be considered by the use site.
>
> In bytecode, the preferred translation of 'new Foo()' is 'invokestatic 
> Foo.()LFoo;'. Note that this is the case for both value classes *and 
> identity classes*. For compatibility, 'new/dup/' also needs to be 
> supported for now; eventually, it might be deprecated. Refactoring between 
> constructors and factories is generally compatible.
>
> Because this re-interpretation of 'new Foo()' supports factories, there is no 
> unique instance invariant. At best, particular classes can document that they 
> produce unique instances, and clients who need this behavior should ensure 
> they're working with classes that promise it. (It's not as simple as looking 
> for a *current* factory, because constructors can be refactored to