Re: Evolving instance creation

2022-03-01 Thread Dan Smith
On Mar 1, 2022, at 6:56 AM, Kevin Bourrillion 
mailto:kev...@google.com>> wrote:

The main thing I think CICEs/`new` accomplish is simply to "cross the bridge". 
Constructors are void and non-static; yet somehow we need to call them as if 
they're static and non-void! `new` gets us across that gap. This seems to me 
like a special-snowflake problem that `new` is custom built to address, and I 
would hope we keep it.

Okay. So support for 'new Point()' (1) over just 'Point()' (3) on the basis 
that constructor declarations need special magic to enter the context where the 
constructor body lives. So as long as we're declaring value class constructors 
in the same way as identity class constructors, it makes sense for both to have 
the same invocation syntax, and for that syntax to be somehow different from a 
method invocation.

I suppose (3) envisions this magic happening invisibly, as part of the 
instantiation API provided by the class—there's some magic under the covers 
where a bridge/factory-like entity gets invoked and sets up the context for the 
constructor body. But I agree that it's probably better not to have to appeal 
to something invisible when people are already used to the magic being explicit.

A couple more minor points about the factories idea:

A related, possibly-overlapping new Java feature idea (not concretely proposed, 
but something the language might want in the future) is the declaration of 
canonical factory methods in a class, which intentionally *don't* promise 
unique instances (for example, they might implement interning). These factories 
would be like constructors in that they wouldn't have a unique method name, but 
otherwise would behave like ad hoc static factory methods—take some arguments, 
use them to create/locate an appropriate instance, return it.

Can you clarify what these offer that static methods don't already provide? The 
two weaknesses I'm aware of with static factory methods are (1) subclasses 
still need a constructor to call and (2) often you don't really want the burden 
of naming them, you just want them to look like the obvious standard creation 
path. It sounds like this addresses (2) but not (1), and I assume also 
addresses some (3).

A couple of things:

- If it's canonical, everybody knows where to find it. APIs like reflection and 
tools like serialization can create instances through a universally-recognized 
mechanism (but one that is more flexible than constructors).

- In a similar vein, if JVMS can count on instantiation being supported by a 
canonical method name, then this approach can subsume existing uses of 
'new/dup/', which are a major source of complexity. This is a very long 
game, but the idea is that eventually the old mechanism (specifically, use of 
the 'new' bytecode outside of the class being instantiated) could be deprecated.

(2) 'new Foo()' as a general-purpose creation tool

In this approach, 'new Foo()' is the use-site syntax for *both* factory and 
constructor invocation. Factories and constructors live in the same overload 
resolution "namespace", and all will be considered by the use site.

It sounds to me like these factories would be static, so `new` would not be 
required by the "cross the bridge" interpretation given above.

Right. This approach gives up the use-site/declaration-site alignment, instead 
interpreting 'new' as "make me one of these, using whatever mechanism the class 
provides".


Re: Abstract class with fields implementing ValueObject

2022-03-01 Thread Dan Heidinga
>
> My initial reaction was that, no, we really do want IdentityObject here, 
> because it's useful to be able to assign an abstract class type to 
> IdentityObject.
>
> But: for new classes, the compiler will have an opportunity to be explicit. 
> It's mostly a question of how we handle legacy classes. And there, it would 
> actually be bad to infer IdentityObject, when in most cases the class will 
> get permits_value when it is recompiled. Probably best to avoid a scenario 
> like:
>
> - Compile against legacy API, assign library.AbstractBanana to IdentityObject 
> in your code
> - Upgrade to newer version of the API, assignment from library.AbstractBanana 
> to IdentityObject is an error
>
> So, okay, let's say we limit JVM inference to concrete classes. And javac 
> will infer/generate 'implements IdentityObject' if it decides an abstract 
> class can't be permits_value.
>
> What about separate compilation? javac's behavior might be something like: 1) 
> look for fields, 'synchronized', etc. in the class declaration, and if any 
> are present, add 'implements IdentityObject' (if it's not already there); 2) 
> if the superclass is permits_value and this class doesn't extend 
> IdentityObject (directly or indirectly), set permits_value. (1) is a local 
> decision, while (2) depends on multiple classes, so can be disrupted by 
> separate compilation. But, thinking through the scenarios here... I'm pretty 
> comfortable saying that an abstract class that is neither permits_value nor a 
> subclass of IdentityObject is in an unstable state, and, like the legacy 
> case, it's probably better if programmers *don't* write code assuming they 
> can assign to IdentityObject.
>

I agree that "it's probably better if programmers *don't* write code
assuming they can assign to IdentityObject."  We've historically
assumed that separate compilation should endeavour to be consistent
and if not, should live with the consequences.  The best analogy I can
think of is the StackOverflowError due to bridge method loops - there
we tell users to fix their separate compilation issues and don't try
to paper over it in the VM - and I think it's best to do the same
here.

Using both (1) and (2) is the right call here - no need to limit to
the local information when we can make the better decision using other
necessarily available classes (ie: the superclass)

--Dan



Re: Evolving instance creation

2022-03-01 Thread Kevin Bourrillion
Seems like this decision is trending in the direction I'd prefer already,
but here's some argumentation that *might* be helpful from the
programming-model perspective.


On Tue, Feb 22, 2022 at 1:17 PM Dan Smith  wrote:

One of the longstanding properties of class instance creation expressions
> ('new Foo()') is that the instance being produced is unique—that is, not
> '==' to any previously-created instance.
>

I'll argue that this is an incidental association only.

Note that `new` has simply never been *needed* for identityless types
before; for those (all 8 of them), literals and binary expressions and
things had us covered. So `new` has so far seemed associated with
identityful types. But I think the expectation quoted here clearly comes
from the identity-type-ness, not from the `new` keyword. If we use `new`
with identityless objects or values, a distinct-identity expectation simply
doesn't apply.

Plus as Remi says, changing to that type changes `==` into a different
operator with the same name. So I think that this:


new Point(1, 2) == new Point(1, 2) // always true
>

 is entirely *un*problematic!

Dan H. says, "`new` carries the mental model of allocating space" -- again,
I think it's incidental. Because the *point of introducing *identityless
types is that the distinction between creating and reusing (summoning from
the either somehow) vanishes. We shouldn't be able to distinguish those
cases.

As Dan S. says later,

I'd rather have programmers think in these terms: when you instantiate a
> value class, you might get an object that already exists. Whether there are
> copies of that object at different memory locations or not is
> irrelevant—it's still *the same object*.


But my reaction is: then to the programming model it *might as well just
look like creation*. It can't really "look like reusing" without forcing
the question of "reusing what from where?". It can only just look
*different*. But I don't think it needs to. (I think this is what Dan H
ends up supporting too.)

The main thing I think CICEs/`new` accomplish is simply to "cross the
bridge". Constructors are void and non-static; yet somehow we need to
*call* them
as if they're static and non-void! `new` gets us across that gap. This
seems to me like a special-snowflake problem that `new` is custom built to
address, and I would hope we keep it.

Seen this way, it's essential that this bridge-crossing happens *somewhere*,
but it doesn't necessarily mean constructors need to be spiffy public API.
It could be a dirty secret we hide within our static factory methods. And
we often do this, because public constructors *aren't* spiffy; they can't
have names, relying purely on argument types and order to disambiguate, and
they weirdly promise the caller never to return a subtype even though a
caller should never even care about that (because substitutability
principle).


Here are three approaches that I could imagine pursuing:
>
> (1) Value classes are a special case for 'new Foo()'
>
> This is the plan of record: the unique instance invariant continues to
> hold for 'new Foo()' where Foo is an identity class, but if Foo is a value
> class, you might get an existing instance.
>

(I've argued above it's not even a special case.)


Biggest concerns: for now, it can be surprising that 'new' doesn't always
> give you a unique instance.


This is the best kind of surprise! Because grappling with it points you
directly toward understanding what identityless classes are all about.


A couple more minor points about the factories idea:


> A related, possibly-overlapping new Java feature idea (not concretely
> proposed, but something the language might want in the future) is the
> declaration of canonical factory methods in a class, which intentionally
> *don't* promise unique instances (for example, they might implement
> interning). These factories would be like constructors in that they
> wouldn't have a unique method name, but otherwise would behave like ad hoc
> static factory methods—take some arguments, use them to create/locate an
> appropriate instance, return it.
>

Can you clarify what these offer that static methods don't already provide?
The two weaknesses I'm aware of with static factory methods are (1)
subclasses still need a constructor to call and (2) often you don't really
want the burden of naming them, you just want them to look like the obvious
standard creation path. It sounds like this addresses (2) but not (1), and
I assume also addresses some (3).



> (2) 'new Foo()' as a general-purpose creation tool
>
> In this approach, 'new Foo()' is the use-site syntax for *both* factory
> and constructor invocation. Factories and constructors live in the same
> overload resolution "namespace", and all will be considered by the use site.
>

It sounds to me like these factories would be static, so `new` would not be
required by the "cross the bridge" interpretation given above.


On Thu, Feb 24, 2022 at 7:40 AM Dan Heidinga  wrote: