One of the longstanding properties of class instance creation expressions ('new
Foo()') is that the instance being produced is unique—that is, not '==' to any
previously-created instance.
Value classes will disrupt this invariant, because it's possible to "create" an
instance of a value class that already exists:
new Point(1, 2) == new Point(1, 2) // always true
A related, possibly-overlapping new Java feature idea (not concretely proposed,
but something the language might want in the future) is the declaration of
canonical factory methods in a class, which intentionally *don't* promise
unique instances (for example, they might implement interning). These factories
would be like constructors in that they wouldn't have a unique method name, but
otherwise would behave like ad hoc static factory methods—take some arguments,
use them to create/locate an appropriate instance, return it.
I want to focus here on the usage of class instance creation expressions, and
how to approach changes to their semantics. This involves balancing the needs
of programmers who depend on the unique instance invariant with those who don't
care and would prefer fewer knobs/less complexity.
Here are three approaches that I could imagine pursuing:
(1) Value classes are a special case for 'new Foo()'
This is the plan of record: the unique instance invariant continues to hold for
'new Foo()' where Foo is an identity class, but if Foo is a value class, you
might get an existing instance.
In bytecode, the translation of 'new Foo()' depends on the kind of class (as
determined at compile time). Identity class creation continues to be
implemented via 'new Foo; dup; invokespecial Foo.<init>()V'. Value class
creation occurs via 'invokestatic Foo.<newvalue>()LFoo;' (method name
bikeshedding tk). There is no compatibility between the two (e.g., if an
identity class becomes a value class).
In a way, it shouldn't be surprising that a value class doesn't guarantee
unique instances, because uniqueness is closely tied to identity. So
special-casing 'new Foo()' isn't that different from special-casing
Object.equals'—in the absence of identity, we'll do something reasonable, but
not quite the same.
Factories don't enter into this story at all. If we end up having unnamed
factories in the future, they will be declared and invoked with a separate
syntax, and will be declarable both by identity classes and value classes.
(Value class factories don't seem particularly compelling, but they could, say,
be used to smooth migration, like 'Integer.valueOf'.)
Biggest concerns: for now, it can be surprising that 'new' doesn't always give
you a unique instance. In a future with factories, navigating between the 'new'
syntax and the factory invocation syntax may be burdensome, with style wars
about which approach is better.
(2) 'new Foo()' as a general-purpose creation tool
In this approach, 'new Foo()' is the use-site syntax for *both* factory and
constructor invocation. Factories and constructors live in the same overload
resolution "namespace", and all will be considered by the use site.
In bytecode, the preferred translation of 'new Foo()' is 'invokestatic
Foo.<new>()LFoo;'. Note that this is the case for both value classes *and
identity classes*. For compatibility, 'new/dup/<init>' also needs to be
supported for now; eventually, it might be deprecated. Refactoring between
constructors and factories is generally compatible.
Because this re-interpretation of 'new Foo()' supports factories, there is no
unique instance invariant. At best, particular classes can document that they
produce unique instances, and clients who need this behavior should ensure
they're working with classes that promise it. (It's not as simple as looking
for a *current* factory, because constructors can be refactored to factories.)
For developers who don't care about unique instances, this is the simplest
approach: whenever you want an instance of Foo, you say 'new Foo()'.
Biggest concerns: we've demoted an ironclad semantic guarantee to an optional
property of some classes. For those developers/use cases who care about the
unique instance invariant, that may be difficult, especially because we're
undoing a longstanding property rather than designing it this way from the
beginning.
(3) 'new Foo()' for unique instances and just 'Foo()' otherwise
Here, the 'new' keyword is reserved for cases in which a unique instance is
guaranteed. For value class creation, factory invocation, and constructor
invocation when unique instances don't matter, a bare 'Foo()' call is used
instead. 'new Point()' would be an error—this syntax doesn't work with value
classes.
In bytecode, 'new Foo()' always compiles to 'new/dup/<init>', while plain
'Foo()' typically compiles to 'invokestatic Foo.<make>()LFoo;' (method name
bikeshedding tk). For compatibility, plain 'Foo()' would support
'new/dup/<init>' invocations as well, if that's all the class provides.
Refactoring between constructors and factories is generally compatible for
plain 'Foo()' use sites, but not 'new Foo()' use sites.
The plain 'Foo()' would become the preferred style for general-purpose usage,
while 'new Foo()' would (eventually, after a long migration period) signal an
interest in the unique instance guarantee. Java code written with the updated
style is a little lighter on "ceremony".
Biggest concerns: a somewhat arbitrary shift in coding style for all
programmers to learn, which at a minimum must be adopted when working with
value classes.
---
What are your thoughts about the significance of the unique instance invariant?
Is it important enough to design instance creation syntax around it? Do either
(2) or (3) above sound like a better destination than the plan of record?