Thanks Dan for putting the work in to provide a credible alternative.

Let me add some background for how we came up with these things.  At some point we asked ourselves, what if we had identity and value classes from day 1?  How would that affect the object model?  And we concluded at the time that we probably wouldn't want the identity-indeterminacy of Object, but instead would want something like

    abstract class Object
    class IdentityObject extends Object { }
    abstract class ValueObject extends Object { }

So the {Identity,Value}Object interfaces seemed valuable pedagogically, in that they make the object hierarchy reflect the language division.  At the time, we imagined there might be methods that apply to all value objects, that could live in ValueObject.

A separate factor is that we were taking operations that were previously total (locking, weak refs) and making them partial. This is scary!  So we wanted a way to make these expressible in the static type system.

Unfortunately, the interfaces do not really deliver on either goal, because we can't turn back time.  We still have to deal with `new Object()`, so we can't (yet) make Object abstract. Many signatures will not be changeable from "Object" to "IdentityObject" for reasons of compatibility, unless we make IdentityObject erase to Object (which has its own problems.)  If people use it at all for type bounds, we'll see lots of uses of `Foo<? extends Bar&IdentityObject>`, which will put more pressure on our weak support for intersection types.  And dynamic errors will still happen, because too much of the world was built using signatures that don't express identity-ness. (Kevin will see a parallel to introducing nullness annotations; it might be fine if you build the world that way from scratch, but the transition is painful when you have to interpret an unadorned type as "of unspecified identity-ness.")

Several years on, we're still leaning on the same few motivating examples -- capturing things like "I might lock this" in the type system.  That we haven't come up with more killer examples is notable.  And I grow increasingly skeptical of the value of the locking example, both because this is not how concurrent code is written, and because we *still* have to deal with the risk of dynamic errors because most of the world's code has not been (and will not be) written to use IdentityObject throughout.


As Dan points out, the main thing we give up by backing off from these interfaces is the static typing; we don't get to use `IdentityObject` as a parameter type, return type, or type bound.  And the only reason we've come up with so far to want that is a pretty lame one -- locking.

From a language design perspective, I find that you declare a class with `value class`, but you express the subclassing constraint with `extends IdentityObject`, to be pretty leaky.

On 3/22/2022 7:56 PM, Dan Smith wrote:
In response to some encouragement from Remi, John, and others, I've decided to 
take a closer look at how we might approach the categorization of value and 
identity classes without relying on the IdentityObject and ValueObject 
interfaces.

(For background, see the thread "The interfaces IdentityObject and ValueObject must 
die" in January.)

These interfaces have found a number of different uses (enumerated below), 
while mostly leaning on the existing functionality of interfaces, so there's a 
pretty good complexity vs. benefit trade-off. But their use has some rough 
edges, and inserting them everywhere has a nontrivial compatibility impact. Can 
we do better?

Language proposal:

- A "value class" is any class whose instances are all value objects. An "identity class" is any 
class whose instances are all identity objects. Abstract classes can be value classes or identity classes, or neither. 
Interfaces can be "value interfaces" or "identity interfaces", or neither.

- A class/interface can be designated a value class with the 'value' modifier.

value class Foo {}
abstract value class Bar {}
value interface Baz {}
value record Rec(int x) {}

A class/interface can be designated an identity class with the 'identity' 
modifier.

identity class Foo {}
abstract identity class Bar {}
identity interface Baz {}
identity record Rec(int x) {}

- Concrete classes with neither modifier are implicitly 'identity'; abstract 
classes with neither modifier, but with certain identity-dependent features 
(instance fields, initializers, synchronized methods, ...) are implicitly 
'identity' (possibly with a warning). Other abstract classes and interfaces are 
fine being neither (thus supporting both kinds of subclasses).

- The properties are inherited: if you extend a value class/interface, you are 
a value/class interface. (Same for identity classes/interfaces.) It's an error 
to be both.

- The usual restrictions apply to value classes, both concrete and abstract; and also to 
"neither" abstract classes, if they haven't been implicitly made 'identity'.

- An API ('Object.isValueObject()'?) allows for dynamically distinguishing between value 
objects and identity objects. The reflection API (in java.lang.Class) allows for 
detection of value classes/interfaces, identity classes/interfaces, and 
"neither" classes/interfaces.

- TBD whether/how we track these properties statically so that the type system 
catch mismatches between non-identity class types and uses that assume identity.

JVM proposal:

- Same conceptual framework.

- Classes can be ACC_VALUE, ACC_IDENTITY, or neither.

- Legacy-version classes are implicitly ACC_IDENTITY. Legacy interfaces are 
not. Optionally, modern-version concrete classes are also implicitly 
ACC_IDENTITY.

(Trying out this alternative approach to abstract classes: there's no more 
ACC_PERMITS_VALUE; instead, legacy-version abstract classes are automatically 
ACC_IDENTITY, and modern-version abstract classes permit value subclasses 
unless they opt out with ACC_IDENTITY. It's the bytecode generator's 
responsibility to set these flags appropriately. Conceptually cleaner, maybe 
too risky...)

- At class load time, we inherit value/identity-ness and check for conflicts. It's okay 
to have neither flag set but inherit the property from one of your supers. We also 
enforce constraints on value classes and "neither" abstract classes.

---

So how does this score as a replacement for the list of features enabled by the 
interfaces?

- Dynamic detection: 'obj instanceof ValueObject' is quite straightforward; if 
we can replace that with 'obj.isValueObject()', that feels about equally 
useful. (I'd be more pessimistic about something like 
'Objects.isValueObject(obj)'.)

- Subclass restriction: 'implements IdentityObject' has been replaced with the 'identity' 
modifier. Complexity cost of special modifiers seems on par with the complexity of 
special rules for inferring and checking the superinterfaces. I think it's a win that we 
use the 'value' modifier and "value" terminology for all kinds of 
classes/interfaces, not just concrete classes.

- Variable types: I don't see a good way to get the equivalent of an 'IdentityObject' type. It 
would involve tracking the 'identity' property through the whole type system, which seems like a 
huge burden for the occasional "I'm not sure you can lock on that" error message. So we'd 
probably need to be okay letting that go. Fortunately, I'm not sure it's a great loss—lots of code 
today seems happy using 'Object' when it means, informally, "object that I've created for the 
sole purpose of locking".

- Type variable bounds: this one seems more achievable, by using the 'value' and 
'identity' keywords to indicate a new kind of bounds check ('<identity T extends 
Runnable>'). Again, it's added complexity, but it's more localized. We should 
think more about the use cases, and decide if it passes the cost/benefit analysis. If 
not, nothing else depends on this, so it could be dropped. (Or left to a future, more 
general feature?)

- Documentation: we've lost the handy javadoc location to put some explanations 
about identity & value objects in a place that curious programmers can easily 
stumble on. Anything we want to say needs to go in JLS/JVMS (or perhaps the 
java.lang.Object javadoc).

- Compatibility: pretty clear win here. No interface injection means tools that 
depend on reflection results won't be broken. (We've found a significant number 
of these problems in our own code/tests, FWIW.) No new static types means 
inference results won't change. There's less risk of incompatibilities when 
adding/removing the 'identity' and 'value' keywords (although there can still 
be source, binary, and behavioral incompatibilities).

Reply via email to