In response to some encouragement from Remi, John, and others, I've decided to 
take a closer look at how we might approach the categorization of value and 
identity classes without relying on the IdentityObject and ValueObject 
interfaces.

(For background, see the thread "The interfaces IdentityObject and ValueObject 
must die" in January.)

These interfaces have found a number of different uses (enumerated below), 
while mostly leaning on the existing functionality of interfaces, so there's a 
pretty good complexity vs. benefit trade-off. But their use has some rough 
edges, and inserting them everywhere has a nontrivial compatibility impact. Can 
we do better?

Language proposal:

- A "value class" is any class whose instances are all value objects. An 
"identity class" is any class whose instances are all identity objects. 
Abstract classes can be value classes or identity classes, or neither. 
Interfaces can be "value interfaces" or "identity interfaces", or neither.

- A class/interface can be designated a value class with the 'value' modifier.

value class Foo {}
abstract value class Bar {}
value interface Baz {}
value record Rec(int x) {}

A class/interface can be designated an identity class with the 'identity' 
modifier.

identity class Foo {}
abstract identity class Bar {}
identity interface Baz {}
identity record Rec(int x) {}

- Concrete classes with neither modifier are implicitly 'identity'; abstract 
classes with neither modifier, but with certain identity-dependent features 
(instance fields, initializers, synchronized methods, ...) are implicitly 
'identity' (possibly with a warning). Other abstract classes and interfaces are 
fine being neither (thus supporting both kinds of subclasses).

- The properties are inherited: if you extend a value class/interface, you are 
a value/class interface. (Same for identity classes/interfaces.) It's an error 
to be both.

- The usual restrictions apply to value classes, both concrete and abstract; 
and also to "neither" abstract classes, if they haven't been implicitly made 
'identity'.

- An API ('Object.isValueObject()'?) allows for dynamically distinguishing 
between value objects and identity objects. The reflection API (in 
java.lang.Class) allows for detection of value classes/interfaces, identity 
classes/interfaces, and "neither" classes/interfaces.

- TBD whether/how we track these properties statically so that the type system 
catch mismatches between non-identity class types and uses that assume identity.

JVM proposal:

- Same conceptual framework.

- Classes can be ACC_VALUE, ACC_IDENTITY, or neither.

- Legacy-version classes are implicitly ACC_IDENTITY. Legacy interfaces are 
not. Optionally, modern-version concrete classes are also implicitly 
ACC_IDENTITY.

(Trying out this alternative approach to abstract classes: there's no more 
ACC_PERMITS_VALUE; instead, legacy-version abstract classes are automatically 
ACC_IDENTITY, and modern-version abstract classes permit value subclasses 
unless they opt out with ACC_IDENTITY. It's the bytecode generator's 
responsibility to set these flags appropriately. Conceptually cleaner, maybe 
too risky...)

- At class load time, we inherit value/identity-ness and check for conflicts. 
It's okay to have neither flag set but inherit the property from one of your 
supers. We also enforce constraints on value classes and "neither" abstract 
classes.

---

So how does this score as a replacement for the list of features enabled by the 
interfaces?

- Dynamic detection: 'obj instanceof ValueObject' is quite straightforward; if 
we can replace that with 'obj.isValueObject()', that feels about equally 
useful. (I'd be more pessimistic about something like 
'Objects.isValueObject(obj)'.)

- Subclass restriction: 'implements IdentityObject' has been replaced with the 
'identity' modifier. Complexity cost of special modifiers seems on par with the 
complexity of special rules for inferring and checking the superinterfaces. I 
think it's a win that we use the 'value' modifier and "value" terminology for 
all kinds of classes/interfaces, not just concrete classes.

- Variable types: I don't see a good way to get the equivalent of an 
'IdentityObject' type. It would involve tracking the 'identity' property 
through the whole type system, which seems like a huge burden for the 
occasional "I'm not sure you can lock on that" error message. So we'd probably 
need to be okay letting that go. Fortunately, I'm not sure it's a great 
loss—lots of code today seems happy using 'Object' when it means, informally, 
"object that I've created for the sole purpose of locking".

- Type variable bounds: this one seems more achievable, by using the 'value' 
and 'identity' keywords to indicate a new kind of bounds check ('<identity T 
extends Runnable>'). Again, it's added complexity, but it's more localized. We 
should think more about the use cases, and decide if it passes the cost/benefit 
analysis. If not, nothing else depends on this, so it could be dropped. (Or 
left to a future, more general feature?)

- Documentation: we've lost the handy javadoc location to put some explanations 
about identity & value objects in a place that curious programmers can easily 
stumble on. Anything we want to say needs to go in JLS/JVMS (or perhaps the 
java.lang.Object javadoc).

- Compatibility: pretty clear win here. No interface injection means tools that 
depend on reflection results won't be broken. (We've found a significant number 
of these problems in our own code/tests, FWIW.) No new static types means 
inference results won't change. There's less risk of incompatibilities when 
adding/removing the 'identity' and 'value' keywords (although there can still 
be source, binary, and behavioral incompatibilities).

Reply via email to