I haven't caught up on the plans for equality in a long time.

This is a good time to catch up on this.

Today, the JVM provides an equality operation on objects in the form of the `ACMP` instructions.  It also provides per-primitive equality operations (`ICMP`, `FCMP`, etc) for the various primitive types. (The JVM mostly erases boolean, byte, char, and short to int, so some of these instructions are "missing".)

Today, the language translate the `==` operator to the appropriate ACMP / ICMP / etc instruction, depending on the static type of the operands.  (JLS Ch5 (Contexts and Conversions) does the lifting of managing mismatches when we, say, compare an object to a primitive.)  The important thing to take away here is that there really are multiple `==` operators, they are just spelled the same way, and disambiguated by static typing; let's call them `id==`, `int==`, etc if there's any ambiguity.  Note that `float==` and `double==` are weird when it comes to `NaN`, so `==` on primitives is not necessarily just a straight bitwise comparison.

Object has an `equals` method; the default implementation is:

    boolean equals(Object other) {
        return this == other;
    }

So in the absence of code to the contrary, two objects are `equals` if they are the same object.

Extrapolating, ACMP is a _substitutability test_; it says that substituting one for the other would have no detectable differences.  Because all objects have a unique identity, comparing the identities is both necessary and sufficient for a substitutability test.  This is the foundation on which we abstract `==` on the new classes.

If C is a class with no identity, that means an instance is the state, the whole state, and nothing but the state.  So the natural way to ask "could I substitute instance c1 for instance c2" is to compare each of its fields with a substitutability test.  Which is exactly what `ACMP` does on primitive objects.  In keeping with the notion that each primitive type has its own `==`, we'll write `Point==` for the equality on `Point`.

For a simple `Point` primitive class, this is obvious, but it gets tricky when a primitive is hiding behind a broader static type like Object or an interface type.  Consider:

    primitive class Box {
        Object contents;
    }

How do we compare two boxes?  By comparing their contents.  How do we compare contents?  With a substitutability test.  If we have identity objects in the box, then the box comparison is "are you both boxes, and are your contents `id==`".  What if we have Points in the box?  We need to compare them with `Point==`.  How do we know we have Points in the box?  By looking at their dynamic type.  So the `==` operation on primitive objects not only recurses into fields, but for fields that could hold _either_ identity or primitive objects (these are `Object`, interfaces, and some abstract classes), we dynamically select the `==` operator to use on that field.  (Edge cases: an id object is never `==` to a primitive object; null is always `==` to itself.)

Note that `.ref` is transparent here; in order to get a `Point` into the `Object` field, we (probably silently) converted it to `Point.ref`.  But `Point.ref` uses the same `==` computation as `Point`.  The same is true for the B2/B3 distinction; no difference.  Objects without identity are equal when their state is equal, whether they're a B2, B3, or B3.ref.

Possibly surprisingly, this has been pushed all the way into `ACMP`.  This means that existing code like the default implementation of `Object::equals` just works; if you give it primitive objects, it knows what to do, and performs the proper substitutability test.  One rough edge is that we don't use `==` as the test for float and double fields, because it's not a proper substitutability test; we use the semantics of `Float::equals` and `Double::equals` instead.  Historical wart.

The bottom line is that `==` is preserved as a substitutability test on instances of all primitive classes, whether they're "stored" by reference or value.  A corollary is that (finally) Integer instances provide reliable `==` semantics, rather than the old unreliable cache-based semantics.  (One rift healed.)


Reply via email to