We characterize patterns by their /applicability/ (static type checking), /unconditionality/ (can matching be determined without a dynamic check, akin to the difference between a static and dynamic cast), and /behavior/ (under what conditions does it match, and what bindings do we get.)

       Currently shipping

As currently shipping, we have one kind of pattern: type patterns for reference types. We define the useful term “downcast convertible” to mean there is a cast conversion that is not unchecked. So |Object| and |ArrayList| are downcast-convertible to each other, as are |List| and |ArrayList|, as are |List<String>| and |ArrayList<String>|, but not |List<?>| to |ArrayList<String>|.

A type pattern |T t| for a ref type T is /applicable to/ a ref type U if U is downcast-convertible to T.

A type pattern |T t| is /unconditional/ on |U| if |U <: T|.

A type pattern |T t| matches a target x when the pattern is unconditional, or when |x instanceof T|; if so, its binding is |(T) x|.


       Record patterns

In the next round, we will add /record patterns/, which bring in /nested patterns/.

A record pattern |R(P*)| is applicable to a reference type U if U is downcast-convertible to R. A record pattern is never unconditional.

A record pattern |R(P*)| matches a target |x| when |x instanceof R|, and when each component of |R| matches the corresponding nested pattern |P_i|. Matching against components is performed using the /instantiated/ static type of the component.

Record patterns also drag in primitive patterns, because records can have primitive components.

A primitive type pattern |P p| is applicable to, and unconditional on, the type P. A primitive type matches a target x when the pattern is unconditional, and its binding is |(P) x|.

Record patterns also drag in |var| patterns as nested patterns. A |var| pattern is applicable to, and unconditional on, every type U, and its binding when matched to |x| whose static type is |U|, is |x| (think: identity conversion.)

This is what we intend to specify for 19.


       Primitive patterns

Looking ahead, we’ve talked about how far to extend primitive patterns beyond exact matches. While I know that this makes some people uncomfortable, I am still convinced that there is a more powerful role for patterns to play here, and that is: as the cast precondition.

A language that has casts but no way to ask “would this cast succeed” is deficient; either casts will not be used, or we would have to tolerate cast failure, manifesting as either exceptions or data loss / corruption. (One could argue that for primitive casts, Java is deficient in this way now (you can make a lossy cast from long to int), but the monomorphic nature of primitive types mitigates this somewhat.) Prior to patterns, users have internalized that before a cast, you should first do an |instanceof| to the same type. For reference types, the |instanceof| operator is the “cast precondition” operator, with an additional (sensible) opinion that |null| is not deemed to be an instance of anything, because even if the cast were to succeed, the result would be unlikely to be usable as the target type.

There are many types that can be cast to |int|, at least under some conditions:

 * Integer, except null
 * byte, short, and char, unconditionally
 * Byte, Short, and Character, except null
 * long, but with potential loss of precision
 * Object or Number, if it’s not null and is an Integer

Just as |instanceof T| for a reference type T tells us whether a cast to T would profitably succeed, we can define |instanceof int| the same way: whether a cast to int would succeed without error or loss of precision. By this measure, |instanceof int| would be true for:

 * any int
 * Integer, when the instance is non-null (unboxing)
 * Any reference type that is cast-convertible to Integer, and is
   |instanceof Integer| (unboxing)
 * byte, short, and char, unconditionally (types that can be widened to
   int)
 * Byte, Short, and Character, when non-null (unboxing plus widening)
 * long when in the range of int (narrowing)
 * Long when non-null, and in the range of int (unboxing plus narrowing)

This table can be generated simply by looking at the set of cast conversions — and we haven’t talked about patterns yet. This is simply the generalization of |instanceof| to primitives. If we are to allow |instanceof int| at all, I don’t think there is really any choice of what it means. And this is useful in the language we have today, separate from patterns:

 * asking if something fits in the range of a byte or int; doing this
   by hand is annoying and error-prone
 * asking if casting from long to int would produce truncation; doing
   this by hand is annoying and error-prone

Doing this means that

|if (x instanceof T) ... (T) x ... |

becomes universally meaningful, and captures exactly the preconditions for when the cast succeeds without error, loss of precision, or null escape. (And as Valhalla is going to bring primitives more into the world of objects, generalizing this relationship will become only more important.)

And if we’ve given meaning to |instanceof int|, it is hard to see how the pattern |int x| could behave any differently than |instanceof int|, because otherwise, we could not refactor the above idiom to:

|if (x instanceof T t) ... t ... |

Extending instanceof / pattern matching to primitives in this way is not only a sensible generalization, but failing to do so would expose gratuitous asymmetries that would be impediments to refactoring:

 *

   Cannot necessarily refactor |int x = 0| with |let int x = 0|. While
   this may seem non-problematic on the surface, as soon as |let|
   acquires any other feature besides “straight unconditional pattern
   assignment”, such as let-expression, it puts users in the bad choice
   between “Can use let, or can use assignment conversion, but not both.”

 *

   Loss of duality between |new X(args)| and |case X(ARGS)|. The
   duality between construction and deconstruction patterns (and
   similar for static factories/patterns, builders/“unbuilders”, and
   collection literals/patterns is a key part of the story; we take
   things apart in the same way we put them together. Any gratuitous
   divergence becomes an avoidable sharp edge.

Since these are related to assignment and method invocation, let’s ask: how do these conversions line up with assignment and method invocation conversions?

There are two main differences between the safe cast conversions and assignment context. One has to do with narrowing; the “if it’s a literal and in range” is the best approximation that assignment can do, while in a context that accepts partial patterns, the pattern can be more discriminating, and so should. The other is treatment of null; again, because of the totality requirement, assignment throws when unboxing a null, but pattern matching in a partial context can deal more gracefully, and simply decline to match.

There are also some small differences between the safe cast conversions and method invocation context. There is the same issue with unboxing null (throws in (loose) invocation context), and method invocation context makes no attempt to do narrowing, even for literals. This last seems mostly a historical wart, which now can’t be changed because it would either potentially change (very few) overload selection choices, or would require another stage of selection.

What are the arguments against this interpretation? They seem to be various flavors of “ok, but, do we really need this?” and “yikes, new complexity.”

The first argument comes from a desire to treat pattern matching as a “Coin”-like feature, strictly limiting its scope. (As an example of a similar kind of pushback, in the early days, it was asked “but does pattern matching have to be an expression, couldn’t we just have an “ifmatch” statement? (See answer here: http://mail.openjdk.java.net/pipermail/amber-dev/2018-December/003842.html) This is the sort of question we get a lot — there’s a natural tendency to try to “scope down” features that seem unfamiliar. But I think it’s counterproductive here.

The second argument is largely a red herring, in that this is /not/ new complexity, since these are exactly the rules for successful casts. In fact, not doing it might well be perceived as new complexity, since it results in more corner cases where refactorings that seem like they should work, do not, because of conversions.

Reply via email to