Primitive type patterns and conversions

Brian Goetz Mon, 01 Mar 2021 14:05:23 -0800

Right now, we've spent almost all our time on patterns whose targets arereference types (type patterns, record patterns, array patterns,deconstruction patterns). It's getting to be time to nail down (a) thesemantics of primitive type patterns, and (b) theconversions-and-contexts (JLS 5) rules. And, because we're on the cuspof the transition to Valhalla, we must be mindful of both both thecurrent set of primitive conversions, and the more general object modelas it will apply to primitive classes.

If we focus on type patterns alone, let's bear in mind that primitivetype patterns are not nearly as powerful as other type patterns, because(under the current rules) primitives are "islands" in the type system --no supertypes, no subtypes. In other words, they are *always total* onthe types they would be strictly applicable to, which means anyconditionality would come from conversions like boxing, unboxing, andwidening. But I'm not sure pattern matching has quite as much to offerthese more ad-hoc conversions.

We have special rules for integer literals; the literal `0` has astandalone type of `int`, but in most contexts, can be narrowed to`byte`, `short`, or `char` if it fits into the range. When we wereconsidering constant patterns, we considered whether those rules werehelpful for applying in reverse to constant patterns, and concluded thatit added a lot of complexity for little benefit. Now that we've decidedagainst constant patterns for the time being, it may be moot anyway, butlet me draw the example as I think it might be helpful.


Consider the following switch:

    int anInt = 300;

    switch (anInt) {
        case byte b:  A
        case short s: B
        case int i: C
    }

What do we expect to happen? One interpretation is that `byte b` is apattern that is applicable to all integral types, and only matches therange of byte values. (In this interpretation, the second case wouldmatch.) The other is that this is a type error; the patterns `byte b`and `short s` are not applicable to `int`, so the compiler complains. (In fact, in this interpretation, these patterns are always total, andtheir main use is in nested patterns.)

If your initial reaction is that the first interpretation seems prettygood, beware that the sirens are probably singing to you. Yes, havingthe ability to say "does this int fit in a byte" is a reasonable test towant to be able to express. But cramming this into the semantics of thetype pattern `byte b` is an exercise in complexity, since now we have tohave special rules for each (from, to) pair of primitives we want tosupport.


Another flavor of this problem is:

    Object o = new Short(3);

    switch (o) {
        case byte b:  A
        case short s: B
    }

3 can be crammed into a `byte`, and therefore could theoretically matchthe first case, but is this really the kind of complexity we want tolayer atop the definition of primitive type patterns?

I think there's a better answer: lean on explicit patterns forconversions. The conversions from byte <--> int form an embeddingprojection pair, which means that they are suited for a total factory +partial pattern pair:


    class int {
        static int fromByte(byte b) { return b; }
        pattern(byte b) fromByte() { ... succeed if target in range ... }
    }

Then we can replace the first switch with:

    switch (anInt) {
        case fromByte(var b): A    // static or instance patterns on `int`
        case fromShort(var s): B
    }

which is (a) explicit and (b) uses straight library code rather thancomplex language magic, and (c) scales to non-built-in primitiveclasses. (Readers may first think that the name `fromXxx` is backwards,rather than `toXxx`, but what we're asking is: "could this int have comefrom a byte-to-int conversion".)


So, strawman:

A primitive type pattern `P p` is applicable _only_ to type `P`(and therefore is always

    total).  Accordingly, their primary utility is as a nested pattern.

Now, let's ask the same questions about boxing and unboxing. (Boxing isalways total; unboxing might NPE.)

Here, I think introducing boxing/unboxing conversions into patternmatching per se is even less useful. If a pattern binds an int, but wewanted an Integer (or vice versa), then we are free (by virtual ofboxing/unboxing in assignment and related contexts) to just use thebinding. For example:


    void m(Integer i) { ... }
    ...
    plus some pattern Foo(int x)
    ...

    switch (x) {
        case Foo(int x): m(x);
    }

We don't care that we got an int out; when we need an Integer, the rightthing happens. In the other direction, we have to worry about NPEs, butwe can fix that with pattern tools we have:


    switch (x) {
        case Bar(Integer x & true(x != null)): ... safe to unbox x ...

So I think our strawman holds up: primitive type patterns are total ontheir type, with no added boxing/narrowing/widening weirdness. We cancharacterize this as a new context in Ch5 ("conditional pattern matchcontext"), that permits only identity and reference wideningconversions. And when we get to Valhalla, the same is true for typepatterns on primitive classes.



** BONUS ROUND **

Now, let's talk about pattern assignment statements, such as:

    Point(var x, var y) = aPoint

The working theory is that the pattern on the LHS must be total on thetype of the expression on the RHS, with some remainder allowed, and willthrow on any remainder (e.g., can throw NPE on null.) If we want toalign this with the semantics of local variable declaration +initializer, we probably *do* want the full set of assignment-contextconversions, which I think is fine in this context (so, a second newcontext: unconditional pattern assignment, which allows all the sameconversions as are allowed in assignment context.)

If the set of conversions is the same, then we are well on our way tobeing able to interpret


    T t = e

as *either* a local variable declaration, *or* a pattern match, withoutthe user being able to tell the difference:

- The scoping is the same (since the pattern either completes normallyor throws);

 - The mutability is the same (we fixed this one just in time);

- The set of conversions, applicable types, and potential exceptionsare the same (exercise left to the reader.)

Which means (drum roll) local variable assignment is revealed to havebeen a degenerate case of pattern match all along. (And the crowd goeswild.)

Primitive type patterns and conversions

Reply via email to