On Fri, Mar 25, 2022 at 8:39 AM Brian Goetz <brian.go...@oracle.com>
wrote:
We still have a lot of work to do on the current round of pattern
matching (record patterns), but let's take a quick peek down the
road. Pattern assignment is a sensible next building block, not
only because it is directly useful, but also because it will be
required for _declaring_ deconstruction patterns in classes
(that's how one pattern delegates to another.) What follows is a
rambling sketch of all the things we _could_ do with pattern
assignment, though we need not do all of them initially, or even
ever.
# Pattern assignment
So far, we've got two contexts in the language that can
accommodate patterns --
`instanceof` and `switch`. Both of these are conditional
contexts, designed for
dealing with partial patterns -- test whether a pattern matches,
and if so,
conditionally extract some state and act on it.
There are cases, though, when we know a pattern will always match,
in which case
we'd like to spare ourselves the ceremony of asking. If we have a
3d `Point`,
asking if it is a `Point` is redundant and distracting:
```
Point p = ...
if (p instanceof Point(var x, var y, var z)) {
// use x, y, z
}
```
In this situation, we're asking a question to which we know the
answer, and
we're distorting the structure of our code to do it. Further,
we're depriving
ourselves of the type checking the compiler would willingly do to
validate that
the pattern is total. Much better to have a way to _assert_ that
the pattern
matches.
## Let-bind statements
In such a case, where we want to assert that the pattern matches,
and forcibly
bind it, we'd rather say so directly. We've experimented with a
few ways to
express this, and the best approach seems to be some sort of `let`
statement:
```
let Point(var x, var y, var z) p = ...;
// can use x, y, z, p
```
Other ways to surface this might be to call it `bind`:
```
bind Point(var x, var y, var z) p = ...;
```
or even use no keyword, and treat it as a generalization of
assignment:
```
Point(var x, var y, var z) p = ...;
```
(Usual disclaimer: we discuss substance before syntax.)
A `let` statement takes a pattern and an expression, and we
statically verify
that the pattern is exhaustive on the type of the expression; if
it is not, this is a
type error at compile time. Any bindings that appear in the
pattern are
definitely assigned and in scope in the remainder of the block
that encloses the
`let` statement.
Let statements are also useful in _declaring_ patterns; just as a
subclass
constructor will delegate part of its job to a superclass
constructor, a
subclass deconstruction pattern will likely want to delegate part
of its job to
a superclass deconstruction pattern. Let statements are a natural
way to invoke
total patterns from other total patterns.
#### Remainder
Let statements require that the pattern be exhaustive on the type
of the expression.
For total patterns like type patterns, this means that every value
is matched,
including `null`:
```
let Object o = x;
```
Whatever the value of `x`, `o` will be assigned to `x` (even if
`x` is null)
because `Object o` is total on `Object`. Similarly, some patterns
are clearly
not total on some types:
```
Object o = ...
let String s = o; // compile error
```
Here, `String s` is not total on `Object`, so the `let` statement
is not valid.
But as previously discussed, there is a middle ground -- patterns
that are
_total with remainder_ -- which are "total enough" to be allowed
to be considered
exhaustive, but which in fact do not match on certain "weird"
values. An
example is the record pattern `Box(var x)`; it matches all box
instances, even
those containing null, but does not match a `null` value itself
(because to
deconstruct a `Box`, we effectively have to invoke an instance
member on the
box, and we cannot invoke instance members on null receivers.)
Similarly, the
pattern `Box(Bag(String s))` is total on `Box<Bag<String>>`, with
remainder
`null` and `Box(null)`.
Because `let` statements guarantee that its bindings are
definitely assigned
after the `let` statement completes normally, the natural thing to
do when
presented with a remainder value is to complete abruptly by reason
of exception.
(This is what `switch` does as well.) So the following statement:
```
Box<Bag<String>> bbs = ...
let Box(Bag(String s)) = bbs;
```
would throw when encountering `null` or `Box(null)` (but not
`Box(Bag(null))`,
because that matches the pattern, with `s=null`, just like a
switch containing
only this case would.
#### Conversions
JLS Chapter 5 ("Conversions and Contexts") outlines the
conversions (widening,
narrowing, boxing, unboxing, etc) that are permitted in various
contexts
(assignment, loose method invocation, strict method invocation,
cast, etc.)
We need to define the set of conversions we're willing to perform
in the context
of a `let` statement as well; which of the following do we want to
support?
```
let int x = aShort; // primitive widening
let byte b = 0; // primitive narrowing
let Integer x = 0; // boxing
let int x = anInteger; // unboxing
```
The above examples -- all of which use type patterns -- look a lot
like local
variable declarations (especially if we choose to go without a
keyword); this
strongly suggests we should align the valid set of conversions in
`let`
statements with those permitted in assignment context. The one
place where we
have to exercise care is conversions that involve unboxing; a null
in such
circumstances feeds into the remainder of the pattern, rather than
having
matching throw (we're still likely to throw, but it affects the
timing of how
far we progress in a pattern switch before we do so.) So for
example, the
the pattern `int x` is exhaustive on `Integer`, but with remainder
`null`.
## Possible extensions
There are a number of ways we can extend `let` statements to make
it more
useful; these could be added at the same time, or at a later time.
#### What about partial patterns?
There are times when it may be more convenient to use a `let` even
when we know
the pattern is partial. In most cases, we'll still want to
complete abruptly if the
pattern doesn't match, but we may want to control what happens.
For example:
```
let Optional.of(var contents) = optName
else throw new IllegalArgumentException("name is empty");
```
Having an `else` clause allows us to use a partial pattern, which
receives
control if the pattern does not match. The `else` clause could
choose to throw,
but could also choose to `break` or `return` to an enclosing
context, or even
recover by assigning the bindings.
#### What about recovery?
If we're supporting partial patterns, we might want to allow the
`else` clause
to provide defaults for the bindings, rather than throw. We can
make the bindings of the
pattern in the `let` statement be in scope, but definitely
unassigned, in the
`else` clause, which means the `else` clause could initialize them
and continue:
```
let Optional.of(var contents) = optName
else contents = "Unnamed";
```
This allows us to continue, while preserving the invariant that
when the `let`
statement completes normally, all bindings are DA.
#### What about guards
If we're supporting partial patterns, we also need to consider the
case where
the pattern matches but we still want to reject the content. This
could of
course be handled by testing and throwing after the `let`
completes, but if we
want to recover via the `else` clause, we might want to handle
this directly.
We've already introduced a means to do this for switch cases -- a
`when` clause
-- and this works equally well in `let`:
```
let Point(var x, var y) = aPoint
when x >= 0 && y >= 0
else { x = y = 0; }
```
#### What about expressions?
The name `let` conjures up the image of `let` expressions in
functional
languages, where we introduce a local binding for use in the scope
of a single
expression. This is not an accident! It is quite useful when the
same expression
is going to be used multiple times, or when we want to limit the
scope of a local
to a specific computation.
It is a short hop to `let` being usable as an expression, by
providing an `in`
clause:
```
String lastThree =
let int len = s.length()
in s.substring(len-3, len);
```
The scope of the binding `len` is the expression to the right of
the `in`,
nothing else. (As with `switch` expressions, the expression to
the right
of the `in` could be a block with a `yield` statement.)
It is a further short hop to permitting _multiple_ matches in a
single `let`
statement or expression:
```
int area = let Point(var x0, var y0) = lowerLeft,
Point(var x1, var y1) = upperRight
in (x1-x0) * (y1-y0);
```
#### What about parameter bindings?
Destructuring with total patterns is also useful for method and lambda
parameters. For a lambda that accepts a `Point`, we could include
the pattern
in the lambda parameter list, and the bindings would automatically
be in scope in the body. Instead of:
```
areaFn = (Point lowerLeft, Point upperRight)
-> (upperRight.x() - lowerLeft.x()) * (upperRight.y() -
lowerLeft.y());
```
we could do the destructuring in the lambda header:
```
areaFn = (let Point(var x0, var y0) lowerLeft,
let Point(var x1, var y1) upperRight)
-> (x1-x0) * (y1-y0);
```
This allows us to treat the derived values to be "parameters" of
the lambda. We
would enforce totality at compile time, and dynamically reject
remainder as we
do with `switch` and `let` statements.
I think this one may be a bridge too far, though. The method
header should
probably be reserved for API declaration, and destructuring only
serves the
implementation. I think I'd prefer to move the `let` into the
body of the
method or lambda.