A guard construct need not itself be a pattern.
True. What is minimally needed is a _syntactic_ separation of what is
pattern and what is expression, without having to wait for semantic
analysis to understand what is being combined. This is, in part,
because there are still sequences that are matched by both the pattern
and expression productions, notably `Identifier()` (could be a
deconstruction pattern with no bindings, or could be a method invocation.)
Rather, it can be viewed as a map from patterns to patterns. Indeed,
they are formulated in exactly that way in Gavin’s BNF in JEP
JDK-8213076 "Pattern Matching for switch”: a guard is not a pattern,
but can only appear within a pattern as the right-hand operand of `&`:
Pattern:
PatternOperand
Pattern & PatternOperandOrGuard
PatternOperandOrGuard:
PatternOperand
GuardPattern
As a result, a guard necessarily appears to the right of a `&` and
therefore necessarily to the right of a pattern. We should also
inquire as to whether it is ever desirable in practice, within a chain
of `&` (pattern conditional-and) operations for a pattern to appear to
the right of a guard.
I have long had a nagging feeling that this will eventually be
desirable. Let's say we have P & g & Q & h; under what conditions can
we commute g and Q without regret? I can think of four potential
sources of regret:
- g declares bindings that are inputs to Q
- the cost model of Q is such that we'd like to run g first, and
short-circuit
- Q might throw an exception when g does not hold
- Q might have side-effects that we don't want to run if g does not hold
I think we can eliminate the last one; I'm pretty comfortable saying
that if you write side effects in pattern declarations, you get what you
deserve. And, the linguistic part of patterns are not supposed to throw
exceptions, but badly written pattern declarations may anyway. But that
still leaves the dataflow and performance concerns; I think I will
eventually want to be able to specify the order, and get
short-circuiting. This is why I've resisted this direction to date.
If not, then `&` chains always have the simple form
pattern & pattern & … & pattern & guard & guard & … & guard
where the number of patterns must be positive but the number of guards
may be zero. And if this is the case, it is not unreasonable to ask
whether readability might not be better served by better marking that
transition from patterns to guard in the chain, for example:
pattern & pattern & … & pattern when guard & guard & … & guard
And then we see that there really is no reason to try to overload `&`
(however it is actually spelled) to mean both pattern conjunction and
guard conjunction, because guard conjunction already exists in the
form of the `&&` expression operator:
pattern & pattern & … & pattern when guard && guard && … && guard
and therefore we can, after all, simplify this general form to the
case of zero or one guards:
pattern & pattern & … & pattern [when guard]
There's one more turn of this crank: if we are willing to move the
guards all to the right (big if), then, why say `when`, and not `&&`?
Then it looks just like the `if...instanceof` situation.
case pattern & pattern && guard:
This further align patterns in instanceof with patterns in switch. (With
one potentially surprising caveat: we can never switch on booleans;
`case true && false` would not match `true`. Pause for groans.)
Finally, given that (an earlier version of) the patterns design
already encompasses forms that can bind the entire object as well as
components (what is done in other languages with `as`, I have to ask:
what are the envisioned practical applications of pattern conjunction
other than as a cute way to include guards or a (more verbose) way to
bind the entire value as well as components? Maybe as a way to fake
intersection types?
When a pattern focuses on a part, rather than the whole. A pattern like
`Point(var x, var y)` matches / destructures the whole thing, but other
patterns can act as queries. Imagine we have a pattern
`Map.with(key)(var value)`, which matches maps that have the specified
key. We would likely want to combine these with &:
if (x instanceof (Map.with(key1)(var val1) & Map.with(key2)(var
val2))) { ... }
This scales up to query APIs such as a JSON parsing API, where you only
want to match a blob of JSON if it has all the parts you are looking for
(similar to "spec/conform" in Clojure.) From
https://github.com/openjdk/amber-docs/blob/master/site/design-notes/pattern-match-object-model.md:
switch (doc) {
case stringKey("firstName")(var first)
& stringKey("lastName")(var last)
& intKey("age")(var age)
& objectKey("address")(
stringKey("city")(var city)
& stringKey("state")(var state)
& ...): ...
}
This expresses not only the all-or-nothing nature of the composite
query, but permits the pattern to match the structure of what is being
queried (the `objectKey` pattern has a nested pattern which applies to
the body of that object, which itself is an & pattern.
This is the sort of example I could imagining wanting to stick a guard
in the middle of; I could well want to guard "name not empty" and not
bother parsing the rest of the document. Semantically that might be
equivalent to putting the guard at the end, but the user might not thank
us for not letting them short-circuit out.
The real unknown is:
- g declares bindings that are inputs to Q
Can we construct a credible example?
As to intersection types, pattern-& is not even the right vehicle,
because in
case Foo f & Bar b:
then f and b will have types Foo and Bar, but really, I want something
of type (Foo&Bar). If this were important, I'd probably want to be able
to use an intersection type in the type pattern:
case (Foo&Bar) fb:
Now, all of this has no bearing on whether or not guards are required
to be “top level only” in all cases; it argues only that guards need
not appear within pattern-conjunction chains. But I believe it would
be perfectly reasonable to write
case Point(int x when x > 0, int y when y > x):
Rémi has argued that this would be better written
case Point(int x, int y) when x > 0 && y > x:
but I would argue that this choice is, and should be, a matter of
style, and when matching against a record with many fields it might be
more readable to mention each field’s constraint next to its binding
rather than to make the reader compare a list of bindings against a
list of constraints.
Agreed, this is a user choice.
Bottom line: there are conceptually three distinct combining forms:
pattern conjunction
guard conjunction
to a pattern, attach a guard
and it may be a mistake after all to conflate them by trying to use
the same syntax or symbol for all three.
So what I would like to see is the convincing application example
where you really do want to write
pattern & guard & pattern
Did the "guard in the middle of the JSON blob" do that?
because then everything I’ve written above falls to the ground.
Well, even if so, its possible it can be propped up, just not using one
universal support. Suppose we allow guards to be conjoined on the end
of a pattern with &&. Then you could say
case Foo(var x) && x > 0:
as well, even, as
case Foo(var x && x > 0):
But if you wanted to do the P & g & Q thing, you'd need a grobble-style
pattern to do so:
case P & grobble(g) & Q:
Recall that we will eventually be able to write `grobble()` as an
ordinary declared pattern, and we can also still resurrect the
true/false built-in patterns if we like, for rescuing this case. (I
still think we may eventually want a `non-null(e)` pattern, as it will
likely be the most common form of grobbling:
case Foo(var x && non-null(x)):
Sorry to bikeshed here, but while “when” is nice, I think “if” is even
more appealing (short, familiar, already a keyword), especially if it
alone can express attachment of a guard to a pattern (and we can argue
about whether the parentheses are required):
case Foo(var x) if (x > 0):
There's precedent for this, of course. My concern with this is that the
colon too easily hides the flow of what's going on:
case Foo(var x) if (x > 0): if (x > 10) println(x);
Having `if` on both sides there seems likely to lead to both "which is
which" confusion, as well as "why can't I have Perl-style `if` at the
end of a statement."
case grobble(e):
which is later revealed to be sugar for:
case Foo(var _) & grobble(e):
I think you meant it is sugar for
case var _ & grobble(e):
Yes.
If so, then compare that to the claim that
case if (e):
is sugar for
case var _ if (e):
Or:
case &&e: