While we continue with the discussion for the syntax of guards/AND
patterns, I'd like to sync on the premise, which, now that we've put it
out there, seems the obvious choice: that the combination of AND
patterns (which we eventually want anyway), and a way to turn a boolean
expression into a pattern, are much better than an ad-hoc guard syntax
for switches.
The reason, in hindsight, is simple; it is more compositional. A guard
syntax for switches is ad-hoc, and works only in switches. If we wanted
to extend patterns to, say, catch clauses, we'd have to invent another
feature (which might be syntactically similar, but would still be
different.) So the "return on complexity" for guards is pretty small,
in that we don't get to amortize the incremental complexity over very
many uses. On the other hand, combining patterns into bigger patterns
with AND is a natural feature, obviously relevant (at least when we get
to patterns on large composites, like maps or JSON), and can be directly
used in all the places where patterns are allowed. Modeling predicates
as patterns allows all patterns to be guarded, not just switch patterns.
Further, in a guarded switch, the guards would almost certainly only
come after the patterns. So if we were composing multiple patterns (P,
Q) and multiple guards (G, H), we'd be able to write:
P then Q then G then H
but not
P then G then Q then H
Whereas, if guards are patterns, then we can AND them together in
whatever order of execution we want. Because, composition.
Regardless of the spelling of pattern-and-pattern or
pattern-and-expression, the direction seems far better than the
alternatives (at least today, tomorrow may be different!)
On 1/8/2021 12:45 PM, Brian Goetz wrote:
Picking up on this topic: there's a third possibility, which I'm
starting to like better.
The first two possibilities were:
- An imperative statement (`continue`)
- A declarative clause (`when <predicate>`) on case labels.
The possibly-better possibility is: instead of spending syntactic
budget on guards (which are strictly tied to switch, and would then
have to be extended in other places that use patterns, such as catch
clauses), spend that budget on AND patterns instead. The doc I posted
this week shows that AND patterns are useful anyway, but here's a way
we can use AND patterns in place of guards.
Thought experiment: imagine we already had static patterns. So we
write a static pattern. I'm going to use a pathologically awful
syntax just to ensure that no one is distracted by the syntax.
static<T> __pattern
__target(T that)
__bindings()
__arguments(boolean expr)
__name = "guard"
__body {
if (expr)
__match_succeeds_with_bindings()
else
__match_does_not_succeed
}
The point of this is that the object model I have described _already_
supports guards being declared as ordinary library patterns, once we
get to declared static patterns, if we have AND patterns. Because
now, a guarded pattern can be written as:
case Foo(int x) __AND guard(x > 3):
This has multiple big advantages: we spend our budget on a more
general *and composible* feature (pattern conjunction) rather than a
narrower, more ad-hoc feature (case guards in switch).
It is also more expressive (because of the composibility). If we are
already composing patterns with AND:
case P(var x) __AND Q(var y) when (x > 0):
we could only put the guard at the end. But we might not want that --
we might want to execute the guard after the P match, before going on
to the Q match. If guards were just patterns, we'd be able to write:
case P(var x) __AND guard(x > 0) __AND Q(var y):
and have better control over the order of matching. Much bigger
payoff for a pretty similiar investment.
But, we can't write our `guard` pattern yet in Java code. But we can
have built-in patterns called `true(expr)` and `false(expr)` which
behave just like the declared patterns above, and our guarded case
becomes:
case P(var x) __AND true(x > 0) __AND Q(var y):
Now, how to spell __AND? We can't spell it `&&`, since we'd have an
ambiguity with:
if (x instanceof P && Q)
(is that the pattern conjunction P&&Q, or `x instanceof P` && `Q`?)
But I think we can use `&`. We don't need it much in instanceof,
since we already have &&, but we can in switch:
case P(var x) & true(x > 0):
case P(var x) & false(x > 0):
On 8/14/2020 1:20 PM, Brian Goetz wrote:
- Guards. (John, Tagir) There is acknowledgement that some sort of
"whoops, not this case" support is needed in order to maintain
switch as a useful construct in the face of richer case labels, but
some disagreement about whether an imperative statement (e.g.,
continue) or a declarative guard (e.g., `when <predicate>`) is the
right choice.
This is probably the biggest blocking decision in front of us.
John correctly points out that the need for some sort of guard is a
direct consequence of making switch stronger; with the current
meaning of switch, which is "which one of these is it", there's no
need for backtracking, but as we can express richer case labels, the
risk of the case label _not being rich enough_ starts to loom.
We explored rolling boolean guards into patterns themselves (`P &&
g`), which was theoretically attractive but turned out to not be all
that great. There are some potential ambiguities (even if we do
something else about constant patterns, there are still some patterns
that look like expressions and vice versa, making the grammar ugly
here) and it just doesn't have that much incremental expressive
power, since the most credible other use of patterns already
(instanceof) has no problem conjoining additional conditions, because
it's a boolean expression. So this is largely about filling in the
gaps of switch so that we don't have fall-off-the-cliff behaviors.
There are two credible approaches here:
- An imperative statement (like `continue` or `next-case`), which
means "whoops, fell in the wrong bucket, please backtrack to the
dispatch";
- A declarative clause on the case label (like `when <predicate>`)
that qualifies whether the case is selected.
Most of the discussion so far has been on the axis of "continue is
lower-level, and therefore better suited to be a language primitive"
vs "the code that uses guards is easier to read and reason about."
Assuming we have to do one (and I think we do), we have three choices
(one, the other, or both.) I think we should step away from the
either/or mentality and try to shine a light on what goes well, or
badly, when we _don't_ have one or the other.
For example, with guards, we can express fine degrees of refinement
in the case labels:
case P & g1: ...
case P & g2: ...
case P & g3: ...
but without them, we can only have one `case P`:
case P:
if (g1) { ... }
else if (g2) { ... }
else if (g3) { ... }
My main fear of the without-guards branches is that it will be
prohibitively hard to understand what a switch is doing, because the
case arms will be full of imperative control-flow logic.
On the other hand, a valid concern when you have guards is that there
will be so much logic in the guard that you won't be able to tell
where the case label ends and where the arm begins.