Re: Guards

Brian Goetz Fri, 05 Mar 2021 15:12:19 -0800

A guard construct need not itself be a pattern.

True. What is minimally needed is a _syntactic_ separation of what ispattern and what is expression, without having to wait for semanticanalysis to understand what is being combined. This is, in part,because there are still sequences that are matched by both the patternand expression productions, notably `Identifier()` (could be adeconstruction pattern with no bindings, or could be a method invocation.)

Rather, it can be viewed as a map from patterns to patterns. Indeed,they are formulated in exactly that way in Gavin’s BNF in JEPJDK-8213076 "Pattern Matching for switch”: a guard is not a pattern,but can only appear within a pattern as the right-hand operand of `&`:
Pattern:
PatternOperand
Pattern & PatternOperandOrGuard
PatternOperandOrGuard:
PatternOperand
GuardPattern

As a result, a guard necessarily appears to the right of a `&` andtherefore necessarily to the right of a pattern. We should alsoinquire as to whether it is ever desirable in practice, within a chainof `&` (pattern conditional-and) operations for a pattern to appear tothe right of a guard.

I have long had a nagging feeling that this will eventually bedesirable. Let's say we have P & g & Q & h; under what conditions canwe commute g and Q without regret? I can think of four potentialsources of regret:


 - g declares bindings that are inputs to Q

- the cost model of Q is such that we'd like to run g first, andshort-circuit

 - Q might throw an exception when g does not hold
 - Q might have side-effects that we don't want to run if g does not hold

I think we can eliminate the last one; I'm pretty comfortable sayingthat if you write side effects in pattern declarations, you get what youdeserve. And, the linguistic part of patterns are not supposed to throwexceptions, but badly written pattern declarations may anyway. But thatstill leaves the dataflow and performance concerns; I think I willeventually want to be able to specify the order, and getshort-circuiting. This is why I've resisted this direction to date.

If not, then `&` chains always have the simple form

pattern & pattern & … & pattern & guard & guard & … & guard
where the number of patterns must be positive but the number of guardsmay be zero. And if this is the case, it is not unreasonable to askwhether readability might not be better served by better marking thattransition from patterns to guard in the chain, for example:
pattern & pattern & … & pattern when guard & guard & … & guard
And then we see that there really is no reason to try to overload `&`(however it is actually spelled) to mean both pattern conjunction andguard conjunction, because guard conjunction already exists in theform of the `&&` expression operator:
pattern & pattern & … & pattern when guard && guard && … && guard
and therefore we can, after all, simplify this general form to thecase of zero or one guards:
pattern & pattern & … & pattern [when guard]

There's one more turn of this crank: if we are willing to move theguards all to the right (big if), then, why say `when`, and not `&&`? Then it looks just like the `if...instanceof` situation.


    case pattern & pattern && guard:

This further align patterns in instanceof with patterns in switch. (Withone potentially surprising caveat: we can never switch on booleans;`case true && false` would not match `true`. Pause for groans.)

Finally, given that (an earlier version of) the patterns designalready encompasses forms that can bind the entire object as well ascomponents (what is done in other languages with `as`, I have to ask:what are the envisioned practical applications of pattern conjunctionother than as a cute way to include guards or a (more verbose) way tobind the entire value as well as components? Maybe as a way to fakeintersection types?

When a pattern focuses on a part, rather than the whole. A pattern like`Point(var x, var y)` matches / destructures the whole thing, but otherpatterns can act as queries. Imagine we have a pattern`Map.with(key)(var value)`, which matches maps that have the specifiedkey. We would likely want to combine these with &:

if (x instanceof (Map.with(key1)(var val1) & Map.with(key2)(varval2))) { ... }

This scales up to query APIs such as a JSON parsing API, where you onlywant to match a blob of JSON if it has all the parts you are looking for(similar to "spec/conform" in Clojure.) Fromhttps://github.com/openjdk/amber-docs/blob/master/site/design-notes/pattern-match-object-model.md:


switch (doc) {
    case stringKey("firstName")(var first)
         & stringKey("lastName")(var last)
         & intKey("age")(var age)
         & objectKey("address")(
                 stringKey("city")(var city)
                 & stringKey("state")(var state)
                 & ...): ...
}

This expresses not only the all-or-nothing nature of the compositequery, but permits the pattern to match the structure of what is beingqueried (the `objectKey` pattern has a nested pattern which applies tothe body of that object, which itself is an & pattern.

This is the sort of example I could imagining wanting to stick a guardin the middle of; I could well want to guard "name not empty" and notbother parsing the rest of the document. Semantically that might beequivalent to putting the guard at the end, but the user might not thankus for not letting them short-circuit out.


The real unknown is:
    - g declares bindings that are inputs to Q

Can we construct a credible example?

As to intersection types, pattern-& is not even the right vehicle,because in


    case Foo f & Bar b:

then f and b will have types Foo and Bar, but really, I want somethingof type (Foo&Bar). If this were important, I'd probably want to be ableto use an intersection type in the type pattern:


    case (Foo&Bar) fb:

Now, all of this has no bearing on whether or not guards are requiredto be “top level only” in all cases; it argues only that guards neednot appear within pattern-conjunction chains. But I believe it wouldbe perfectly reasonable to write
case Point(int x when x > 0, int y when y > x):

Rémi has argued that this would be better written

case Point(int x, int y) when x > 0 && y > x:
but I would argue that this choice is, and should be, a matter ofstyle, and when matching against a record with many fields it might bemore readable to mention each field’s constraint next to its bindingrather than to make the reader compare a list of bindings against alist of constraints.


Agreed, this is a user choice.

Bottom line: there are conceptually three distinct combining forms:

pattern conjunction
guard conjunction
to a pattern, attach a guard
and it may be a mistake after all to conflate them by trying to usethe same syntax or symbol for all three.
So what I would like to see is the convincing application examplewhere you really do want to write
pattern & guard & pattern


Did the "guard in the middle of the JSON blob" do that?

because then everything I’ve written above falls to the ground.

Well, even if so, its possible it can be propped up, just not using oneuniversal support. Suppose we allow guards to be conjoined on the endof a pattern with &&. Then you could say


    case Foo(var x) && x > 0:

as well, even, as

    case Foo(var x && x > 0):

But if you wanted to do the P & g & Q thing, you'd need a grobble-stylepattern to do so:


    case P & grobble(g) & Q:

Recall that we will eventually be able to write `grobble()` as anordinary declared pattern, and we can also still resurrect thetrue/false built-in patterns if we like, for rescuing this case. (Istill think we may eventually want a `non-null(e)` pattern, as it willlikely be the most common form of grobbling:


    case Foo(var x && non-null(x)):

Sorry to bikeshed here, but while “when” is nice, I think “if” is evenmore appealing (short, familiar, already a keyword), especially if italone can express attachment of a guard to a pattern (and we can argueabout whether the parentheses are required):
case Foo(var x) if (x > 0):

There's precedent for this, of course. My concern with this is that thecolon too easily hides the flow of what's going on:


    case Foo(var x) if (x > 0): if (x > 10) println(x);

Having `if` on both sides there seems likely to lead to both "which iswhich" confusion, as well as "why can't I have Perl-style `if` at theend of a statement."

    case grobble(e):

which is later revealed to be sugar for:

    case Foo(var _) & grobble(e):


I think you meant it is sugar for

    case var _ & grobble(e):


Yes.

If so, then compare that to the claim that

    case if (e):

is sugar for

    case var _ if (e):


Or:

    case &&e:

Re: Guards

Reply via email to