A guard construct need not itself be a pattern.

True.  What is minimally needed is a _syntactic_ separation of what is pattern and what is expression, without having to wait for semantic analysis to understand what is being combined.  This is, in part, because there are still sequences that are matched by both the pattern and expression productions, notably `Identifier()` (could be a deconstruction pattern with no bindings, or could be a method invocation.)

Rather, it can be viewed as a map from patterns to patterns.  Indeed, they are formulated in exactly that way in Gavin’s BNF in JEP JDK-8213076 "Pattern Matching for switch”: a guard is not a pattern, but can only appear within a pattern as the right-hand operand of `&`:

Pattern:
PatternOperand
Pattern & PatternOperandOrGuard
PatternOperandOrGuard:
PatternOperand
GuardPattern

As a result, a guard necessarily appears to the right of a `&` and therefore necessarily to the right of a pattern.  We should also inquire as to whether it is ever desirable in practice, within a chain of `&` (pattern conditional-and) operations for a pattern to appear to the right of a guard.

I have long had a nagging feeling that this will eventually be desirable.  Let's say we have P & g & Q & h; under what conditions can we commute g and Q without regret?  I can think of four potential sources of regret:

 - g declares bindings that are inputs to Q
 - the cost model of Q is such that we'd like to run g first, and short-circuit
 - Q might throw an exception when g does not hold
 - Q might have side-effects that we don't want to run if g does not hold

I think we can eliminate the last one; I'm pretty comfortable saying that if you write side effects in pattern declarations, you get what you deserve.  And, the linguistic part of patterns are not supposed to throw exceptions, but badly written pattern declarations may anyway.  But that still leaves the dataflow and performance concerns; I think I will eventually want to be able to specify the order, and get short-circuiting.  This is why I've resisted this direction to date.

If not, then `&` chains always have the simple form

pattern & pattern & … & pattern & guard & guard & … & guard

where the number of patterns must be positive but the number of guards may be zero.  And if this is the case, it is not unreasonable to ask whether readability might not be better served by better marking that transition from patterns to guard in the chain, for example:

pattern & pattern & … & pattern when guard & guard & … & guard

And then we see that there really is no reason to try to overload `&` (however it is actually spelled) to mean both pattern conjunction and guard conjunction, because guard conjunction already exists in the form of the `&&` expression operator:

pattern & pattern & … & pattern when guard && guard && … && guard

and therefore we can, after all, simplify this general form to the case of zero or one guards:

pattern & pattern & … & pattern [when guard]

There's one more turn of this crank: if we are willing to move the guards all to the right (big if), then, why say `when`, and not `&&`?  Then it looks just like the `if...instanceof` situation.

    case pattern & pattern && guard:

This further align patterns in instanceof with patterns in switch. (With one potentially surprising caveat: we can never switch on booleans; `case true && false` would not match `true`. Pause for groans.)

Finally, given that (an earlier version of) the patterns design already encompasses forms that can bind the entire object as well as components (what is done in other languages with `as`,  I have to ask: what are the envisioned practical applications of pattern conjunction other than as a cute way to include guards or a (more verbose) way to bind the entire value as well as components?  Maybe as a way to fake intersection types?

When a pattern focuses on a part, rather than the whole.  A pattern like `Point(var x, var y)` matches / destructures the whole thing, but other patterns can act as queries.  Imagine we have a pattern `Map.with(key)(var value)`, which matches maps that have the specified key.  We would likely want to combine these with &:

    if (x instanceof (Map.with(key1)(var val1) & Map.with(key2)(var val2))) { ... }

This scales up to query APIs such as a JSON parsing API, where you only want to match a blob of JSON if it has all the parts you are looking for (similar to "spec/conform" in Clojure.)  From https://github.com/openjdk/amber-docs/blob/master/site/design-notes/pattern-match-object-model.md:

switch (doc) {
    case stringKey("firstName")(var first)
         & stringKey("lastName")(var last)
         & intKey("age")(var age)
         & objectKey("address")(
                 stringKey("city")(var city)
                 & stringKey("state")(var state)
                 & ...): ...
}

This expresses not only the all-or-nothing nature of the composite query, but permits the pattern to match the structure of what is being queried (the `objectKey` pattern has a nested pattern which applies to the body of that object, which itself is an & pattern.

This is the sort of example I could imagining wanting to stick a guard in the middle of; I could well want to guard "name not empty" and not bother parsing the rest of the document.  Semantically that might be equivalent to putting the guard at the end, but the user might not thank us for not letting them short-circuit out.

The real unknown is:
    - g declares bindings that are inputs to Q

Can we construct a credible example?

As to intersection types, pattern-& is not even the right vehicle, because in

    case Foo f & Bar b:

then f and b will have types Foo and Bar, but really, I want something of type (Foo&Bar).  If this were important, I'd probably want to be able to use an intersection type in the type pattern:

    case (Foo&Bar) fb:

Now, all of this has no bearing on whether or not guards are required to be “top level only” in all cases; it argues only that guards need not appear within pattern-conjunction chains.  But I believe it would be perfectly reasonable to write

case Point(int x when x > 0, int y when y > x):

Rémi has argued that this would be better written

case Point(int x, int y) when x > 0 && y > x:

but I would argue that this choice is, and should be, a matter of style, and when matching against a record with many fields it might be more readable to mention each field’s constraint next to its binding rather than to make the reader compare a list of bindings against a list of constraints.

Agreed, this is a user choice.

Bottom line: there are conceptually three distinct combining forms:

pattern conjunction
guard conjunction
to a pattern, attach a guard

and it may be a mistake after all to conflate them by trying to use the same syntax or symbol for all three.

So what I would like to see is the convincing application example where you really do want to write

pattern & guard & pattern

Did the "guard in the middle of the JSON blob" do that?

because then everything I’ve written above falls to the ground.

Well, even if so, its possible it can be propped up, just not using one universal support.  Suppose we allow guards to be conjoined on the end of a pattern with &&.  Then you could say

    case Foo(var x) && x > 0:

as well, even, as

    case Foo(var x && x > 0):

But if you wanted to do the P & g & Q thing, you'd need a grobble-style pattern to do so:

    case P & grobble(g) & Q:

Recall that we will eventually be able to write `grobble()` as an ordinary declared pattern, and we can also still resurrect the true/false built-in patterns if we like, for rescuing this case.  (I still think we may eventually want a `non-null(e)` pattern, as it will likely be the most common form of grobbling:

    case Foo(var x && non-null(x)):

Sorry to bikeshed here, but while “when” is nice, I think “if” is even more appealing (short, familiar, already a keyword), especially if it alone can express attachment of a guard to a pattern (and we can argue about whether the parentheses are required):

case Foo(var x) if (x > 0):

There's precedent for this, of course.  My concern with this is that the colon too easily hides the flow of what's going on:

    case Foo(var x) if (x > 0): if (x > 10) println(x);

Having `if` on both sides there seems likely to lead to both "which is which" confusion, as well as "why can't I have Perl-style `if` at the end of a statement."


    case grobble(e):

which is later revealed to be sugar for:

    case Foo(var _) & grobble(e):

I think you meant it is sugar for

    case var _ & grobble(e):

Yes.

If so, then compare that to the claim that

    case if (e):

is sugar for

    case var _ if (e):

Or:

    case &&e:


Reply via email to