I recently reviewed the spec changes for patterns in switch, and found the
treatment of switch labels pretty hard to work out. Since this is really about
the design more than the specification, I thought I'd share some thoughts here.
Consider this early feedback on the Java 19 preview.
The status quo: syntactically, a switch label is a set of colon- and
comma-separated elements, where an *element* is one of { constant, enum,
pattern, null, default }. (I'm oversimplifying delimiters a little: 'case' or a
plain 'default' must follow ':', but not ','.) Layered on top of that syntax
are a number of restrictions: colon delimiters aren't allowed when using switch
*rules*; the entire set of elements must not have repeat elements; constants
and enums are only allowed for certain switch types; certain combinations of
elements are disallowed, while others are okay. Then there are inter-label
restrictions, mainly expressed by the dominance relation.
I find all these non-syntactic rules to be pretty hard to keep in my head,
especially when it comes to which combinations of elements are allowed.
A couple of simplifying moves I'd suggest:
(1) Don't try to merge sets of switch block labels (e.g., 'case foo: case
bar:').
These consecutive labels have the *effect* of handling multiple cases with a
single block of code, but I think we can formally treat them as two unique
blocks, the first of which falls through to the second. And I think that
framing is more in line with how programmers would typically read the syntax.
In this framing, the restrictions about sets of elements in a single label
don't apply, because we're talking about two different labels. But we have
rules to prevent various abuses. Examples:
case 23: case Pattern: // illegal before and now, due to fallthrough Pattern
rule
case Pattern: case Pattern: // ditto
case null: case Pattern: // allowed before, illegal now: use a comma for null
case Pattern: default: // illegal before, legal now: you fell through to the
default case
case Pattern: noop(); default: // legal before and now
case Pattern: case null: // binds the pattern before, pattern is out of scope
now
case Pattern: noop(); case null: // legal with pattern out of scope, before and
now
case Pattern: case 23: // illegal before, legal now with fallthrough
case 23: case 23: // illegal before and now, due to dominance
case default: default: // illegal before and now, due to "only one default" rule
Another way to argue this is that I think 'case Pattern: case somethingelse'
has a lot more in common, syntactically and conceptually, with 'case Pattern:
noop(); case somethingelse:' than it does with 'case Pattern, somethingelse'.
A possible limitation is if there are already special rules in the language
that treat colon-delimited labels differently than separate blocks with
fallthrough. I can't think of any right now, but I may be forgetting
something...
(2) Reduce the syntactic surface of comma-separated switch labels.
There are a lot of combinations of elements—maybe a majority?—that don't make
sense and we prohibit. There are some others that are a bit odd, but we allow
them anyway. I'd prefer to cut back on having multiple ways to do things, and
syntactically enumerate the few cases that are actually meaningful.
Something like:
SwitchLabel:
case CaseValue { , CaseValue } :
case Pattern { Guard }:
case null, Pattern:
default:
case null, default:
CaseValue:
null
ConstantExpression
EnumConstantName
(There's some ambiguity in CaseValue, and I think we should do better with
'ConstantExpression', but okay, set that aside, this is the concept at least.)
Note that the second kind of Pattern SwitchLabel is especially weird—it binds
'null' to a pattern variable, and requires the pattern to be a (possibly
parenthesized) type pattern. So it's nice to break it out as its own syntactic
case. I'd also suggest rethinking whether "case null," is the right way to
express the two kinds of nullable SwitchLabels, but anyway now it's really
clear that they are special.
Rules like "no duplicates" and "only for certain switch types" only need to be
expressed as constraints on the second kind of SwitchLabel. Dominance also
seems like it would be more manageable to specify/understand.
Some things this would newly disallow:
- 'case default:'—just say 'default:'
- case 23, default:'—just say 'default:' (mention '23' in a comment if it's
important to call out)
- 'case default, null:'—could add this case, I guess, or just say that
'default' always goes second
- 'case Pattern, null:'—ditto
- 'case null, Pattern Guard:'—confusing whether the guard is checked when 'null'
This cuts back especially on the degrees of freedom for 'default': the only
useful thing you can add to it, and should want to add to it, is making it
match null.