> De: "Brian Goetz" <brian.go...@oracle.com> > À: "amber-spec-experts" <amber-spec-experts@openjdk.java.net> > Envoyé: Lundi 17 Mai 2021 23:36:30 > Objet: Rehabilitating switch -- a scorecard
> This is a good time to look at the progress we've made with switch. When we > started looking at extending switch to support pattern matching (four years > ago!) we identified a lot of challenges deriving from switch's C legacy, some > of which is summarized here: > [ http://cr.openjdk.java.net/~briangoetz/amber/switch-rehab.html | > http://cr.openjdk.java.net/~briangoetz/amber/switch-rehab.html ] > We had two primary driving goals for improving switch: switches as > expressions, > and switches with patterns as labels. In turn, these pushed on a number of > other uncomfortable aspects of switch: fall through, totality, scoping, and > null handling. > Initially, we were unsure we would be able to rehabilitate switch to support > these new requirements without being forever bogged down by the mistakes of > the > past. Bit by bit, we have chipped away at the negative aspects of switch, > while > respecting the existing code that depends on those aspects. I think where > we've > landed is, in many ways, better than we could have initially hoped for. > Throughout this exercise, there were periodic calls for "just toss it and > invent > something new" (which we sometimes called "snitch", shorthand for "new > switch"*), and no shortage of people's attempts to design their ideal switch > construct. We resisted this line of attack, because we believed having two > similar-but-different constructs living side by side would be more annoying > (and confusing) to users than a rehabilitated, albeit more complex, construct. > The first round of improvements came with expression switches. This was the > easy > batch, because it didn't materially change the set of questions we could ask > with switch, just the form in which we asked the question. This brought the > following improvements: > - Switches as expressions. Many existing switch statements are in reality > modeling expressions, in a more roundabout and less safe way. Expressing it > directly is simpler and less error-prone. > - Checked totality. The compiler enforces that a switch expression is > exhaustive > (because, expressions must be total). In the case of enum switches, a switch > that covers all the cases needs no default clause, and the compiler inserts an > extra case to catch novel values and throw (ICCE) on them. (Eventually the > same > will be true for switches on sealed classes as well.) > - A fallthrough-free option. Switches now give us a choice between two styles > of > _switch blocks_, the old willy-nilly style, and the new single-consequent > (arrow) style. Switches that choose arrow-style need not reason about > fallthrough. > Unfortunately, it also brought a new asymmetry; switch expressions must be > total > (and you get enhanced type checking for this), but switch statements cannot > be. > This is a shame, since the improved type checking for totality is one of the > best things about the improvements in switch, as a switch that is total by > virtue of actually covering all the cases acts as a tripwire against new enum > constants / permitted subtypes being added later, rather than papering it over > with a catch-all. We explored several ways to explicitly add back totality > checking, but this always felt like a hack, and requires the programmer to > remember to ask for this checking. > Our resolution here offers a path to true healing with minimal user impact, by > (temporarily) carving out the semantic space of old statement switches. A > "legacy switch" is a statement switch on a numeric primitive or its box, enum, > or string, and which contains no pattern labels (i.e., a statement switch that > is valid today.) Like expression switches, we will require non-legacy > statement > switches to be exhaustive, and warn on non-exhaustive legacy switches. (To > make > the warning go away, just insert a "default: " or "default: break" at the > bottom of the switch; not painful.) After some time, we should be able to make > this warning an error, which again is easy to mitigate with a single line. In > the end, all switch constructs will be total and type-checked for > exhaustiveness, and once done, the notion of "legacy switch" can be > garbage-collected. > Looking ahead to patterns in switch, we have several legacy considerations to > navigate: > - Fallthrough and bindings. While fallthrough is not inherently problematic > (though the choice of fallthrough-by-default was unfortunate), if a case label > introduces a pattern variable, then fallthrough to another case (at least one > that doesn't introduce the same pattern variable with the same type) makes > little sense, and such fallthrough has been outlawed. > - Scoping. The block of a switch is one big scope, rather than each case label > group being its own scope. (Again, one might call this a historical error, > since there's little good that comes from this.) With case labels introducing > variable declarations, this could have been a big problem, if one case > polluted > later cases (forcing users to pick unique names for each binding in a switch > statement), but flow scopoing solves that one. > - Nulls. In Java 1.0, switching over reference types was not permitted, so we > didn't have to worry about this. In Java 5, autoboxing and enums meant we > could > switch over some reference types, but for all of these, null was a "silly" > value so we didn't care about NPEing on null. In Java 7, when we added string > switch, we could have conceivably allowed `case null`, but instead chose to > follow the precedent set by Java 5. But once we introduce switches over any > type, with richer patterns, eagerly NPEing on null becomes much more > problematic. We've navigated this by say that switches can NPE on null if they > have no nullable cases; nullable cases are those that explicitly say "null", > and total patterns (which always come last since they dominate all others.) > The > old rule of "switches throw on null" becomes "switches throw on null, except > when they say 'case null' or the bottom case is total." Default continues to > mean what it always did -- "anything not already matched, except null." > The new treatment of null actually would have fallen out of the decisions on > totality, had we not gotten there already via another path. Our notion of > totality accounts for "remainder", which includes things like novel subclasses > of sealed types that did not exist at compile time, which it would not be > reasonable to ask users to write code to deal with, and null fits into this > treatment as well. We type check that a switch is sufficiently total, and then > insert extra code to catch "silly" values that are not otherwise handled, > including null, and throw. (This also enables DA analysis to truly trust > switch > totality.) > Where we land is a single unified switch construct that can be either a > statement or an expression; that can use either old-style flow (colon) or the > more constrained flow style (arrow); whose case labels can be constant, > patterns (including guarded patterns), or a mix of the two; which can accept > the legacy null-hostility behavior, or can override it by explicitly using > nullable case labels; and which are almost always type checked for totality > (with some temporary, legacy exceptions.) Fallthough is basically unchanged; > you can get fallthrough when using the old-style flow, but becomes less > important as fallthrough is (mostly) nonsensical in the presence of pattern > cases with bindings, and the compiler prevents this misuse. The distinction > between "legacy" switches and pattern switches is temporary, with a path to > getting to "all switches are total" over time. > I think we've done a remarkable job at rehabilitating this monster. I believe the only pending issue on that matter is the position of default inside the switch, With the legacy switch, default can be in the middle, with a switch on types that default has to be the last case. I think we should try to emit a warning if "default" is not at last position, both Eclipse and IntelliJ already have that warning. Rémi > *Someone actually suggested using the syntax "new switch", on the basis that > new > was already a keyword. Would not have aged well. if we add a prefix "new" to switch for each LTS release, e.g. new new switch for 6 years after 2018, it would help the future historians because radiocarbon dating does not work well on the source code.