Hi,
Some of you may have noticed that in the rewrite from rustboot to rustc
we're becoming substantially more expression-language-ish. This is
mostly a result of me yielding to the preferences of other developers
(and LLVM's semantics), as well as some hint that things get much easier
in syntax extensions and calculating compile-time-constants if we permit
more "statement-ish" forms as expressions. Particularly conditionals.
We've run into a (common, seen in many other languages) sort of problem
along the way here, which is that some expressions are implicitly
ignored (or must be, due to being in an ignored context) whereas others
are not. We have a nil-type (), but we don't always have sensible rules
for forcing things to have the nil type by context.
This email is a poll of alternative solutions. I'll give two example
cases and ask people for their input on which modification of the rules
feels best.
Example case that does compile:
A: auto x = if (foo()) { 10; } else { 11; };
Example case that does not compile:
B: if (foo()) { 10; } else { "hello"; }
We can write this in rust at the moment, but in the rustc typechecking
rules it will fail to compile, because 'if' is an expression-statement,
expressions have types, and the types of the two branches (judged as the
last statement's expression value, if it's an expression, or else nil)
are of different types.
Here are some approaches to solving this example. Please pick the one
you like the most:
(1) Kick all branchy expressions out of the expression grammar, put them
back in the statement grammar. Case B will compile, and case A must be
rewritten like so:
A: auto x = { auto t = 11; if (foo()) { t = 10; }; t; };
This is the C-with-GNU-extensions model.
(2) Hoist all statements up into the expression language and make
semicolon into a sequencing operator, with a trailing-semi ignored by
the parser. Then we need to rewrite only the second case to force unit
types in the to-be-ignored differing branches.
B: if (foo()) { 10; () } else { "hello"; () }
Though we'd also be *allowed* to rewrite the first case to drop the
semicolons:
A: auto x = if (foo() { 10 } else { 11 };
This is the Ocaml approach.
(3) A slightly weaker form of (2), which is to reformulate blocks with
the following grammar:
block ::= { [ stmt ; ]* expr? }
In other words, every block becomes a brace-enclosed sequence of
semicolon-terminated statements, followed by an optional expr. If the
expr is missing, it is implied as (). In this case we'd be rewriting
only the first case:
A: auto x = if (foo()) { 10 } else { 11 };
This is similar to the Ocaml rule in practice, except that it makes the
presence or absence of the final semicolon in a block equivalent to
ending the block with the nil type. This is a possible hazard
(especially during refactoring or editing) to users who want to write a
value-producing block but accidentally semicolon-terminate the last
expression; but it's not a huge hazard since the typechecker will tell
them the value they produced is of nil type. It just might be hit a lot.
(4) Statically determine the contexts in which an expression's value
"will be used" in an outer expression, and only typecheck those
contexts. This permits both of the examples to compile as-is, but it's
the most unorthodox approach, and poses a refactoring hazard as code may
become type-invalid when nested into an expression context that "uses"
its previously-ignored result. Again, as in (3) the typechecker will
catch these cases, but they might happen more or less often than those
in (3).
We can't think of any other options. Significant whitespace is not an
option :)
Personally my knee-jerk reaction is to embrace (1) since I like
statements anyway, but I can see plausible arguments for the other 3.
Can I get a show of hands? We have to pick something.
-Graydon
_______________________________________________
Rust-dev mailing list
[email protected]
https://mail.mozilla.org/listinfo/rust-dev