I want to answer this before we start the meetings because i really think that
restricted keyword as i propose solve the issues
Stephan raised.
----- Mail original -----
De: "Stephan Herrmann" <stephan.herrm...@berlin.de>
À: jigsaw-dev@openjdk.java.net
Envoyé: Mardi 16 Mai 2017 11:49:45
Objet: Re: An alternative to "restricted keywords"
Thanks, Remi, for taking this to the EG list.
Some collected responses:
Remi: "from the user point of view, '^' looks like a hack"
This is, of course, a subjective statement. I don't share this view
and in years of experience with Xtext-languages (where this concept
is used by default) I never heard any user complain about this.
More importantly, I hold that such aesthetic considerations are of
much lesser significance than the question, whether we can explain
- unambiguously explain - the concept in a few simple sentences.
Explaining must be possible at two levels: in a rigorous specification
and in simple words for users of the language.
I'm not against ^, or ` as it has already asked to escape an identifier, but as
you said it's a pervasive change that applies on
the whole grammar while i think that with restricted keyword (that really
should be called local keywords) the changes only impact
the grammar that specifies a module-info.java
Remi: "a keyword which is activated if you are at a position in the
grammar where it can be recognized".
I don't think 'being at a position in the grammar' is a good way of
explaining. Parsing doesn't generally have one position in a grammar,
multiple productions can be active in the same parser state.
Also speaking of a "loop" for modifiers seems to complicate matters
more than necessary.
Under these considerations I still see '^' as the clearest of all
solutions. Clear as a specification, simple to explain to users.
Eclipse uses a LR parser, for a LR parser, position == dotted production as i
have written earlier, so no problem because it
corresponds to only one parser state. Note that even if one do not use an LR
or a LL parser, most hand written parser i've seen,
javac is one of them, also refers to dotted production in the comments of the
corresponding methods.
Peter spoke about module names vs. package names.
I think we agree, that module names cannot use "module words",
whereas package names should be expected to contain them.
yes, that the main issue, package names may contains unqualified name like
'transitive, ''with' or 'to'.
but i think people will also want to use existing package or more exactly
prefix of existing package as module name, so we should
also support having restricted keyword name as part of a module name.
The grammar is:
open? module module_name {
requires (transitive | static)* module_name;
exports package_name;
exports package_name to module_name1, module_name2;
opens package_name;
opens package_name to module_name1, module_name2;
uses xxx;
provides xxx with xxx, yyy;
}
If we just consider package name, only 'opens' and 'exports' are followed by a
package name and a package name can only been
followed by ';' or 'to', so once 'opens' is parsed, you know that you can have
only an identifier so if it's not an identifier by
one of the restricted keywords, it should be considered as an identifier.
As i said earlier, the scanner can see the restricted keyword as keyword and
before feeding the token to the parser, you can check
the parser state to see if the keyword as to be lowered to an identifier or not.
For module name, there is the supplementary problem of transitive, because if a
module starts with transitive, you can have a
conflict. As i said earlier, instead of using the next token to know if
transitive is the keyword or part of the module name, i
think we should consider it as a keyword, as the JLS said a restricted keyword is
activated when it can appear, so "requires
transitive" is not a valid directive.
Remi: "you should use reverse DNS naming for package so no problem :)"
"to" is a "module word" and a TLD.
I think we should be very careful in judging that a existing conflict
is not a real problem. Better to clearly and rigorously avoid the
conflict in the first place.
to as the first part of a package/module and to as in exports ... to can not be
present on the same dotted production, because
exports as to be followed by a package_name so 'to' here means the start of a
package name and then because a package name can not
ends with '.' you always know if you are inside the production recognizing the
package_name or outside matching the to of the
directive exports.
Some additional notes from my side:
In the escape-approach, it may be prudent to technically allow
escaping even words that are identifiers in Java 9, but could become
keywords in a future version. This ensures that modules which need
more escaping in Java 9+X can still be parsed in Java 9.
yes, that's why i think that escaping is not the right mechanism here, because
we want to solve a very local problem so we do not
need a global grammar-wise way to solve our problem.
Current focus was on names of modules, packages and types.
A complete solution must also give an answer for annotations on modules.
Some possible solutions:
a. Assume that annotations for modules are designed with modules in mind
and thus have to avoid any module words in their names.
b. Support escaping also in annotations
c. Refine the scope where "module words" are keywords, let it start only
when the word "module" or the group "open module" has been consumed.
This would make the words "module" and "open" special, as being
switch words, where we switch from one language to another.
(For this I previously coined the term "scoped keywords" [1])
For annotation, again, because annotation name are qualified, you now when you
see 'module' if you are in the middle of the
annotation name or if you are outside.
I think we all agree that the conflicts we are solving here are rare
corner cases. Most names do not contain module words. Still, from a
conceptual and technical p.o.v. the solution must be bullet proof.
But there's no need to be afraid of module declarations being spammed
with dozens of '^' characters. Realistically, this will not happen.
I agree, and i strongly believe that scoped keyword, local keywords or
restricted keywords, i.e. whatever the name, keywords that
are keywords or identifiers depending on the parser state are the general
mechanism that solve our problem.
Stephan
[1] http://www.objectteams.org/def/1.3/sA.html#sA.0.1
Rémi