I'll update svn.openlaszlo.org/vendor/
<http://svn.openlaszlo.org/vendor/> and the wiki pages [1,2] later.
[1] http://wiki.openlaszlo.org/SubversionBuildInstructions#Install_JavaCC
[2]
http://wiki.openlaszlo.org/NativeWindows7SubversionBuildInstructions#Install_javacc
Change bargull-20110202-U3G by bargull@Bargull02 on 2011-02-02 17:03:34
in /home/anba/src/svn/openlaszlo/trunk
for http://svn.openlaszlo.org/openlaszlo/trunk
Summary: Upgrade to JavaCC5, fix ASI bugs in parser
New Features: LPP-8788 (We should use a more modern version of JavaCC)
Bugs Fixed: LPP-9714 (Parser does not enfore no-line-terminator in
break/continue/throw/postfix-op statements), LPP-9719 (Semicolon after
do-while creates empty statement), LPP-9720 (Newline in multi-line
comment not treated as newline for ASI), LPP-9721 (TypeIdentifier()
should enforce NoLineTerminator before Nullable()/NonNullable()),
LPP-9722 (Multi-line comment not handled when computing ASI)
Technical Reviewer: ptw, dda
QA Reviewer: (pending)
Overview:
JavaCC 5 has got a bug with javacode non-terminals, somehow the
tokenstream isn't rewinded properly in a (deeply?) nested syntactic
lookahead if used in conjunction with javacode non-terminals (e.g. see
lookahead for ModifiedDefinition() in Directive() and the Sc()
non-terminal). Even using a no-op javacode non-terminal broke the
parser (JAVACODE void Nop () {/*empty*/}). Because of that reason I
had to re-implement the whole optional semicolon mechanism in the
parser. While doing this rework, I've encountered a couple of bugs
(LPP-9714, LPP-9719, LPP-9720, LPP-9721, LPP-9722).
Details:
All Sc() non-terminals for optional semicolons have been rewritten to
use a semantic lookahead: [LOOKAHEAD({Sc()}) ";"]
The new Sc() function simply returns `true` if the next token in the
stream is a semicolon, this semicolon will then simply be consumed. If
the next token is not a semicolon, but an acceptable choice for
automatic semicolon insertion (ASI), Sc() returns `false`. So the
lookahead is unsuccessful and by that the optional block [...] is not
processed any further. If the next token is neither a semicolon nor an
acceptable choice for ASI, a parser exception is thrown.
JavaCC cannot infer the content a javacode non-terminal, therefore it
couldn't warn for possible ambiguities in the grammar after Sc()
calls. As the old Sc() non-terminal is gone, I had to add additional
lookaheads to resolve these ambiguities. See CallExpression(),
AdditiveExpression(), RelationalExpression() productions.
LPP-9714 (no-line-terminator rule):
The ECMAScript grammar has got restricted productions for the
PostfixExpression, ContinueStatement, BreakStatement, ThrowStatement
and ReturnStatement productions. For these productions, line
terminators are strictly prohibited. This restriction was only
implemented for the ReturnStatement, it's now implemented for the
other productions as well. The pattern used to implement the
restriction is simply: [LOOKAHEAD(NonTerminal(), {NoLineTerminator()})
NonTerminal()]. So a combination of syntactic and semantic lookahead:
if the next token(s) can be parsed as NonTerminal() and there are no
intervening newlines, consume the tokens, otherwise just proceed. I've
added an extra function to produce a nice error message for the
ThrowStatement production, so it's easier for the user to understand
what's going wrong when she added a newline after the `throw`
statement. The default error message by JavaCC is pretty useless here:
Encountered "" at line 1, column 61. Was expecting one of: (empty list
here)
LPP-9719 (do-while loop):
Just added the optional semicolon predicate for the do-while loop in
IterationStatement().
LPP-9720 (newline in multi-line comment):
Scan for newlines in multi-line comments, they count as newlines for
the syntactic grammar, cf. ECMAScript 5, 7.4 Comments
LPP-9721 (restriced production for TypeIdentifier):
TypeIdentifier was not a restricted production, but it should be,
otherwise "var x:Boolean \n !x;", where \n denotes a line break, is
not parsed as "var x:Boolean; !x;".
LPP-9722 (multi-line comment for ASI):
The (now removed) optionalSc() function just used the first special
token, but special tokens are chained, e.g. consider multiple
single-line and multi-line comments appearing next to each other.
Therefore it's necessary to loop over the specialToken field, just
like it's now done in NoLineTerminator().
Token.java:
JavaCC 5 requires a new interface in Token: Token.newToken(int, String).
TestParserASI.java:
JUnit test case for LPP-9714, LPP-9719, LPP-9720, LPP-9721, LPP-9722
build.xml:
- update 1.4 -> 1.5
- comment out org/openlaszlo/iv/flash/context package, this was
recently discussed on laszlo-dev
- explicitly add EDU/oswego package for build, sometimes my ant
doesn't pick it up, so make it explicit
- add TestParserASI to the "test" target
Tests:
TestParserASI test case, lfc compiles, smokecheck and alldata (swf10,
dhtml)x(Firefox, Opera, Safari, IE)
(Don't forget to download and install JavaCC 5 and update your
JAVACC_HOME environment variable accordingly!)
Files:
M WEB-INF/lib/javacc.jar
M WEB-INF/lps/server/build.xml
A WEB-INF/lps/server/src/org/openlaszlo/test/TestParserASI.java
M WEB-INF/lps/server/sc/src/org/openlaszlo/sc/Parser.jjt
M WEB-INF/lps/server/sc/src/org/openlaszlo/sc/parser/Token.java
Changeset:
http://svn.openlaszlo.org/openlaszlo/patches/bargull-20110202-U3G.tar