I am working on https://issues.apache.org/jira/browse/CALCITE-5541 (upgrading JavaCC) and the Babel parser is having problems deducing whether a keyword is reserved. Investigating this, I took a look at the generated code, and found something interesting.
Here are the NonReservedKeyWord and NonReservedKeyWord0of3 methods in Babel (babel/build/javacc/javaCCMain/org/apache/calcite/sql/parser/babel/SqlBabelParserImpl.java): final public String NonReservedKeyWord() throws ParseException { if (jj_2_1116(2)) { NonReservedKeyWord0of3(); } else if (jj_2_1117(2)) { NonReservedKeyWord1of3(); } else if (jj_2_1118(2)) { NonReservedKeyWord2of3(); } else { jj_consume_token(-1); throw new ParseException(); } {if ("" != null) return unquotedIdentifier();} throw new Error("Missing return statement in function"); } /** @see #NonReservedKeyWord */ final public void NonReservedKeyWord0of3() throws ParseException { if (jj_2_1119(2)) { jj_consume_token(A); } else if (jj_2_1120(2)) { jj_consume_token(ACTION); } else if (jj_2_1121(2)) { jj_consume_token(ADMIN); ... And here are the same methods in Core (core/build/javacc/javaCCMain/org/apache/calcite/sql/parser/impl/SqlParserImpl.java): final public String NonReservedKeyWord() throws ParseException { switch ((jj_ntk==-1)?jj_ntk_f():jj_ntk) { case A: case ACTION: case ADMIN: case APPLY: ... case YEARS:{ NonReservedKeyWord0of3(); break; } case ABSENT: ... case ZONE:{ NonReservedKeyWord1of3(); break; } ... default: jj_la1[436] = jj_gen; jj_consume_token(-1); throw new ParseException(); } {if ("" != null) return unquotedIdentifier();} throw new Error("Missing return statement in function"); } /** @see #NonReservedKeyWord */ final public void NonReservedKeyWord0of3() throws ParseException { switch ((jj_ntk==-1)?jj_ntk_f():jj_ntk) { case A:{ jj_consume_token(A); break; } case ACTION:{ jj_consume_token(ACTION); break; } case ADMIN:{ jj_consume_token(ADMIN); break; } ... Both of the above are generated using JavaCC 7.0.13. Other parsers, such as Server, look similar to Core. Under JavaCC 4.0, all parsers generate a 'switch'. In all parsers we split the reserved keywords into 3 rules (0of3, 1of3, 2of3) due to the size restrictions noted in https://issues.apache.org/jira/browse/CALCITE-2405. I was puzzled why one is generating a 'switch' and the other is generating chained 'if'...'else-if's. At first I thought it was that Babel had more keywords, but some experiments eliminated that possibility. I also disproved the hypothesis that it is because Babel allows extra characters in identifiers (see https://issues.apache.org/jira/browse/CALCITE-5668). My current hypothesis is that Babel needs to use lookahead in order to determine whether a non-reserved keyword can be converted to an identifier. But whatever the reason, something seems to be very different about the Babel grammar. Given how frequently identifiers occur when parsing SQL, I would not be surprised if the Babel parser is significantly slower than the regular parser under JavaCC 7.0.13. In my opinion, that is not a bug that should prevent us from upgrading JavaCC. Especially given that JavaCC 4.0 has a performance bug that is affecting all of our parser variants. Julian