Many ANTLR Tokens

David Mollitor Thu, 09 Apr 2020 15:36:48 -0700

Hello Gang,

I am investigating HIVE-23172 and I am having a problem addressing this
because I am getting the following error from compiling the grammar:


hive-parser: Compilation failure
[ERROR]
/home/apache/hive/hive/parser/target/generated-sources/antlr3/org/apache/hadoop/hive/ql/parse/HiveParser.java:[40,38]
code too large

I traced it down to the fact that there are too many token defined.  In
HiveParser.java, it has the following:

 public static final String[] tokenNames = new String[] { ... };

That list is so long, it's breaking Java compilation.  Someone else came
across this awhile ago: HIVE-15577.

I observed that the parser defines two token for most elements, for example:

KW_TRUNCATE / TOK_TRUNCATETABLE

What is the value of having both?  Can we consolidate this down to one and
conserve some space?  I would propose just using  TOK_TRUNCATE and get rid
of the KW version.

Does anyone have an insight into why things are setup the way they are?

Many ANTLR Tokens

Reply via email to