There is an issue to upgrade JavaCC: 
https://issues.apache.org/jira/browse/CALCITE-5541

It may be a worthwhile effort to do it.

Mihai
________________________________
From: Cyril DE CATHEU <[email protected]>
Sent: Friday, July 5, 2024 8:55 AM
To: [email protected] <[email protected]>
Subject: Re: SqlBabelParserImpl instantiation is slow because it instantiates a 
java.lang.Error

Me again,

So this LookaheadSuccess comes from JavaCC.
It seems this issue was solved in JavaCC some time ago, from JavaCC 7.0.5:
https://github.com/javacc/javacc/pull/99
https://github.com/javacc/javacc/blob/46aa917fda0d0726690e384632e5cefae46bab68/docs/release-notes.md?plain=1#L163



On Fri, Jul 5, 2024 at 5:44 PM Cyril DE CATHEU 
<[email protected]<mailto:[email protected]>> wrote:
Had a deeper look, it seems the LookaheadSuccess instance is exploited for 
conditional branching

Eg:

try { return !jj_3_2(); }
catch(LookaheadSuccess ls) { return true; }
finally { jj_save(1, xla); }

So I guess lazy instantiation will not help because the LookaheadSuccess will 
be instantiated anyway.
If we know this error is always caught internally in the class, could we 
override the constructor to not do a call to Throwable.fillInStackTrace(); ? 
which is what makes this slow.

On Fri, Jul 5, 2024 at 10:53 AM Cyril DE CATHEU 
<[email protected]<mailto:[email protected]>> wrote:
Hey,

I'm using Calcite to parse snippets of SQL.
My (simplified) function looks like this:

public static SqlNode expressionToNode(final String sqlExpression,
    final SqlParser.Config config) {
  SqlParser sqlParser = SqlParser.create(sqlExpression, config);
  try {
    return sqlParser.parseExpression();
  } catch (SqlParseException e) {
     //do something
  }
}

and it is called many times.
The parser set in the config is Babel, so the parser instantiated is 
SqlBabelParserImpl.
A SqlBabelParserImpl is instantiated every time this function is called. From 
what I understand this parser cannot be re-used easily to parse different 
expressions but let me know if I'm incorrect.

The issue I'm encountering is the instantiation of this class is pretty slow 
because one of the instance field  jj_ls extends java.lang.Error and is 
instantiated when the SqlBabelParserImpl is instantiated.

[Screenshot 2024-07-05 at 10.37.17.png]

here is the extract of the generated code in SqlBabelParserImpl:

static private final class LookaheadSuccess extends java.lang.Error { }

final private LookaheadSuccess jj_ls = new LookaheadSuccess();

final private boolean jj_scan_token(int kind) {
  if (jj_scanpos == jj_lastpos) {
    jj_la--;
    if (jj_scanpos.next == null) {
      jj_lastpos = jj_scanpos = jj_scanpos.next = token_source.getNextToken();
    } else {
      jj_lastpos = jj_scanpos = jj_scanpos.next;
    }
  } else {
    jj_scanpos = jj_scanpos.next;
  }
  if (jj_rescan) {
    int i = 0; Token tok = token;
    while (tok != null && tok != jj_scanpos) { i++; tok = tok.next; }
    if (tok != null) jj_add_error_token(kind, i);
  }
  if (jj_scanpos.kind != kind) return true;
  if (jj_la == 0 && jj_scanpos == jj_lastpos) throw jj_ls;
  return false;
}


jj_ls is only used in an error case. (before last line of jj_scan_token)
I'm assuming the re-use of a single error instance is done on purpose,
but could we consider instantiate jj_ls lazily?

I can try to do this but I'm a bit lost in the code generation codebase so I 
would need some pointers.

Have a nice day.
--

 [StarTree] <https://startree.ai>
Cyril de Catheu
Software Engineer
+33 (684) 829-908<tel:+33+(684)+829-908>
Follow us: [RSS] <https://www.startree.ai/blogs> [LinkedIn] 
<https://www.linkedin.com/company/startreedata/> [Twitter] 
<https://twitter.com/startreedata> [Slack] <https://stree.ai/slack> [YouTube] 
<https://youtube.com/StarTreeData>

[Try StarTree Cloud 
Today]<https://get.startree.ai/startree-cloud?utm_campaign=byoc-edition-of-startree-cloud&utm_source=email&utm_content=startree-employee-email-signatures>

Reply via email to