Hi,

> However, If prefixPath is not a leaf node, a StringBuilder will be
created instead of reference access.

In your example, prefixPath is a leaf node, is that right?

Maybe it is the incorrect of the API call that lead to the bad performance.
Can we do some unit tests? e.g. just implement 1 ~ 2 grammars using both
Antlr3 and 4 and test the performance?

By the way, I noticed that Calcite uses JavaCC...

Best,
-----------------------------------
Xiangdong Huang
School of Software, Tsinghua University

 黄向东
清华大学 软件学院


康愈圆 <[email protected]> 于2019年9月9日周一 上午11:43写道:

> Hi,
>
> Yes, antlr3.g file have the same detailed definition.However, ANTLR v3
> allows users to explicitly define the structure of the tree.
>
> For example,
>
> setStorageGroup
>   : KW_SET KW_STORAGE KW_GROUP KW_TO prefixPath
>   -> ^(TOK_SET ^(TOK_STORAGEGROUP prefixPath))
>   ;
>
> the structure of the tree is like:
>
>             'SET'
>               |
>         'STORAGEGROUP'
>               |
>          prefixPath
>
> The prefixPath is another tree. Users can recursively analyse the AST node
> by function like analyze(prefixPath). Data are accessed by reference.
>
> However, in ANTLR v4, the '->' operator is omitted.So the statement of
> setting storage group is defined as
>
> setStorageGroup
>   : KW_SET KW_STORAGE KW_GROUP KW_TO prefixPath
>
> If we need to get the string info of prefixPath, we can use
> prefixPath.getText(), which is actually more clear and direct for
> developers. However, If
> prefixPath is not a leaf node, a StringBuilder will be created instead of
> reference access. Although operations on StringBuilder is faster than on
> String,
> creating StringBuilder too frequenly is a heavy overhead, which impairs
> the benefits and even reduce the overall performance.
>
> Currently, I think this is what leads to the problem.
>
> Best,
> ---------------------
> Yuyuan KANG
>
>
>
> > -----原始邮件-----
> > 发件人: "Xiangdong Huang" <[email protected]>
> > 发送时间: 2019-09-09 00:08:00 (星期一)
> > 收件人: [email protected]
> > 抄送:
> > 主题: Re: [jira] [Created] (IOTDB-201) Query parsing runs slower when
> using ANTLR v4
> >
> > Hi,
> >
> > > There are some grammar definitions that are too detailed, such as
> decimal
> > numbers, which are categorized into many types. I think making the rules
> > more general may decrease the times of calling getText() method.
> >
> > One question, does the antlr3.g file have the same detailed definition,
> > e.g., the decimal numbers?
> >
> > Best,
> >
> > -----------------------------------
> > Xiangdong Huang
> > School of Software, Tsinghua University
> >
> >  黄向东
> > 清华大学 软件学院
> >
> >
> > 康愈圆 <[email protected]> 于2019年9月5日周四 下午11:11写道:
> >
> > > Hi,
> > >
> > > I've been working on JIRA issue [IOTDB-190 switch to ANTLR v4] these
> days.
> > >
> > > I implemented the SQL parsing module. However, it seems that the
> parsing
> > > efficiency reduces a lot when using ANTLR v4.
> > >
> > > It turns out that RuleContext.getText() is frequently called, which
> takes
> > > more than 90% of the CPU time.
> > >
> > > The grammer definition (.g4 file) here is a continuation of previous
> > > version (ANTLR v3). There are some grammar definitions that are too
> > > detailed, such as decimal numbers, which are categorized into many
> types. I
> > > think making the rules more general may decrease the times of calling
> > > getText() method.
> > >
> > > I plan to reconstruct the grammer definition to improve the parsing
> > > efficiency.
> > >
> > > ----
> > > Yuyuan KANG
> > >
> > > 在2019-09-06 13:30:00,Yuyuan KANG (Jira)<[email protected]>写道:
> > > > Yuyuan KANG created IOTDB-201:
> > > > ---------------------------------
> > > >
> > > >              Summary: Query parsing runs slower when using ANTLR v4
> > > >                  Key: IOTDB-201
> > > >                  URL:
> https://issues.apache.org/jira/browse/IOTDB-201
> > > >              Project: Apache IoTDB
> > > >           Issue Type: Improvement
> > > >             Reporter: Yuyuan KANG
> > > >
> > > >
> > > > The system now uses ANTLR v3. When transformed to ANTLR v4 using
> > > previous grammar definition, experiment result shows that the
> efficiency of
> > > logical plan generation is negatively impacted.
> > > >
> > > >
> > > >
> > > > --
> > > > This message was sent by Atlassian Jira
> > > > (v8.3.2#803003)
> > >
> > >
>

Reply via email to