Thanks for Mihai and Julian's reply, I will have a try. And I think we can include non-standard operators to the Babel module.
Hongyu Guo On 2023/10/12 18:06:52 Julian Hyde wrote: > As Mihai says, the choice is between lexical states and constructing ‘>>’ from consecutive ‘>’ tokens. I strongly prefer the latter. > > (We use lexical states in Calcite for comments and to accommodate different quoting styles, but the parser does not ask the lexer to switch states, because that introduces coupling between two components that should be separate.) > > > > On Oct 12, 2023, at 10:59 AM, <mb...@gmail.com> <mb...@gmail.com> wrote: > > > > This is a notorious problem in C++-like languages. > > There are possible workarounds, depending on the parser generator you are using. > > In the P4 compiler, which does parsing using flex and bison, we essentially never parse ">>" as a token [1], but we reconstruct it from two adjacent ">" when used within an expression. This works for any number of ">". I am not as familiar with the JavaCC generator used by Calcite to guess how it should be exactly written. The P4 compiler relies on two features of flex: lookahead [2] and parser states (introduced in this commit [3], used to change the precedence of the "<" operator based on context). > > > > [1] https://github.com/p4lang/p4c/blob/d79e2e8bfa07c7797891d44b7d084910947bf0a7/frontends/parsers/p4/p4parser.ypp#L212 > > [2] https://github.com/p4lang/p4c/blob/d79e2e8bfa07c7797891d44b7d084910947bf0a7/frontends/parsers/p4/p4lexer.ll#L300 > > [3] https://github.com/p4lang/p4c/commit/49c4c651a2c00fb89814cae311b34ba42dbf29d9 > > > > Mihai > > > > -----Original Message----- > > From: Julian Hyde <jh...@gmail.com> > > Sent: Thursday, October 12, 2023 8:19 AM > > To: dev@calcite.apache.org > > Subject: Re: Question about bitwise right operator > > > > The problem is that you are making the language ambiguous. To get your desired behavior you probably need the lexical analyzer less eager. That’s a hard thing to do, because you probably need to adjust its behavior based on the syntactic context - whether the parser thinks that it is parsing the name of a type. > > > > Every java parser has to solve the same problem, so you should see how they solve it. > > > > I question whether we want SQL to look like Java. This includes whether we use < and > for parameterized types, and whether we include every possible operator such as >> and &&. > > > > Julian > > > >> On Oct 12, 2023, at 02:13, Hongyu Guo <gu...@gmail.com> wrote: > >> > >> Hi devs, > >> > >> I want to add some new bitwise operators to Babel parser, specifically > >> the left shift "<<" and the right shift ">>". > >> > >> To do this, I add `"< BITWISE_RIGHT_SHIFT: \">>\" >"` to > >> binaryOperatorsTokens in babel/src/main/codegen/config.fmpp and a new > >> SqlBinaryOperator to > >> core/src/main/java/org/apache/calcite/sql/fun/SqlLibraryOperators.java > >> , similar to what was done in CALCITE-4980[1]. > >> > >> But I have noticed this change causes an error when using SQL like > >> `select cast(a as map<varchar multiset, map<int, int>>)`. This is > >> because the SQL contains `>>` string, I want to know how to avoid this awkward error. > >> Thanks. > >> > >> [1] https://issues.apache.org/jira/browse/CALCITE-4980 > >> > >> Best, > >> hongyu guo > > > >