Alternatively, you could post-process the results after all the optimizations are performed. It is not a great deal of work to write a recursive Rel/Rex Shuttle.
James On Tue, Aug 8, 2023 at 7:12 PM LakeShen <shenleifight...@gmail.com> wrote: > From calcite's point of view, calcite is not binded to any engine, it is > not aware of the specific implementation of the underlying engine, as > julian said, SEARCH (and Sarg) is introduced for internal optimization of > calcite. > > In our project, we also do some extra processing for SEARCH (and Sarg), > maybe we could add a configuration parameter to control whether have the > SEARCH > (and Sarg) RexNode. For example,we could add a configuration parameter for > SqlToRelConverter if the user doesn't want a SEARCH (and Sarg) > expression,if false, SqlNode to RelNode would not have the SEARCH (and > Sarg) RexNode. > > Best, LakeShen > > Julian Hyde <jhyde.apa...@gmail.com> 于2023年8月9日周三 04:28写道: > > > The optimizations are the reason that SEARCH (and Sarg) exist. For the > > simplifier to handle all of the combinations of <, <=, >, >=, =, <>, and > > AND, OR, NOT is prohibitively expensive; if the same expressions are > > converted to Sargs they can be optimized using simple set operations. > > > > > > > On Aug 4, 2023, at 5:57 PM, P.F. ZHAN <dethr...@gmail.com> wrote: > > > > > > Thanks, Julian, for your answer. Initially, I was thinking that SEARCH > > > might not be essential. Since Calcite first translates it into a SEARCH > > > operator, and then we can use RexUtil#expandSearch to rewrite it into > an > > > operator supported by Spark, it seems like doing the same job twice. > So, > > > instead, why not consider disabling this operator, thus avoiding the > need > > > to reconvert the operator back into an expression supported by other > > > execution engines, such as Spark, later on? Frankly speaking, I might > not > > > have taken into account the potential simplifications brought by the > > SEARCH > > > operator for optimizing the execution plan. If these > > > simplifications produce a more efficient execution plan or make the > > > optimization stage more efficient, then I'm open to exploring ways to > > > implement an equivalent transformation within Kylin, or even exploring > > the > > > possibility of creating a similar implementation in other execution > > engines > > > like Spark. > > > > > > On Sat, Aug 5, 2023 at 2:57 AM Julian Hyde <jhyde.apa...@gmail.com> > > wrote: > > > > > >> I agree that it should be solved ‘by config’ but not by global config. > > The > > >> mere fact that you are talking to Spark (i.e. using the JDBC adapter > > with > > >> the Spark dialect) should be sufficient right? > > >> > > >> Put another way. Calcite’s internal representation for expressions is > > what > > >> it is. The fact that SEARCH is part of that representation has many > > >> benefits for simplification. Just expect there to be a a translation > > step > > >> from that representation to any backend. > > >> > > >> Julian > > >> > > >> > > >>> On Aug 4, 2023, at 7:22 AM, P.F. ZHAN <dethr...@gmail.com> wrote: > > >>> > > >>> Very nice suggestion. I wonder can we introduce this feature by > config? > > >>> Maybe it’s better for users using more than one query engine to > > interpret > > >>> and execute query. > > >>> > > >>> > > >>> On Fri, Aug 4, 2023 at 22:03 Alessandro Solimando < > > >>> alessandro.solima...@gmail.com> wrote: > > >>> > > >>>> Hello, > > >>>> as LakeShen suggests, you can take a look into RexUtil#expandSearch, > > you > > >>>> can see it in action in RexProgramTest tests, one example: > > >>>> > > >>>> > > >>>> > > >> > > > https://github.com/apache/calcite/blob/98f3048fb1407e2878162ffc80388d4f9dd094b2/core/src/test/java/org/apache/calcite/rex/RexProgramTest.java#L1710-L1727 > > >>>> > > >>>> Best regards, > > >>>> Alessandro > > >>>> > > >>>> On Fri, 4 Aug 2023 at 15:45, LakeShen <shenleifight...@gmail.com> > > >> wrote: > > >>>> > > >>>>> Hi P.F.ZHAN,in calcite,it has a method RexUtil#expandSearch to > expand > > >>>>> Search,maybe you could get some information from this method. > > >>>>> > > >>>>> There is also some logic to simplify Search in the > > >>>>> RexSimplify#simplifySearch method. I hope this could help you. > > >>>>> Here's the code: 1. > > >>>>> > > >>>>> > > >>>> > > >> > > > https://github.com/apache/calcite/blob/98f3048fb1407e2878162ffc80388d4f9dd094b2/core/src/main/java/org/apache/calcite/rex/RexUtil.java#L593 > > >>>>> 2. > > >>>>> > > >>>>> > > >>>> > > >> > > > https://github.com/apache/calcite/blob/98f3048fb1407e2878162ffc80388d4f9dd094b2/core/src/main/java/org/apache/calcite/rex/RexSimplify.java#L2132 > > >>>>> > > >>>>> Best, > > >>>>> LakeShen > > >>>>> > > >>>>> Soumyadeep Mukhopadhyay <soumyamy...@gmail.com> 于2023年8月4日周五 > > 20:29写道: > > >>>>> > > >>>>>> Thank you, shall explore more on this! :) > > >>>>>> > > >>>>>> > > >>>>>> On Fri, 4 Aug 2023 at 5:53 PM, P.F. ZHAN <dethr...@gmail.com> > > wrote: > > >>>>>> > > >>>>>>> Aha, I'm using Apache Kylin which uses Calcite to generate a > > logical > > >>>>>> plan, > > >>>>>>> then convert to Spark plan to execute a query. Given that Calcite > > has > > >>>>>> more > > >>>>>>> operations for aggregations, and Kylin wants to take full > > advantage > > >>>> of > > >>>>>>> precomputed cubes (something like Calcite's materialized views), > it > > >>>>> uses > > >>>>>>> both Calcite and Spark(for distribution computing). Maybe it's > wild > > >>>>> and a > > >>>>>>> little fun, but it does works well on many scenarios. > > >>>>>>> > > >>>>>>> On Fri, Aug 4, 2023 at 8:10 PM Soumyadeep Mukhopadhyay < > > >>>>>>> soumyamy...@gmail.com> wrote: > > >>>>>>> > > >>>>>>>> I am curious about your use case. Are you not losing out on the > > >>>>>>>> optimisations of Calcite when you are using Spark? Is it > possible > > >>>> for > > >>>>>> you > > >>>>>>>> to share a general approach where we will be able to keep the > > >>>>>>> optimisations > > >>>>>>>> done by Calcite and use Spark on top of it? > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> On Fri, 4 Aug 2023 at 5:19 PM, P.F. ZHAN <dethr...@gmail.com> > > >>>> wrote: > > >>>>>>>> > > >>>>>>>>> Generally speaking, the SEARCH operator is very good, but when > we > > >>>>> use > > >>>>>>>>> Calcite to optimize the logical plan and then use Spark to > > >>>> execute, > > >>>>>>> this > > >>>>>>>> is > > >>>>>>>>> unsupported. So is there a more elegant way to close the SEARCH > > >>>>>>> operator? > > >>>>>>>>> Or how to convert the SEARCH operator to the IN operator before > > >>>>>>>> converting > > >>>>>>>>> the Calcite logical plan to the Spark logical plan? If we do > > >>>> this, > > >>>>> we > > >>>>>>>> need > > >>>>>>>>> to consider Join / Filter, are there any other RelNodes? > > >>>>>>>>> > > >>>>>>>>> Maybe, this optimization is optional more better at present for > > >>>>> many > > >>>>>>>> query > > >>>>>>>>> execution engine does not support this operator? > > >>>>>>>>> > > >>>>>>>> > > >>>>>>> > > >>>>>> > > >>>>> > > >>>> > > >> > > >> > > > > >