The optimizations are the reason that SEARCH (and Sarg) exist. For the simplifier to handle all of the combinations of <, <=, >, >=, =, <>, and AND, OR, NOT is prohibitively expensive; if the same expressions are converted to Sargs they can be optimized using simple set operations.
> On Aug 4, 2023, at 5:57 PM, P.F. ZHAN <dethr...@gmail.com> wrote: > > Thanks, Julian, for your answer. Initially, I was thinking that SEARCH > might not be essential. Since Calcite first translates it into a SEARCH > operator, and then we can use RexUtil#expandSearch to rewrite it into an > operator supported by Spark, it seems like doing the same job twice. So, > instead, why not consider disabling this operator, thus avoiding the need > to reconvert the operator back into an expression supported by other > execution engines, such as Spark, later on? Frankly speaking, I might not > have taken into account the potential simplifications brought by the SEARCH > operator for optimizing the execution plan. If these > simplifications produce a more efficient execution plan or make the > optimization stage more efficient, then I'm open to exploring ways to > implement an equivalent transformation within Kylin, or even exploring the > possibility of creating a similar implementation in other execution engines > like Spark. > > On Sat, Aug 5, 2023 at 2:57 AM Julian Hyde <jhyde.apa...@gmail.com> wrote: > >> I agree that it should be solved ‘by config’ but not by global config. The >> mere fact that you are talking to Spark (i.e. using the JDBC adapter with >> the Spark dialect) should be sufficient right? >> >> Put another way. Calcite’s internal representation for expressions is what >> it is. The fact that SEARCH is part of that representation has many >> benefits for simplification. Just expect there to be a a translation step >> from that representation to any backend. >> >> Julian >> >> >>> On Aug 4, 2023, at 7:22 AM, P.F. ZHAN <dethr...@gmail.com> wrote: >>> >>> Very nice suggestion. I wonder can we introduce this feature by config? >>> Maybe it’s better for users using more than one query engine to interpret >>> and execute query. >>> >>> >>> On Fri, Aug 4, 2023 at 22:03 Alessandro Solimando < >>> alessandro.solima...@gmail.com> wrote: >>> >>>> Hello, >>>> as LakeShen suggests, you can take a look into RexUtil#expandSearch, you >>>> can see it in action in RexProgramTest tests, one example: >>>> >>>> >>>> >> https://github.com/apache/calcite/blob/98f3048fb1407e2878162ffc80388d4f9dd094b2/core/src/test/java/org/apache/calcite/rex/RexProgramTest.java#L1710-L1727 >>>> >>>> Best regards, >>>> Alessandro >>>> >>>> On Fri, 4 Aug 2023 at 15:45, LakeShen <shenleifight...@gmail.com> >> wrote: >>>> >>>>> Hi P.F.ZHAN,in calcite,it has a method RexUtil#expandSearch to expand >>>>> Search,maybe you could get some information from this method. >>>>> >>>>> There is also some logic to simplify Search in the >>>>> RexSimplify#simplifySearch method. I hope this could help you. >>>>> Here's the code: 1. >>>>> >>>>> >>>> >> https://github.com/apache/calcite/blob/98f3048fb1407e2878162ffc80388d4f9dd094b2/core/src/main/java/org/apache/calcite/rex/RexUtil.java#L593 >>>>> 2. >>>>> >>>>> >>>> >> https://github.com/apache/calcite/blob/98f3048fb1407e2878162ffc80388d4f9dd094b2/core/src/main/java/org/apache/calcite/rex/RexSimplify.java#L2132 >>>>> >>>>> Best, >>>>> LakeShen >>>>> >>>>> Soumyadeep Mukhopadhyay <soumyamy...@gmail.com> 于2023年8月4日周五 20:29写道: >>>>> >>>>>> Thank you, shall explore more on this! :) >>>>>> >>>>>> >>>>>> On Fri, 4 Aug 2023 at 5:53 PM, P.F. ZHAN <dethr...@gmail.com> wrote: >>>>>> >>>>>>> Aha, I'm using Apache Kylin which uses Calcite to generate a logical >>>>>> plan, >>>>>>> then convert to Spark plan to execute a query. Given that Calcite has >>>>>> more >>>>>>> operations for aggregations, and Kylin wants to take full advantage >>>> of >>>>>>> precomputed cubes (something like Calcite's materialized views), it >>>>> uses >>>>>>> both Calcite and Spark(for distribution computing). Maybe it's wild >>>>> and a >>>>>>> little fun, but it does works well on many scenarios. >>>>>>> >>>>>>> On Fri, Aug 4, 2023 at 8:10 PM Soumyadeep Mukhopadhyay < >>>>>>> soumyamy...@gmail.com> wrote: >>>>>>> >>>>>>>> I am curious about your use case. Are you not losing out on the >>>>>>>> optimisations of Calcite when you are using Spark? Is it possible >>>> for >>>>>> you >>>>>>>> to share a general approach where we will be able to keep the >>>>>>> optimisations >>>>>>>> done by Calcite and use Spark on top of it? >>>>>>>> >>>>>>>> >>>>>>>> On Fri, 4 Aug 2023 at 5:19 PM, P.F. ZHAN <dethr...@gmail.com> >>>> wrote: >>>>>>>> >>>>>>>>> Generally speaking, the SEARCH operator is very good, but when we >>>>> use >>>>>>>>> Calcite to optimize the logical plan and then use Spark to >>>> execute, >>>>>>> this >>>>>>>> is >>>>>>>>> unsupported. So is there a more elegant way to close the SEARCH >>>>>>> operator? >>>>>>>>> Or how to convert the SEARCH operator to the IN operator before >>>>>>>> converting >>>>>>>>> the Calcite logical plan to the Spark logical plan? If we do >>>> this, >>>>> we >>>>>>>> need >>>>>>>>> to consider Join / Filter, are there any other RelNodes? >>>>>>>>> >>>>>>>>> Maybe, this optimization is optional more better at present for >>>>> many >>>>>>>> query >>>>>>>>> execution engine does not support this operator? >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >> >>