Re: [Discussion] Can we forbidden SEARCH operator when use other execution engine?

P.F. ZHAN Wed, 09 Aug 2023 06:53:07 -0700

Thanks. I am planning to go through this operator and follow your
suggestions to handle my business.



On Wed, Aug 9, 2023 at 11:46 James Starr <jamesst...@gmail.com> wrote:

> Alternatively, you could post-process the results after all the
> optimizations are performed.  It is not a great deal of work to write a
> recursive Rel/Rex Shuttle.
>
> James
>
> On Tue, Aug 8, 2023 at 7:12 PM LakeShen <shenleifight...@gmail.com> wrote:
>
> > From calcite's point of view, calcite is not binded to any engine, it is
> > not aware of the specific implementation of the underlying engine, as
> > julian said, SEARCH (and Sarg) is introduced for internal optimization of
> > calcite.
> >
> > In our project, we also do some extra processing for SEARCH (and Sarg),
> > maybe we could add a configuration parameter to control whether have the
> > SEARCH
> > (and Sarg) RexNode. For example,we could add a configuration parameter
> for
> > SqlToRelConverter if the user doesn't want a SEARCH (and Sarg)
> > expression,if false, SqlNode to RelNode would not have the SEARCH (and
> > Sarg) RexNode.
> >
> > Best, LakeShen
> >
> > Julian Hyde <jhyde.apa...@gmail.com> 于2023年8月9日周三 04:28写道：
> >
> > > The optimizations are the reason that SEARCH (and Sarg) exist. For the
> > > simplifier to handle all of the combinations of <, <=, >, >=, =, <>,
> and
> > > AND, OR, NOT is prohibitively expensive; if the same expressions are
> > > converted to Sargs they can be optimized using simple set operations.
> > >
> > >
> > > > On Aug 4, 2023, at 5:57 PM, P.F. ZHAN <dethr...@gmail.com> wrote:
> > > >
> > > > Thanks, Julian, for your answer. Initially, I was thinking that
> SEARCH
> > > > might not be essential. Since Calcite first translates it into a
> SEARCH
> > > > operator, and then we can use RexUtil#expandSearch to rewrite it into
> > an
> > > > operator supported by Spark, it seems like doing the same job twice.
> > So,
> > > > instead, why not consider disabling this operator, thus avoiding the
> > need
> > > > to reconvert the operator back into an expression supported by other
> > > > execution engines, such as Spark, later on? Frankly speaking, I might
> > not
> > > > have taken into account the potential simplifications brought by the
> > > SEARCH
> > > > operator for optimizing the execution plan. If these
> > > > simplifications produce a more efficient execution plan or make the
> > > > optimization stage more efficient, then I'm open to exploring ways to
> > > > implement an equivalent transformation within Kylin, or even
> exploring
> > > the
> > > > possibility of creating a similar implementation in other execution
> > > engines
> > > > like Spark.
> > > >
> > > > On Sat, Aug 5, 2023 at 2:57 AM Julian Hyde <jhyde.apa...@gmail.com>
> > > wrote:
> > > >
> > > >> I agree that it should be solved ‘by config’ but not by global
> config.
> > > The
> > > >> mere fact that you are talking to Spark (i.e. using the JDBC adapter
> > > with
> > > >> the Spark dialect) should be sufficient right?
> > > >>
> > > >> Put another way. Calcite’s internal representation for expressions
> is
> > > what
> > > >> it is. The fact that SEARCH is part of that representation has many
> > > >> benefits for simplification. Just expect there to be a a translation
> > > step
> > > >> from that representation to any backend.
> > > >>
> > > >> Julian
> > > >>
> > > >>
> > > >>> On Aug 4, 2023, at 7:22 AM, P.F. ZHAN <dethr...@gmail.com> wrote:
> > > >>>
> > > >>> Very nice suggestion. I wonder can we introduce this feature by
> > config?
> > > >>> Maybe it’s better for users using more than one query engine to
> > > interpret
> > > >>> and execute query.
> > > >>>
> > > >>>
> > > >>> On Fri, Aug 4, 2023 at 22:03 Alessandro Solimando <
> > > >>> alessandro.solima...@gmail.com> wrote:
> > > >>>
> > > >>>> Hello,
> > > >>>> as LakeShen suggests, you can take a look into
> RexUtil#expandSearch,
> > > you
> > > >>>> can see it in action in RexProgramTest tests, one example:
> > > >>>>
> > > >>>>
> > > >>>>
> > > >>
> > >
> >
> https://github.com/apache/calcite/blob/98f3048fb1407e2878162ffc80388d4f9dd094b2/core/src/test/java/org/apache/calcite/rex/RexProgramTest.java#L1710-L1727
> > > >>>>
> > > >>>> Best regards,
> > > >>>> Alessandro
> > > >>>>
> > > >>>> On Fri, 4 Aug 2023 at 15:45, LakeShen <shenleifight...@gmail.com>
> > > >> wrote:
> > > >>>>
> > > >>>>> Hi P.F.ZHAN,in calcite,it has a method RexUtil#expandSearch to
> > expand
> > > >>>>> Search,maybe you could get some information from this method.
> > > >>>>>
> > > >>>>> There is also some logic to simplify Search in the
> > > >>>>> RexSimplify#simplifySearch method. I hope this could help you.
> > > >>>>> Here's the code: 1.
> > > >>>>>
> > > >>>>>
> > > >>>>
> > > >>
> > >
> >
> https://github.com/apache/calcite/blob/98f3048fb1407e2878162ffc80388d4f9dd094b2/core/src/main/java/org/apache/calcite/rex/RexUtil.java#L593
> > > >>>>> 2.
> > > >>>>>
> > > >>>>>
> > > >>>>
> > > >>
> > >
> >
> https://github.com/apache/calcite/blob/98f3048fb1407e2878162ffc80388d4f9dd094b2/core/src/main/java/org/apache/calcite/rex/RexSimplify.java#L2132
> > > >>>>>
> > > >>>>> Best,
> > > >>>>> LakeShen
> > > >>>>>
> > > >>>>> Soumyadeep Mukhopadhyay <soumyamy...@gmail.com> 于2023年8月4日周五
> > > 20:29写道：
> > > >>>>>
> > > >>>>>> Thank you, shall explore more on this! :)
> > > >>>>>>
> > > >>>>>>
> > > >>>>>> On Fri, 4 Aug 2023 at 5:53 PM, P.F. ZHAN <dethr...@gmail.com>
> > > wrote:
> > > >>>>>>
> > > >>>>>>> Aha, I'm using Apache Kylin which uses Calcite to generate a
> > > logical
> > > >>>>>> plan,
> > > >>>>>>> then convert to Spark plan to execute a query. Given that
> Calcite
> > > has
> > > >>>>>> more
> > > >>>>>>> operations for aggregations, and Kylin  wants to take full
> > > advantage
> > > >>>> of
> > > >>>>>>> precomputed cubes (something like Calcite's materialized
> views),
> > it
> > > >>>>> uses
> > > >>>>>>> both Calcite and Spark(for distribution computing). Maybe it's
> > wild
> > > >>>>> and a
> > > >>>>>>> little fun, but it does works well on many scenarios.
> > > >>>>>>>
> > > >>>>>>> On Fri, Aug 4, 2023 at 8:10 PM Soumyadeep Mukhopadhyay <
> > > >>>>>>> soumyamy...@gmail.com> wrote:
> > > >>>>>>>
> > > >>>>>>>> I am curious about your use case. Are you not losing out on
> the
> > > >>>>>>>> optimisations of Calcite when you are using Spark? Is it
> > possible
> > > >>>> for
> > > >>>>>> you
> > > >>>>>>>> to share a general approach where we will be able to keep the
> > > >>>>>>> optimisations
> > > >>>>>>>> done by Calcite and use Spark on top of it?
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>> On Fri, 4 Aug 2023 at 5:19 PM, P.F. ZHAN <dethr...@gmail.com>
> > > >>>> wrote:
> > > >>>>>>>>
> > > >>>>>>>>> Generally speaking, the SEARCH operator is very good, but
> when
> > we
> > > >>>>> use
> > > >>>>>>>>> Calcite to optimize the logical plan and then use Spark to
> > > >>>> execute,
> > > >>>>>>> this
> > > >>>>>>>> is
> > > >>>>>>>>> unsupported. So is there a more elegant way to close the
> SEARCH
> > > >>>>>>> operator?
> > > >>>>>>>>> Or how to convert the SEARCH operator to the IN operator
> before
> > > >>>>>>>> converting
> > > >>>>>>>>> the Calcite logical plan to the Spark logical plan? If we do
> > > >>>> this,
> > > >>>>> we
> > > >>>>>>>> need
> > > >>>>>>>>> to consider Join / Filter, are there any other RelNodes?
> > > >>>>>>>>>
> > > >>>>>>>>> Maybe, this optimization is optional more better at present
> for
> > > >>>>> many
> > > >>>>>>>> query
> > > >>>>>>>>> execution engine does not support this operator?
> > > >>>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>
> > > >>>>>>
> > > >>>>>
> > > >>>>
> > > >>
> > > >>
> > >
> > >
> >
>

Re: [Discussion] Can we forbidden SEARCH operator when use other execution engine?

Reply via email to