Re: [Discussion] Can we forbidden SEARCH operator when use other execution engine?

P.F. ZHAN Fri, 04 Aug 2023 17:57:57 -0700

Thanks, Julian, for your answer. Initially, I was thinking that SEARCH
might not be essential. Since Calcite first translates it into a SEARCH
operator, and then we can use RexUtil#expandSearch to rewrite it into an
operator supported by Spark, it seems like doing the same job twice. So,
instead, why not consider disabling this operator, thus avoiding the need
to reconvert the operator back into an expression supported by other
execution engines, such as Spark, later on? Frankly speaking, I might not
have taken into account the potential simplifications brought by the SEARCH
operator for optimizing the execution plan. If these
simplifications produce a more efficient execution plan or make the
optimization stage more efficient, then I'm open to exploring ways to
implement an equivalent transformation within Kylin, or even exploring the
possibility of creating a similar implementation in other execution engines
like Spark.


On Sat, Aug 5, 2023 at 2:57 AM Julian Hyde <jhyde.apa...@gmail.com> wrote:

> I agree that it should be solved ‘by config’ but not by global config. The
> mere fact that you are talking to Spark (i.e. using the JDBC adapter with
> the Spark dialect) should be sufficient right?
>
> Put another way. Calcite’s internal representation for expressions is what
> it is. The fact that SEARCH is part of that representation has many
> benefits for simplification. Just expect there to be a a translation step
> from that representation to any backend.
>
> Julian
>
>
> > On Aug 4, 2023, at 7:22 AM, P.F. ZHAN <dethr...@gmail.com> wrote:
> >
> > Very nice suggestion. I wonder can we introduce this feature by config?
> > Maybe it’s better for users using more than one query engine to interpret
> > and execute query.
> >
> >
> > On Fri, Aug 4, 2023 at 22:03 Alessandro Solimando <
> > alessandro.solima...@gmail.com> wrote:
> >
> >> Hello,
> >> as LakeShen suggests, you can take a look into RexUtil#expandSearch, you
> >> can see it in action in RexProgramTest tests, one example:
> >>
> >>
> >>
> https://github.com/apache/calcite/blob/98f3048fb1407e2878162ffc80388d4f9dd094b2/core/src/test/java/org/apache/calcite/rex/RexProgramTest.java#L1710-L1727
> >>
> >> Best regards,
> >> Alessandro
> >>
> >> On Fri, 4 Aug 2023 at 15:45, LakeShen <shenleifight...@gmail.com>
> wrote:
> >>
> >>> Hi P.F.ZHAN,in calcite,it has a method RexUtil#expandSearch to expand
> >>> Search,maybe you could get some information from this method.
> >>>
> >>> There is also some logic to simplify Search in the
> >>> RexSimplify#simplifySearch method. I hope this could help you.
> >>> Here's the code: 1.
> >>>
> >>>
> >>
> https://github.com/apache/calcite/blob/98f3048fb1407e2878162ffc80388d4f9dd094b2/core/src/main/java/org/apache/calcite/rex/RexUtil.java#L593
> >>> 2.
> >>>
> >>>
> >>
> https://github.com/apache/calcite/blob/98f3048fb1407e2878162ffc80388d4f9dd094b2/core/src/main/java/org/apache/calcite/rex/RexSimplify.java#L2132
> >>>
> >>> Best,
> >>> LakeShen
> >>>
> >>> Soumyadeep Mukhopadhyay <soumyamy...@gmail.com> 于2023年8月4日周五 20:29写道：
> >>>
> >>>> Thank you, shall explore more on this! :)
> >>>>
> >>>>
> >>>> On Fri, 4 Aug 2023 at 5:53 PM, P.F. ZHAN <dethr...@gmail.com> wrote:
> >>>>
> >>>>> Aha, I'm using Apache Kylin which uses Calcite to generate a logical
> >>>> plan,
> >>>>> then convert to Spark plan to execute a query. Given that Calcite has
> >>>> more
> >>>>> operations for aggregations, and Kylin  wants to take full advantage
> >> of
> >>>>> precomputed cubes (something like Calcite's materialized views), it
> >>> uses
> >>>>> both Calcite and Spark(for distribution computing). Maybe it's wild
> >>> and a
> >>>>> little fun, but it does works well on many scenarios.
> >>>>>
> >>>>> On Fri, Aug 4, 2023 at 8:10 PM Soumyadeep Mukhopadhyay <
> >>>>> soumyamy...@gmail.com> wrote:
> >>>>>
> >>>>>> I am curious about your use case. Are you not losing out on the
> >>>>>> optimisations of Calcite when you are using Spark? Is it possible
> >> for
> >>>> you
> >>>>>> to share a general approach where we will be able to keep the
> >>>>> optimisations
> >>>>>> done by Calcite and use Spark on top of it?
> >>>>>>
> >>>>>>
> >>>>>> On Fri, 4 Aug 2023 at 5:19 PM, P.F. ZHAN <dethr...@gmail.com>
> >> wrote:
> >>>>>>
> >>>>>>> Generally speaking, the SEARCH operator is very good, but when we
> >>> use
> >>>>>>> Calcite to optimize the logical plan and then use Spark to
> >> execute,
> >>>>> this
> >>>>>> is
> >>>>>>> unsupported. So is there a more elegant way to close the SEARCH
> >>>>> operator?
> >>>>>>> Or how to convert the SEARCH operator to the IN operator before
> >>>>>> converting
> >>>>>>> the Calcite logical plan to the Spark logical plan? If we do
> >> this,
> >>> we
> >>>>>> need
> >>>>>>> to consider Join / Filter, are there any other RelNodes?
> >>>>>>>
> >>>>>>> Maybe, this optimization is optional more better at present for
> >>> many
> >>>>>> query
> >>>>>>> execution engine does not support this operator?
> >>>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>>
> >>
>
>

Re: [Discussion] Can we forbidden SEARCH operator when use other execution engine?

Reply via email to