Re: Re: Re: Problem with converters and possibly rule matching

Stamatis Zampetakis Fri, 15 Nov 2019 00:04:30 -0800

Hi again Vladimir,

In my previous email, I mentioned three rules for performing logical to
physical conversions.
It's normal that if you add more operators we will end up with more rules.
Now that the example has a filter we have the following rule list:


Rule 1: RootLogicalRel -> RootPhysicalRel
Rule 2: ProjectLogicalRel -> ProjectPhysicalRel
Rule 3: FilterLogicalRel -> FilterPhysicalRel
Rule 4: MapScanLogicalRel -> MapScanPhysicalRel

If I understand well, your concern is that in order to get the optimal plan
you would have to introduce many additional rules for the Exchange operator
(e.g., ExchangeProjectTransposeRule, ExchangeFilterTransposeRue, etc.).
I was thinking that it is not necessary to introduce these kind of rules.
You could have only one rule, i.e., PhysicaExchangeRule, for enforcing a
distribution that would be applied when there is a requirement for
particular distribution.

So for the initial plan
LogicalRoot [distribution=SINGLETON]
-> LogicalProject
 -> LogicalFilter
  -> LogicalScan

The ruleset above should generate the following alternatives where the
PhysicalExchangeRule is applied 3 times.

Alt1:
PhysicalRoot
-> SingletonExchange
 -> PhysicalProject
  -> PhysicalFilter
   -> PhysicalScan

Alt2:
PhysicalRoot
-> PhysicalProject
 -> SingletonExchange
  -> PhysicalFilter
   -> PhysicalScan

Alt3:
PhysicalRoot
-> PhysicalProject
 -> PhysicalFilter
  -> SingletonExchange
   -> PhysicalScan

This is still in the same spirit of the previous example where the physical
property (distribution) is either satisfied immediately or passed down to
the next level.

Best,
Stamatis

On Thu, Nov 14, 2019 at 11:14 AM Vladimir Ozerov <ppoze...@gmail.com> wrote:

> search place* = search space
>
> чт, 14 нояб. 2019 г. в 13:10, Vladimir Ozerov <ppoze...@gmail.com>:
>
> > Hi Haisheng,
> >
> > I double-checked the code. My original version returned false for some
> > cases, but it didn't affect number of rules calls anyway, so I changed it
> > to always return true. Please note that if I change the code as you
> > suggested, the test started failing, because bottom-up propagation of
> rule
> > calls no longer work: when the child is converted to physical form, the
> > parent logical node is not notified. This is the very problem I address
> > with that weird physical-to-logical conversions: they do not make sense,
> > and converter expansion does not produce any new rels, but their
> existence
> > allow for logical rule re-trigger which ultimately allow the plan to
> > compile.
> >
> > Regarding two conventions - I agree that it may look strange, but I do
> not
> > see any problems from the correctness perspective. Separation of logical
> > and physical planning helps me avoid excessive expansion of the search
> > place: I do not want my physical rules to produce new rels from not
> > optimized logical rels. Do you see any problems with that approach?
> >
> > Regards,
> > Vladimir
> >
> > ср, 6 нояб. 2019 г. в 03:52, Haisheng Yuan <h.y...@alibaba-inc.com>:
> >
> >> Hi Vladimir,
> >>
> >> The code in PHYSICAL convention L44 looks weird, I think it always
> >> returns true.
> >>
> >>
> https://github.com/devozerov/calcite-optimizer/blob/master/src/main/java/devozerov/HazelcastConventions.java#L44
> >>
> >> Try this:
> >>
> >> fromTraits.containsIfApplicable(Convention.PHYSICAL)
> >>     && toTraits.containsIfApplicable(Convention.PHYSICAL);
> >>
> >>
> >> Adding a AbstractConverter on logical operators is meaningless. Calcite
> >> is mixing the concept of logical and physical together, which is sad.
> >>
> >> BTW, using 2 conventions is not appropriate and wrong.
> >>
> >> - Haisheng
> >>
> >> ------------------------------------------------------------------
> >> 发件人：Vladimir Ozerov<ppoze...@gmail.com>
> >> 日 期：2019年11月05日 18:02:15
> >> 收件人：Haisheng Yuan<h.y...@alibaba-inc.com>
> >> 抄 送：dev@calcite.apache.org (dev@calcite.apache.org)<
> >> dev@calcite.apache.org>
> >> 主 题：Re: Re: Problem with converters and possibly rule matching
> >>
> >> Hi Haisheng,
> >>
> >>  think I already tried something very similar to what you explained, but
> >> it gave not an optimal plan. Please let me describe what I did. I would
> >> appreciate your feedback.
> >>
> >> 1) We start with a simple operator tree Root <- Project <- Scan, where
> >> the root is a final aggregator in the distributed query engine:
> >> -> LogicalRoot
> >>  -> LogicalProject
> >>   -> LogicalScan
> >>
> >> 2) First, we convert the Root and enforce SINGLETON distribution on a
> >> child:
> >> *-> PhysicalRoot[SINGLETON]*
> >> * -> Enforcer#1[SINGLETON]*
> >>   -> LogicalProject
> >>    -> LogicalScan
> >>
> >> 3) Then the project's rule is invoked. It doesn't know the distribution
> >> of the input, so it requests ANY distribution. Note that we have to set
> ANY
> >> to the project as well since we do not know the distribution of the
> input:
> >> -> PhysicalRoot[SINGLETON]
> >>  -> Enforcer#1[SINGLETON]
> >> *  -> PhysicalProject[ANY]*
> >> *   -> Enforcer#2[ANY]*
> >>     -> LogicalScan
> >>
> >> 4) Finally, the physical scan is created and its distribution is
> >> resolved. Suppose that it is REPLICATED, i.e. the whole result set is
> >> located on all nodes.
> >> -> PhysicalRoot[SINGLETON]
> >>  -> Enforcer#1[SINGLETON]
> >>   -> PhysicalProject[ANY]
> >>    -> Enforcer#2[ANY]
> >> *    -> PhysicalScan[REPLICATED]*
> >>
> >> 5) Now as all logical nodes are converted, we start resolving enforcers.
> >> The second one is no-op, since REPLICATED satisfies ANY:
> >> -> PhysicalRoot[SINGLETON]
> >>  -> Enforcer#1[SINGLETON]
> >>   -> PhysicalProject[ANY]
> >>    -> PhysicalScan[REPLICATED]
> >>
> >> 6) But the first enforcer now requires an Exchange, since ANY doesn't
> >> satisfy SINGLETON!
> >> -> PhysicalRoot[SINGLETON]
> >> * -> SingletonExchange[SINGLETON]*
> >>   -> PhysicalProject[ANY] // <= unresolved!
> >>    -> PhysicalScan[REPLICATED]
> >>
> >> The resulting plan requires data movement only because we didn't know
> >> precise distribution of the PhysicalProject when it was created. But
> should
> >> I enable Convention.Impl.canConvertConvention, bottom-up propagation
> >> kicks in, and the correct plan is produced because now LogicalProject
> >> has a chance to be converted to PhysicalProject with the concrete
> >> distribution. The optimized plan looks like this (since REPLICATED
> >> satisfies SINGLETON):
> >> -> PhysicalRoot[SINGLETON]
> >>  -> PhysicalProject[REPLICATED]
> >>   -> PhysicalScan[REPLICATED]
> >>
> >> You may see this in action in my reproducer:
> >> 1) Test producing "bad" plan:
> >>
> https://github.com/devozerov/calcite-optimizer/blob/master/src/test/java/devozerov/OptimizerTest.java#L45
> >> 2) Root enforces SINGLETON on Project:
> >>
> https://github.com/devozerov/calcite-optimizer/blob/master/src/main/java/devozerov/physical/RootPhysicalRule.java#L45
> >> 3) Project enforces default (ANY) distribution on Scan:
> >>
> https://github.com/devozerov/calcite-optimizer/blob/master/src/main/java/devozerov/physical/ProjectPhysicalRule.java#L49
> >>
> >> Please let me know if this flow is similar to what you meant.
> >>
> >> Regards,
> >> Vladimir.
> >>
> >> пн, 4 нояб. 2019 г. в 10:33, Haisheng Yuan <h.y...@alibaba-inc.com>:
> >>
> >>> Hi Vladimir,
> >>>
> >>> This is still can be done through top-down request approach.
> >>>
> >>> PhysicalFilter operator should request ANY distribution from child
> >>> operator, unless there is outer reference in the filter condition, in
> which
> >>> case, PhysicalFilter should request SINGLETON or BROADCAST
> distribution. So
> >>> in your case, PhysicalFilter request ANY, its required distribution
> will be
> >>> enforced on filter's output.
> >>>
> >>> Regarding index usage, you should have a FIlterTableScan2IndexGet
> >>> logical transformation rule, and a IndexGet2IndexScan physical
> >>> implementation rule. Note that IndexGet is a logical operator and
> IndexScan
> >>> is a physical operator, which are also used by SQL Server.
> >>>
> >>> - Haisheng
> >>>
> >>> ------------------------------------------------------------------
> >>> 发件人：Vladimir Ozerov<ppoze...@gmail.com>
> >>> 日 期：2019年11月01日 17:30:26
> >>> 收件人：<dev@calcite.apache.org>
> >>> 主 题：Re: Problem with converters and possibly rule matching
> >>>
> >>> Hi Stamatis,
> >>>
> >>> Thank you for your reply. I also thought that we may set the
> distribution
> >>> trait during logical planning because it is known in advance. And the
> >>> example I gave will work! :-) But unfortunately, it will work only
> >>> because
> >>> the tree is very simple, and the Project is adjacent to the Scan. This
> is
> >>> how my reproducer will work in that case:
> >>> 1) Root: enforce "SINGLETON" on Project
> >>> 2) Project: check the logical Scan, infer already resolved
> distribution,
> >>> then convert to [PhyiscalProject <- PhysicalScan]
> >>> 3) Resolve Root enforcer, adding and Exchange if needed.
> >>>
> >>> But this stops working as soon as a plan becomes more complex so that
> it
> >>> is
> >>> impossible to infer the distribution from the child immediately. E.g.:
> >>> LogicalRoot [distribution=SINGLETON]
> >>> -> LogicalProject // We are here new and cannot produce the physical
> >>> project
> >>> -> LogicalFilter[distribution=?]
> >>> -> LogicalScan[distribution=REPLICATED]
> >>>
> >>> This is where your suggestion with cascading enforcement may kick in.
> But
> >>> now consider that instead of having REPLICATED distribution, which
> >>> satisfies SINGLETON, we have a PARTITIONED distribution, It doesn't
> >>> satisfy
> >>> SINGLETON, so we have to insert the exchange.
> >>>
> >>> Before:
> >>> PhysicalRoot [SINGLETON]
> >>> -> PhysicalProject [SINGLETON]
> >>> -> PhysicalFilter [SINGLETON]
> >>> -> LogicalScan [PARTITIONED] // SINGLETON is enforced here
> >>>
> >>> After:
> >>> PhysicalRoot [SINGLETON]
> >>> -> PhysicalProject [SINGLETON]
> >>> -> PhysicalFilter [SINGLETON]
> >>> -> SingletonExchange [SINGLETON]
> >>> -> PhysicalScan [PARTITIONED]
> >>>
> >>> But unfortunately, this plan is not optimal. Since we perform Project
> and
> >>> Filter after the exchange, we now have to reshuffle the whole table.
> The
> >>> optimal plan would be to pushdown Project and Filter past exchange:
> >>> PhysicalRoot [SINGLETON]
> >>> -> SingletonExchange [SINGLETON]
> >>> -> PhysicalProject [PARTITIONED]
> >>> -> PhysicalFilter [PARTITIONED]
> >>> -> PhysicalScan [PARTITIONED]
> >>>
> >>> But in order to achieve that, now I need to introduce new rules which
> >>> will
> >>> attempt to transpose dozens of different operators past exchange, so
> most
> >>> likely the planning will never finish :-)
> >>>
> >>> This is why it seems that the bottom-up approach seems to be more
> natural
> >>> for such cases. Another example is index support. A table may have a
> >>> number
> >>> of access methods in addition to plain scan, and every such method may
> >>> produce a physical scan with different collations. It is not practical
> to
> >>> try to enforce something from the top. Instead, we'd better produce
> >>> alternatives from the bottom.
> >>>
> >>> Basically, Drill tries to mix two approaches. First, it tries to derive
> >>> traits from the child. If it fails and no transformations were created,
> >>> it
> >>> just stops trait propagation. As a result, an exchange is created on
> top
> >>> of
> >>> the operator where we stopped, which again leads to not optimal plans.
> >>> You
> >>> may observe this in action if you run my reproducer. "testPass"
> produces
> >>> an
> >>> optimal plan with help of abstract converters. "testDrill" produces not
> >>> optimal plan with the unnecessary exchange.
> >>>
> >>> All in all, I cannot get how we can employ distribution metadata and
> >>> index
> >>> metadata for optimal planning without using a pure bottom-up approach.
> >>>
> >>> Regards,
> >>> Vladimir.
> >>>
> >>> пт, 1 нояб. 2019 г. в 11:42, Stamatis Zampetakis <zabe...@gmail.com>:
> >>>
> >>> > The discussion is interesting thanks for the nice examples.
> >>> > I am replying in this thread since it has all the examples and the
> >>> history
> >>> > of the conversation.
> >>> >
> >>> > *Part I: Distribution physical or logical property*
> >>> >
> >>> > The Volcano paper writes the following regarding properties:
> >>> >
> >>> > "Logical properties can be derived from the logical algebra
> expression
> >>> and
> >>> > include schema, expected size, etc., while physical properties depend
> >>> on
> >>> > algorithms, e.g., sort order, partitioning, etc."
> >>> >
> >>> > We could say that partitioning is not only a property which depends
> on
> >>> an
> >>> > algorithm but a property which depends on the schema. When you have a
> >>> table
> >>> > you know in advance that the particular table is replicated.
> >>> >
> >>> > Basically, this means that the starting "logical" plan in your case
> >>> could
> >>> > be the following:
> >>> > RootLogicalRel[convention=LOGICAL, distribution=ANY]
> >>> > -> ProjectLogicalRel[convention=LOGICAL, distribution=ANY]
> >>> > -> MapScanLogicalRel[convention=LOGICAL, distribution=REPLICATED]
> >>> >
> >>> > When you are about to create your physical projection it is not
> >>> necessary
> >>> > to create three equivalent paths but the most promising by asking the
> >>> input
> >>> > what can it offer as traits. Again this partially overlaps with the
> >>> > discussion about "On demand traitset" [1] where as I wrote there is
> >>> already
> >>> > a mechanism to derive traits from children (collation, distribution,
> >>> etc.).
> >>> >
> >>> > Note that I am not saying that there is nothing more to be done to
> >>> improve
> >>> > the Calcite optimizer; I am just trying to provide a viable
> alternative
> >>> > given the current situation of the project.
> >>> >
> >>> > *Part II: Example with the original Volcano optimizer*
> >>> >
> >>> > Now let me try to show how the original (from the paper) Volcano
> >>> optimizer
> >>> > could handle your query.
> >>> >
> >>> > The optimization starts and we request two properties convention and
> >>> > distribution.
> >>> >
> >>> > RootLogicalRel[convention=PHYSICAL, distribution=SINGLETON]
> >>> > -> ProjectLogicalRel
> >>> > -> MapScanLogicalRel
> >>> >
> >>> > Rule 1: RootLogicalRel -> RootPhysicalRel applies the algorithm but
> >>> cannot
> >>> > satisfy the SINGLETON property so it propagates the demand
> >>> >
> >>> > RootPhysicalRel
> >>> > -> ProjectLogicalRel [convention=PHYSICAL, distribution=SINGLETON]
> >>> > -> MapScanLogicalRel
> >>> >
> >>> > Rule 2: ProjectLogicalRel -> ProjectPhysicalRel applies the algorithm
> >>> but
> >>> > cannot satisfy the SINGLETON property so it propagates the demand
> >>> >
> >>> > RootPhysicalRel
> >>> > -> ProjectPhysicalRel
> >>> > -> MapScanLogicalRel [convention=PHYSICAL, distribution=SINGLETON]
> >>> >
> >>> > Rule 3: MapScanLogicalRel -> MapScanPhysicalRel applies the algorithm
> >>> and
> >>> > given that the table is replicated it satisfies the distribution
> >>> property.
> >>> >
> >>> > RootPhysicalRel
> >>> > -> ProjectPhysicalRel
> >>> > -> MapScanPhysicalRel
> >>> >
> >>> > So we end up with a complete plan that satisfies all properties and
> >>> does
> >>> > not have redundant exchange operators.
> >>> > Moreover we didn't require all possible distributions when we applied
> >>> Rule
> >>> > 2 since we just want to satisfy the SINGLETON property requested by
> the
> >>> > parent.
> >>> > The example above does not demonstrate the additional optimization
> >>> paths
> >>> > which would be apply an enforcer (an Exchange operator to satisfy the
> >>> > SINGLETON property) at a higher level than the scan.
> >>> >
> >>> > Would this be an acceptable approach? Does it still suffer from a big
> >>> > search space?
> >>> >
> >>> > To make the analog with the actual Volcano optimizer in Calcite the
> >>> thing
> >>> > that may be missing is to be able to know what trait was requested by
> >>> the
> >>> > parent during the application of Rule 2.
> >>> >
> >>> > Best,
> >>> > Stamatis
> >>> >
> >>> > [1]
> >>> >
> >>> >
> >>>
> https://lists.apache.org/thread.html/79dac47ea50b5dfbd3f234e368ed61d247fb0eb989f87fe01aedaf25@%3Cdev.calcite.apache.org%3E
> >>> >
> >>> > On Wed, Oct 30, 2019 at 7:24 PM Vladimir Ozerov <ppoze...@gmail.com>
> >>> > wrote:
> >>> >
> >>> > > Hi Stamatis,
> >>> > >
> >>> > > The problem that the presented reproducer is a very simplified
> >>> version of
> >>> > > what is actually needed. In reality, there are several distribution
> >>> > types -
> >>> > > PARTITIONED, REPLICATED, SINGLETON. To make things worse,
> >>> PARTITIONED may
> >>> > > or may not have concrete distribution fields. In theory, I can
> >>> create one
> >>> > > transformation per distribution type, but that would increase plan
> >>> space
> >>> > > significantly. In my sample "Root <- Project <- Scan" plan, there
> is
> >>> no
> >>> > > room for optimization at all, we only need to convert the nodes and
> >>> > > propagate traits. But if I follow the proposed approach, the
> planner
> >>> > would
> >>> > > create three equivalent paths. For complex queries, this may
> increase
> >>> > > optimization time significantly.
> >>> > >
> >>> > > What I need instead is to gradually convert and optimize nodes
> >>> > *bottom-up*,
> >>> > > instead of top-bottom. That is, create a physical scan first, then
> >>> > create a
> >>> > > physical project on top of it, etc. But at the same time, some
> rules
> >>> > still
> >>> > > require the top-bottom approach. So essentially I need the
> optimizer
> >>> to
> >>> > do
> >>> > > both. Abstract converters help me establish bottom-up preparation
> >>> but do
> >>> > > this at the cost of considering too many trait pairs, and as a
> >>> result,
> >>> > also
> >>> > > pollute the search space.
> >>> > >
> >>> > > To the contrast, precise command "I transformed the node A to node
> B,
> >>> > > please re-trigger the rules for A's parents" would allow us to
> >>> re-trigger
> >>> > > only required rules, without adding more nodes.
> >>> > >
> >>> > > Does it make sense?
> >>> > >
> >>> > > Regards,
> >>> > > Vladimir.
> >>> > >
> >>> > > ср, 30 окт. 2019 г. в 21:02, Stamatis Zampetakis <
> zabe...@gmail.com
> >>> >:
> >>> > >
> >>> > > > I admit that I didn't thoroughly read the exchanges so far but
> >>> forcing
> >>> > > the
> >>> > > > rules trigger does not seem the right approach.
> >>> > > >
> >>> > > > Other than that I would like to backtrack a bit to point 4.3
> >>> raised by
> >>> > > > Vladimir.
> >>> > > >
> >>> > > > "ProjectPhysicalRule [6] - transforms logical project to physical
> >>> > > > project *ONLY* if there is an underlying physical input with
> >>> REPLICATED
> >>> > > or
> >>> > > > SINGLETON distribution"
> >>> > > >
> >>> > > > The rule could be modified to do the following two
> transformations:
> >>> > > > 1. Create a physical project and require the input to be
> >>> REPLICATED.
> >>> > > > 2. Create a physical project and require
> >>> > > > the input to be SINGLETON.
> >>> > > >
> >>> > > > I would assume that afterwards when your scan rule fires it
> should
> >>> go
> >>> > to
> >>> > > > the appropriate subset and give you back the desired plan. Is
> >>> there a
> >>> > > > problem with this approach?
> >>> > > >
> >>> > > > Best,
> >>> > > > Stamatis
> >>> > > >
> >>> > > > On Wed, Oct 30, 2019, 5:52 PM Seliverstov Igor <
> >>> gvvinbl...@gmail.com>
> >>> > > > wrote:
> >>> > > >
> >>> > > > > Unfortunately it requires package-private API usage.
> >>> > > > >
> >>> > > > > Best solution there if Calcite provide it for end users.
> >>> > > > >
> >>> > > > > ср, 30 окт. 2019 г., 18:48 Vladimir Ozerov <ppoze...@gmail.com
> >:
> >>> > > > >
> >>> > > > > > “e pension” == “expand conversion” :)
> >>> > > > > >
> >>> > > > > > ср, 30 окт. 2019 г. в 18:46, Vladimir Ozerov <
> >>> ppoze...@gmail.com>:
> >>> > > > > >
> >>> > > > > > > Yes, that may work. Even if e pension rule is used, for the
> >>> most
> >>> > > > cases
> >>> > > > > it
> >>> > > > > > > will not trigger any real conversions, since we are moving
> >>> from
> >>> > > > > abstract
> >>> > > > > > > convention to physical, and created converters will have
> the
> >>> > > opposite
> >>> > > > > > trait
> >>> > > > > > > direction (from physical to abstract).
> >>> > > > > > >
> >>> > > > > > > But again - ideally we only need to re-trigger the rules
> for
> >>> a
> >>> > > > specific
> >>> > > > > > > node, no more than that. So API support like
> >>> > > > > > > “VolcanoPlanner.forceRules(RelNode)” would be very
> >>> convenient.
> >>> > > > > > >
> >>> > > > > > > What do you think?
> >>> > > > > > >
> >>> > > > > > > ср, 30 окт. 2019 г. в 17:56, Seliverstov Igor <
> >>> > > gvvinbl...@gmail.com
> >>> > > > >:
> >>> > > > > > >
> >>> > > > > > >> I considered manual rules calling too, for now we use
> >>> abstract
> >>> > > > > > converters
> >>> > > > > > >> +
> >>> > > > > > >> ExpandConversionRule for exchanges producing.
> >>> > > > > > >>
> >>> > > > > > >> You may create such converters manually (checking
> >>> appropriate
> >>> > > > subset)
> >>> > > > > > this
> >>> > > > > > >> case you may reduce created converters count, also, a
> >>> converter
> >>> > > is a
> >>> > > > > > quite
> >>> > > > > > >> special node, that does almost nothing (without
> >>> corresponding
> >>> > > rule)
> >>> > > > it
> >>> > > > > > may
> >>> > > > > > >> be used just as a rule trigger.
> >>> > > > > > >>
> >>> > > > > > >> Regards,
> >>> > > > > > >> Igor
> >>> > > > > > >>
> >>> > > > > > >> ср, 30 окт. 2019 г., 17:31 Vladimir Ozerov <
> >>> ppoze...@gmail.com
> >>> > >:
> >>> > > > > > >>
> >>> > > > > > >> > One funny hack which helped me is manual registration
> of a
> >>> > fake
> >>> > > > > > RelNode
> >>> > > > > > >> > with desired traits through VolcanoPlanner.register()
> >>> method.
> >>> > > But
> >>> > > > > > again,
> >>> > > > > > >> > this leads to trashing. What could really help is a call
> >>> to
> >>> > > > > > >> > VolcanoPlanner.fireRules() with desired rel. But this
> >>> doesn't
> >>> > > work
> >>> > > > > out
> >>> > > > > > >> of
> >>> > > > > > >> > the box since some internals of the rule queue needs to
> be
> >>> > > > adjusted.
> >>> > > > > > >> >
> >>> > > > > > >> > What does the community think about adding a method
> which
> >>> will
> >>> > > > > re-add
> >>> > > > > > >> rules
> >>> > > > > > >> > applicable to the specific RelNode to the rule queue?
> >>> > > > > > >> >
> >>> > > > > > >> > ср, 30 окт. 2019 г. в 17:00, Vladimir Ozerov <
> >>> > > ppoze...@gmail.com
> >>> > > > >:
> >>> > > > > > >> >
> >>> > > > > > >> > > Hi Igor,
> >>> > > > > > >> > >
> >>> > > > > > >> > > Yes, I came to the same conclusion, thank you. This is
> >>> how
> >>> > it
> >>> > > > > > >> basically
> >>> > > > > > >> > > happens when converters are disabled:
> >>> > > > > > >> > > 1) We start with initial tree: [LogicalProject] <-
> >>> > > [LogicalScan]
> >>> > > > > > >> > > 2) Then we convert LogicalScan to PhysicalScan, so it
> is
> >>> > added
> >>> > > > to
> >>> > > > > > the
> >>> > > > > > >> > > set: [LogicalProject] <- [LogicalScan, PhysicalScan]
> >>> > > > > > >> > > 3) Finally, when it is time to fire a rule for
> >>> PhysicalScan,
> >>> > > we
> >>> > > > > try
> >>> > > > > > to
> >>> > > > > > >> > get
> >>> > > > > > >> > > parents of that scan set with traits of the
> >>> PhysicalScan.
> >>> > > Since
> >>> > > > > > there
> >>> > > > > > >> are
> >>> > > > > > >> > > no such parents (we skipped it intentionally), the
> rule
> >>> is
> >>> > not
> >>> > > > > > queued.
> >>> > > > > > >> > >
> >>> > > > > > >> > > But when converters are enabled, a converter rel is
> >>> created:
> >>> > > > > > >> > [LogicalProject]
> >>> > > > > > >> > > <- [LogicalScan, PhysicalScan,
> >>> > > ConverterFromPhysicalToLogical].
> >>> > > > No
> >>> > > > > > >> rules
> >>> > > > > > >> > > are fired for PhysicalScan again, but they are fired
> for
> >>> > > > converter
> >>> > > > > > >> since
> >>> > > > > > >> > > it has the necessary LOGICAL trait.
> >>> > > > > > >> > >
> >>> > > > > > >> > > It makes sense, that converters essentially allow
> >>> forcing
> >>> > rule
> >>> > > > > > >> invocation
> >>> > > > > > >> > > on parents, even if the child was created with
> different
> >>> > > traits.
> >>> > > > > But
> >>> > > > > > >> it
> >>> > > > > > >> > > seems that this mechanism may be too heavy for complex
> >>> > queries
> >>> > > > > > >> because it
> >>> > > > > > >> > > literally creates hundreds of new converter rels and
> >>> > triggers
> >>> > > > > rules
> >>> > > > > > >> over
> >>> > > > > > >> > > and over again.
> >>> > > > > > >> > >
> >>> > > > > > >> > > We need some fine-grained alternative. Basically, what
> >>> would
> >>> > > be
> >>> > > > > > enough
> >>> > > > > > >> > for
> >>> > > > > > >> > > me is to let the planner know somehow: "I created that
> >>> rel,
> >>> > > and
> >>> > > > I
> >>> > > > > > want
> >>> > > > > > >> > you
> >>> > > > > > >> > > to execute parent rules not only using its trait but
> >>> also on
> >>> > > > this
> >>> > > > > > and
> >>> > > > > > >> > those
> >>> > > > > > >> > > traits."
> >>> > > > > > >> > > Is there any API in Calcite which allows doing this
> >>> without
> >>> > > > > > creating a
> >>> > > > > > >> > new
> >>> > > > > > >> > > rel node?
> >>> > > > > > >> > >
> >>> > > > > > >> > > Regards,
> >>> > > > > > >> > > Vladimir.
> >>> > > > > > >> > >
> >>> > > > > > >> > >
> >>> > > > > > >> > > ср, 30 окт. 2019 г. в 09:25, Seliverstov Igor <
> >>> > > > > gvvinbl...@gmail.com
> >>> > > > > > >:
> >>> > > > > > >> > >
> >>> > > > > > >> > >> Vladimir,
> >>> > > > > > >> > >>
> >>> > > > > > >> > >> Probably it'll help you:
> >>> > > > > > >> > >>
> >>> > > > > > >> > >> Seems the cause of issue in RelSubset.getParentRels()
> >>> The
> >>> > > > check
> >>> > > > > > used
> >>> > > > > > >> > when
> >>> > > > > > >> > >> the planner schedules newly matched rules after
> >>> successful
> >>> > > > > > >> > transformation
> >>> > > > > > >> > >> (see VolcanoRuleCall.matchRecurse), it prevents the
> >>> parent
> >>> > > rule
> >>> > > > > be
> >>> > > > > > >> > applied
> >>> > > > > > >> > >> once again (here your logical project with an input
> >>> having
> >>> > > ANY
> >>> > > > > > >> > >> distribution
> >>> > > > > > >> > >> doesn't satisfy a transformed input traits).
> >>> > > > > > >> > >>
> >>> > > > > > >> > >> In our case we use another workaround, so there are
> >>> also
> >>> > much
> >>> > > > > more
> >>> > > > > > >> > >> transformations than we wanted, so the desired rule
> is
> >>> > > > triggered.
> >>> > > > > > >> > >>
> >>> > > > > > >> > >>
> >>> > > > > > >> > >> вт, 29 окт. 2019 г., 14:46 Vladimir Ozerov <
> >>> > > ppoze...@gmail.com
> >>> > > > >:
> >>> > > > > > >> > >>
> >>> > > > > > >> > >> > Hi Vladimir,
> >>> > > > > > >> > >> >
> >>> > > > > > >> > >> > I am sorry. Pushed, it works now.
> >>> > > > > > >> > >> >
> >>> > > > > > >> > >> > вт, 29 окт. 2019 г. в 14:41, Vladimir Sitnikov <
> >>> > > > > > >> > >> > sitnikov.vladi...@gmail.com
> >>> > > > > > >> > >> > >:
> >>> > > > > > >> > >> >
> >>> > > > > > >> > >> > > > mvn clean test
> >>> > > > > > >> > >> > >
> >>> > > > > > >> > >> > > [ERROR] The goal you specified requires a project
> >>> to
> >>> > > > execute
> >>> > > > > > but
> >>> > > > > > >> > >> there is
> >>> > > > > > >> > >> > > no POM in this directory
> >>> > > > > > >> > >> > >
> >>> > > > > > >> > >> > > Vladimir, please push missing files
> >>> > > > > > >> > >> > >
> >>> > > > > > >> > >> > > Vladimir
> >>> > > > > > >> > >> > >
> >>> > > > > > >> > >> >
> >>> > > > > > >> > >>
> >>> > > > > > >> > >
> >>> > > > > > >> >
> >>> > > > > > >>
> >>> > > > > > >
> >>> > > > > >
> >>> > > > >
> >>> > > >
> >>> > >
> >>> >
> >>>
> >>>
> >>
>

Re: Re: Re: Problem with converters and possibly rule matching

Reply via email to