Hello,

I've seen that parts from this discussion have moved to another thread [1].
I decided to reply in this thread in an attempt to not deviate a lot the
discussion in [1], which I feel is more general than the problems mentioned
here.

Getting things from the beginning, Xiening mentioned three drawbacks but I
would say that we are mostly bothered by number 1.

For the second drawback, the fact that we use rules to transform a
LogicalFilter to an EnumerableFilter is not a big issue from my point of
view. If the plan is to register a special builder in the planner to
replace these rules then it seems that we move the code from one place to
another. Plus, if we cannot replace all the rules of Logical to Enumerable
and we split the logic into two places then the code will be more difficult
to understand.

For the third drawback, I would say that it is mostly a problem of the
adapter. Every rule accepts a RelBuilderFactory so if the adapter wants to
create physical nodes it should pass an appropriate factory to the
respective rules. Recently in [1], and other times in the past, it was
brought up as a problem that ProjectMergeRule (and other similar kind of
rules) match Enumerable nodes and create Logical ones. I would say that
this is mostly a bad configuration rather than a problem that should be
resolved automatically by the planner. Avoiding the aforementioned behavior
does not look very complicated; we need to instantiate the rule so that it
matches EnumerableProject and uses a RelBuilderFactory that creates
Enumerable operators.

>From my last comment, I think it is evident that I am also leaning to the
direction outlined by Julian, a RelBuilder capable of creating many
physical operators.

It's been a while that I didn't check the progress in [2] so apologies if
what I wrote here is obsolete.

Best,
Stamatis

[1]
https://lists.apache.org/thread.html/r38ea71968c069f465921e7197488329c15413b46831c90ad4d48f87e%40%3Cdev.calcite.apache.org%3E
[2] https://github.com/apache/calcite/pull/1884

On Thu, Apr 9, 2020, 12:57 AM Julian Hyde <jh...@apache.org> wrote:

> It's challenging to support all physical operators, but we can and
> should support many of them in RelBuilder.
>
> As Haisheng points out, some physical operators have extra operands.
> Maybe some of those operands can be added to the factory interfaces,
> or shoe-horned in in some clever way. (Note how we represent ASC and
> DESC keywords in RelBuilder.sort(RexNode...) as if they were function
> calls, or made RelBuilder.AggCall a fluent API to accommodate the many
> possible operands of an aggregate function call.)
>
> Many physical operators do not have extra operands. They implement the
> same contract as the logical operator, but with different physical
> properties (traits) (e.g. in a different convention, or assuming the
> input is sorted or partitioned in a particular way). The "operands" to
> these operands are the physical properties that they consume and
> produce.
>
> But conversely, there are huge benefits to sharing the code involved
> in creating RelNodes. A HashAggregate physical operator has a lot in
> common with LogicalAggregate, so can benefit from the same code.
> Without RelBuilder (or its embedded helper objects like RexSimplifier)
> being the place to put that logic, it will end up being spread out
> over, and duplicated in, many RelOptRule instances.
>
> We should be ambitious, and aim to make RelBuilder useful for creating
> most physical nodes. Perhaps make modest extensions to RelBuilder and
> RelFactory APIs to achieve this. If we fail, people can still manually
> create RelNodes using RelBuilder.push().
>
> Julian
>
>
> On Wed, Apr 8, 2020 at 10:07 AM Xiening Dai <xndai....@gmail.com> wrote:
> >
> > Yes, this is a good question. I think it would be up to the builder
> factory to decide. One can just create a logical join if it couldn’t decide
> which join algorithm to use. Then the withConvention() syntax does not
> provide guarantees. The caller will need an assert if they expect the
> RelNode returned to have the requested convention.
> >
> > I think eventually it comes down to how we define the role of
> RelBuilder. Now it does look like it's doing some simple logical
> transformation (see filter() and sort()).
> >
> > > On Apr 7, 2020, at 5:43 PM, Haisheng Yuan <h.y...@alibaba-inc.com>
> wrote:
> > >
> > > Thanks Xiening for moving this to dev list.
> > >
> > > Unlike logical operators, with which many systems share the same
> structure, physical operators are pretty implementation dependent.
> RelBuilder provides the common interface to create aggregate, join. But
> what kind of physical aggregate? HashAgg, StreamAgg? How do I specify it as
> a local aggregate, or global aggregate? In many cases, these physical
> operator constructors have system dependent arguments, will the RelBuilder
> support these? Shuffle / Exchange operator of different system may also
> differ on their arguments.
> > >
> > > The worst case is that only physical filter, project, sort can be
> created using the physical RelBuilder.
> > >
> > > - Haisheng
> > >
> > > ------------------------------------------------------------------
> > > 发件人:Xiening Dai<xndai....@gmail.com>
> > > 日 期:2020年04月08日 07:36:43
> > > 收件人:<dev@calcite.apache.org>
> > > 主 题:[DICUSS] Support building physical RelNode in Calcite
> > >
> > > Hi all,
> > >
> > > In light of CALCITE-2970, I’d like to initiate a discussion.
> > >
> > > Currently the framework itself does not have a way to create physical
> RelNode (RelNode with a particular convention). We completely rely on
> adapter rules to convert logical nodes into physical ones. There are a few
> major drawbacks with this approach -
> > >
> > > 1. During trait enforcement, we have to create logic node and then get
> it implemented by converter rule, even though we know the target
> convention. This results in the increase of memo size and planning time.
> CALCITE-2970 is one good example.
> > >
> > > 2. A lot of implementation rules today simply copy the inputs and
> create a physical node (examples like EnumerableProjectRule,
> EnumerableFilterRule, etc). If the framework was able to create physical
> node, a lot of these trivial rules could be eliminated.
> > >
> > > 3. There are a number of adapter rules today using RelBuilder and
> creating logical node. This is not desirable since in a lot of cases what
> they need are physical nodes instead. Creating logical nodes then getting
> them implemented by converters again is inefficient.
> > >
> > > To solve this problem, there have been two proposals so far -
> > >
> > > a) Extend current RelBuilder interface to support withConvention()
> syntax, which means you could build a physical node with the convention
> specified. This can be done through extending the RelNode factory withheld
> by the builder. Calcite user can register a particular factory for their
> own convention.
> > >
> > > b) Extend Convention interface and add something like "RelNode
> enforce(RelNode input, RelTrait trait)". This would be used to address
> issue #1. A convention could implement this method to create exact physical
> node to be used for satisfying given trait.
> > >
> > > I personally prefer a) since I believe this is a basic building block
> and it’s not just for enforcement. Also extending RelBuild feels nature to
> me - the intention was clearly there when this class was created (look at
> the class comments). There have been discussions about whether or not
> RelBuilder should support creating physical nodes. Although we have been
> using it solely for logical nodes building today, but that doesn’t mean we
> shouldn’t extend to physical ones. I haven't seen clear reasoning.
> > >
> > > Would like to hear your thoughts.
> >
>

Reply via email to