Re: [DICUSS] Support building physical RelNode in Calcite

2020-04-28 Thread Xiening Dai
eed an assert if they expect the
>> RelNode returned to have the requested convention.
>>> 
>>> I think eventually it comes down to how we define the role of
>> RelBuilder. Now it does look like it's doing some simple logical
>> transformation (see filter() and sort()).
>>> 
>>>> On Apr 7, 2020, at 5:43 PM, Haisheng Yuan 
>> wrote:
>>>> 
>>>> Thanks Xiening for moving this to dev list.
>>>> 
>>>> Unlike logical operators, with which many systems share the same
>> structure, physical operators are pretty implementation dependent.
>> RelBuilder provides the common interface to create aggregate, join. But
>> what kind of physical aggregate? HashAgg, StreamAgg? How do I specify it as
>> a local aggregate, or global aggregate? In many cases, these physical
>> operator constructors have system dependent arguments, will the RelBuilder
>> support these? Shuffle / Exchange operator of different system may also
>> differ on their arguments.
>>>> 
>>>> The worst case is that only physical filter, project, sort can be
>> created using the physical RelBuilder.
>>>> 
>>>> - Haisheng
>>>> 
>>>> --
>>>> 发件人:Xiening Dai
>>>> 日 期:2020年04月08日 07:36:43
>>>> 收件人:
>>>> 主 题:[DICUSS] Support building physical RelNode in Calcite
>>>> 
>>>> Hi all,
>>>> 
>>>> In light of CALCITE-2970, I’d like to initiate a discussion.
>>>> 
>>>> Currently the framework itself does not have a way to create physical
>> RelNode (RelNode with a particular convention). We completely rely on
>> adapter rules to convert logical nodes into physical ones. There are a few
>> major drawbacks with this approach -
>>>> 
>>>> 1. During trait enforcement, we have to create logic node and then get
>> it implemented by converter rule, even though we know the target
>> convention. This results in the increase of memo size and planning time.
>> CALCITE-2970 is one good example.
>>>> 
>>>> 2. A lot of implementation rules today simply copy the inputs and
>> create a physical node (examples like EnumerableProjectRule,
>> EnumerableFilterRule, etc). If the framework was able to create physical
>> node, a lot of these trivial rules could be eliminated.
>>>> 
>>>> 3. There are a number of adapter rules today using RelBuilder and
>> creating logical node. This is not desirable since in a lot of cases what
>> they need are physical nodes instead. Creating logical nodes then getting
>> them implemented by converters again is inefficient.
>>>> 
>>>> To solve this problem, there have been two proposals so far -
>>>> 
>>>> a) Extend current RelBuilder interface to support withConvention()
>> syntax, which means you could build a physical node with the convention
>> specified. This can be done through extending the RelNode factory withheld
>> by the builder. Calcite user can register a particular factory for their
>> own convention.
>>>> 
>>>> b) Extend Convention interface and add something like "RelNode
>> enforce(RelNode input, RelTrait trait)". This would be used to address
>> issue #1. A convention could implement this method to create exact physical
>> node to be used for satisfying given trait.
>>>> 
>>>> I personally prefer a) since I believe this is a basic building block
>> and it’s not just for enforcement. Also extending RelBuild feels nature to
>> me - the intention was clearly there when this class was created (look at
>> the class comments). There have been discussions about whether or not
>> RelBuilder should support creating physical nodes. Although we have been
>> using it solely for logical nodes building today, but that doesn’t mean we
>> shouldn’t extend to physical ones. I haven't seen clear reasoning.
>>>> 
>>>> Would like to hear your thoughts.
>>> 
>> 



Re: [DICUSS] Support building physical RelNode in Calcite

2020-04-28 Thread Stamatis Zampetakis
tors have system dependent arguments, will the RelBuilder
> support these? Shuffle / Exchange operator of different system may also
> differ on their arguments.
> > >
> > > The worst case is that only physical filter, project, sort can be
> created using the physical RelBuilder.
> > >
> > > - Haisheng
> > >
> > > --
> > > 发件人:Xiening Dai
> > > 日 期:2020年04月08日 07:36:43
> > > 收件人:
> > > 主 题:[DICUSS] Support building physical RelNode in Calcite
> > >
> > > Hi all,
> > >
> > > In light of CALCITE-2970, I’d like to initiate a discussion.
> > >
> > > Currently the framework itself does not have a way to create physical
> RelNode (RelNode with a particular convention). We completely rely on
> adapter rules to convert logical nodes into physical ones. There are a few
> major drawbacks with this approach -
> > >
> > > 1. During trait enforcement, we have to create logic node and then get
> it implemented by converter rule, even though we know the target
> convention. This results in the increase of memo size and planning time.
> CALCITE-2970 is one good example.
> > >
> > > 2. A lot of implementation rules today simply copy the inputs and
> create a physical node (examples like EnumerableProjectRule,
> EnumerableFilterRule, etc). If the framework was able to create physical
> node, a lot of these trivial rules could be eliminated.
> > >
> > > 3. There are a number of adapter rules today using RelBuilder and
> creating logical node. This is not desirable since in a lot of cases what
> they need are physical nodes instead. Creating logical nodes then getting
> them implemented by converters again is inefficient.
> > >
> > > To solve this problem, there have been two proposals so far -
> > >
> > > a) Extend current RelBuilder interface to support withConvention()
> syntax, which means you could build a physical node with the convention
> specified. This can be done through extending the RelNode factory withheld
> by the builder. Calcite user can register a particular factory for their
> own convention.
> > >
> > > b) Extend Convention interface and add something like "RelNode
> enforce(RelNode input, RelTrait trait)". This would be used to address
> issue #1. A convention could implement this method to create exact physical
> node to be used for satisfying given trait.
> > >
> > > I personally prefer a) since I believe this is a basic building block
> and it’s not just for enforcement. Also extending RelBuild feels nature to
> me - the intention was clearly there when this class was created (look at
> the class comments). There have been discussions about whether or not
> RelBuilder should support creating physical nodes. Although we have been
> using it solely for logical nodes building today, but that doesn’t mean we
> shouldn’t extend to physical ones. I haven't seen clear reasoning.
> > >
> > > Would like to hear your thoughts.
> >
>


Re: [DICUSS] Support building physical RelNode in Calcite

2020-04-08 Thread Julian Hyde
It's challenging to support all physical operators, but we can and
should support many of them in RelBuilder.

As Haisheng points out, some physical operators have extra operands.
Maybe some of those operands can be added to the factory interfaces,
or shoe-horned in in some clever way. (Note how we represent ASC and
DESC keywords in RelBuilder.sort(RexNode...) as if they were function
calls, or made RelBuilder.AggCall a fluent API to accommodate the many
possible operands of an aggregate function call.)

Many physical operators do not have extra operands. They implement the
same contract as the logical operator, but with different physical
properties (traits) (e.g. in a different convention, or assuming the
input is sorted or partitioned in a particular way). The "operands" to
these operands are the physical properties that they consume and
produce.

But conversely, there are huge benefits to sharing the code involved
in creating RelNodes. A HashAggregate physical operator has a lot in
common with LogicalAggregate, so can benefit from the same code.
Without RelBuilder (or its embedded helper objects like RexSimplifier)
being the place to put that logic, it will end up being spread out
over, and duplicated in, many RelOptRule instances.

We should be ambitious, and aim to make RelBuilder useful for creating
most physical nodes. Perhaps make modest extensions to RelBuilder and
RelFactory APIs to achieve this. If we fail, people can still manually
create RelNodes using RelBuilder.push().

Julian


On Wed, Apr 8, 2020 at 10:07 AM Xiening Dai  wrote:
>
> Yes, this is a good question. I think it would be up to the builder factory 
> to decide. One can just create a logical join if it couldn’t decide which 
> join algorithm to use. Then the withConvention() syntax does not provide 
> guarantees. The caller will need an assert if they expect the RelNode 
> returned to have the requested convention.
>
> I think eventually it comes down to how we define the role of RelBuilder. Now 
> it does look like it's doing some simple logical transformation (see filter() 
> and sort()).
>
> > On Apr 7, 2020, at 5:43 PM, Haisheng Yuan  wrote:
> >
> > Thanks Xiening for moving this to dev list.
> >
> > Unlike logical operators, with which many systems share the same structure, 
> > physical operators are pretty implementation dependent. RelBuilder provides 
> > the common interface to create aggregate, join. But what kind of physical 
> > aggregate? HashAgg, StreamAgg? How do I specify it as a local aggregate, or 
> > global aggregate? In many cases, these physical operator constructors have 
> > system dependent arguments, will the RelBuilder support these? Shuffle / 
> > Exchange operator of different system may also differ on their arguments.
> >
> > The worst case is that only physical filter, project, sort can be created 
> > using the physical RelBuilder.
> >
> > - Haisheng
> >
> > --------------
> > 发件人:Xiening Dai
> > 日 期:2020年04月08日 07:36:43
> > 收件人:
> > 主 题:[DICUSS] Support building physical RelNode in Calcite
> >
> > Hi all,
> >
> > In light of CALCITE-2970, I’d like to initiate a discussion.
> >
> > Currently the framework itself does not have a way to create physical 
> > RelNode (RelNode with a particular convention). We completely rely on 
> > adapter rules to convert logical nodes into physical ones. There are a few 
> > major drawbacks with this approach -
> >
> > 1. During trait enforcement, we have to create logic node and then get it 
> > implemented by converter rule, even though we know the target convention. 
> > This results in the increase of memo size and planning time. CALCITE-2970 
> > is one good example.
> >
> > 2. A lot of implementation rules today simply copy the inputs and create a 
> > physical node (examples like EnumerableProjectRule, EnumerableFilterRule, 
> > etc). If the framework was able to create physical node, a lot of these 
> > trivial rules could be eliminated.
> >
> > 3. There are a number of adapter rules today using RelBuilder and creating 
> > logical node. This is not desirable since in a lot of cases what they need 
> > are physical nodes instead. Creating logical nodes then getting them 
> > implemented by converters again is inefficient.
> >
> > To solve this problem, there have been two proposals so far -
> >
> > a) Extend current RelBuilder interface to support withConvention() syntax, 
> > which means you could build a physical node with the convention specified. 
> > This can be done through extending the RelNode factory withheld by the 
> > builder. Calc

Re: [DICUSS] Support building physical RelNode in Calcite

2020-04-08 Thread Xiening Dai
Yes, this is a good question. I think it would be up to the builder factory to 
decide. One can just create a logical join if it couldn’t decide which join 
algorithm to use. Then the withConvention() syntax does not provide guarantees. 
The caller will need an assert if they expect the RelNode returned to have the 
requested convention.

I think eventually it comes down to how we define the role of RelBuilder. Now 
it does look like it's doing some simple logical transformation (see filter() 
and sort()).

> On Apr 7, 2020, at 5:43 PM, Haisheng Yuan  wrote:
> 
> Thanks Xiening for moving this to dev list.
> 
> Unlike logical operators, with which many systems share the same structure, 
> physical operators are pretty implementation dependent. RelBuilder provides 
> the common interface to create aggregate, join. But what kind of physical 
> aggregate? HashAgg, StreamAgg? How do I specify it as a local aggregate, or 
> global aggregate? In many cases, these physical operator constructors have 
> system dependent arguments, will the RelBuilder support these? Shuffle / 
> Exchange operator of different system may also differ on their arguments.
> 
> The worst case is that only physical filter, project, sort can be created 
> using the physical RelBuilder. 
> 
> - Haisheng
> 
> --
> 发件人:Xiening Dai
> 日 期:2020年04月08日 07:36:43
> 收件人:
> 主 题:[DICUSS] Support building physical RelNode in Calcite
> 
> Hi all,
> 
> In light of CALCITE-2970, I’d like to initiate a discussion.
> 
> Currently the framework itself does not have a way to create physical RelNode 
> (RelNode with a particular convention). We completely rely on adapter rules 
> to convert logical nodes into physical ones. There are a few major drawbacks 
> with this approach -
> 
> 1. During trait enforcement, we have to create logic node and then get it 
> implemented by converter rule, even though we know the target convention. 
> This results in the increase of memo size and planning time. CALCITE-2970 is 
> one good example.
> 
> 2. A lot of implementation rules today simply copy the inputs and create a 
> physical node (examples like EnumerableProjectRule, EnumerableFilterRule, 
> etc). If the framework was able to create physical node, a lot of these 
> trivial rules could be eliminated.
> 
> 3. There are a number of adapter rules today using RelBuilder and creating 
> logical node. This is not desirable since in a lot of cases what they need 
> are physical nodes instead. Creating logical nodes then getting them 
> implemented by converters again is inefficient.
> 
> To solve this problem, there have been two proposals so far -
> 
> a) Extend current RelBuilder interface to support withConvention() syntax, 
> which means you could build a physical node with the convention specified. 
> This can be done through extending the RelNode factory withheld by the 
> builder. Calcite user can register a particular factory for their own 
> convention.
> 
> b) Extend Convention interface and add something like "RelNode 
> enforce(RelNode input, RelTrait trait)". This would be used to address issue 
> #1. A convention could implement this method to create exact physical node to 
> be used for satisfying given trait.
> 
> I personally prefer a) since I believe this is a basic building block and 
> it’s not just for enforcement. Also extending RelBuild feels nature to me - 
> the intention was clearly there when this class was created (look at the 
> class comments). There have been discussions about whether or not RelBuilder 
> should support creating physical nodes. Although we have been using it solely 
> for logical nodes building today, but that doesn’t mean we shouldn’t extend 
> to physical ones. I haven't seen clear reasoning.
> 
> Would like to hear your thoughts.



Re: [DICUSS] Support building physical RelNode in Calcite

2020-04-07 Thread Haisheng Yuan
Thanks Xiening for moving this to dev list.

Unlike logical operators, with which many systems share the same structure, 
physical operators are pretty implementation dependent. RelBuilder provides the 
common interface to create aggregate, join. But what kind of physical 
aggregate? HashAgg, StreamAgg? How do I specify it as a local aggregate, or 
global aggregate? In many cases, these physical operator constructors have 
system dependent arguments, will the RelBuilder support these? Shuffle / 
Exchange operator of different system may also differ on their arguments.

The worst case is that only physical filter, project, sort can be created using 
the physical RelBuilder. 

- Haisheng

--
发件人:Xiening Dai
日 期:2020年04月08日 07:36:43
收件人:
主 题:[DICUSS] Support building physical RelNode in Calcite

Hi all,

In light of CALCITE-2970, I’d like to initiate a discussion.

Currently the framework itself does not have a way to create physical RelNode 
(RelNode with a particular convention). We completely rely on adapter rules to 
convert logical nodes into physical ones. There are a few major drawbacks with 
this approach -

1. During trait enforcement, we have to create logic node and then get it 
implemented by converter rule, even though we know the target convention. This 
results in the increase of memo size and planning time. CALCITE-2970 is one 
good example.

2. A lot of implementation rules today simply copy the inputs and create a 
physical node (examples like EnumerableProjectRule, EnumerableFilterRule, etc). 
If the framework was able to create physical node, a lot of these trivial rules 
could be eliminated.

3. There are a number of adapter rules today using RelBuilder and creating 
logical node. This is not desirable since in a lot of cases what they need are 
physical nodes instead. Creating logical nodes then getting them implemented by 
converters again is inefficient.

To solve this problem, there have been two proposals so far -

a) Extend current RelBuilder interface to support withConvention() syntax, 
which means you could build a physical node with the convention specified. This 
can be done through extending the RelNode factory withheld by the builder. 
Calcite user can register a particular factory for their own convention.

b) Extend Convention interface and add something like "RelNode enforce(RelNode 
input, RelTrait trait)". This would be used to address issue #1. A convention 
could implement this method to create exact physical node to be used for 
satisfying given trait.

I personally prefer a) since I believe this is a basic building block and it’s 
not just for enforcement. Also extending RelBuild feels nature to me - the 
intention was clearly there when this class was created (look at the class 
comments). There have been discussions about whether or not RelBuilder should 
support creating physical nodes. Although we have been using it solely for 
logical nodes building today, but that doesn’t mean we shouldn’t extend to 
physical ones. I haven't seen clear reasoning.

Would like to hear your thoughts.


[DICUSS] Support building physical RelNode in Calcite

2020-04-07 Thread Xiening Dai
Hi all,

In light of CALCITE-2970, I’d like to initiate a discussion.

Currently the framework itself does not have a way to create physical RelNode 
(RelNode with a particular convention). We completely rely on adapter rules to 
convert logical nodes into physical ones. There are a few major drawbacks with 
this approach -

1. During trait enforcement, we have to create logic node and then get it 
implemented by converter rule, even though we know the target convention. This 
results in the increase of memo size and planning time. CALCITE-2970 is one 
good example.

2. A lot of implementation rules today simply copy the inputs and create a 
physical node (examples like EnumerableProjectRule, EnumerableFilterRule, etc). 
If the framework was able to create physical node, a lot of these trivial rules 
could be eliminated.

3. There are a number of adapter rules today using RelBuilder and creating 
logical node. This is not desirable since in a lot of cases what they need are 
physical nodes instead. Creating logical nodes then getting them implemented by 
converters again is inefficient.

To solve this problem, there have been two proposals so far -

a) Extend current RelBuilder interface to support withConvention() syntax, 
which means you could build a physical node with the convention specified. This 
can be done through extending the RelNode factory withheld by the builder. 
Calcite user can register a particular factory for their own convention.

b) Extend Convention interface and add something like "RelNode enforce(RelNode 
input, RelTrait trait)". This would be used to address issue #1. A convention 
could implement this method to create exact physical node to be used for 
satisfying given trait.

I personally prefer a) since I believe this is a basic building block and it’s 
not just for enforcement. Also extending RelBuild feels nature to me - the 
intention was clearly there when this class was created (look at the class 
comments). There have been discussions about whether or not RelBuilder should 
support creating physical nodes. Although we have been using it solely for 
logical nodes building today, but that doesn’t mean we shouldn’t extend to 
physical ones. I haven't seen clear reasoning.

Would like to hear your thoughts.