Re: [DISCUSS] Towards Cascades Optimizer

2020-05-14 Thread Julian Hyde
> I even added a
> VolcanoPlannerFactory in my code to allow user to specify their own
> planner. But I also see that Julian is more of keeping one planner.

I am fine with multiple planners. (We already have multiple planners -
Hep and Volcano.) Another one or two more is fine.

In the short term (while this development is going on, and until we
build consensus about which planner (or planners) best serves the
community) we will definitely have multiple planners.

But in the long term I don't want to end up with two planners that are
distinct without there being good reason. If it is architecturally
possible, let's make one planner that has the best features of all of
the prototypes.

Also please think carefully about the other places that we can put
"planning" code - the corpus of RelNodes, rules, traits, and metadata.
These systems are orthogonal to planners and if code can be put there,
it becomes available to all planners.

Julian

On Mon, May 11, 2020 at 11:49 PM Jinpeng Wu  wrote:
>
> Hi, Roman.
>
> Your suggestion is good and that's what I thought before. I even added a
> VolcanoPlannerFactory in my code to allow user to specify their own
> planner. But I also see that Julian is more of keeping one planner. This
> has been a long discussion. I think it can hardly goes to a consensus until
> a solution is considered as the best one, which is, as you mentioned,
> fastest, stable and backward compatible. So how about we just move forward,
> making our solution good enough first?
>
> One thing to clarify is that Haisheng and I are actually from the same
> team. It looks like we raised two different solutions, but actually they
> are two different components of a same planner. Haisheng is focusing more
> on trait propagation while I focus on pruning. Even for the conflict, we
> already have many discussions in private and the conclusion is that I will
> find another solution without AC. So Haisheng's interference is not a
> problem for me. As your planner is a brand new one, I think it should not
> be a problem for you, either.
> The differences between Haisheng and me lies in how we contribute those
> efforts to community. Haisheng is confident that he's code is fully
> controllable and the new feature is useful for ordinary volcano procedure.
> So he modified the VolcanoPlanner directly. As for my code, it is
> controllable now. But I am not so confident when trying something
> aggressive in the future.
> Later on, I will value Julian's advice, trying to keep one planner. I agree
> that modifying the VolcanoPlanner directly does not necessary to make it
> unstable or incompatible. It will surely slow me down when I move forward,
> but not impossible. Also, I don't think one planner is the only way to go.
> For example, if the planner has if/else blocks to check flags everywhere,
> it is no better than having another planner and switching between
> implementations by settings.
>
> Thanks,
> Jinpeng
>
> On Tue, May 12, 2020 at 2:16 AM Julian Hyde  wrote:
>
> > If there’s a way for all of these efforts to be committed to master,
> > disabled by default, but such that they can be enabled by setting a flag,
> > then we can at least avoid merge conflicts.
> >
> > I hope that this “trunk-based development” approach is technically
> > feasible. If the rule and metadata APIs are essentially unchanged, and the
> > planner API is evolved smoothly, we should be able to do it.
> >
> > From the perspective of the project and community, this is certainly a
> > good problem to have. An embarrassment of riches.
> >
> > From the perspective of the developers, we should their work as painless
> > as possible. We, the community, also owe it them to make a timely decision
> > about which effort (or efforts) we wish to support long-term.
> >
> > Julian
> >
> >
> > > On May 11, 2020, at 7:04 AM, Roman Kondakov 
> > wrote:
> > >
> > > Hi Jinpeng,
> > >
> > > Apache Calcite community has entered into the interesting situation: we
> > > have several concurrent efforts of building the new Cascades style
> > > optimizers. I can see at least three activities (correct me if I'm
> > wrong):
> > >
> > > 1. Haisheng's gradual rebuilding of VolcanoPlanner [1]
> > > 2. Jinpeng's new CascadesPlanner based on VolcanoPlanner [2]
> > > 3. My brand new CascadesPlanner [3]
> > >
> > > All these activities are independent and may interfere with each other.
> > > It is not good. We still don't know which approach is better and which
> > > optimizer will be the most efficient while keeping backward
> > compatibility.
> > >
> > > In my opinion we must refrain from breaking changes in the master branch
> > > right now because it can break other optimizers.
> > >
> > > I think we need to develop a further strategy for activities related to
> > > the new optimizers:
> > > - find out the way how we will choose the new default Cascades optimizer
> > > for Calcite: it should be stable, fast and backward-compatible
> > > optimizer. If all optimizers 

Re: [DISCUSS] Towards Cascades Optimizer

2020-05-12 Thread Jinpeng Wu
Hi, Roman.

Your suggestion is good and that's what I thought before. I even added a
VolcanoPlannerFactory in my code to allow user to specify their own
planner. But I also see that Julian is more of keeping one planner. This
has been a long discussion. I think it can hardly goes to a consensus until
a solution is considered as the best one, which is, as you mentioned,
fastest, stable and backward compatible. So how about we just move forward,
making our solution good enough first?

One thing to clarify is that Haisheng and I are actually from the same
team. It looks like we raised two different solutions, but actually they
are two different components of a same planner. Haisheng is focusing more
on trait propagation while I focus on pruning. Even for the conflict, we
already have many discussions in private and the conclusion is that I will
find another solution without AC. So Haisheng's interference is not a
problem for me. As your planner is a brand new one, I think it should not
be a problem for you, either.
The differences between Haisheng and me lies in how we contribute those
efforts to community. Haisheng is confident that he's code is fully
controllable and the new feature is useful for ordinary volcano procedure.
So he modified the VolcanoPlanner directly. As for my code, it is
controllable now. But I am not so confident when trying something
aggressive in the future.
Later on, I will value Julian's advice, trying to keep one planner. I agree
that modifying the VolcanoPlanner directly does not necessary to make it
unstable or incompatible. It will surely slow me down when I move forward,
but not impossible. Also, I don't think one planner is the only way to go.
For example, if the planner has if/else blocks to check flags everywhere,
it is no better than having another planner and switching between
implementations by settings.

Thanks,
Jinpeng

On Tue, May 12, 2020 at 2:16 AM Julian Hyde  wrote:

> If there’s a way for all of these efforts to be committed to master,
> disabled by default, but such that they can be enabled by setting a flag,
> then we can at least avoid merge conflicts.
>
> I hope that this “trunk-based development” approach is technically
> feasible. If the rule and metadata APIs are essentially unchanged, and the
> planner API is evolved smoothly, we should be able to do it.
>
> From the perspective of the project and community, this is certainly a
> good problem to have. An embarrassment of riches.
>
> From the perspective of the developers, we should their work as painless
> as possible. We, the community, also owe it them to make a timely decision
> about which effort (or efforts) we wish to support long-term.
>
> Julian
>
>
> > On May 11, 2020, at 7:04 AM, Roman Kondakov 
> wrote:
> >
> > Hi Jinpeng,
> >
> > Apache Calcite community has entered into the interesting situation: we
> > have several concurrent efforts of building the new Cascades style
> > optimizers. I can see at least three activities (correct me if I'm
> wrong):
> >
> > 1. Haisheng's gradual rebuilding of VolcanoPlanner [1]
> > 2. Jinpeng's new CascadesPlanner based on VolcanoPlanner [2]
> > 3. My brand new CascadesPlanner [3]
> >
> > All these activities are independent and may interfere with each other.
> > It is not good. We still don't know which approach is better and which
> > optimizer will be the most efficient while keeping backward
> compatibility.
> >
> > In my opinion we must refrain from breaking changes in the master branch
> > right now because it can break other optimizers.
> >
> > I think we need to develop a further strategy for activities related to
> > the new optimizers:
> > - find out the way how we will choose the new default Cascades optimizer
> > for Calcite: it should be stable, fast and backward-compatible
> > optimizer. If all optimizers will prove to be stable and backward
> > compatible, may be we need to choose the fastest one?
> > - think about the possibility of using plug-in optimizers: it would be a
> > great feature for users if they want to use their custom optimizer for
> > specific tasks or research purposes. It can be configured with
> > FrameworkConfig.setPlanner(new CustomPlanner()).
> >
> > What do you think?
> >
> > [1] https://github.com/apache/calcite/pull/1953
> > [2] https://github.com/apache/calcite/pull/1950
> > [3] https://github.com/apache/calcite/pull/1948
> >
> >
> > --
> > Kind Regards
> > Roman Kondakov
> >
> >
> > On 11.05.2020 11:12, Jinpeng Wu wrote:
> >> Hi, Roman.
> >>
> >> Thanks to Julian's tips, I added a CoreCascadeQuidemTest to the code. It
> >> runs iq files that CoreQuidemTest would runs and uses CascadePlanner
> >> instead of VolcanoPlanner to generate physical plans. Currently all
> tests
> >> have passed. There are some plan changes but they are actually
> equivalent
> >> plans in different forms. I‘ve also committed those plan changes as
> '.cas'
> >> files.
> >> Moreover, if you would like to set -Dcalcite.cascade=true during tests,
> 

Re: [DISCUSS] Towards Cascades Optimizer

2020-05-11 Thread Julian Hyde
If there’s a way for all of these efforts to be committed to master, disabled 
by default, but such that they can be enabled by setting a flag, then we can at 
least avoid merge conflicts.

I hope that this “trunk-based development” approach is technically feasible. If 
the rule and metadata APIs are essentially unchanged, and the planner API is 
evolved smoothly, we should be able to do it.

From the perspective of the project and community, this is certainly a good 
problem to have. An embarrassment of riches.

From the perspective of the developers, we should their work as painless as 
possible. We, the community, also owe it them to make a timely decision about 
which effort (or efforts) we wish to support long-term.

Julian


> On May 11, 2020, at 7:04 AM, Roman Kondakov  
> wrote:
> 
> Hi Jinpeng,
> 
> Apache Calcite community has entered into the interesting situation: we
> have several concurrent efforts of building the new Cascades style
> optimizers. I can see at least three activities (correct me if I'm wrong):
> 
> 1. Haisheng's gradual rebuilding of VolcanoPlanner [1]
> 2. Jinpeng's new CascadesPlanner based on VolcanoPlanner [2]
> 3. My brand new CascadesPlanner [3]
> 
> All these activities are independent and may interfere with each other.
> It is not good. We still don't know which approach is better and which
> optimizer will be the most efficient while keeping backward compatibility.
> 
> In my opinion we must refrain from breaking changes in the master branch
> right now because it can break other optimizers.
> 
> I think we need to develop a further strategy for activities related to
> the new optimizers:
> - find out the way how we will choose the new default Cascades optimizer
> for Calcite: it should be stable, fast and backward-compatible
> optimizer. If all optimizers will prove to be stable and backward
> compatible, may be we need to choose the fastest one?
> - think about the possibility of using plug-in optimizers: it would be a
> great feature for users if they want to use their custom optimizer for
> specific tasks or research purposes. It can be configured with
> FrameworkConfig.setPlanner(new CustomPlanner()).
> 
> What do you think?
> 
> [1] https://github.com/apache/calcite/pull/1953
> [2] https://github.com/apache/calcite/pull/1950
> [3] https://github.com/apache/calcite/pull/1948
> 
> 
> -- 
> Kind Regards
> Roman Kondakov
> 
> 
> On 11.05.2020 11:12, Jinpeng Wu wrote:
>> Hi, Roman.
>> 
>> Thanks to Julian's tips, I added a CoreCascadeQuidemTest to the code. It
>> runs iq files that CoreQuidemTest would runs and uses CascadePlanner
>> instead of VolcanoPlanner to generate physical plans. Currently all tests
>> have passed. There are some plan changes but they are actually equivalent
>> plans in different forms. I‘ve also committed those plan changes as '.cas'
>> files.
>> Moreover, if you would like to set -Dcalcite.cascade=true during tests, all
>> test cases will use the new planner for planning. There would be some
>> failures, typically because there's no common ways across all tests to tell
>> whether a RelNode is physical or logical.
>> 
>> 
>> When I was running the tests, I found that Haisheng's latest changes has a
>> conflict with my PR. The new planner relies on AbstractConverter to tell
>> whether a subset canConvert/needConvert to another subset. Haisheng's
>> change will remove AbstractConverter and break this. Currently, I switch
>> off top-down trait propagation in new planner to work around. I will merge
>> these two top-down processing together later on to completely address this
>> problem.
>> One thing to recall is that, while I am writing a new planner, Haisheng
>> modified the VolcanoPlanner directly and used a flag to switch on/off the
>> new feature. I need to decide how to merge them: to keep a separated
>> planner and invoke the new trait propagation interfaces or to modified the
>> volcano planner directly. I am trying the first way now. But of course, if
>> keeping one planner is preferred, I can also take the second way.
>> 
>> 
>> Thanks,
>> Jinpeng
>> 
>> 
>> On Fri, May 1, 2020 at 6:40 AM Haisheng Yuan  wrote:
>> 
>>> Hi all,
>>> 
>>> As planned in my proposal, I opened the pull request [1] for CALCITE-3896
>>> to achieve:
>>> 1. Top-down trait request
>>> 2. Bottom-up trait derivation
>>> 3. Trait enforcement without AbstractConverter
>>> 
>>> The feature can be turned on or off by a flag, either through property
>>> config file or VolcanoPlanner set method. Since Calcite doesn't turn on
>>> AbstractConverter until latest master, I just disabled AbstractConverter by
>>> turning on top-down trait request, now all tests passed.
>>> 
>>> In our system, 99 tpcds queries' test results show almost no plan diff,
>>> but the number of relnodes created during optimization is reduced by 10~15%
>>> average (even without space pruning). I believe for other systems using
>>> VolcanoPlanner, more than 20% reduction can be expected.
>>> 
>>> It also 

Re: [DISCUSS] Towards Cascades Optimizer

2020-05-11 Thread Roman Kondakov
Hi Jinpeng,

Apache Calcite community has entered into the interesting situation: we
have several concurrent efforts of building the new Cascades style
optimizers. I can see at least three activities (correct me if I'm wrong):

1. Haisheng's gradual rebuilding of VolcanoPlanner [1]
2. Jinpeng's new CascadesPlanner based on VolcanoPlanner [2]
3. My brand new CascadesPlanner [3]

All these activities are independent and may interfere with each other.
It is not good. We still don't know which approach is better and which
optimizer will be the most efficient while keeping backward compatibility.

In my opinion we must refrain from breaking changes in the master branch
right now because it can break other optimizers.

I think we need to develop a further strategy for activities related to
the new optimizers:
- find out the way how we will choose the new default Cascades optimizer
for Calcite: it should be stable, fast and backward-compatible
optimizer. If all optimizers will prove to be stable and backward
compatible, may be we need to choose the fastest one?
- think about the possibility of using plug-in optimizers: it would be a
great feature for users if they want to use their custom optimizer for
specific tasks or research purposes. It can be configured with
FrameworkConfig.setPlanner(new CustomPlanner()).

What do you think?

[1] https://github.com/apache/calcite/pull/1953
[2] https://github.com/apache/calcite/pull/1950
[3] https://github.com/apache/calcite/pull/1948


-- 
Kind Regards
Roman Kondakov


On 11.05.2020 11:12, Jinpeng Wu wrote:
> Hi, Roman.
> 
> Thanks to Julian's tips, I added a CoreCascadeQuidemTest to the code. It
> runs iq files that CoreQuidemTest would runs and uses CascadePlanner
> instead of VolcanoPlanner to generate physical plans. Currently all tests
> have passed. There are some plan changes but they are actually equivalent
> plans in different forms. I‘ve also committed those plan changes as '.cas'
> files.
> Moreover, if you would like to set -Dcalcite.cascade=true during tests, all
> test cases will use the new planner for planning. There would be some
> failures, typically because there's no common ways across all tests to tell
> whether a RelNode is physical or logical.
> 
> 
> When I was running the tests, I found that Haisheng's latest changes has a
> conflict with my PR. The new planner relies on AbstractConverter to tell
> whether a subset canConvert/needConvert to another subset. Haisheng's
> change will remove AbstractConverter and break this. Currently, I switch
> off top-down trait propagation in new planner to work around. I will merge
> these two top-down processing together later on to completely address this
> problem.
> One thing to recall is that, while I am writing a new planner, Haisheng
> modified the VolcanoPlanner directly and used a flag to switch on/off the
> new feature. I need to decide how to merge them: to keep a separated
> planner and invoke the new trait propagation interfaces or to modified the
> volcano planner directly. I am trying the first way now. But of course, if
> keeping one planner is preferred, I can also take the second way.
> 
> 
> Thanks,
> Jinpeng
> 
> 
> On Fri, May 1, 2020 at 6:40 AM Haisheng Yuan  wrote:
> 
>> Hi all,
>>
>> As planned in my proposal, I opened the pull request [1] for CALCITE-3896
>> to achieve:
>> 1. Top-down trait request
>> 2. Bottom-up trait derivation
>> 3. Trait enforcement without AbstractConverter
>>
>> The feature can be turned on or off by a flag, either through property
>> config file or VolcanoPlanner set method. Since Calcite doesn't turn on
>> AbstractConverter until latest master, I just disabled AbstractConverter by
>> turning on top-down trait request, now all tests passed.
>>
>> In our system, 99 tpcds queries' test results show almost no plan diff,
>> but the number of relnodes created during optimization is reduced by 10~15%
>> average (even without space pruning). I believe for other systems using
>> VolcanoPlanner, more than 20% reduction can be expected.
>>
>> It also has top-down rule apply in mind, later can be evolved to top-down
>> rule apply and space pruning, e.g. integrating code from Jingpeng and
>> Roman's. But the interface that is exposed to user, as described in the
>> proposal, can remain the same.
>>
>> Haisheng
>>
>> [1] https://github.com/apache/calcite/pull/1953
>>
>>
>> On 2020/04/30 18:01:26, Julian Hyde  wrote:
>>> If your test cases are SQL scripts, it might be fairly straightforward
>> to port them to Quidem (.iq) files. Plenty of examples in
>> https://github.com/apache/calcite/tree/master/core/src/test/resources/sql
>> >> .
>>>
>>> Quidem files are basically SQL scripts. Expected output is embedded in
>> the script. You can run the script once, and if the output looks right,
>> overwrite the input file with the output.
>>>
>>> Julian
>>>
>>>
 On Apr 30, 2020, at 3:26 AM, Jinpeng Wu  wrote:

Re: [DISCUSS] Towards Cascades Optimizer

2020-05-11 Thread Jinpeng Wu
Hi, Roman.

Thanks to Julian's tips, I added a CoreCascadeQuidemTest to the code. It
runs iq files that CoreQuidemTest would runs and uses CascadePlanner
instead of VolcanoPlanner to generate physical plans. Currently all tests
have passed. There are some plan changes but they are actually equivalent
plans in different forms. I‘ve also committed those plan changes as '.cas'
files.
Moreover, if you would like to set -Dcalcite.cascade=true during tests, all
test cases will use the new planner for planning. There would be some
failures, typically because there's no common ways across all tests to tell
whether a RelNode is physical or logical.


When I was running the tests, I found that Haisheng's latest changes has a
conflict with my PR. The new planner relies on AbstractConverter to tell
whether a subset canConvert/needConvert to another subset. Haisheng's
change will remove AbstractConverter and break this. Currently, I switch
off top-down trait propagation in new planner to work around. I will merge
these two top-down processing together later on to completely address this
problem.
One thing to recall is that, while I am writing a new planner, Haisheng
modified the VolcanoPlanner directly and used a flag to switch on/off the
new feature. I need to decide how to merge them: to keep a separated
planner and invoke the new trait propagation interfaces or to modified the
volcano planner directly. I am trying the first way now. But of course, if
keeping one planner is preferred, I can also take the second way.


Thanks,
Jinpeng


On Fri, May 1, 2020 at 6:40 AM Haisheng Yuan  wrote:

> Hi all,
>
> As planned in my proposal, I opened the pull request [1] for CALCITE-3896
> to achieve:
> 1. Top-down trait request
> 2. Bottom-up trait derivation
> 3. Trait enforcement without AbstractConverter
>
> The feature can be turned on or off by a flag, either through property
> config file or VolcanoPlanner set method. Since Calcite doesn't turn on
> AbstractConverter until latest master, I just disabled AbstractConverter by
> turning on top-down trait request, now all tests passed.
>
> In our system, 99 tpcds queries' test results show almost no plan diff,
> but the number of relnodes created during optimization is reduced by 10~15%
> average (even without space pruning). I believe for other systems using
> VolcanoPlanner, more than 20% reduction can be expected.
>
> It also has top-down rule apply in mind, later can be evolved to top-down
> rule apply and space pruning, e.g. integrating code from Jingpeng and
> Roman's. But the interface that is exposed to user, as described in the
> proposal, can remain the same.
>
> Haisheng
>
> [1] https://github.com/apache/calcite/pull/1953
>
>
> On 2020/04/30 18:01:26, Julian Hyde  wrote:
> > If your test cases are SQL scripts, it might be fairly straightforward
> to port them to Quidem (.iq) files. Plenty of examples in
> https://github.com/apache/calcite/tree/master/core/src/test/resources/sql
>  >.
> >
> > Quidem files are basically SQL scripts. Expected output is embedded in
> the script. You can run the script once, and if the output looks right,
> overwrite the input file with the output.
> >
> > Julian
> >
> >
> > > On Apr 30, 2020, at 3:26 AM, Jinpeng Wu  wrote:
> > >
> > > Sure. I will add more cases to my PR.
> > >
> > > I did not design more cases because our own product has a test
> frameworks,
> > > which contains thousands of actual user queries.
> > > Calcite's code base is quite different. I cannot just migrate cases to
> > > calcite.  So it may take some time.
> > >
> > > On Wed, Apr 29, 2020 at 4:27 AM Roman Kondakov
> 
> > > wrote:
> > >
> > >> Hi Jinpeng,
> > >>
> > >> I went through your PR and it seemed very impressive to me. It is very
> > >> similar to what I did, but you've reused many existing logic from the
> > >> Volcano planner. We should definitely stay in sync in our
> experiments. I
> > >> believe the future Cascades planner will be the kind combination of
> our
> > >> works.
> > >>
> > >> Is there any way to run tests that are close to the real system query
> > >> execution? May be with Enumerable convention, or, better, with
> > >> convention that supports distribution trait? I just want to look
> through
> > >> your planner's optimization steps more thoroughly. I've found some
> tests
> > >> in org.apache.calcite.plan.volcano package, but they use synthetic
> > >> conventions and nodes. May be I missed something.
> > >>
> > >> Thank you for sharing your work!
> > >>
> > >> --
> > >> Kind Regards
> > >> Roman Kondakov
> > >>
> > >>
> > >> On 28.04.2020 15:19, Jinpeng Wu wrote:
> > >>> Hi, Roman. It's great to see your proposal. Actually my team has also
> > >> been
> > >>> working on a cascade planner based on calcite.  And we already have
> some
> > >>> outcome as well.  Maybe we can combine our works.
> > >>>
> > >>> I've pushed my code as 

Re: [DISCUSS] Towards Cascades Optimizer

2020-04-30 Thread Haisheng Yuan
Hi all,

As planned in my proposal, I opened the pull request [1] for CALCITE-3896 to 
achieve:
1. Top-down trait request
2. Bottom-up trait derivation
3. Trait enforcement without AbstractConverter

The feature can be turned on or off by a flag, either through property config 
file or VolcanoPlanner set method. Since Calcite doesn't turn on 
AbstractConverter until latest master, I just disabled AbstractConverter by 
turning on top-down trait request, now all tests passed.

In our system, 99 tpcds queries' test results show almost no plan diff, but the 
number of relnodes created during optimization is reduced by 10~15% average 
(even without space pruning). I believe for other systems using VolcanoPlanner, 
more than 20% reduction can be expected.

It also has top-down rule apply in mind, later can be evolved to top-down rule 
apply and space pruning, e.g. integrating code from Jingpeng and Roman's. But 
the interface that is exposed to user, as described in the proposal, can remain 
the same.

Haisheng

[1] https://github.com/apache/calcite/pull/1953


On 2020/04/30 18:01:26, Julian Hyde  wrote: 
> If your test cases are SQL scripts, it might be fairly straightforward to 
> port them to Quidem (.iq) files. Plenty of examples in 
> https://github.com/apache/calcite/tree/master/core/src/test/resources/sql 
> .
> 
> Quidem files are basically SQL scripts. Expected output is embedded in the 
> script. You can run the script once, and if the output looks right, overwrite 
> the input file with the output.
> 
> Julian
> 
> 
> > On Apr 30, 2020, at 3:26 AM, Jinpeng Wu  wrote:
> > 
> > Sure. I will add more cases to my PR.
> > 
> > I did not design more cases because our own product has a test frameworks,
> > which contains thousands of actual user queries.
> > Calcite's code base is quite different. I cannot just migrate cases to
> > calcite.  So it may take some time.
> > 
> > On Wed, Apr 29, 2020 at 4:27 AM Roman Kondakov 
> > wrote:
> > 
> >> Hi Jinpeng,
> >> 
> >> I went through your PR and it seemed very impressive to me. It is very
> >> similar to what I did, but you've reused many existing logic from the
> >> Volcano planner. We should definitely stay in sync in our experiments. I
> >> believe the future Cascades planner will be the kind combination of our
> >> works.
> >> 
> >> Is there any way to run tests that are close to the real system query
> >> execution? May be with Enumerable convention, or, better, with
> >> convention that supports distribution trait? I just want to look through
> >> your planner's optimization steps more thoroughly. I've found some tests
> >> in org.apache.calcite.plan.volcano package, but they use synthetic
> >> conventions and nodes. May be I missed something.
> >> 
> >> Thank you for sharing your work!
> >> 
> >> --
> >> Kind Regards
> >> Roman Kondakov
> >> 
> >> 
> >> On 28.04.2020 15:19, Jinpeng Wu wrote:
> >>> Hi, Roman. It's great to see your proposal. Actually my team has also
> >> been
> >>> working on a cascade planner based on calcite.  And we already have some
> >>> outcome as well.  Maybe we can combine our works.
> >>> 
> >>> I've pushed my code as https://github.com/apache/calcite/pull/1950 .
> >>> 
> >>> Our works have many places in common. We both developed a new
> >>> CascadePlanner and avoid modifying the old VolcanoPlanner directly. We
> >>> both implemented the top-down search strategy according to the
> >>> Columnbia optimizer
> >>> generator
> >>> <
> >> https://15721.courses.cs.cmu.edu/spring2019/papers/22-optimizer1/xu-columbia-thesis1998.pdf
> >>> 。But
> >>> we also have some differences.
> >>> 
> >>> The first difference is that I try to reuse the existing VolcanoPlanner
> >> as
> >>> much as possible. My CascadePlanner inherits from the existing
> >>> VolcanoPlanner. Except that it overwrites ruleQueue and findBestPlan
> >> method
> >>> to rearrange rule applies, most logic generally inherit from
> >>> VolcanoPlanner. For example,
> >>>  - It reuses the RelSet and RelSubset class and the register method
> >>>  - Rules are fired as soon as a RelNode is registered (In the
> >>> Columnbia optimizer generator, rules are not fired until exploring). The
> >>> ApplyRule task controls when to invoke the onMatch method of a RuleMatch.
> >>> This design have a benefit that we do not need to worry about missing a
> >>> rule or firing a rule multiple times.
> >>>  - It leverages AbstractConverter to pass traits requirements down.  So
> >>> currently AC is still essential in my code.
> >>> This makes the new planner highly compatible with the old VolcanoPlanner.
> >>> Features like MV and Hints can apply to it directly.  And I tried to
> >> change
> >>> VolcanoPlanner to the new CascadePlanner in tests. Most tests passed.
> >>> Several cases did fail. I know the reason and how to fix them. But I am
> >>> still thinking about making them as "won't fix" as the ruleset violates
> >>> 

Re: [DISCUSS] Towards Cascades Optimizer

2020-04-30 Thread Julian Hyde
If your test cases are SQL scripts, it might be fairly straightforward to port 
them to Quidem (.iq) files. Plenty of examples in 
https://github.com/apache/calcite/tree/master/core/src/test/resources/sql 
.

Quidem files are basically SQL scripts. Expected output is embedded in the 
script. You can run the script once, and if the output looks right, overwrite 
the input file with the output.

Julian


> On Apr 30, 2020, at 3:26 AM, Jinpeng Wu  wrote:
> 
> Sure. I will add more cases to my PR.
> 
> I did not design more cases because our own product has a test frameworks,
> which contains thousands of actual user queries.
> Calcite's code base is quite different. I cannot just migrate cases to
> calcite.  So it may take some time.
> 
> On Wed, Apr 29, 2020 at 4:27 AM Roman Kondakov 
> wrote:
> 
>> Hi Jinpeng,
>> 
>> I went through your PR and it seemed very impressive to me. It is very
>> similar to what I did, but you've reused many existing logic from the
>> Volcano planner. We should definitely stay in sync in our experiments. I
>> believe the future Cascades planner will be the kind combination of our
>> works.
>> 
>> Is there any way to run tests that are close to the real system query
>> execution? May be with Enumerable convention, or, better, with
>> convention that supports distribution trait? I just want to look through
>> your planner's optimization steps more thoroughly. I've found some tests
>> in org.apache.calcite.plan.volcano package, but they use synthetic
>> conventions and nodes. May be I missed something.
>> 
>> Thank you for sharing your work!
>> 
>> --
>> Kind Regards
>> Roman Kondakov
>> 
>> 
>> On 28.04.2020 15:19, Jinpeng Wu wrote:
>>> Hi, Roman. It's great to see your proposal. Actually my team has also
>> been
>>> working on a cascade planner based on calcite.  And we already have some
>>> outcome as well.  Maybe we can combine our works.
>>> 
>>> I've pushed my code as https://github.com/apache/calcite/pull/1950 .
>>> 
>>> Our works have many places in common. We both developed a new
>>> CascadePlanner and avoid modifying the old VolcanoPlanner directly. We
>>> both implemented the top-down search strategy according to the
>>> Columnbia optimizer
>>> generator
>>> <
>> https://15721.courses.cs.cmu.edu/spring2019/papers/22-optimizer1/xu-columbia-thesis1998.pdf
>>> 。But
>>> we also have some differences.
>>> 
>>> The first difference is that I try to reuse the existing VolcanoPlanner
>> as
>>> much as possible. My CascadePlanner inherits from the existing
>>> VolcanoPlanner. Except that it overwrites ruleQueue and findBestPlan
>> method
>>> to rearrange rule applies, most logic generally inherit from
>>> VolcanoPlanner. For example,
>>>  - It reuses the RelSet and RelSubset class and the register method
>>>  - Rules are fired as soon as a RelNode is registered (In the
>>> Columnbia optimizer generator, rules are not fired until exploring). The
>>> ApplyRule task controls when to invoke the onMatch method of a RuleMatch.
>>> This design have a benefit that we do not need to worry about missing a
>>> rule or firing a rule multiple times.
>>>  - It leverages AbstractConverter to pass traits requirements down.  So
>>> currently AC is still essential in my code.
>>> This makes the new planner highly compatible with the old VolcanoPlanner.
>>> Features like MV and Hints can apply to it directly.  And I tried to
>> change
>>> VolcanoPlanner to the new CascadePlanner in tests. Most tests passed.
>>> Several cases did fail. I know the reason and how to fix them. But I am
>>> still thinking about making them as "won't fix" as the ruleset violates
>>> some basic principles of top-down trait requests.
>>> 
>>> The second difference is that our design have the ability for space
>>> pruning. Currently it contains a simply LowerBoundCost metadata to
>> compute
>>> the lower bound of a RelNdoe. Because logical properties like cardinality
>>> of a RelSet is not stable across exploring, it is required that a group
>> to
>>> be fully explored (implementation rules and enforcement rules should
>> never
>>> modify the logical properties) before it can provide a valid lower bound
>>> cost. Because of that, logical search space pruning is not supported now.
>>> It can only pruned out implementation rules and enforcement rules.
>> Testing
>>> with cases in our own product, the new planner saves about 10% rule
>>> applies. I am still considering how to support logical space pruning,
>>> looking forwards to have more improvements.
>>> 
>>> Hope my code will help.
>>> 
>>> Thanks,
>>> Jinpeng
>>> 
>>> 
>>> On Tue, Apr 28, 2020 at 11:22 AM Xiening Dai 
>> wrote:
>>> 
 For #1, aside from that we need to be able to build physical nodes based
 on a convention. For example, if we merge two EnumerableProject, we
>> would
 want to create an EnumerableProject as a result, instead of
>> LogicalProject.
 The RelBuilder change 

Re: [DISCUSS] Towards Cascades Optimizer

2020-04-30 Thread Jinpeng Wu
Sure. I will add more cases to my PR.

I did not design more cases because our own product has a test frameworks,
which contains thousands of actual user queries.
Calcite's code base is quite different. I cannot just migrate cases to
calcite.  So it may take some time.

On Wed, Apr 29, 2020 at 4:27 AM Roman Kondakov 
wrote:

> Hi Jinpeng,
>
> I went through your PR and it seemed very impressive to me. It is very
> similar to what I did, but you've reused many existing logic from the
> Volcano planner. We should definitely stay in sync in our experiments. I
> believe the future Cascades planner will be the kind combination of our
> works.
>
> Is there any way to run tests that are close to the real system query
> execution? May be with Enumerable convention, or, better, with
> convention that supports distribution trait? I just want to look through
> your planner's optimization steps more thoroughly. I've found some tests
> in org.apache.calcite.plan.volcano package, but they use synthetic
> conventions and nodes. May be I missed something.
>
> Thank you for sharing your work!
>
> --
> Kind Regards
> Roman Kondakov
>
>
> On 28.04.2020 15:19, Jinpeng Wu wrote:
> > Hi, Roman. It's great to see your proposal. Actually my team has also
> been
> > working on a cascade planner based on calcite.  And we already have some
> > outcome as well.  Maybe we can combine our works.
> >
> > I've pushed my code as https://github.com/apache/calcite/pull/1950 .
> >
> > Our works have many places in common. We both developed a new
> > CascadePlanner and avoid modifying the old VolcanoPlanner directly. We
> > both implemented the top-down search strategy according to the
> > Columnbia optimizer
> > generator
> > <
> https://15721.courses.cs.cmu.edu/spring2019/papers/22-optimizer1/xu-columbia-thesis1998.pdf
> >。But
> > we also have some differences.
> >
> > The first difference is that I try to reuse the existing VolcanoPlanner
> as
> > much as possible. My CascadePlanner inherits from the existing
> > VolcanoPlanner. Except that it overwrites ruleQueue and findBestPlan
> method
> > to rearrange rule applies, most logic generally inherit from
> > VolcanoPlanner. For example,
> >   - It reuses the RelSet and RelSubset class and the register method
> >   - Rules are fired as soon as a RelNode is registered (In the
> > Columnbia optimizer generator, rules are not fired until exploring). The
> > ApplyRule task controls when to invoke the onMatch method of a RuleMatch.
> > This design have a benefit that we do not need to worry about missing a
> > rule or firing a rule multiple times.
> >   - It leverages AbstractConverter to pass traits requirements down.  So
> > currently AC is still essential in my code.
> > This makes the new planner highly compatible with the old VolcanoPlanner.
> > Features like MV and Hints can apply to it directly.  And I tried to
> change
> > VolcanoPlanner to the new CascadePlanner in tests. Most tests passed.
> > Several cases did fail. I know the reason and how to fix them. But I am
> > still thinking about making them as "won't fix" as the ruleset violates
> > some basic principles of top-down trait requests.
> >
> > The second difference is that our design have the ability for space
> > pruning. Currently it contains a simply LowerBoundCost metadata to
> compute
> > the lower bound of a RelNdoe. Because logical properties like cardinality
> > of a RelSet is not stable across exploring, it is required that a group
> to
> > be fully explored (implementation rules and enforcement rules should
> never
> > modify the logical properties) before it can provide a valid lower bound
> > cost. Because of that, logical search space pruning is not supported now.
> > It can only pruned out implementation rules and enforcement rules.
> Testing
> > with cases in our own product, the new planner saves about 10% rule
> > applies. I am still considering how to support logical space pruning,
> > looking forwards to have more improvements.
> >
> > Hope my code will help.
> >
> > Thanks,
> > Jinpeng
> >
> >
> > On Tue, Apr 28, 2020 at 11:22 AM Xiening Dai 
> wrote:
> >
> >> For #1, aside from that we need to be able to build physical nodes based
> >> on a convention. For example, if we merge two EnumerableProject, we
> would
> >> want to create an EnumerableProject as a result, instead of
> LogicalProject.
> >> The RelBuilder change I work on would help this case.
> >>
> >> For #2, I don’t think it’s just a bug. If the physical cost cannot be
> >> reliable before transformation is finished, we should probably delay the
> >> physical cost calculation, or we risk doing it over again. The other
> way is
> >> to complete RelSet transformation before implementing it - which is a
> >> common practice in industry, including Orca.
> >>
> >> The multi-convention is a key scenario, and I agree we should support.
> My
> >> thinking is more about seperating logical one (Conventions.NONE) from
> >> others.
> >>
> >>
> >>> On Apr 27, 

Re: [DISCUSS] Towards Cascades Optimizer

2020-04-28 Thread Roman Kondakov
Hi Xiening,

thank you for your feedback.

> 1. Do we really need RelGroup and RelSubGroup? I believe the memo structure 
> would be largely the same even if we move towards a Cascade planner. I think 
> we can reuse most of the RelSet/RelSubset today. RelSubset is a tricky one. 
> There’s no corresponding concept in the Cascade framework. But we do need a 
> way to track the RelNode physical traits anyway. A RelSubset is more like a 
> logical grouping of the RelNodes with same trait set, which I think is still 
> valuable.

No, we don't really need RelGroup and RelSubGroup. My initial approach
was to use only RelGroup as in original Cascades framework, but later I
realized that RelSubGroup is a convenient concept and introduced it. I'm
going to replace both with RelSet/RelSubset later. Despite
RelSet/RelSubset have many things that not needed for cascades (parents,
propagateCosImprovements, etc).

> Keeping backward compatibility is a key goal. We have to think through this 
> at the very beginning of the design. The community has invested a lot with 
> Volcano planner over the years. We cannot ask people to rewrite their rules 
> just to make them work with the new planner. We should expect current rules 
> would just work, or would work with “minimal” changes.

I agree with you. My next step is to bring backward compatibility for
Cascades planner.

> There are some building blocks that are currently missing which will prevent 
> us from using the Cascade top down search strategy. For example, the way how 
> rule is matched, the lack of transformation phase, the lack of the ability to 
> create physical nodes by the framework, etc. If we don’t care about current 
> rules, and start from scratch, we probably don’t need to fix these issues. 
> But with #2 in mind, we have to work out solutions for these so we can carry 
> over existing rules.

I think we can reuse existing rules and at the same time achieve correct
top-down Cascades search strategy. For example, I've changed rule
matching strategy in the prototype: rules are matched on demand in the
top-down way (see org.apache.calcite.plan.cascades.ApplyRule.Bindery
class). And I believe that all other things (like creating physical
nodes) we can fix in the similar way: just changing the planner code
while maintaining backward compatibility.


-- 
Kind Regards
Roman Kondakov


On 28.04.2020 03:12, Xiening Dai wrote:
> Hi Roman,
> 
> First, thank you for sharing your design and prototype code. I took a quick 
> look at your design and have some high level feedback -
> 
> 1. Do we really need RelGroup and RelSubGroup? I believe the memo structure 
> would be largely the same even if we move towards a Cascade planner. I think 
> we can reuse most of the RelSet/RelSubset today. RelSubset is a tricky one. 
> There’s no corresponding concept in the Cascade framework. But we do need a 
> way to track the RelNode physical traits anyway. A RelSubset is more like a 
> logical grouping of the RelNodes with same trait set, which I think is still 
> valuable.
> 
> 2. Keeping backward compatibility is a key goal. We have to think through 
> this at the very beginning of the design. The community has invested a lot 
> with Volcano planner over the years. We cannot ask people to rewrite their 
> rules just to make them work with the new planner. We should expect current 
> rules would just work, or would work with “minimal” changes.
> 
> 3. There are some building blocks that are currently missing which will 
> prevent us from using the Cascade top down search strategy. For example, the 
> way how rule is matched, the lack of transformation phase, the lack of the 
> ability to create physical nodes by the framework, etc. If we don’t care 
> about current rules, and start from scratch, we probably don’t need to fix 
> these issues. But with #2 in mind, we have to work out solutions for these so 
> we can carry over existing rules.
> 
> I think your execution plan looks good overall. We can iterate on the design 
> while working out a document.
> 
> For the purpose of transparency, I’ve been working with Haisheng in the last 
> few months regarding this topics. I agree with him on most parts of his 
> roadmap, which I think it’s a tangible plan to evolve current Volcano planner.
> 
> 
>> On Apr 27, 2020, at 11:29 AM, Roman Kondakov  
>> wrote:
>>
>> Hi all,
>>
>> Stamatis, Haisheng thank you very much for your feedback! I really
>> appreciate it.
>>
>>> If in the new planner we end up copy-pasting code then I guess it will be a 
>>> bad idea.
>>
>> Yes, there are some code duplication between Volcano planner and
>> Cascades planner. I think I'll move it to some common superclass: either
>> to existing AbstractRelOptPlanner or a new AbstractCostBasedPlanner.
>>
>>> Like that it will be easier for everybody to test and see if the changes 
>>> make this better or worse. From a backward compatibility perspective it 
>>> seems feasible to keep the new eatures configurable for 

Re: [DISCUSS] Towards Cascades Optimizer

2020-04-28 Thread Roman Kondakov
Hi Jinpeng,

I went through your PR and it seemed very impressive to me. It is very
similar to what I did, but you've reused many existing logic from the
Volcano planner. We should definitely stay in sync in our experiments. I
believe the future Cascades planner will be the kind combination of our
works.

Is there any way to run tests that are close to the real system query
execution? May be with Enumerable convention, or, better, with
convention that supports distribution trait? I just want to look through
your planner's optimization steps more thoroughly. I've found some tests
in org.apache.calcite.plan.volcano package, but they use synthetic
conventions and nodes. May be I missed something.

Thank you for sharing your work!

-- 
Kind Regards
Roman Kondakov


On 28.04.2020 15:19, Jinpeng Wu wrote:
> Hi, Roman. It's great to see your proposal. Actually my team has also been
> working on a cascade planner based on calcite.  And we already have some
> outcome as well.  Maybe we can combine our works.
> 
> I've pushed my code as https://github.com/apache/calcite/pull/1950 .
> 
> Our works have many places in common. We both developed a new
> CascadePlanner and avoid modifying the old VolcanoPlanner directly. We
> both implemented the top-down search strategy according to the
> Columnbia optimizer
> generator
> 。But
> we also have some differences.
> 
> The first difference is that I try to reuse the existing VolcanoPlanner as
> much as possible. My CascadePlanner inherits from the existing
> VolcanoPlanner. Except that it overwrites ruleQueue and findBestPlan method
> to rearrange rule applies, most logic generally inherit from
> VolcanoPlanner. For example,
>   - It reuses the RelSet and RelSubset class and the register method
>   - Rules are fired as soon as a RelNode is registered (In the
> Columnbia optimizer generator, rules are not fired until exploring). The
> ApplyRule task controls when to invoke the onMatch method of a RuleMatch.
> This design have a benefit that we do not need to worry about missing a
> rule or firing a rule multiple times.
>   - It leverages AbstractConverter to pass traits requirements down.  So
> currently AC is still essential in my code.
> This makes the new planner highly compatible with the old VolcanoPlanner.
> Features like MV and Hints can apply to it directly.  And I tried to change
> VolcanoPlanner to the new CascadePlanner in tests. Most tests passed.
> Several cases did fail. I know the reason and how to fix them. But I am
> still thinking about making them as "won't fix" as the ruleset violates
> some basic principles of top-down trait requests.
> 
> The second difference is that our design have the ability for space
> pruning. Currently it contains a simply LowerBoundCost metadata to compute
> the lower bound of a RelNdoe. Because logical properties like cardinality
> of a RelSet is not stable across exploring, it is required that a group to
> be fully explored (implementation rules and enforcement rules should never
> modify the logical properties) before it can provide a valid lower bound
> cost. Because of that, logical search space pruning is not supported now.
> It can only pruned out implementation rules and enforcement rules. Testing
> with cases in our own product, the new planner saves about 10% rule
> applies. I am still considering how to support logical space pruning,
> looking forwards to have more improvements.
> 
> Hope my code will help.
> 
> Thanks,
> Jinpeng
> 
> 
> On Tue, Apr 28, 2020 at 11:22 AM Xiening Dai  wrote:
> 
>> For #1, aside from that we need to be able to build physical nodes based
>> on a convention. For example, if we merge two EnumerableProject, we would
>> want to create an EnumerableProject as a result, instead of LogicalProject.
>> The RelBuilder change I work on would help this case.
>>
>> For #2, I don’t think it’s just a bug. If the physical cost cannot be
>> reliable before transformation is finished, we should probably delay the
>> physical cost calculation, or we risk doing it over again. The other way is
>> to complete RelSet transformation before implementing it - which is a
>> common practice in industry, including Orca.
>>
>> The multi-convention is a key scenario, and I agree we should support. My
>> thinking is more about seperating logical one (Conventions.NONE) from
>> others.
>>
>>
>>> On Apr 27, 2020, at 4:59 PM, Julian Hyde  wrote:
>>>
>>> Re 1. By all means have multiple instances of a rule (e.g. one instance
>> that matches LogicalFilter and another that matches FooFilter) and enable
>> different instances during different phases. (We have been slow to create
>> all of these variant instances, in part because of the difficulty of
>> changing the constructors of many existing rules. My proposed change
>> https://issues.apache.org/jira/browse/CALCITE-3923 <
>> https://issues.apache.org/jira/browse/CALCITE-3923> would 

Re: [DISCUSS] Towards Cascades Optimizer

2020-04-28 Thread Jinpeng Wu
Hi, Roman. It's great to see your proposal. Actually my team has also been
working on a cascade planner based on calcite.  And we already have some
outcome as well.  Maybe we can combine our works.

I've pushed my code as https://github.com/apache/calcite/pull/1950 .

Our works have many places in common. We both developed a new
CascadePlanner and avoid modifying the old VolcanoPlanner directly. We
both implemented the top-down search strategy according to the
Columnbia optimizer
generator
。But
we also have some differences.

The first difference is that I try to reuse the existing VolcanoPlanner as
much as possible. My CascadePlanner inherits from the existing
VolcanoPlanner. Except that it overwrites ruleQueue and findBestPlan method
to rearrange rule applies, most logic generally inherit from
VolcanoPlanner. For example,
  - It reuses the RelSet and RelSubset class and the register method
  - Rules are fired as soon as a RelNode is registered (In the
Columnbia optimizer generator, rules are not fired until exploring). The
ApplyRule task controls when to invoke the onMatch method of a RuleMatch.
This design have a benefit that we do not need to worry about missing a
rule or firing a rule multiple times.
  - It leverages AbstractConverter to pass traits requirements down.  So
currently AC is still essential in my code.
This makes the new planner highly compatible with the old VolcanoPlanner.
Features like MV and Hints can apply to it directly.  And I tried to change
VolcanoPlanner to the new CascadePlanner in tests. Most tests passed.
Several cases did fail. I know the reason and how to fix them. But I am
still thinking about making them as "won't fix" as the ruleset violates
some basic principles of top-down trait requests.

The second difference is that our design have the ability for space
pruning. Currently it contains a simply LowerBoundCost metadata to compute
the lower bound of a RelNdoe. Because logical properties like cardinality
of a RelSet is not stable across exploring, it is required that a group to
be fully explored (implementation rules and enforcement rules should never
modify the logical properties) before it can provide a valid lower bound
cost. Because of that, logical search space pruning is not supported now.
It can only pruned out implementation rules and enforcement rules. Testing
with cases in our own product, the new planner saves about 10% rule
applies. I am still considering how to support logical space pruning,
looking forwards to have more improvements.

Hope my code will help.

Thanks,
Jinpeng


On Tue, Apr 28, 2020 at 11:22 AM Xiening Dai  wrote:

> For #1, aside from that we need to be able to build physical nodes based
> on a convention. For example, if we merge two EnumerableProject, we would
> want to create an EnumerableProject as a result, instead of LogicalProject.
> The RelBuilder change I work on would help this case.
>
> For #2, I don’t think it’s just a bug. If the physical cost cannot be
> reliable before transformation is finished, we should probably delay the
> physical cost calculation, or we risk doing it over again. The other way is
> to complete RelSet transformation before implementing it - which is a
> common practice in industry, including Orca.
>
> The multi-convention is a key scenario, and I agree we should support. My
> thinking is more about seperating logical one (Conventions.NONE) from
> others.
>
>
> > On Apr 27, 2020, at 4:59 PM, Julian Hyde  wrote:
> >
> > Re 1. By all means have multiple instances of a rule (e.g. one instance
> that matches LogicalFilter and another that matches FooFilter) and enable
> different instances during different phases. (We have been slow to create
> all of these variant instances, in part because of the difficulty of
> changing the constructors of many existing rules. My proposed change
> https://issues.apache.org/jira/browse/CALCITE-3923 <
> https://issues.apache.org/jira/browse/CALCITE-3923> would make it easier
> to create new instances that are more precisely targeted.)
> >
> > Re 2. Sure, there are bugs.
> >
> > Re 3. That’s a good idea. The planner could have a "single-convention
> mode", which might be faster, but would throw if it encountered a rel in a
> different convention.
> >
> > I think we’re in agreement - separating rules is a best practice. But we
> shouldn’t force that best practice on everyone. The multi-convention case
> is still crucial for planning hybrid queries (e.g. joining MySQL to
> MongoDB).
> >
> > Julian
> >
> >
> >> On Apr 27, 2020, at 4:28 PM, Xiening Dai  wrote:
> >>
> >> Hi Julian,
> >>
> >> In my view, separating logic and physical rules have a number of
> benefits -
> >>
> >> 1. With current design, a rule can match both physical and logical
> nodes. This behavior could cause duplication of rule firings and explosion
> of memo and search space. There was a long discussion regarding this
> 

Re: [DISCUSS] Towards Cascades Optimizer

2020-04-27 Thread Xiening Dai
For #1, aside from that we need to be able to build physical nodes based on a 
convention. For example, if we merge two EnumerableProject, we would want to 
create an EnumerableProject as a result, instead of LogicalProject. The 
RelBuilder change I work on would help this case.

For #2, I don’t think it’s just a bug. If the physical cost cannot be reliable 
before transformation is finished, we should probably delay the physical cost 
calculation, or we risk doing it over again. The other way is to complete 
RelSet transformation before implementing it - which is a common practice in 
industry, including Orca.

The multi-convention is a key scenario, and I agree we should support. My 
thinking is more about seperating logical one (Conventions.NONE) from others.


> On Apr 27, 2020, at 4:59 PM, Julian Hyde  wrote:
> 
> Re 1. By all means have multiple instances of a rule (e.g. one instance that 
> matches LogicalFilter and another that matches FooFilter) and enable 
> different instances during different phases. (We have been slow to create all 
> of these variant instances, in part because of the difficulty of changing the 
> constructors of many existing rules. My proposed change 
> https://issues.apache.org/jira/browse/CALCITE-3923 
>  would make it easier to 
> create new instances that are more precisely targeted.)
> 
> Re 2. Sure, there are bugs.
> 
> Re 3. That’s a good idea. The planner could have a "single-convention mode", 
> which might be faster, but would throw if it encountered a rel in a different 
> convention.
> 
> I think we’re in agreement - separating rules is a best practice. But we 
> shouldn’t force that best practice on everyone. The multi-convention case is 
> still crucial for planning hybrid queries (e.g. joining MySQL to MongoDB).
> 
> Julian
> 
> 
>> On Apr 27, 2020, at 4:28 PM, Xiening Dai  wrote:
>> 
>> Hi Julian,
>> 
>> In my view, separating logic and physical rules have a number of benefits -
>> 
>> 1. With current design, a rule can match both physical and logical nodes. 
>> This behavior could cause duplication of rule firings and explosion of memo 
>> and search space. There was a long discussion regarding this (CALCITE-2223). 
>> Although the indefinitely rule matching problem is fixed by a separate 
>> change, the duplicate rule firing is not resolved. There's a patch trying to 
>> address it (https://github.com/apache/calcite/pull/1543 
>> ), but it still fell short due 
>> to current design limitation. 
>> 
>> 2. We have a few meta inconsistency issues today which are due to the 
>> reality that we don’t clearly define transformation phase. For example, we 
>> don’t have a solution for CALCITE-2166 as long as transformation can still 
>> apply to a RelSet after it’s been implemented, which means the group logical 
>> properties (such as row count) can still change and invalidate all the 
>> previous best cost calculation, so best cost can increase (oops!).
>> 
>> 3. The other benefit is the planner can choose shortcut a number of 
>> expensive operations (such as RelSubSet registration, cost calculation and 
>> propagation, etc) during transformation phase, if it was clearly defined and 
>> enforced by framework.
>> 
>> 
>>> On Apr 27, 2020, at 11:59 AM, Julian Hyde  wrote:
>>> 
>>> This thread has almost gotten too long to respond to. I confess I’ve not 
>>> read much of it. I’m going to reply anyway. Sorry.
>>> 
>>> I support making Calcite’s optimizer support “Cascades”.
>>> 
>>> We should keep the existing VolcanoPlanner working during the transition, 
>>> and perhaps longer. (I acknowledge that this will not be easy. But it will 
>>> not be impossible, because we already have 2 planner engines, and going 
>>> from 2 to 3 is much harder than going from 1 to 2.)
>>> 
>>> I perhaps have a different philosophy than Haisheng + Cascades on logical 
>>> vs physical. I do believe that logical & physical rels and rules are more 
>>> similar than they are different, therefore they should have the same 
>>> classes, etc. Pragmatically, it makes a lot of sense to create rule sets 
>>> and planner phases that deal with just logical rels, or just physical rels. 
>>> But that’s a best practice, not a rule.
>>> 
>>> This philosophy manifested in a discussion a couple of weeks ago about 
>>> whether RelBuilder should be able to create physical rels. I still believe 
>>> that it should.
>>> 
>>> Julian
>>> 
>>> 
 On Apr 27, 2020, at 11:29 AM, Roman Kondakov  
 wrote:
 
 Hi all,
 
 Stamatis, Haisheng thank you very much for your feedback! I really
 appreciate it.
 
> If in the new planner we end up copy-pasting code then I guess it will be 
> a bad idea.
 
 Yes, there are some code duplication between Volcano planner and
 Cascades planner. I think I'll move it to some common superclass: either
 to existing AbstractRelOptPlanner 

Re: [DISCUSS] Towards Cascades Optimizer

2020-04-27 Thread Xiening Dai
Hi Roman,

First, thank you for sharing your design and prototype code. I took a quick 
look at your design and have some high level feedback -

1. Do we really need RelGroup and RelSubGroup? I believe the memo structure 
would be largely the same even if we move towards a Cascade planner. I think we 
can reuse most of the RelSet/RelSubset today. RelSubset is a tricky one. 
There’s no corresponding concept in the Cascade framework. But we do need a way 
to track the RelNode physical traits anyway. A RelSubset is more like a logical 
grouping of the RelNodes with same trait set, which I think is still valuable.

2. Keeping backward compatibility is a key goal. We have to think through this 
at the very beginning of the design. The community has invested a lot with 
Volcano planner over the years. We cannot ask people to rewrite their rules 
just to make them work with the new planner. We should expect current rules 
would just work, or would work with “minimal” changes.

3. There are some building blocks that are currently missing which will prevent 
us from using the Cascade top down search strategy. For example, the way how 
rule is matched, the lack of transformation phase, the lack of the ability to 
create physical nodes by the framework, etc. If we don’t care about current 
rules, and start from scratch, we probably don’t need to fix these issues. But 
with #2 in mind, we have to work out solutions for these so we can carry over 
existing rules.

I think your execution plan looks good overall. We can iterate on the design 
while working out a document.

For the purpose of transparency, I’ve been working with Haisheng in the last 
few months regarding this topics. I agree with him on most parts of his 
roadmap, which I think it’s a tangible plan to evolve current Volcano planner.


> On Apr 27, 2020, at 11:29 AM, Roman Kondakov  
> wrote:
> 
> Hi all,
> 
> Stamatis, Haisheng thank you very much for your feedback! I really
> appreciate it.
> 
>> If in the new planner we end up copy-pasting code then I guess it will be a 
>> bad idea.
> 
> Yes, there are some code duplication between Volcano planner and
> Cascades planner. I think I'll move it to some common superclass: either
> to existing AbstractRelOptPlanner or a new AbstractCostBasedPlanner.
> 
>> Like that it will be easier for everybody to test and see if the changes 
>> make this better or worse. From a backward compatibility perspective it 
>> seems feasible to keep the new eatures configurable for a certain amount of 
>> time.
> 
> I was thinking about backward compatibility in this way: what if we can
> switch planner in tests by setting some flag (system property?)
> somewhere and check what happens with new planner if we replace Volcano
> with it. And if ever it turns out that the new planner passes all
> Volcano's tests and at the same time it works more efficiently, we can
> safely replace Volcano planner with Cascades planner.
> 
>> A design doc would definitely help, especially if it has a few end-to-end 
>> (from logical to physical plan) examples showing how the optimizer works
> at each step before/after the changes. This is actually what is usually
> missing in research papers that makes them hard to understand.
>> I am thinking some similar to the examples that Haisheng send in the first  
>> email but possibly a bit more detailed.
> 
> I agree, I'll add some exmples to the design doc very soon.
> 
>> I looked very briefly in the PR by Roman but I think I didn't see tests 
>> where the final plan contains operators from multiple conventions. Multiple 
>> conventions is among the choices that complicate certain parts of he 
>> existing planner so we should make sure that we take this into account.
> 
> It's on my radar. I'm going to add these tests.
> 
> I would like to ask the commutnity about my next steps on the way from
> transition of the Cascades planner from the prototype status to becoming
> a part of the project. I see these steps like this:
> 
> 1. Create a jira ticket.
> 2. Update design document with examples.
> 3. Make some research to obtain backward compatibility with Volcano
> planner to be able to replace Volcano planner with Cascades planner in
> test and to understand the current porblems with planner.
> 4. Solve known problems
>  - materialized views
>  - hints
>  - multiple convetions
>  - listener hooks
>  - problems from p.3.
> 5. new PR, review and merge.
> 6. Replacie Volcano planner wiith Cascades after several releases.
> 
> What do you think about this roadmap?
> 
> 
> -- 
> Kind Regards
> Roman Kondakov
> 
> 
> On 27.04.2020 01:55, Stamatis Zampetakis wrote:
>> Hi all,
>> 
>> I am very excited about the ideas discussed so far and especially by the
>> enthusiasm of many people that are ready to help for pulling this out.
>> I wouldn't except that we could have a prototype so quickly.
>> Thanks a lot everyone!
>> 
>> In the debate between creating new planner or patching the existing one, I
>> don't have a 

Re: [DISCUSS] Towards Cascades Optimizer

2020-04-27 Thread Julian Hyde
PS Someone mentioned that logical properties do not propagate across RelSubsets 
in the same RelSet. That is a bug, and we should fix it.

For example, if subset#1 has determined that MinRowCount=1 then subset#2 in the 
same set should also inherit that MinRowCount. The same goes for other logical 
properties, such as RelMdPredicates. The benefits to sharing properties among 
subsets are very significant.

We might need to formalize what we mean by “logical properties”. It would not 
be valid to propagate RelMdCollation between two subsets that have a different 
RelCollation property.

Julian


> On Apr 27, 2020, at 4:59 PM, Julian Hyde  wrote:
> 
> Re 1. By all means have multiple instances of a rule (e.g. one instance that 
> matches LogicalFilter and another that matches FooFilter) and enable 
> different instances during different phases. (We have been slow to create all 
> of these variant instances, in part because of the difficulty of changing the 
> constructors of many existing rules. My proposed change 
> https://issues.apache.org/jira/browse/CALCITE-3923 
>  would make it easier to 
> create new instances that are more precisely targeted.)
> 
> Re 2. Sure, there are bugs.
> 
> Re 3. That’s a good idea. The planner could have a "single-convention mode", 
> which might be faster, but would throw if it encountered a rel in a different 
> convention.
> 
> I think we’re in agreement - separating rules is a best practice. But we 
> shouldn’t force that best practice on everyone. The multi-convention case is 
> still crucial for planning hybrid queries (e.g. joining MySQL to MongoDB).
> 
> Julian
> 
> 
>> On Apr 27, 2020, at 4:28 PM, Xiening Dai > > wrote:
>> 
>> Hi Julian,
>> 
>> In my view, separating logic and physical rules have a number of benefits -
>> 
>> 1. With current design, a rule can match both physical and logical nodes. 
>> This behavior could cause duplication of rule firings and explosion of memo 
>> and search space. There was a long discussion regarding this (CALCITE-2223). 
>> Although the indefinitely rule matching problem is fixed by a separate 
>> change, the duplicate rule firing is not resolved. There's a patch trying to 
>> address it (https://github.com/apache/calcite/pull/1543 
>>  
>> > >), but it still fell short due 
>> to current design limitation. 
>> 
>> 2. We have a few meta inconsistency issues today which are due to the 
>> reality that we don’t clearly define transformation phase. For example, we 
>> don’t have a solution for CALCITE-2166 as long as transformation can still 
>> apply to a RelSet after it’s been implemented, which means the group logical 
>> properties (such as row count) can still change and invalidate all the 
>> previous best cost calculation, so best cost can increase (oops!).
>> 
>> 3. The other benefit is the planner can choose shortcut a number of 
>> expensive operations (such as RelSubSet registration, cost calculation and 
>> propagation, etc) during transformation phase, if it was clearly defined and 
>> enforced by framework.
>> 
>> 
>>> On Apr 27, 2020, at 11:59 AM, Julian Hyde >> > wrote:
>>> 
>>> This thread has almost gotten too long to respond to. I confess I’ve not 
>>> read much of it. I’m going to reply anyway. Sorry.
>>> 
>>> I support making Calcite’s optimizer support “Cascades”.
>>> 
>>> We should keep the existing VolcanoPlanner working during the transition, 
>>> and perhaps longer. (I acknowledge that this will not be easy. But it will 
>>> not be impossible, because we already have 2 planner engines, and going 
>>> from 2 to 3 is much harder than going from 1 to 2.)
>>> 
>>> I perhaps have a different philosophy than Haisheng + Cascades on logical 
>>> vs physical. I do believe that logical & physical rels and rules are more 
>>> similar than they are different, therefore they should have the same 
>>> classes, etc. Pragmatically, it makes a lot of sense to create rule sets 
>>> and planner phases that deal with just logical rels, or just physical rels. 
>>> But that’s a best practice, not a rule.
>>> 
>>> This philosophy manifested in a discussion a couple of weeks ago about 
>>> whether RelBuilder should be able to create physical rels. I still believe 
>>> that it should.
>>> 
>>> Julian
>>> 
>>> 
 On Apr 27, 2020, at 11:29 AM, Roman Kondakov >>> > wrote:
 
 Hi all,
 
 Stamatis, Haisheng thank you very much for your feedback! I really
 appreciate it.
 
> If in the new planner we end up copy-pasting code then I guess it will be 
> a bad idea.
 
 Yes, there are some code duplication between Volcano planner and
 Cascades planner. I think I'll move it to some common superclass: either
 to existing 

Re: [DISCUSS] Towards Cascades Optimizer

2020-04-27 Thread Julian Hyde
Re 1. By all means have multiple instances of a rule (e.g. one instance that 
matches LogicalFilter and another that matches FooFilter) and enable different 
instances during different phases. (We have been slow to create all of these 
variant instances, in part because of the difficulty of changing the 
constructors of many existing rules. My proposed change 
https://issues.apache.org/jira/browse/CALCITE-3923 
 would make it easier to 
create new instances that are more precisely targeted.)

Re 2. Sure, there are bugs.

Re 3. That’s a good idea. The planner could have a "single-convention mode", 
which might be faster, but would throw if it encountered a rel in a different 
convention.

I think we’re in agreement - separating rules is a best practice. But we 
shouldn’t force that best practice on everyone. The multi-convention case is 
still crucial for planning hybrid queries (e.g. joining MySQL to MongoDB).

Julian


> On Apr 27, 2020, at 4:28 PM, Xiening Dai  wrote:
> 
> Hi Julian,
> 
> In my view, separating logic and physical rules have a number of benefits -
> 
> 1. With current design, a rule can match both physical and logical nodes. 
> This behavior could cause duplication of rule firings and explosion of memo 
> and search space. There was a long discussion regarding this (CALCITE-2223). 
> Although the indefinitely rule matching problem is fixed by a separate 
> change, the duplicate rule firing is not resolved. There's a patch trying to 
> address it (https://github.com/apache/calcite/pull/1543 
> ), but it still fell short due 
> to current design limitation. 
> 
> 2. We have a few meta inconsistency issues today which are due to the reality 
> that we don’t clearly define transformation phase. For example, we don’t have 
> a solution for CALCITE-2166 as long as transformation can still apply to a 
> RelSet after it’s been implemented, which means the group logical properties 
> (such as row count) can still change and invalidate all the previous best 
> cost calculation, so best cost can increase (oops!).
> 
> 3. The other benefit is the planner can choose shortcut a number of expensive 
> operations (such as RelSubSet registration, cost calculation and propagation, 
> etc) during transformation phase, if it was clearly defined and enforced by 
> framework.
> 
> 
>> On Apr 27, 2020, at 11:59 AM, Julian Hyde  wrote:
>> 
>> This thread has almost gotten too long to respond to. I confess I’ve not 
>> read much of it. I’m going to reply anyway. Sorry.
>> 
>> I support making Calcite’s optimizer support “Cascades”.
>> 
>> We should keep the existing VolcanoPlanner working during the transition, 
>> and perhaps longer. (I acknowledge that this will not be easy. But it will 
>> not be impossible, because we already have 2 planner engines, and going from 
>> 2 to 3 is much harder than going from 1 to 2.)
>> 
>> I perhaps have a different philosophy than Haisheng + Cascades on logical vs 
>> physical. I do believe that logical & physical rels and rules are more 
>> similar than they are different, therefore they should have the same 
>> classes, etc. Pragmatically, it makes a lot of sense to create rule sets and 
>> planner phases that deal with just logical rels, or just physical rels. But 
>> that’s a best practice, not a rule.
>> 
>> This philosophy manifested in a discussion a couple of weeks ago about 
>> whether RelBuilder should be able to create physical rels. I still believe 
>> that it should.
>> 
>> Julian
>> 
>> 
>>> On Apr 27, 2020, at 11:29 AM, Roman Kondakov  
>>> wrote:
>>> 
>>> Hi all,
>>> 
>>> Stamatis, Haisheng thank you very much for your feedback! I really
>>> appreciate it.
>>> 
 If in the new planner we end up copy-pasting code then I guess it will be 
 a bad idea.
>>> 
>>> Yes, there are some code duplication between Volcano planner and
>>> Cascades planner. I think I'll move it to some common superclass: either
>>> to existing AbstractRelOptPlanner or a new AbstractCostBasedPlanner.
>>> 
 Like that it will be easier for everybody to test and see if the changes 
 make this better or worse. From a backward compatibility perspective it 
 seems feasible to keep the new eatures configurable for a certain amount 
 of time.
>>> 
>>> I was thinking about backward compatibility in this way: what if we can
>>> switch planner in tests by setting some flag (system property?)
>>> somewhere and check what happens with new planner if we replace Volcano
>>> with it. And if ever it turns out that the new planner passes all
>>> Volcano's tests and at the same time it works more efficiently, we can
>>> safely replace Volcano planner with Cascades planner.
>>> 
 A design doc would definitely help, especially if it has a few end-to-end 
 (from logical to physical plan) examples showing how the optimizer works
>>> at each step before/after the changes. This is actually 

Re: [DISCUSS] Towards Cascades Optimizer

2020-04-27 Thread Xiening Dai
Hi Julian,

In my view, separating logic and physical rules have a number of benefits -

1. With current design, a rule can match both physical and logical nodes. This 
behavior could cause duplication of rule firings and explosion of memo and 
search space. There was a long discussion regarding this (CALCITE-2223). 
Although the indefinitely rule matching problem is fixed by a separate change, 
the duplicate rule firing is not resolved. There's a patch trying to address it 
(https://github.com/apache/calcite/pull/1543 
), but it still fell short due to 
current design limitation. 

2. We have a few meta inconsistency issues today which are due to the reality 
that we don’t clearly define transformation phase. For example, we don’t have a 
solution for CALCITE-2166 as long as transformation can still apply to a RelSet 
after it’s been implemented, which means the group logical properties (such as 
row count) can still change and invalidate all the previous best cost 
calculation, so best cost can increase (oops!).

3. The other benefit is the planner can choose shortcut a number of expensive 
operations (such as RelSubSet registration, cost calculation and propagation, 
etc) during transformation phase, if it was clearly defined and enforced by 
framework.


> On Apr 27, 2020, at 11:59 AM, Julian Hyde  wrote:
> 
> This thread has almost gotten too long to respond to. I confess I’ve not read 
> much of it. I’m going to reply anyway. Sorry.
> 
> I support making Calcite’s optimizer support “Cascades”.
> 
> We should keep the existing VolcanoPlanner working during the transition, and 
> perhaps longer. (I acknowledge that this will not be easy. But it will not be 
> impossible, because we already have 2 planner engines, and going from 2 to 3 
> is much harder than going from 1 to 2.)
> 
> I perhaps have a different philosophy than Haisheng + Cascades on logical vs 
> physical. I do believe that logical & physical rels and rules are more 
> similar than they are different, therefore they should have the same classes, 
> etc. Pragmatically, it makes a lot of sense to create rule sets and planner 
> phases that deal with just logical rels, or just physical rels. But that’s a 
> best practice, not a rule.
> 
> This philosophy manifested in a discussion a couple of weeks ago about 
> whether RelBuilder should be able to create physical rels. I still believe 
> that it should.
> 
> Julian
> 
> 
>> On Apr 27, 2020, at 11:29 AM, Roman Kondakov  
>> wrote:
>> 
>> Hi all,
>> 
>> Stamatis, Haisheng thank you very much for your feedback! I really
>> appreciate it.
>> 
>>> If in the new planner we end up copy-pasting code then I guess it will be a 
>>> bad idea.
>> 
>> Yes, there are some code duplication between Volcano planner and
>> Cascades planner. I think I'll move it to some common superclass: either
>> to existing AbstractRelOptPlanner or a new AbstractCostBasedPlanner.
>> 
>>> Like that it will be easier for everybody to test and see if the changes 
>>> make this better or worse. From a backward compatibility perspective it 
>>> seems feasible to keep the new eatures configurable for a certain amount of 
>>> time.
>> 
>> I was thinking about backward compatibility in this way: what if we can
>> switch planner in tests by setting some flag (system property?)
>> somewhere and check what happens with new planner if we replace Volcano
>> with it. And if ever it turns out that the new planner passes all
>> Volcano's tests and at the same time it works more efficiently, we can
>> safely replace Volcano planner with Cascades planner.
>> 
>>> A design doc would definitely help, especially if it has a few end-to-end 
>>> (from logical to physical plan) examples showing how the optimizer works
>> at each step before/after the changes. This is actually what is usually
>> missing in research papers that makes them hard to understand.
>>> I am thinking some similar to the examples that Haisheng send in the first  
>>> email but possibly a bit more detailed.
>> 
>> I agree, I'll add some exmples to the design doc very soon.
>> 
>>> I looked very briefly in the PR by Roman but I think I didn't see tests 
>>> where the final plan contains operators from multiple conventions. Multiple 
>>> conventions is among the choices that complicate certain parts of he 
>>> existing planner so we should make sure that we take this into account.
>> 
>> It's on my radar. I'm going to add these tests.
>> 
>> I would like to ask the commutnity about my next steps on the way from
>> transition of the Cascades planner from the prototype status to becoming
>> a part of the project. I see these steps like this:
>> 
>> 1. Create a jira ticket.
>> 2. Update design document with examples.
>> 3. Make some research to obtain backward compatibility with Volcano
>> planner to be able to replace Volcano planner with Cascades planner in
>> test and to understand the current porblems with planner.
>> 4. Solve 

Re: [DISCUSS] Towards Cascades Optimizer

2020-04-27 Thread Julian Hyde
This thread has almost gotten too long to respond to. I confess I’ve not read 
much of it. I’m going to reply anyway. Sorry.

I support making Calcite’s optimizer support “Cascades”.

We should keep the existing VolcanoPlanner working during the transition, and 
perhaps longer. (I acknowledge that this will not be easy. But it will not be 
impossible, because we already have 2 planner engines, and going from 2 to 3 is 
much harder than going from 1 to 2.)

I perhaps have a different philosophy than Haisheng + Cascades on logical vs 
physical. I do believe that logical & physical rels and rules are more similar 
than they are different, therefore they should have the same classes, etc. 
Pragmatically, it makes a lot of sense to create rule sets and planner phases 
that deal with just logical rels, or just physical rels. But that’s a best 
practice, not a rule.

This philosophy manifested in a discussion a couple of weeks ago about whether 
RelBuilder should be able to create physical rels. I still believe that it 
should.

Julian


> On Apr 27, 2020, at 11:29 AM, Roman Kondakov  
> wrote:
> 
> Hi all,
> 
> Stamatis, Haisheng thank you very much for your feedback! I really
> appreciate it.
> 
>> If in the new planner we end up copy-pasting code then I guess it will be a 
>> bad idea.
> 
> Yes, there are some code duplication between Volcano planner and
> Cascades planner. I think I'll move it to some common superclass: either
> to existing AbstractRelOptPlanner or a new AbstractCostBasedPlanner.
> 
>> Like that it will be easier for everybody to test and see if the changes 
>> make this better or worse. From a backward compatibility perspective it 
>> seems feasible to keep the new eatures configurable for a certain amount of 
>> time.
> 
> I was thinking about backward compatibility in this way: what if we can
> switch planner in tests by setting some flag (system property?)
> somewhere and check what happens with new planner if we replace Volcano
> with it. And if ever it turns out that the new planner passes all
> Volcano's tests and at the same time it works more efficiently, we can
> safely replace Volcano planner with Cascades planner.
> 
>> A design doc would definitely help, especially if it has a few end-to-end 
>> (from logical to physical plan) examples showing how the optimizer works
> at each step before/after the changes. This is actually what is usually
> missing in research papers that makes them hard to understand.
>> I am thinking some similar to the examples that Haisheng send in the first  
>> email but possibly a bit more detailed.
> 
> I agree, I'll add some exmples to the design doc very soon.
> 
>> I looked very briefly in the PR by Roman but I think I didn't see tests 
>> where the final plan contains operators from multiple conventions. Multiple 
>> conventions is among the choices that complicate certain parts of he 
>> existing planner so we should make sure that we take this into account.
> 
> It's on my radar. I'm going to add these tests.
> 
> I would like to ask the commutnity about my next steps on the way from
> transition of the Cascades planner from the prototype status to becoming
> a part of the project. I see these steps like this:
> 
> 1. Create a jira ticket.
> 2. Update design document with examples.
> 3. Make some research to obtain backward compatibility with Volcano
> planner to be able to replace Volcano planner with Cascades planner in
> test and to understand the current porblems with planner.
> 4. Solve known problems
>  - materialized views
>  - hints
>  - multiple convetions
>  - listener hooks
>  - problems from p.3.
> 5. new PR, review and merge.
> 6. Replacie Volcano planner wiith Cascades after several releases.
> 
> What do you think about this roadmap?
> 
> 
> -- 
> Kind Regards
> Roman Kondakov
> 
> 
> On 27.04.2020 01:55, Stamatis Zampetakis wrote:
>> Hi all,
>> 
>> I am very excited about the ideas discussed so far and especially by the
>> enthusiasm of many people that are ready to help for pulling this out.
>> I wouldn't except that we could have a prototype so quickly.
>> Thanks a lot everyone!
>> 
>> In the debate between creating new planner or patching the existing one, I
>> don't have a clear preference.
>> I think the answer depends on how many things can we reuse.
>> If in the new planner we end up copy-pasting code then I guess it will be a
>> bad idea.
>> On the other hand, if the new and old planner do not have many things in
>> common then I guess the answer is obvious.
>> 
>> From Haisheng's description, I was thinking that many of the proposed
>> changes could go in the existing planner.
>> Like that it will be easier for everybody to test and see if the changes
>> make this better or worse.
>> From a backward compatibility perspective it seems feasible to keep the new
>> features configurable for a certain amount of time.
>> 
>> From the wish-list, I think we should focus initially on points:
>> 1. Top-down trait request
>> 2. 

Re: [DISCUSS] Towards Cascades Optimizer

2020-04-27 Thread Roman Kondakov
Hi all,

Stamatis, Haisheng thank you very much for your feedback! I really
appreciate it.

> If in the new planner we end up copy-pasting code then I guess it will be a 
> bad idea.

Yes, there are some code duplication between Volcano planner and
Cascades planner. I think I'll move it to some common superclass: either
to existing AbstractRelOptPlanner or a new AbstractCostBasedPlanner.

> Like that it will be easier for everybody to test and see if the changes make 
> this better or worse. From a backward compatibility perspective it seems 
> feasible to keep the new eatures configurable for a certain amount of time.

I was thinking about backward compatibility in this way: what if we can
switch planner in tests by setting some flag (system property?)
somewhere and check what happens with new planner if we replace Volcano
with it. And if ever it turns out that the new planner passes all
Volcano's tests and at the same time it works more efficiently, we can
safely replace Volcano planner with Cascades planner.

> A design doc would definitely help, especially if it has a few end-to-end 
> (from logical to physical plan) examples showing how the optimizer works
at each step before/after the changes. This is actually what is usually
missing in research papers that makes them hard to understand.
> I am thinking some similar to the examples that Haisheng send in the first  
> email but possibly a bit more detailed.

I agree, I'll add some exmples to the design doc very soon.

> I looked very briefly in the PR by Roman but I think I didn't see tests where 
> the final plan contains operators from multiple conventions. Multiple 
> conventions is among the choices that complicate certain parts of he existing 
> planner so we should make sure that we take this into account.

It's on my radar. I'm going to add these tests.

I would like to ask the commutnity about my next steps on the way from
transition of the Cascades planner from the prototype status to becoming
a part of the project. I see these steps like this:

1. Create a jira ticket.
2. Update design document with examples.
3. Make some research to obtain backward compatibility with Volcano
planner to be able to replace Volcano planner with Cascades planner in
test and to understand the current porblems with planner.
4. Solve known problems
  - materialized views
  - hints
  - multiple convetions
  - listener hooks
  - problems from p.3.
5. new PR, review and merge.
6. Replacie Volcano planner wiith Cascades after several releases.

What do you think about this roadmap?


-- 
Kind Regards
Roman Kondakov


On 27.04.2020 01:55, Stamatis Zampetakis wrote:
> Hi all,
> 
> I am very excited about the ideas discussed so far and especially by the
> enthusiasm of many people that are ready to help for pulling this out.
> I wouldn't except that we could have a prototype so quickly.
> Thanks a lot everyone!
> 
> In the debate between creating new planner or patching the existing one, I
> don't have a clear preference.
> I think the answer depends on how many things can we reuse.
> If in the new planner we end up copy-pasting code then I guess it will be a
> bad idea.
> On the other hand, if the new and old planner do not have many things in
> common then I guess the answer is obvious.
> 
> From Haisheng's description, I was thinking that many of the proposed
> changes could go in the existing planner.
> Like that it will be easier for everybody to test and see if the changes
> make this better or worse.
> From a backward compatibility perspective it seems feasible to keep the new
> features configurable for a certain amount of time.
> 
> From the wish-list, I think we should focus initially on points:
> 1. Top-down trait request
> 2. Convert traits without Abstract converters
> 4. Bottom-up trait derivation
> 
> I know that 3, and 5, are also important but I have the feeling they can
> wait a bit longer.
> 
> A design doc would definitely help, especially if it has a few end-to-end
> (from logical to physical plan) examples showing how the optimizer works at
> each step before/after the changes.
> This is actually what is usually missing in research papers that makes them
> hard to understand.
> I am thinking some similar to the examples that Haisheng send in the first
> email but possibly a bit more detailed.
> 
> I looked very briefly in the PR by Roman but I think I didn't see tests
> where the final plan contains operators from multiple conventions.
> Multiple conventions is among the choices that complicate certain parts of
> the existing planner so we should make sure that we take this into account.
> 
> Hoping to find some time to think over all this more quietly. Very
> interesting stuff :)
> 
> Best,
> Stamatis
> 
> On Sun, Apr 26, 2020 at 11:14 PM Haisheng Yuan  wrote:
> 
>> Hi Roman,
>>
>> Excellent! This is definitely a helpful contribution to the Calcite
>> community.
>> Thank you for your endeavors.
>>
>> Haisheng
>>
>> On 2020/04/26 19:25:00, Roman 

Re: [DISCUSS] Towards Cascades Optimizer

2020-04-26 Thread Stamatis Zampetakis
Hi all,

I am very excited about the ideas discussed so far and especially by the
enthusiasm of many people that are ready to help for pulling this out.
I wouldn't except that we could have a prototype so quickly.
Thanks a lot everyone!

In the debate between creating new planner or patching the existing one, I
don't have a clear preference.
I think the answer depends on how many things can we reuse.
If in the new planner we end up copy-pasting code then I guess it will be a
bad idea.
On the other hand, if the new and old planner do not have many things in
common then I guess the answer is obvious.

>From Haisheng's description, I was thinking that many of the proposed
changes could go in the existing planner.
Like that it will be easier for everybody to test and see if the changes
make this better or worse.
>From a backward compatibility perspective it seems feasible to keep the new
features configurable for a certain amount of time.

>From the wish-list, I think we should focus initially on points:
1. Top-down trait request
2. Convert traits without Abstract converters
4. Bottom-up trait derivation

I know that 3, and 5, are also important but I have the feeling they can
wait a bit longer.

A design doc would definitely help, especially if it has a few end-to-end
(from logical to physical plan) examples showing how the optimizer works at
each step before/after the changes.
This is actually what is usually missing in research papers that makes them
hard to understand.
I am thinking some similar to the examples that Haisheng send in the first
email but possibly a bit more detailed.

I looked very briefly in the PR by Roman but I think I didn't see tests
where the final plan contains operators from multiple conventions.
Multiple conventions is among the choices that complicate certain parts of
the existing planner so we should make sure that we take this into account.

Hoping to find some time to think over all this more quietly. Very
interesting stuff :)

Best,
Stamatis

On Sun, Apr 26, 2020 at 11:14 PM Haisheng Yuan  wrote:

> Hi Roman,
>
> Excellent! This is definitely a helpful contribution to the Calcite
> community.
> Thank you for your endeavors.
>
> Haisheng
>
> On 2020/04/26 19:25:00, Roman Kondakov 
> wrote:
> > Hi everyone!
> >
> > Haisheng, thank you for bringing this subject up. A new Cascades-style
> > optimizer should be definitely the next step for Apache Calcite. Many
> > projects suffer from the lack of this kind of optimizer.
> >
> > That was the reason why several weeks ago I started working on the
> > prototype of Cascades optimizer for Apache Calcite. I was not sure that
> > I would build something useful without much experience in this area. But
> > now I'd like to share my work with the community. You can find the
> > Cascades prototype in PR [1]. This prototype is based on the well-known
> > paper [2].
> >
> > What prototype can do:
> > - Top-down trait request
> > - Convert traits without Abstract converters
> > - Top-down rule apply
> > - Bottom-up trait derivation
> >
> > What is not supported yet:
> > - Search space pruning (but I'm going to fix it in the next commits)
> > - Materialized views
> > - Planner hooks
> > - Hints
> > - Backward compatibility with Volcano planner (some research needed)
> >
> > I prepared a design doc for this planner [3], you can find many details
> > there. I also opened it for comments.
> >
> > I've written several basic test cases in
> > org.apache.calcite.plan.cascades.CascadesPlannerTest including that was
> > discussed in this thread previously in the context of trait requests:
> >
> > >  MergeJoin hash[a]
> > >   | TableScan R hash[a] (RelSubset)
> > >   + TableScan S hash[a] (RelSubset)
> >
> >
> > Haisheng, this is very similar to what you propose. Answering your
> question:
> >
> > > There are 2 ways:
> > > a) Modify on current VolcanoPlanner.
> > >   Pros: code reuse, existing numerous test cases and infrastructure,
> fast integration
> > >   Cons: changing code always brings risk
> > >
> > > b) Add a new Planner
> > >   Pros: no risk, no tech debt, no need to worry about backward
> compatability
> > >   Cons: separate test cases for new planner, one more planner to
> maintain
> >
> > I've chosen the second approach. Because I don't have clear
> > understanding how to fix Volcano planner gradually.
> >
> > This new planner is very far from perfect, but I think it can be a good
> > starting point for community.
> >
> > Please, share your thoughts about this planner.
> >
> >
> > [1] PR: https://github.com/apache/calcite/pull/1948
> > [2] Paper:
> >
> https://15721.courses.cs.cmu.edu/spring2019/papers/22-optimizer1/xu-columbia-thesis1998.pdf
> > [3] Design doc:
> >
> https://docs.google.com/document/d/1qaV3eSKTw4gfLuBR3XB_LVIc51hxIng9k1J4VD6qnRg/edit?usp=sharing
> >
> >
> >
> > --
> > Kind Regards
> > Roman Kondakov
> >
> >
> > On 22.04.2020 09:52, Danny Chan wrote:
> > >> Is there any recommended approach to make 

Re: [DISCUSS] Towards Cascades Optimizer

2020-04-26 Thread Haisheng Yuan
Hi Roman,

Excellent! This is definitely a helpful contribution to the Calcite community.
Thank you for your endeavors.

Haisheng

On 2020/04/26 19:25:00, Roman Kondakov  wrote: 
> Hi everyone!
> 
> Haisheng, thank you for bringing this subject up. A new Cascades-style
> optimizer should be definitely the next step for Apache Calcite. Many
> projects suffer from the lack of this kind of optimizer.
> 
> That was the reason why several weeks ago I started working on the
> prototype of Cascades optimizer for Apache Calcite. I was not sure that
> I would build something useful without much experience in this area. But
> now I'd like to share my work with the community. You can find the
> Cascades prototype in PR [1]. This prototype is based on the well-known
> paper [2].
> 
> What prototype can do:
> - Top-down trait request
> - Convert traits without Abstract converters
> - Top-down rule apply
> - Bottom-up trait derivation
> 
> What is not supported yet:
> - Search space pruning (but I'm going to fix it in the next commits)
> - Materialized views
> - Planner hooks
> - Hints
> - Backward compatibility with Volcano planner (some research needed)
> 
> I prepared a design doc for this planner [3], you can find many details
> there. I also opened it for comments.
> 
> I've written several basic test cases in
> org.apache.calcite.plan.cascades.CascadesPlannerTest including that was
> discussed in this thread previously in the context of trait requests:
> 
> >  MergeJoin hash[a] 
> >   | TableScan R hash[a] (RelSubset)
> >   + TableScan S hash[a] (RelSubset)
> 
> 
> Haisheng, this is very similar to what you propose. Answering your question:
> 
> > There are 2 ways:
> > a) Modify on current VolcanoPlanner.
> >   Pros: code reuse, existing numerous test cases and infrastructure, fast 
> > integration
> >   Cons: changing code always brings risk
> > 
> > b) Add a new Planner
> >   Pros: no risk, no tech debt, no need to worry about backward compatability
> >   Cons: separate test cases for new planner, one more planner to maintain
> 
> I've chosen the second approach. Because I don't have clear
> understanding how to fix Volcano planner gradually.
> 
> This new planner is very far from perfect, but I think it can be a good
> starting point for community.
> 
> Please, share your thoughts about this planner.
> 
> 
> [1] PR: https://github.com/apache/calcite/pull/1948
> [2] Paper:
> https://15721.courses.cs.cmu.edu/spring2019/papers/22-optimizer1/xu-columbia-thesis1998.pdf
> [3] Design doc:
> https://docs.google.com/document/d/1qaV3eSKTw4gfLuBR3XB_LVIc51hxIng9k1J4VD6qnRg/edit?usp=sharing
> 
> 
> 
> -- 
> Kind Regards
> Roman Kondakov
> 
> 
> On 22.04.2020 09:52, Danny Chan wrote:
> >> Is there any recommended approach to make that happen smoothly besides
> > coding and testing work? We need to be aware that the new planner might be
> > co-exist with VolcanoPlanner for 5 or more years, or even never replace
> > VolcanoPlanner.
> > 
> > If that is true, i might say the new planner is probably with a not that
> > good design, we expect to see in advance for what cases/reasons user has
> > the reason to keep the old VolcanoPlanner and we *must* give a solution for
> > those problems in the new design.
> > 
> > I was expecting that migrating to a new planner would at least take 1 year
> > for developing, if that is true, modifying directly based on current
> > planner means for the near future 3~4 versions Calcite, there would bring
> > in huge plan changes/bugs for each release which i believe all the users of
> > Calcite don't want to see. And on one can guarantee that modifying directly
> > can keep good stability and compatibility, only the test set do.
> > 
> > From the experience of Alibaba Blink planner which has contributed to
> > Apache Flink, yes, the old/new planner would co-exist at least for 2 years.
> > For the reasons that the new and old planner has different ability in some
> > corner cases.
> > 
> > From my point of view, we should at least:
> > - Give a convincing test set for the new planner that makes us believe the
> > new planner is stable and powerful enough. I mean obviously the current
> > rule tests are far away from enough to support the new planner
> > - We should give a more detailed design doc about the new planner,
> > especially about the interfaces changes and any change that would bring in
> > the compatibility problem. Then we can make more accurate decision how much
> > work the new planner would bring in, until then, we can decide if switch to
> > a pure new planner development is a good idea or modify the existing one.
> > 
> > 
> > Haisheng Yuan  于2020年4月22日周三 上午9:45写道:
> > 
> >> Hi Andrii,
> >>
> >>> Obviously, from what is written here, I could guess that this would
> >> require me to change my physical planning rules, even if only by
> >> implementing a marker interface.
> >> You don't need to change your physical rules, it will be treated as equal
> >> 

Re: [DISCUSS] Towards Cascades Optimizer

2020-04-26 Thread Roman Kondakov
Hi everyone!

Haisheng, thank you for bringing this subject up. A new Cascades-style
optimizer should be definitely the next step for Apache Calcite. Many
projects suffer from the lack of this kind of optimizer.

That was the reason why several weeks ago I started working on the
prototype of Cascades optimizer for Apache Calcite. I was not sure that
I would build something useful without much experience in this area. But
now I'd like to share my work with the community. You can find the
Cascades prototype in PR [1]. This prototype is based on the well-known
paper [2].

What prototype can do:
- Top-down trait request
- Convert traits without Abstract converters
- Top-down rule apply
- Bottom-up trait derivation

What is not supported yet:
- Search space pruning (but I'm going to fix it in the next commits)
- Materialized views
- Planner hooks
- Hints
- Backward compatibility with Volcano planner (some research needed)

I prepared a design doc for this planner [3], you can find many details
there. I also opened it for comments.

I've written several basic test cases in
org.apache.calcite.plan.cascades.CascadesPlannerTest including that was
discussed in this thread previously in the context of trait requests:

>  MergeJoin hash[a] 
>   | TableScan R hash[a] (RelSubset)
>   + TableScan S hash[a] (RelSubset)


Haisheng, this is very similar to what you propose. Answering your question:

> There are 2 ways:
> a) Modify on current VolcanoPlanner.
>   Pros: code reuse, existing numerous test cases and infrastructure, fast 
> integration
>   Cons: changing code always brings risk
> 
> b) Add a new Planner
>   Pros: no risk, no tech debt, no need to worry about backward compatability
>   Cons: separate test cases for new planner, one more planner to maintain

I've chosen the second approach. Because I don't have clear
understanding how to fix Volcano planner gradually.

This new planner is very far from perfect, but I think it can be a good
starting point for community.

Please, share your thoughts about this planner.


[1] PR: https://github.com/apache/calcite/pull/1948
[2] Paper:
https://15721.courses.cs.cmu.edu/spring2019/papers/22-optimizer1/xu-columbia-thesis1998.pdf
[3] Design doc:
https://docs.google.com/document/d/1qaV3eSKTw4gfLuBR3XB_LVIc51hxIng9k1J4VD6qnRg/edit?usp=sharing



-- 
Kind Regards
Roman Kondakov


On 22.04.2020 09:52, Danny Chan wrote:
>> Is there any recommended approach to make that happen smoothly besides
> coding and testing work? We need to be aware that the new planner might be
> co-exist with VolcanoPlanner for 5 or more years, or even never replace
> VolcanoPlanner.
> 
> If that is true, i might say the new planner is probably with a not that
> good design, we expect to see in advance for what cases/reasons user has
> the reason to keep the old VolcanoPlanner and we *must* give a solution for
> those problems in the new design.
> 
> I was expecting that migrating to a new planner would at least take 1 year
> for developing, if that is true, modifying directly based on current
> planner means for the near future 3~4 versions Calcite, there would bring
> in huge plan changes/bugs for each release which i believe all the users of
> Calcite don't want to see. And on one can guarantee that modifying directly
> can keep good stability and compatibility, only the test set do.
> 
> From the experience of Alibaba Blink planner which has contributed to
> Apache Flink, yes, the old/new planner would co-exist at least for 2 years.
> For the reasons that the new and old planner has different ability in some
> corner cases.
> 
> From my point of view, we should at least:
> - Give a convincing test set for the new planner that makes us believe the
> new planner is stable and powerful enough. I mean obviously the current
> rule tests are far away from enough to support the new planner
> - We should give a more detailed design doc about the new planner,
> especially about the interfaces changes and any change that would bring in
> the compatibility problem. Then we can make more accurate decision how much
> work the new planner would bring in, until then, we can decide if switch to
> a pure new planner development is a good idea or modify the existing one.
> 
> 
> Haisheng Yuan  于2020年4月22日周三 上午9:45写道:
> 
>> Hi Andrii,
>>
>>> Obviously, from what is written here, I could guess that this would
>> require me to change my physical planning rules, even if only by
>> implementing a marker interface.
>> You don't need to change your physical rules, it will be treated as equal
>> as logical rules and be applied together with the real logical rules, no
>> more logical/physical rules difference. This is also how current
>> VolcanoPlanner works.
>>
>>> I don't want you to think that I somehow resent the changes you are
>> pushing.
>> Don't get me wrong. I am seriously thinking of revert these changes, since
>> most people like the idea of adding new planner, why don't we make all the

Re: [DISCUSS] Towards Cascades Optimizer

2020-04-22 Thread Danny Chan
> Is there any recommended approach to make that happen smoothly besides
coding and testing work? We need to be aware that the new planner might be
co-exist with VolcanoPlanner for 5 or more years, or even never replace
VolcanoPlanner.

If that is true, i might say the new planner is probably with a not that
good design, we expect to see in advance for what cases/reasons user has
the reason to keep the old VolcanoPlanner and we *must* give a solution for
those problems in the new design.

I was expecting that migrating to a new planner would at least take 1 year
for developing, if that is true, modifying directly based on current
planner means for the near future 3~4 versions Calcite, there would bring
in huge plan changes/bugs for each release which i believe all the users of
Calcite don't want to see. And on one can guarantee that modifying directly
can keep good stability and compatibility, only the test set do.

>From the experience of Alibaba Blink planner which has contributed to
Apache Flink, yes, the old/new planner would co-exist at least for 2 years.
For the reasons that the new and old planner has different ability in some
corner cases.

>From my point of view, we should at least:
- Give a convincing test set for the new planner that makes us believe the
new planner is stable and powerful enough. I mean obviously the current
rule tests are far away from enough to support the new planner
- We should give a more detailed design doc about the new planner,
especially about the interfaces changes and any change that would bring in
the compatibility problem. Then we can make more accurate decision how much
work the new planner would bring in, until then, we can decide if switch to
a pure new planner development is a good idea or modify the existing one.


Haisheng Yuan  于2020年4月22日周三 上午9:45写道:

> Hi Andrii,
>
> > Obviously, from what is written here, I could guess that this would
> require me to change my physical planning rules, even if only by
> implementing a marker interface.
> You don't need to change your physical rules, it will be treated as equal
> as logical rules and be applied together with the real logical rules, no
> more logical/physical rules difference. This is also how current
> VolcanoPlanner works.
>
> > I don't want you to think that I somehow resent the changes you are
> pushing.
> Don't get me wrong. I am seriously thinking of revert these changes, since
> most people like the idea of adding new planner, why don't we make all the
> plan changes in the new planner, instead of forcing people changing test
> cases for the code changes that they might not need in VolcanoPlanner
> during upgrade.
>
> I didn't intend to replace VolcanoPlanner, thought just change the search
> strategy and add trait derivation mechanism, because most of the code in
> VolcanoPlanner can be reused. But since many agree to add new planner and
> replace VolcanoPlanner as the final goal, I won't be against most people's
> decision.
>
> Is there any recommended approach to make that happen smoothly besides
> coding and testing work? We need to be aware that the new planner might be
> co-exist with VolcanoPlanner for 5 or more years, or even never replace
> VolcanoPlanner.
>
> More thoughts are welcome.
>
> Haisheng
>
> On 2020/04/21 19:56:25, Андрей Цвелодуб  wrote:
> > Hello Haisheng,
> >
> > > To keep backward compatibility, all the un-marked rules will be treated
> > as logical rules, except rules that uses AbstractConverter as rule
> operand,
> > these rules still need to applied top-down, or random order.
> > Obviously, from what is written here, I could guess that this would
> require
> > me to change my physical planning rules, even if only by implementing a
> > marker interface. I am not saying this is a bad thing, but this is a
> thing
> > that should be communicated and planned ahead in case the VolcanoPlanner
> is
> > modified.
> >
> > > Looks like I have to revert changes in CALCITE-2970 and CALCITE-3753,
> > because they will cause another tons of plan changes.
> > I see you are still bitter due to all the discussions on this list
> lately,
> > I'm sorry. I don't want you to think that I somehow resent the changes
> you
> > are pushing, au contraire I support them and would be happy to help if I
> > can. I just want the process of these changes to be executed in the best
> > possible way.
> > As I see there are already several opinions in this thread that basically
> > align with what I am saying, so I guess I am not the crazy guy running
> > around and yelling "the end is nigh!".
> >
> > Thank you for taking these mumbled thoughts into account.
> >
> > Bestest Regards,
> > Andrii Tsvielodub
> >
> > On Tue, 21 Apr 2020 at 21:08, Haisheng Yuan  wrote:
> >
> > > Hi Andrii,
> > >
> > > > I guess changing the planner would lead to changes in tons of rules
> and
> > > even more tests.
> > > Obviously you didn't read through my email. You are not required to do
> any
> > > changes to your rule if you 

Re: [DISCUSS] Towards Cascades Optimizer

2020-04-21 Thread Haisheng Yuan
Hi Andrii,

> Obviously, from what is written here, I could guess that this would require 
> me to change my physical planning rules, even if only by implementing a 
> marker interface.
You don't need to change your physical rules, it will be treated as equal as 
logical rules and be applied together with the real logical rules, no more 
logical/physical rules difference. This is also how current VolcanoPlanner 
works. 

> I don't want you to think that I somehow resent the changes you are pushing.
Don't get me wrong. I am seriously thinking of revert these changes, since most 
people like the idea of adding new planner, why don't we make all the plan 
changes in the new planner, instead of forcing people changing test cases for 
the code changes that they might not need in VolcanoPlanner during upgrade.

I didn't intend to replace VolcanoPlanner, thought just change the search 
strategy and add trait derivation mechanism, because most of the code in 
VolcanoPlanner can be reused. But since many agree to add new planner and 
replace VolcanoPlanner as the final goal, I won't be against most people's 
decision. 

Is there any recommended approach to make that happen smoothly besides coding 
and testing work? We need to be aware that the new planner might be co-exist 
with VolcanoPlanner for 5 or more years, or even never replace VolcanoPlanner.

More thoughts are welcome.

Haisheng

On 2020/04/21 19:56:25, Андрей Цвелодуб  wrote: 
> Hello Haisheng,
> 
> > To keep backward compatibility, all the un-marked rules will be treated
> as logical rules, except rules that uses AbstractConverter as rule operand,
> these rules still need to applied top-down, or random order.
> Obviously, from what is written here, I could guess that this would require
> me to change my physical planning rules, even if only by implementing a
> marker interface. I am not saying this is a bad thing, but this is a thing
> that should be communicated and planned ahead in case the VolcanoPlanner is
> modified.
> 
> > Looks like I have to revert changes in CALCITE-2970 and CALCITE-3753,
> because they will cause another tons of plan changes.
> I see you are still bitter due to all the discussions on this list lately,
> I'm sorry. I don't want you to think that I somehow resent the changes you
> are pushing, au contraire I support them and would be happy to help if I
> can. I just want the process of these changes to be executed in the best
> possible way.
> As I see there are already several opinions in this thread that basically
> align with what I am saying, so I guess I am not the crazy guy running
> around and yelling "the end is nigh!".
> 
> Thank you for taking these mumbled thoughts into account.
> 
> Bestest Regards,
> Andrii Tsvielodub
> 
> On Tue, 21 Apr 2020 at 21:08, Haisheng Yuan  wrote:
> 
> > Hi Andrii,
> >
> > > I guess changing the planner would lead to changes in tons of rules and
> > even more tests.
> > Obviously you didn't read through my email. You are not required to do any
> > changes to your rule if you don't want to, but if you do, just need to mark
> > the rule to tell planner whether it is a physical rule or not, simply by
> > implementing an empty interface.
> >
> > > many on this list already experienced problems with upgrading even
> > between the minor versions of Calcite.
> > Sorry to see the problem you have experienced when upgrading Calcite.
> > Looks like I have to revert changes in CALCITE-2970 and CALCITE-3753,
> > because they will cause another tons of plan changes.
> >
> > But I will see if I can add a setting to use the old search strategy,
> > which can be left untouched.
> >
> > Haisheng
> >
> > On 2020/04/21 06:33:08, Андрей Цвелодуб  wrote:
> > > Hello everyone,
> > >
> > > First of all, thanks for this great effort of improving the core parts of
> > > the framework we all are using,
> > > I believe this is long overdue and hope this will have benefits both for
> > > the maintainers and users of the library.
> > >
> > > I don't have anything to say about the general idea at the moment,
> > > but I want to make a point that maintaining the old implementation of
> > > VolcanoPlanner during
> > > the initial stages of implementing the new planner is absolutely
> > CRITICAL.
> > > As a lot of users of Calcite do various customizations to the engine, to
> > > the rules
> > > and all that is there in between, I believe changing the implementation
> > of
> > > the core component
> > > would have a huge impact on most users of the library. I think many on
> > this
> > > list
> > > already experienced problems with upgrading even between the minor
> > versions
> > > of Calcite,
> > > so I guess changing the planner would lead to changes in tons of rules
> > and
> > > even more tests.
> > >
> > > I don't have anything against replacing VolcanoPlanner as a final goal of
> > > this effort,
> > > but I don't think that modifying it directly and merging it to master is
> > a
> > > viable development approach.

Re: [DISCUSS] Towards Cascades Optimizer

2020-04-21 Thread Андрей Цвелодуб
Hello Haisheng,

> To keep backward compatibility, all the un-marked rules will be treated
as logical rules, except rules that uses AbstractConverter as rule operand,
these rules still need to applied top-down, or random order.
Obviously, from what is written here, I could guess that this would require
me to change my physical planning rules, even if only by implementing a
marker interface. I am not saying this is a bad thing, but this is a thing
that should be communicated and planned ahead in case the VolcanoPlanner is
modified.

> Looks like I have to revert changes in CALCITE-2970 and CALCITE-3753,
because they will cause another tons of plan changes.
I see you are still bitter due to all the discussions on this list lately,
I'm sorry. I don't want you to think that I somehow resent the changes you
are pushing, au contraire I support them and would be happy to help if I
can. I just want the process of these changes to be executed in the best
possible way.
As I see there are already several opinions in this thread that basically
align with what I am saying, so I guess I am not the crazy guy running
around and yelling "the end is nigh!".

Thank you for taking these mumbled thoughts into account.

Bestest Regards,
Andrii Tsvielodub

On Tue, 21 Apr 2020 at 21:08, Haisheng Yuan  wrote:

> Hi Andrii,
>
> > I guess changing the planner would lead to changes in tons of rules and
> even more tests.
> Obviously you didn't read through my email. You are not required to do any
> changes to your rule if you don't want to, but if you do, just need to mark
> the rule to tell planner whether it is a physical rule or not, simply by
> implementing an empty interface.
>
> > many on this list already experienced problems with upgrading even
> between the minor versions of Calcite.
> Sorry to see the problem you have experienced when upgrading Calcite.
> Looks like I have to revert changes in CALCITE-2970 and CALCITE-3753,
> because they will cause another tons of plan changes.
>
> But I will see if I can add a setting to use the old search strategy,
> which can be left untouched.
>
> Haisheng
>
> On 2020/04/21 06:33:08, Андрей Цвелодуб  wrote:
> > Hello everyone,
> >
> > First of all, thanks for this great effort of improving the core parts of
> > the framework we all are using,
> > I believe this is long overdue and hope this will have benefits both for
> > the maintainers and users of the library.
> >
> > I don't have anything to say about the general idea at the moment,
> > but I want to make a point that maintaining the old implementation of
> > VolcanoPlanner during
> > the initial stages of implementing the new planner is absolutely
> CRITICAL.
> > As a lot of users of Calcite do various customizations to the engine, to
> > the rules
> > and all that is there in between, I believe changing the implementation
> of
> > the core component
> > would have a huge impact on most users of the library. I think many on
> this
> > list
> > already experienced problems with upgrading even between the minor
> versions
> > of Calcite,
> > so I guess changing the planner would lead to changes in tons of rules
> and
> > even more tests.
> >
> > I don't have anything against replacing VolcanoPlanner as a final goal of
> > this effort,
> > but I don't think that modifying it directly and merging it to master is
> a
> > viable development approach.
> > While I understand how burdensome it is to maintain several parallel core
> > components at once
> > (we did this while moving the engine of our product to Calcite), we
> should
> > still respect those who depend
> > on it and not introduce the risks related to the development of a new
> > component into existing processing flows.
> >
> > A good model to try to follow would be the way new Garbage Collectors are
> > introduced in Java.
> > First, add it as an experimental option, then make it generally
> available,
> > then after everyone agrees
> > this is the best option - make it the default one.
> > With this approach, everyone can then move to the new planner at their
> own
> > pace, guaranteeing a smooth transition overall.
> > Yes, this could take some time, maybe even a year, but this is the price
> of
> > doing major changes in a popular framework.
> >
> > Again, thank you for initiating this discussion and leading this effort.
> >
> > Best Regards,
> > Andrii Tsvielodub
> >
> > On Tue, 21 Apr 2020 at 07:51, Jinpeng Wu  wrote:
> >
> > > Hi, Xiening.
> > >
> > > Regarding calculating the logical cost, here are some ways I though:
> > > 1. Logical rel may implement their own computeSelfCost method. Some
> > > rels can provide such information, for example the
> > > LogicalProject/LogicalFilter contains nearly the same information as
> their
> > > physical implementations. If we don't have enough confidence, just
> return
> > > zeroCost is also OK, as it only affects pruning.
> > > 2. Logical rel  tells its parents what its physical input could be
> after
> > > implementation. 

Re: [DISCUSS] Towards Cascades Optimizer

2020-04-21 Thread Haisheng Yuan
Hi Andrii,

> I guess changing the planner would lead to changes in tons of rules and even 
> more tests.
Obviously you didn't read through my email. You are not required to do any 
changes to your rule if you don't want to, but if you do, just need to mark the 
rule to tell planner whether it is a physical rule or not, simply by 
implementing an empty interface.

> many on this list already experienced problems with upgrading even between 
> the minor versions of Calcite.
Sorry to see the problem you have experienced when upgrading Calcite. Looks 
like I have to revert changes in CALCITE-2970 and CALCITE-3753, because they 
will cause another tons of plan changes.

But I will see if I can add a setting to use the old search strategy, which can 
be left untouched.

Haisheng

On 2020/04/21 06:33:08, Андрей Цвелодуб  wrote: 
> Hello everyone,
> 
> First of all, thanks for this great effort of improving the core parts of
> the framework we all are using,
> I believe this is long overdue and hope this will have benefits both for
> the maintainers and users of the library.
> 
> I don't have anything to say about the general idea at the moment,
> but I want to make a point that maintaining the old implementation of
> VolcanoPlanner during
> the initial stages of implementing the new planner is absolutely CRITICAL.
> As a lot of users of Calcite do various customizations to the engine, to
> the rules
> and all that is there in between, I believe changing the implementation of
> the core component
> would have a huge impact on most users of the library. I think many on this
> list
> already experienced problems with upgrading even between the minor versions
> of Calcite,
> so I guess changing the planner would lead to changes in tons of rules and
> even more tests.
> 
> I don't have anything against replacing VolcanoPlanner as a final goal of
> this effort,
> but I don't think that modifying it directly and merging it to master is a
> viable development approach.
> While I understand how burdensome it is to maintain several parallel core
> components at once
> (we did this while moving the engine of our product to Calcite), we should
> still respect those who depend
> on it and not introduce the risks related to the development of a new
> component into existing processing flows.
> 
> A good model to try to follow would be the way new Garbage Collectors are
> introduced in Java.
> First, add it as an experimental option, then make it generally available,
> then after everyone agrees
> this is the best option - make it the default one.
> With this approach, everyone can then move to the new planner at their own
> pace, guaranteeing a smooth transition overall.
> Yes, this could take some time, maybe even a year, but this is the price of
> doing major changes in a popular framework.
> 
> Again, thank you for initiating this discussion and leading this effort.
> 
> Best Regards,
> Andrii Tsvielodub
> 
> On Tue, 21 Apr 2020 at 07:51, Jinpeng Wu  wrote:
> 
> > Hi, Xiening.
> >
> > Regarding calculating the logical cost, here are some ways I though:
> > 1. Logical rel may implement their own computeSelfCost method. Some
> > rels can provide such information, for example the
> > LogicalProject/LogicalFilter contains nearly the same information as their
> > physical implementations. If we don't have enough confidence, just return
> > zeroCost is also OK, as it only affects pruning.
> > 2. Logical rel  tells its parents what its physical input could be after
> > implementation. Then the problem come back to calculating lower bound of a
> > physical rel.
> > There should always be ways. The only problem is how to find a pretty one.
> >
> > Regarding the risk, new planner do have different risk. It is not because
> > new planner could stop us doing something wrong but we can decide when to
> > use the new one. Some scenarios:
> > 1. If modifying the VolcanoPlanner directly, the only way user could
> > control the risk is not to upgrade calcite version until it is considered
> > stable. You know, it is quite different from keeping calcite updated and
> > switching to the new planner at a proper time.
> > 2. It is very importance for SLA control. For the important business and
> > jobs, we may keep using the old and stable planner. And use the new one
> > only for jobs that have fault tolerance. And this helps testing new planner
> > with actual scenarios.
> > 3. It is helpful when upgrading online services. When the new planner
> > happened to have some bugs, we can switch to the old planner directly
> > without rollback the whole service.
> > 4. With all these ways to prevent issues becoming disasters, we are not
> > vulnerable to making mistakes. This not only enables faster iterations but
> > also let us have enough time to resolve big bugs, like considering it in
> > detail and applying a time-consuming refactoring for it. To work around a
> > critical bug using tricky ways usually introduces more issues.
> >
> > Thanks,

Re: [DISCUSS] Towards Cascades Optimizer

2020-04-21 Thread Chunwei Lei
Haisheng and Xiening, thanks for sharing these wonderful ideas. I believe
this will be a huge improvement
and definitely benefits all users.

>From my experience of upgrading calcite version, there are always some
changes in the new version which
may lead to unexpected behavior due to a lack of enough integration tests
with other systems even though
it is a minor version. Currently, the VolcanoPlanner is proved to be
production-ready and it has to be stable
to make sure that other systems built on it would not have much trouble
after using the new calcite version.

So from my side, I wish we can provide an option to use the new planner at
least considering its huge changes.


Best,
Chunwei


On Tue, Apr 21, 2020 at 2:42 PM Андрей Цвелодуб 
wrote:

> Hello everyone,
>
> First of all, thanks for this great effort of improving the core parts of
> the framework we all are using,
> I believe this is long overdue and hope this will have benefits both for
> the maintainers and users of the library.
>
> I don't have anything to say about the general idea at the moment,
> but I want to make a point that maintaining the old implementation of
> VolcanoPlanner during
> the initial stages of implementing the new planner is absolutely CRITICAL.
> As a lot of users of Calcite do various customizations to the engine, to
> the rules
> and all that is there in between, I believe changing the implementation of
> the core component
> would have a huge impact on most users of the library. I think many on this
> list
> already experienced problems with upgrading even between the minor versions
> of Calcite,
> so I guess changing the planner would lead to changes in tons of rules and
> even more tests.
>
> I don't have anything against replacing VolcanoPlanner as a final goal of
> this effort,
> but I don't think that modifying it directly and merging it to master is a
> viable development approach.
> While I understand how burdensome it is to maintain several parallel core
> components at once
> (we did this while moving the engine of our product to Calcite), we should
> still respect those who depend
> on it and not introduce the risks related to the development of a new
> component into existing processing flows.
>
> A good model to try to follow would be the way new Garbage Collectors are
> introduced in Java.
> First, add it as an experimental option, then make it generally available,
> then after everyone agrees
> this is the best option - make it the default one.
> With this approach, everyone can then move to the new planner at their own
> pace, guaranteeing a smooth transition overall.
> Yes, this could take some time, maybe even a year, but this is the price of
> doing major changes in a popular framework.
>
> Again, thank you for initiating this discussion and leading this effort.
>
> Best Regards,
> Andrii Tsvielodub
>
> On Tue, 21 Apr 2020 at 07:51, Jinpeng Wu  wrote:
>
> > Hi, Xiening.
> >
> > Regarding calculating the logical cost, here are some ways I though:
> > 1. Logical rel may implement their own computeSelfCost method. Some
> > rels can provide such information, for example the
> > LogicalProject/LogicalFilter contains nearly the same information as
> their
> > physical implementations. If we don't have enough confidence, just return
> > zeroCost is also OK, as it only affects pruning.
> > 2. Logical rel  tells its parents what its physical input could be after
> > implementation. Then the problem come back to calculating lower bound of
> a
> > physical rel.
> > There should always be ways. The only problem is how to find a pretty
> one.
> >
> > Regarding the risk, new planner do have different risk. It is not because
> > new planner could stop us doing something wrong but we can decide when to
> > use the new one. Some scenarios:
> > 1. If modifying the VolcanoPlanner directly, the only way user could
> > control the risk is not to upgrade calcite version until it is considered
> > stable. You know, it is quite different from keeping calcite updated and
> > switching to the new planner at a proper time.
> > 2. It is very importance for SLA control. For the important business and
> > jobs, we may keep using the old and stable planner. And use the new one
> > only for jobs that have fault tolerance. And this helps testing new
> planner
> > with actual scenarios.
> > 3. It is helpful when upgrading online services. When the new planner
> > happened to have some bugs, we can switch to the old planner directly
> > without rollback the whole service.
> > 4. With all these ways to prevent issues becoming disasters, we are not
> > vulnerable to making mistakes. This not only enables faster iterations
> but
> > also let us have enough time to resolve big bugs, like considering it in
> > detail and applying a time-consuming refactoring for it. To work around a
> > critical bug using tricky ways usually introduces more issues.
> >
> > Thanks,
> > Jinpeng
> >
> > On Tue, Apr 21, 2020 at 2:04 AM Xiening Dai  wrote:
> 

Re: [DISCUSS] Towards Cascades Optimizer

2020-04-21 Thread Андрей Цвелодуб
Hello everyone,

First of all, thanks for this great effort of improving the core parts of
the framework we all are using,
I believe this is long overdue and hope this will have benefits both for
the maintainers and users of the library.

I don't have anything to say about the general idea at the moment,
but I want to make a point that maintaining the old implementation of
VolcanoPlanner during
the initial stages of implementing the new planner is absolutely CRITICAL.
As a lot of users of Calcite do various customizations to the engine, to
the rules
and all that is there in between, I believe changing the implementation of
the core component
would have a huge impact on most users of the library. I think many on this
list
already experienced problems with upgrading even between the minor versions
of Calcite,
so I guess changing the planner would lead to changes in tons of rules and
even more tests.

I don't have anything against replacing VolcanoPlanner as a final goal of
this effort,
but I don't think that modifying it directly and merging it to master is a
viable development approach.
While I understand how burdensome it is to maintain several parallel core
components at once
(we did this while moving the engine of our product to Calcite), we should
still respect those who depend
on it and not introduce the risks related to the development of a new
component into existing processing flows.

A good model to try to follow would be the way new Garbage Collectors are
introduced in Java.
First, add it as an experimental option, then make it generally available,
then after everyone agrees
this is the best option - make it the default one.
With this approach, everyone can then move to the new planner at their own
pace, guaranteeing a smooth transition overall.
Yes, this could take some time, maybe even a year, but this is the price of
doing major changes in a popular framework.

Again, thank you for initiating this discussion and leading this effort.

Best Regards,
Andrii Tsvielodub

On Tue, 21 Apr 2020 at 07:51, Jinpeng Wu  wrote:

> Hi, Xiening.
>
> Regarding calculating the logical cost, here are some ways I though:
> 1. Logical rel may implement their own computeSelfCost method. Some
> rels can provide such information, for example the
> LogicalProject/LogicalFilter contains nearly the same information as their
> physical implementations. If we don't have enough confidence, just return
> zeroCost is also OK, as it only affects pruning.
> 2. Logical rel  tells its parents what its physical input could be after
> implementation. Then the problem come back to calculating lower bound of a
> physical rel.
> There should always be ways. The only problem is how to find a pretty one.
>
> Regarding the risk, new planner do have different risk. It is not because
> new planner could stop us doing something wrong but we can decide when to
> use the new one. Some scenarios:
> 1. If modifying the VolcanoPlanner directly, the only way user could
> control the risk is not to upgrade calcite version until it is considered
> stable. You know, it is quite different from keeping calcite updated and
> switching to the new planner at a proper time.
> 2. It is very importance for SLA control. For the important business and
> jobs, we may keep using the old and stable planner. And use the new one
> only for jobs that have fault tolerance. And this helps testing new planner
> with actual scenarios.
> 3. It is helpful when upgrading online services. When the new planner
> happened to have some bugs, we can switch to the old planner directly
> without rollback the whole service.
> 4. With all these ways to prevent issues becoming disasters, we are not
> vulnerable to making mistakes. This not only enables faster iterations but
> also let us have enough time to resolve big bugs, like considering it in
> detail and applying a time-consuming refactoring for it. To work around a
> critical bug using tricky ways usually introduces more issues.
>
> Thanks,
> Jinpeng
>
> On Tue, Apr 21, 2020 at 2:04 AM Xiening Dai  wrote:
>
> > Hi Jinpeng,
> >
> > Regarding this comment - I believe there are ways to calculate the
> logical
> > cost, but I think it’s not that simple as  "cardinality *
> unit_copy_cost.”,
> > would  you provide more details of other different ways? Just the
> algorithm
> > description or pseudo code would help us understand. Thanks.
> >
> > Regarding the approach of creating new planner, I don’t think a new
> > planner would lower the risk. We don’t know what we don’t know. If we
> > introduced an issue while modifying the planner, most likely we would do
> > the same with new planner class. A new planner doesn’t necessarily
> prevent
> > the issue from happening, but just delay its surfacing, which is worse
> IMO.
> >
> > There’s one obvious benefit with new planner is that we can provide some
> > sort of isolation so the change won’t cause test baseline updates, which
> > could be painful at times.  We should see if we 

Re: [DISCUSS] Towards Cascades Optimizer

2020-04-20 Thread Jinpeng Wu
Hi, Xiening.

Regarding calculating the logical cost, here are some ways I though:
1. Logical rel may implement their own computeSelfCost method. Some
rels can provide such information, for example the
LogicalProject/LogicalFilter contains nearly the same information as their
physical implementations. If we don't have enough confidence, just return
zeroCost is also OK, as it only affects pruning.
2. Logical rel  tells its parents what its physical input could be after
implementation. Then the problem come back to calculating lower bound of a
physical rel.
There should always be ways. The only problem is how to find a pretty one.

Regarding the risk, new planner do have different risk. It is not because
new planner could stop us doing something wrong but we can decide when to
use the new one. Some scenarios:
1. If modifying the VolcanoPlanner directly, the only way user could
control the risk is not to upgrade calcite version until it is considered
stable. You know, it is quite different from keeping calcite updated and
switching to the new planner at a proper time.
2. It is very importance for SLA control. For the important business and
jobs, we may keep using the old and stable planner. And use the new one
only for jobs that have fault tolerance. And this helps testing new planner
with actual scenarios.
3. It is helpful when upgrading online services. When the new planner
happened to have some bugs, we can switch to the old planner directly
without rollback the whole service.
4. With all these ways to prevent issues becoming disasters, we are not
vulnerable to making mistakes. This not only enables faster iterations but
also let us have enough time to resolve big bugs, like considering it in
detail and applying a time-consuming refactoring for it. To work around a
critical bug using tricky ways usually introduces more issues.

Thanks,
Jinpeng

On Tue, Apr 21, 2020 at 2:04 AM Xiening Dai  wrote:

> Hi Jinpeng,
>
> Regarding this comment - I believe there are ways to calculate the logical
> cost, but I think it’s not that simple as  "cardinality * unit_copy_cost.”,
> would  you provide more details of other different ways? Just the algorithm
> description or pseudo code would help us understand. Thanks.
>
> Regarding the approach of creating new planner, I don’t think a new
> planner would lower the risk. We don’t know what we don’t know. If we
> introduced an issue while modifying the planner, most likely we would do
> the same with new planner class. A new planner doesn’t necessarily prevent
> the issue from happening, but just delay its surfacing, which is worse IMO.
>
> There’s one obvious benefit with new planner is that we can provide some
> sort of isolation so the change won’t cause test baseline updates, which
> could be painful at times.  We should see if we could use planner config to
> achieve the same if we decide to just modify the current planner.
>
>
> > On Apr 20, 2020, at 8:32 AM, Haisheng Yuan  wrote:
> >
> > Igor,
> >
> > That's great.
> >
> > On 2020/04/20 11:17:49, Seliverstov Igor  wrote:
> >> Haisheng, Xiening,
> >>
> >> Ok, Now I see how it should work.
> >>
> >> Thanks for your replies.
> >>
> >> Regards,
> >> Igor
> >>
> >>> 20 апр. 2020 г., в 09:56, Seliverstov Igor 
> написал(а):
> >>>
> >>> Haisheng, Xiening,
> >>>
> >>> Thanks for clarifying.
> >>>
> >>> In this proposal, we are not trying to split logical and physical
> planning entirely. - actually I was in doubt about an idea of entire
> splitting logical and physical phases, if you aren't going to, I have no
> objections.
> >>>
> >>> But it returns me to my first question: how we will propagate traits
> in bottom-up manner using proposed approach. (See [DISCUSS] Proposal to add
> API to force rules matching specific rels for details)
> >>>
> >>> One of inconveniences of current VolcanoPlanner implementation is
> amount of tricks that we need to get desired behaviour. It would be great
> if some of issues (or all of them) were solved in the new approach.
> >>>
> >>> Regards,
> >>> Igor
> >>>
> >>> пн, 20 апр. 2020 г., 7:02 Xiening Dai  xndai@gmail.com>>:
> >>> Hi Igor,
> >>>
> >>> Your comment - "because actual cost may be calculated correctly using
> physical operators only. So won't be able to implement Branch and Bound
> Space Pruning.“ is actually not true. In Cascade’s lower bound / upper
> bound pruning algorithm, you can get cost lower bound of input RelNode
> using cardinality * unit_copy_cost. The unit_copy_cost is a constant, which
> stands for the minimal cost for processing a tuple from the input. So this
> will be the minimal cost the input RelNode can achieve, and if this is
> indeed larger than the current best cost, this input node can be pruned.
> >>>
> >>> In this proposal, we are not trying to split logical and physical
> planning entirely. But for any given equivalent set, we would need to
> finish exploration before implementation. But for the entire memo tree,
> each set could be in different 

Re: [DISCUSS] Towards Cascades Optimizer

2020-04-20 Thread Haisheng Yuan
Hi Hanumath,

The trait in the example is for distribution only for brevity, not including 
collation. No matter it is hash join or merge join or nestedloop join, the same 
distribution applied.

> Are you planning to use the same interface as that of VolcanoPlanner?
Yes, not only for compatibility, but also because the design of RelNode and 
Rule plays a huge impact on how planner can be refactored. They are integrated 
components, not really separable. I have no choice.

> The concern about the optimizer can take up huge time for optimizing large 
> queries is a genuine. 
Because very few efforts were spent on improving speed of the core framework 
for the last 10 years. Most of queries(MV putting side), as long as there is no 
dynamic programming style join reordering, should be optimized in a reasonable 
span of time, otherwise it is a **bug**.  Alibaba MaxCompute processed millions 
of queries per day, they are mission-critical and time-critical, all have 
optimization and execution time limit. We spent a lot of efforts on improving 
optimization latency, all the changes we made on Calcite will be battlefield 
tested. The inefficiency issue caused by inefficient search strategy and lack 
of trait propagation mechanism, that is experienced by other Calcite based 
systems, will be resolved by this proposal. If you genuinely want 
iteration-based optimization, you will be disappointed. Because even Cascades' 
first adopter SQL Server, doesn't provide iteration-based optimization, instead 
it has a phase with very limited optimization rules for transactional 
processing. You can't have your cake and eat it.

I am inclined to modify on VolcanoPlanner. A separate new planner won't make it 
move fast and lower online production risk. No body knows the pain on 
maintaining 2 optimizers more than me. Ask Greenplum database developers how 
they feel about working on Orca and Postres planner.

Haisheng

On 2020/04/20 18:07:06, hanu mapr  wrote: 
> Hello Haisheng,
> 
> 
> Thanks for the detailed analysis on the support for cascades framework. I
> am quite interested to be part of the new optimization framework. I believe
> this a very important infrastructural work to make calcite a robust query
> optimizer.
> 
> 
> I like your approach on the trait propagation and derivation of the traits
> from child nodes. I have a question though in the example you provided.
> Shouldn't HashJoin make more sense in place of MergeJoin as it can
> passThrough the traits like Hash whereas MergeJoin needs the sorted
> traits.(Please correct me if I am missing anything here).
> 
> 
> How are you imagining the new Optimizer interface from user point of view.
> Are you planning to use the same interface as that of VolcanoPlanner?
> 
> 
> The concern about the optimizer can take up huge time for optimizing large
> queries is a genuine. At the time when I was part of writing an optimizer
> for a federated engine, I also dealt with it by making the search
> time-bound. As mentioned even iteration based approach might be viable but
> I think a little more thought might be needed to finalize.
> 
> 
> I am of the opinion that option b add a new Planner might be the right
> approach. We can continuously work on the new optimizer and make it robust
> so that switching to the new optimizer can be seamless and can be
> controlled by the user. From my experience any bug in the optimizer is
> quite subtle and can have high magnitude degradation in performance, so
> tweaking the existing VolcanoOptimizer can be risky.
> 
> 
> Thanks,
> 
> -Hanumath Maduri
> 
> On Mon, Apr 20, 2020 at 11:04 AM Xiening Dai  wrote:
> 
> > Hi Jinpeng,
> >
> > Regarding this comment - I believe there are ways to calculate the logical
> > cost, but I think it’s not that simple as  "cardinality * unit_copy_cost.”,
> > would  you provide more details of other different ways? Just the algorithm
> > description or pseudo code would help us understand. Thanks.
> >
> > Regarding the approach of creating new planner, I don’t think a new
> > planner would lower the risk. We don’t know what we don’t know. If we
> > introduced an issue while modifying the planner, most likely we would do
> > the same with new planner class. A new planner doesn’t necessarily prevent
> > the issue from happening, but just delay its surfacing, which is worse IMO.
> >
> > There’s one obvious benefit with new planner is that we can provide some
> > sort of isolation so the change won’t cause test baseline updates, which
> > could be painful at times.  We should see if we could use planner config to
> > achieve the same if we decide to just modify the current planner.
> >
> >
> > > On Apr 20, 2020, at 8:32 AM, Haisheng Yuan  wrote:
> > >
> > > Igor,
> > >
> > > That's great.
> > >
> > > On 2020/04/20 11:17:49, Seliverstov Igor  wrote:
> > >> Haisheng, Xiening,
> > >>
> > >> Ok, Now I see how it should work.
> > >>
> > >> Thanks for your replies.
> > >>
> > >> Regards,
> > >> Igor
> > >>
> > >>> 

Re: [DISCUSS] Towards Cascades Optimizer

2020-04-20 Thread hanu mapr
Hello Haisheng,


Thanks for the detailed analysis on the support for cascades framework. I
am quite interested to be part of the new optimization framework. I believe
this a very important infrastructural work to make calcite a robust query
optimizer.


I like your approach on the trait propagation and derivation of the traits
from child nodes. I have a question though in the example you provided.
Shouldn't HashJoin make more sense in place of MergeJoin as it can
passThrough the traits like Hash whereas MergeJoin needs the sorted
traits.(Please correct me if I am missing anything here).


How are you imagining the new Optimizer interface from user point of view.
Are you planning to use the same interface as that of VolcanoPlanner?


The concern about the optimizer can take up huge time for optimizing large
queries is a genuine. At the time when I was part of writing an optimizer
for a federated engine, I also dealt with it by making the search
time-bound. As mentioned even iteration based approach might be viable but
I think a little more thought might be needed to finalize.


I am of the opinion that option b add a new Planner might be the right
approach. We can continuously work on the new optimizer and make it robust
so that switching to the new optimizer can be seamless and can be
controlled by the user. From my experience any bug in the optimizer is
quite subtle and can have high magnitude degradation in performance, so
tweaking the existing VolcanoOptimizer can be risky.


Thanks,

-Hanumath Maduri

On Mon, Apr 20, 2020 at 11:04 AM Xiening Dai  wrote:

> Hi Jinpeng,
>
> Regarding this comment - I believe there are ways to calculate the logical
> cost, but I think it’s not that simple as  "cardinality * unit_copy_cost.”,
> would  you provide more details of other different ways? Just the algorithm
> description or pseudo code would help us understand. Thanks.
>
> Regarding the approach of creating new planner, I don’t think a new
> planner would lower the risk. We don’t know what we don’t know. If we
> introduced an issue while modifying the planner, most likely we would do
> the same with new planner class. A new planner doesn’t necessarily prevent
> the issue from happening, but just delay its surfacing, which is worse IMO.
>
> There’s one obvious benefit with new planner is that we can provide some
> sort of isolation so the change won’t cause test baseline updates, which
> could be painful at times.  We should see if we could use planner config to
> achieve the same if we decide to just modify the current planner.
>
>
> > On Apr 20, 2020, at 8:32 AM, Haisheng Yuan  wrote:
> >
> > Igor,
> >
> > That's great.
> >
> > On 2020/04/20 11:17:49, Seliverstov Igor  wrote:
> >> Haisheng, Xiening,
> >>
> >> Ok, Now I see how it should work.
> >>
> >> Thanks for your replies.
> >>
> >> Regards,
> >> Igor
> >>
> >>> 20 апр. 2020 г., в 09:56, Seliverstov Igor 
> написал(а):
> >>>
> >>> Haisheng, Xiening,
> >>>
> >>> Thanks for clarifying.
> >>>
> >>> In this proposal, we are not trying to split logical and physical
> planning entirely. - actually I was in doubt about an idea of entire
> splitting logical and physical phases, if you aren't going to, I have no
> objections.
> >>>
> >>> But it returns me to my first question: how we will propagate traits
> in bottom-up manner using proposed approach. (See [DISCUSS] Proposal to add
> API to force rules matching specific rels for details)
> >>>
> >>> One of inconveniences of current VolcanoPlanner implementation is
> amount of tricks that we need to get desired behaviour. It would be great
> if some of issues (or all of them) were solved in the new approach.
> >>>
> >>> Regards,
> >>> Igor
> >>>
> >>> пн, 20 апр. 2020 г., 7:02 Xiening Dai  xndai@gmail.com>>:
> >>> Hi Igor,
> >>>
> >>> Your comment - "because actual cost may be calculated correctly using
> physical operators only. So won't be able to implement Branch and Bound
> Space Pruning.“ is actually not true. In Cascade’s lower bound / upper
> bound pruning algorithm, you can get cost lower bound of input RelNode
> using cardinality * unit_copy_cost. The unit_copy_cost is a constant, which
> stands for the minimal cost for processing a tuple from the input. So this
> will be the minimal cost the input RelNode can achieve, and if this is
> indeed larger than the current best cost, this input node can be pruned.
> >>>
> >>> In this proposal, we are not trying to split logical and physical
> planning entirely. But for any given equivalent set, we would need to
> finish exploration before implementation. But for the entire memo tree,
> each set could be in different planning stage, and an equivalent set can be
> pruned even before it’s implemented. A text book example of aforementioned
> “lower bound / upper bound pruning” would be the join ordering case.
> >>>
> >>> Regarding #3, I think we can still achieve that (partially) through
> this proposal. Remember every time when we start the optimization, we 

Re: [DISCUSS] Towards Cascades Optimizer

2020-04-20 Thread Xiening Dai
Hi Jinpeng,

Regarding this comment - I believe there are ways to calculate the logical 
cost, but I think it’s not that simple as  "cardinality * unit_copy_cost.”, 
would  you provide more details of other different ways? Just the algorithm 
description or pseudo code would help us understand. Thanks.

Regarding the approach of creating new planner, I don’t think a new planner 
would lower the risk. We don’t know what we don’t know. If we introduced an 
issue while modifying the planner, most likely we would do the same with new 
planner class. A new planner doesn’t necessarily prevent the issue from 
happening, but just delay its surfacing, which is worse IMO.

There’s one obvious benefit with new planner is that we can provide some sort 
of isolation so the change won’t cause test baseline updates, which could be 
painful at times.  We should see if we could use planner config to achieve the 
same if we decide to just modify the current planner.


> On Apr 20, 2020, at 8:32 AM, Haisheng Yuan  wrote:
> 
> Igor,
> 
> That's great.
> 
> On 2020/04/20 11:17:49, Seliverstov Igor  wrote: 
>> Haisheng, Xiening,
>> 
>> Ok, Now I see how it should work.
>> 
>> Thanks for your replies.
>> 
>> Regards,
>> Igor
>> 
>>> 20 апр. 2020 г., в 09:56, Seliverstov Igor  
>>> написал(а):
>>> 
>>> Haisheng, Xiening,
>>> 
>>> Thanks for clarifying. 
>>> 
>>> In this proposal, we are not trying to split logical and physical planning 
>>> entirely. - actually I was in doubt about an idea of entire splitting 
>>> logical and physical phases, if you aren't going to, I have no objections.
>>> 
>>> But it returns me to my first question: how we will propagate traits in 
>>> bottom-up manner using proposed approach. (See [DISCUSS] Proposal to add 
>>> API to force rules matching specific rels for details)
>>> 
>>> One of inconveniences of current VolcanoPlanner implementation is amount of 
>>> tricks that we need to get desired behaviour. It would be great if some of 
>>> issues (or all of them) were solved in the new approach.
>>> 
>>> Regards,
>>> Igor
>>> 
>>> пн, 20 апр. 2020 г., 7:02 Xiening Dai >> >:
>>> Hi Igor,
>>> 
>>> Your comment - "because actual cost may be calculated correctly using 
>>> physical operators only. So won't be able to implement Branch and Bound 
>>> Space Pruning.“ is actually not true. In Cascade’s lower bound / upper 
>>> bound pruning algorithm, you can get cost lower bound of input RelNode 
>>> using cardinality * unit_copy_cost. The unit_copy_cost is a constant, which 
>>> stands for the minimal cost for processing a tuple from the input. So this 
>>> will be the minimal cost the input RelNode can achieve, and if this is 
>>> indeed larger than the current best cost, this input node can be pruned. 
>>> 
>>> In this proposal, we are not trying to split logical and physical planning 
>>> entirely. But for any given equivalent set, we would need to finish 
>>> exploration before implementation. But for the entire memo tree, each set 
>>> could be in different planning stage, and an equivalent set can be pruned 
>>> even before it’s implemented. A text book example of aforementioned “lower 
>>> bound / upper bound pruning” would be the join ordering case.
>>> 
>>> Regarding #3, I think we can still achieve that (partially) through this 
>>> proposal. Remember every time when we start the optimization, we pass down 
>>> an upper bound limit. Initially this upper bound for root is infinite - 
>>> meaning that no feasible plan is available. Every time when we find a 
>>> physical plan we update this upper bound, then start the search again. We 
>>> could stop the search when the cost is less than a pre-defined threshold - 
>>> which gives you a “good enough” plan with early termination. Still this 
>>> wouldn't avoid the logical exploration. For that, you would probably 
>>> archive through rule configurations, and avoid some expansive 
>>> transformation to keep the cost down. 
>>> 
>>> 
 On Apr 19, 2020, at 7:30 PM, Haisheng Yuan >>> > wrote:
 
 Igor,
 
 a) Given current Calcite's stats derivation strategy, mixing logical and 
 physical planning won't make it better. I hope you went through my email 
 to the end, currently operators inside a memo group don't share stats 
 info, each operator's stats may differ with the other one, and the 
 RelSubset's best node and stats may vary during optimization. So logical 
 and physical rule pruning are not safe at the moment, otherwise it almost 
 implies changing downstream project's cost model silently. 
 
 On the other hand, ensuring child group exploration task finish first will 
 make rule mutual exclusivity check possible, like the result of 
 ReduceExpressionRule won't need trigger the same rule again, The join 
 result of JoinCommuteRule won't need trigger JoinCommuteRule and 
 ReduceExpressionRule again. 
 
 More importantly, 

Re: [DISCUSS] Towards Cascades Optimizer

2020-04-20 Thread Haisheng Yuan
Igor,

That's great.

On 2020/04/20 11:17:49, Seliverstov Igor  wrote: 
> Haisheng, Xiening,
> 
> Ok, Now I see how it should work.
> 
> Thanks for your replies.
> 
> Regards,
> Igor
> 
> > 20 апр. 2020 г., в 09:56, Seliverstov Igor  
> > написал(а):
> > 
> > Haisheng, Xiening,
> > 
> > Thanks for clarifying. 
> > 
> > In this proposal, we are not trying to split logical and physical planning 
> > entirely. - actually I was in doubt about an idea of entire splitting 
> > logical and physical phases, if you aren't going to, I have no objections.
> > 
> > But it returns me to my first question: how we will propagate traits in 
> > bottom-up manner using proposed approach. (See [DISCUSS] Proposal to add 
> > API to force rules matching specific rels for details)
> > 
> > One of inconveniences of current VolcanoPlanner implementation is amount of 
> > tricks that we need to get desired behaviour. It would be great if some of 
> > issues (or all of them) were solved in the new approach.
> > 
> > Regards,
> > Igor
> > 
> > пн, 20 апр. 2020 г., 7:02 Xiening Dai  > >:
> > Hi Igor,
> > 
> > Your comment - "because actual cost may be calculated correctly using 
> > physical operators only. So won't be able to implement Branch and Bound 
> > Space Pruning.“ is actually not true. In Cascade’s lower bound / upper 
> > bound pruning algorithm, you can get cost lower bound of input RelNode 
> > using cardinality * unit_copy_cost. The unit_copy_cost is a constant, which 
> > stands for the minimal cost for processing a tuple from the input. So this 
> > will be the minimal cost the input RelNode can achieve, and if this is 
> > indeed larger than the current best cost, this input node can be pruned. 
> > 
> > In this proposal, we are not trying to split logical and physical planning 
> > entirely. But for any given equivalent set, we would need to finish 
> > exploration before implementation. But for the entire memo tree, each set 
> > could be in different planning stage, and an equivalent set can be pruned 
> > even before it’s implemented. A text book example of aforementioned “lower 
> > bound / upper bound pruning” would be the join ordering case.
> > 
> > Regarding #3, I think we can still achieve that (partially) through this 
> > proposal. Remember every time when we start the optimization, we pass down 
> > an upper bound limit. Initially this upper bound for root is infinite - 
> > meaning that no feasible plan is available. Every time when we find a 
> > physical plan we update this upper bound, then start the search again. We 
> > could stop the search when the cost is less than a pre-defined threshold - 
> > which gives you a “good enough” plan with early termination. Still this 
> > wouldn't avoid the logical exploration. For that, you would probably 
> > archive through rule configurations, and avoid some expansive 
> > transformation to keep the cost down. 
> > 
> > 
> > > On Apr 19, 2020, at 7:30 PM, Haisheng Yuan  > > > wrote:
> > > 
> > > Igor,
> > > 
> > > a) Given current Calcite's stats derivation strategy, mixing logical and 
> > > physical planning won't make it better. I hope you went through my email 
> > > to the end, currently operators inside a memo group don't share stats 
> > > info, each operator's stats may differ with the other one, and the 
> > > RelSubset's best node and stats may vary during optimization. So logical 
> > > and physical rule pruning are not safe at the moment, otherwise it almost 
> > > implies changing downstream project's cost model silently. 
> > > 
> > > On the other hand, ensuring child group exploration task finish first 
> > > will make rule mutual exclusivity check possible, like the result of 
> > > ReduceExpressionRule won't need trigger the same rule again, The join 
> > > result of JoinCommuteRule won't need trigger JoinCommuteRule and 
> > > ReduceExpressionRule again. 
> > > 
> > > More importantly, most if not all the the long planning queries in 
> > > Calcite are not caused by too many alternatives, but mutual triggering, 
> > > or cyclic triggering, which can be avoided by customizing the rules 
> > > instead of using the default one. Unless you use dynamic programming 
> > > (Calcite use heuristic) on join reordering (left-deep, right-deep, 
> > > bushy), space pruning won't help make long / infinite running query 
> > > faster.
> > > 
> > > b) No evidence shows current version of Calcite will return the most 
> > > promising plan in first planning iteration. Instead of praying for 
> > > getting good enough plan in the first iteration, why not focus on fixing 
> > > rules that causes the issue?
> > > 
> > > c) That is not the goal.
> > > 
> > > On 2020/04/19 15:14:57, Seliverstov Igor  > > > wrote: 
> > >> Haisheng,
> > >> 
> > >> From my point of view splitting logical and physical planning parts 
> > >> isn’t a good idea. 
> > >> 
> > >> I think so because actual 

Re: [DISCUSS] Towards Cascades Optimizer

2020-04-20 Thread Seliverstov Igor
Haisheng, Xiening,

Ok, Now I see how it should work.

Thanks for your replies.

Regards,
Igor

> 20 апр. 2020 г., в 09:56, Seliverstov Igor  написал(а):
> 
> Haisheng, Xiening,
> 
> Thanks for clarifying. 
> 
> In this proposal, we are not trying to split logical and physical planning 
> entirely. - actually I was in doubt about an idea of entire splitting logical 
> and physical phases, if you aren't going to, I have no objections.
> 
> But it returns me to my first question: how we will propagate traits in 
> bottom-up manner using proposed approach. (See [DISCUSS] Proposal to add API 
> to force rules matching specific rels for details)
> 
> One of inconveniences of current VolcanoPlanner implementation is amount of 
> tricks that we need to get desired behaviour. It would be great if some of 
> issues (or all of them) were solved in the new approach.
> 
> Regards,
> Igor
> 
> пн, 20 апр. 2020 г., 7:02 Xiening Dai  >:
> Hi Igor,
> 
> Your comment - "because actual cost may be calculated correctly using 
> physical operators only. So won't be able to implement Branch and Bound Space 
> Pruning.“ is actually not true. In Cascade’s lower bound / upper bound 
> pruning algorithm, you can get cost lower bound of input RelNode using 
> cardinality * unit_copy_cost. The unit_copy_cost is a constant, which stands 
> for the minimal cost for processing a tuple from the input. So this will be 
> the minimal cost the input RelNode can achieve, and if this is indeed larger 
> than the current best cost, this input node can be pruned. 
> 
> In this proposal, we are not trying to split logical and physical planning 
> entirely. But for any given equivalent set, we would need to finish 
> exploration before implementation. But for the entire memo tree, each set 
> could be in different planning stage, and an equivalent set can be pruned 
> even before it’s implemented. A text book example of aforementioned “lower 
> bound / upper bound pruning” would be the join ordering case.
> 
> Regarding #3, I think we can still achieve that (partially) through this 
> proposal. Remember every time when we start the optimization, we pass down an 
> upper bound limit. Initially this upper bound for root is infinite - meaning 
> that no feasible plan is available. Every time when we find a physical plan 
> we update this upper bound, then start the search again. We could stop the 
> search when the cost is less than a pre-defined threshold - which gives you a 
> “good enough” plan with early termination. Still this wouldn't avoid the 
> logical exploration. For that, you would probably archive through rule 
> configurations, and avoid some expansive transformation to keep the cost 
> down. 
> 
> 
> > On Apr 19, 2020, at 7:30 PM, Haisheng Yuan  > > wrote:
> > 
> > Igor,
> > 
> > a) Given current Calcite's stats derivation strategy, mixing logical and 
> > physical planning won't make it better. I hope you went through my email to 
> > the end, currently operators inside a memo group don't share stats info, 
> > each operator's stats may differ with the other one, and the RelSubset's 
> > best node and stats may vary during optimization. So logical and physical 
> > rule pruning are not safe at the moment, otherwise it almost implies 
> > changing downstream project's cost model silently. 
> > 
> > On the other hand, ensuring child group exploration task finish first will 
> > make rule mutual exclusivity check possible, like the result of 
> > ReduceExpressionRule won't need trigger the same rule again, The join 
> > result of JoinCommuteRule won't need trigger JoinCommuteRule and 
> > ReduceExpressionRule again. 
> > 
> > More importantly, most if not all the the long planning queries in Calcite 
> > are not caused by too many alternatives, but mutual triggering, or cyclic 
> > triggering, which can be avoided by customizing the rules instead of using 
> > the default one. Unless you use dynamic programming (Calcite use heuristic) 
> > on join reordering (left-deep, right-deep, bushy), space pruning won't help 
> > make long / infinite running query faster.
> > 
> > b) No evidence shows current version of Calcite will return the most 
> > promising plan in first planning iteration. Instead of praying for getting 
> > good enough plan in the first iteration, why not focus on fixing rules that 
> > causes the issue?
> > 
> > c) That is not the goal.
> > 
> > On 2020/04/19 15:14:57, Seliverstov Igor  > > wrote: 
> >> Haisheng,
> >> 
> >> From my point of view splitting logical and physical planning parts isn’t 
> >> a good idea. 
> >> 
> >> I think so because actual cost may be calculated correctly using physical 
> >> operators only. So that we:
> >> a) won't be able to implement Branch and Bound Space Pruning (as far as I 
> >> understand, at exploring time there are no physical operators, no correct 
> >> costs, but assumptions only, I don’t think we 

Re: [DISCUSS] Towards Cascades Optimizer

2020-04-20 Thread Seliverstov Igor
Haisheng, Xiening,

Thanks for clarifying.

*In this proposal, we are not trying to split logical and physical planning
entirely. *- actually I was in doubt about an idea of entire splitting
logical and physical phases, if you aren't going to, I have no objections.

But it returns me to my first question: how we will propagate traits in
bottom-up manner using proposed approach. (See* [DISCUSS] Proposal to add
API to force rules matching specific rels* for details)

One of inconveniences of current VolcanoPlanner implementation is amount of
tricks that we need to get desired behaviour. It would be great if some of
issues (or all of them) were solved in the new approach.

Regards,
Igor

пн, 20 апр. 2020 г., 7:02 Xiening Dai :

> Hi Igor,
>
> Your comment - "because actual cost may be calculated correctly using
> physical operators only. So won't be able to implement Branch and Bound
> Space Pruning.“ is actually not true. In Cascade’s lower bound / upper
> bound pruning algorithm, you can get cost lower bound of input RelNode
> using cardinality * unit_copy_cost. The unit_copy_cost is a constant, which
> stands for the minimal cost for processing a tuple from the input. So this
> will be the minimal cost the input RelNode can achieve, and if this is
> indeed larger than the current best cost, this input node can be pruned.
>
> In this proposal, we are not trying to split logical and physical planning
> entirely. But for any given equivalent set, we would need to finish
> exploration before implementation. But for the entire memo tree, each set
> could be in different planning stage, and an equivalent set can be pruned
> even before it’s implemented. A text book example of aforementioned “lower
> bound / upper bound pruning” would be the join ordering case.
>
> Regarding #3, I think we can still achieve that (partially) through this
> proposal. Remember every time when we start the optimization, we pass down
> an upper bound limit. Initially this upper bound for root is infinite -
> meaning that no feasible plan is available. Every time when we find a
> physical plan we update this upper bound, then start the search again. We
> could stop the search when the cost is less than a pre-defined threshold -
> which gives you a “good enough” plan with early termination. Still this
> wouldn't avoid the logical exploration. For that, you would probably
> archive through rule configurations, and avoid some expansive
> transformation to keep the cost down.
>
>
> > On Apr 19, 2020, at 7:30 PM, Haisheng Yuan  wrote:
> >
> > Igor,
> >
> > a) Given current Calcite's stats derivation strategy, mixing logical and
> physical planning won't make it better. I hope you went through my email to
> the end, currently operators inside a memo group don't share stats info,
> each operator's stats may differ with the other one, and the RelSubset's
> best node and stats may vary during optimization. So logical and physical
> rule pruning are not safe at the moment, otherwise it almost implies
> changing downstream project's cost model silently.
> >
> > On the other hand, ensuring child group exploration task finish first
> will make rule mutual exclusivity check possible, like the result of
> ReduceExpressionRule won't need trigger the same rule again, The join
> result of JoinCommuteRule won't need trigger JoinCommuteRule and
> ReduceExpressionRule again.
> >
> > More importantly, most if not all the the long planning queries in
> Calcite are not caused by too many alternatives, but mutual triggering, or
> cyclic triggering, which can be avoided by customizing the rules instead of
> using the default one. Unless you use dynamic programming (Calcite use
> heuristic) on join reordering (left-deep, right-deep, bushy), space pruning
> won't help make long / infinite running query faster.
> >
> > b) No evidence shows current version of Calcite will return the most
> promising plan in first planning iteration. Instead of praying for getting
> good enough plan in the first iteration, why not focus on fixing rules that
> causes the issue?
> >
> > c) That is not the goal.
> >
> > On 2020/04/19 15:14:57, Seliverstov Igor  wrote:
> >> Haisheng,
> >>
> >> From my point of view splitting logical and physical planning parts
> isn’t a good idea.
> >>
> >> I think so because actual cost may be calculated correctly using
> physical operators only. So that we:
> >> a) won't be able to implement Branch and Bound Space Pruning (as far as
> I understand, at exploring time there are no physical operators, no correct
> costs, but assumptions only, I don’t think we should rely on them)
> >> b) won’t be able to get good enough plan (without Branch and Bound
> Space Pruning it’s non-trivial task to get right search direction and most
> promising plans in first planning iterations)
> >> c) won’t be able to tune the good enough plan during several similar
> queries are executed
> >>
> >> Regards,
> >> Igor
> >>
> >>
> >>> 19 апр. 2020 г., в 17:37, Haisheng Yuan  

Re: [DISCUSS] Towards Cascades Optimizer

2020-04-20 Thread 吴金朋
Hi, Haisheng and Igor.

I think we do need the ability for logical space pruning. But we can
achieve it step by step. In the first trial, we implement and optimize a
rel only after it is fully explored. And then, after solving problems like
group sharing stats and logical cost accuracy, we move it forward.
Haisheng's current design does not prevent future refactoring.

I believe there are ways to calculate the logical cost, but I think it's
not that simple as  "cardinality * unit_copy_cost.". The "current best
cost" is computed by the actual cost model from physical plan, and it is
not comparable to such a simple cost model. To achieve this goal, it may
require some modification to the RelNode interface.


For the discussion mentioned in last part of Haisheng's proposal, I am more
into the second way: Add a new Planner. Here is my opinions:

1. This proposal is a long term work item. It may take months or even years
before it is ready for production. The product I am working on is serving
millions of queries every day. Sometimes a simple change may cause big
problems. I cannot afford the risk and testing efforts if there are always
so many big changes every release.

2. This work is more an exploration than a step by step work. Sometimes you
may prefer an implementation and after moving forward to the next goal you
find another design may be better. It may contain refactoring on
refactoring. Modifying the VolcanoPlanner directly not only introduces more
risk but also makes it more difficult to design as you need to keep
backward compatible always in mind. With a new planner, we can try things
wilder. We can even commit some useless staged code to help cooperations.

3. Though it is a new planner, it still satisfy the Planner interface. And
I see not difficulties switching between VolcanoPlanner and the new one.
Actually a simple factory that creates planner instances by config is the
only effort it needs. This applies to test cases, too. So all the tests are
sharable. And this feature, switching between old and new implementation
with one config, is very helpful for online service publishing.

4. The new planner can derive from the VolcanoPlanner and share the same
data structure. So it does not introduce more coding efforts than modifying
the VolcanoPlanner directly. And, with such derivation, after some
achievements are considered stable and helpful, it is also possible to
apply the changes to the VolcanoPlanner directly to help improving the old
planner.


Xiening Dai  于2020年4月20日周一 下午12:02写道:

> Hi Igor,
>
> Your comment - "because actual cost may be calculated correctly using
> physical operators only. So won't be able to implement Branch and Bound
> Space Pruning.“ is actually not true. In Cascade’s lower bound / upper
> bound pruning algorithm, you can get cost lower bound of input RelNode
> using cardinality * unit_copy_cost. The unit_copy_cost is a constant, which
> stands for the minimal cost for processing a tuple from the input. So this
> will be the minimal cost the input RelNode can achieve, and if this is
> indeed larger than the current best cost, this input node can be pruned.
>
> In this proposal, we are not trying to split logical and physical planning
> entirely. But for any given equivalent set, we would need to finish
> exploration before implementation. But for the entire memo tree, each set
> could be in different planning stage, and an equivalent set can be pruned
> even before it’s implemented. A text book example of aforementioned “lower
> bound / upper bound pruning” would be the join ordering case.
>
> Regarding #3, I think we can still achieve that (partially) through this
> proposal. Remember every time when we start the optimization, we pass down
> an upper bound limit. Initially this upper bound for root is infinite -
> meaning that no feasible plan is available. Every time when we find a
> physical plan we update this upper bound, then start the search again. We
> could stop the search when the cost is less than a pre-defined threshold -
> which gives you a “good enough” plan with early termination. Still this
> wouldn't avoid the logical exploration. For that, you would probably
> archive through rule configurations, and avoid some expansive
> transformation to keep the cost down.
>
>
> > On Apr 19, 2020, at 7:30 PM, Haisheng Yuan  wrote:
> >
> > Igor,
> >
> > a) Given current Calcite's stats derivation strategy, mixing logical and
> physical planning won't make it better. I hope you went through my email to
> the end, currently operators inside a memo group don't share stats info,
> each operator's stats may differ with the other one, and the RelSubset's
> best node and stats may vary during optimization. So logical and physical
> rule pruning are not safe at the moment, otherwise it almost implies
> changing downstream project's cost model silently.
> >
> > On the other hand, ensuring child group exploration task finish first
> will make rule mutual exclusivity 

Re: [DISCUSS] Towards Cascades Optimizer

2020-04-19 Thread Xiening Dai
Hi Igor,

Your comment - "because actual cost may be calculated correctly using physical 
operators only. So won't be able to implement Branch and Bound Space Pruning.“ 
is actually not true. In Cascade’s lower bound / upper bound pruning algorithm, 
you can get cost lower bound of input RelNode using cardinality * 
unit_copy_cost. The unit_copy_cost is a constant, which stands for the minimal 
cost for processing a tuple from the input. So this will be the minimal cost 
the input RelNode can achieve, and if this is indeed larger than the current 
best cost, this input node can be pruned. 

In this proposal, we are not trying to split logical and physical planning 
entirely. But for any given equivalent set, we would need to finish exploration 
before implementation. But for the entire memo tree, each set could be in 
different planning stage, and an equivalent set can be pruned even before it’s 
implemented. A text book example of aforementioned “lower bound / upper bound 
pruning” would be the join ordering case.

Regarding #3, I think we can still achieve that (partially) through this 
proposal. Remember every time when we start the optimization, we pass down an 
upper bound limit. Initially this upper bound for root is infinite - meaning 
that no feasible plan is available. Every time when we find a physical plan we 
update this upper bound, then start the search again. We could stop the search 
when the cost is less than a pre-defined threshold - which gives you a “good 
enough” plan with early termination. Still this wouldn't avoid the logical 
exploration. For that, you would probably archive through rule configurations, 
and avoid some expansive transformation to keep the cost down. 


> On Apr 19, 2020, at 7:30 PM, Haisheng Yuan  wrote:
> 
> Igor,
> 
> a) Given current Calcite's stats derivation strategy, mixing logical and 
> physical planning won't make it better. I hope you went through my email to 
> the end, currently operators inside a memo group don't share stats info, each 
> operator's stats may differ with the other one, and the RelSubset's best node 
> and stats may vary during optimization. So logical and physical rule pruning 
> are not safe at the moment, otherwise it almost implies changing downstream 
> project's cost model silently. 
> 
> On the other hand, ensuring child group exploration task finish first will 
> make rule mutual exclusivity check possible, like the result of 
> ReduceExpressionRule won't need trigger the same rule again, The join result 
> of JoinCommuteRule won't need trigger JoinCommuteRule and 
> ReduceExpressionRule again. 
> 
> More importantly, most if not all the the long planning queries in Calcite 
> are not caused by too many alternatives, but mutual triggering, or cyclic 
> triggering, which can be avoided by customizing the rules instead of using 
> the default one. Unless you use dynamic programming (Calcite use heuristic) 
> on join reordering (left-deep, right-deep, bushy), space pruning won't help 
> make long / infinite running query faster.
> 
> b) No evidence shows current version of Calcite will return the most 
> promising plan in first planning iteration. Instead of praying for getting 
> good enough plan in the first iteration, why not focus on fixing rules that 
> causes the issue?
> 
> c) That is not the goal.
> 
> On 2020/04/19 15:14:57, Seliverstov Igor  wrote: 
>> Haisheng,
>> 
>> From my point of view splitting logical and physical planning parts isn’t a 
>> good idea. 
>> 
>> I think so because actual cost may be calculated correctly using physical 
>> operators only. So that we:
>> a) won't be able to implement Branch and Bound Space Pruning (as far as I 
>> understand, at exploring time there are no physical operators, no correct 
>> costs, but assumptions only, I don’t think we should rely on them)
>> b) won’t be able to get good enough plan (without Branch and Bound Space 
>> Pruning it’s non-trivial task to get right search direction and most 
>> promising plans in first planning iterations)
>> c) won’t be able to tune the good enough plan during several similar queries 
>> are executed
>> 
>> Regards,
>> Igor
>> 
>> 
>>> 19 апр. 2020 г., в 17:37, Haisheng Yuan  написал(а):
>>> 
>>> Hi Igor,
>>> 
>>> You can't have your cake and eat it.
>>> But one feasible work item definitely we can do is that once timeout, stop 
>>> exploring, use the first available physical operator in each group and 
>>> optimize it.
>>> 
>>> Because most, if not all, of the long / infinite running optimizations are 
>>> caused by project related transpose, join reordering (join commute + 
>>> associative), constant reduction for large expression (see CALCITE-3460), 
>>> all of which are logical transformations rules and many of which have 
>>> corresponding JIRAs. So another alternative is, let's fix these bugs to 
>>> improve Calcite to make timeout option less usable.
>>> 
>>> Another thing worth noting is that sql server optimizer timeout mechanism 

Re: [DISCUSS] Towards Cascades Optimizer

2020-04-19 Thread Haisheng Yuan
Igor,

a) Given current Calcite's stats derivation strategy, mixing logical and 
physical planning won't make it better. I hope you went through my email to the 
end, currently operators inside a memo group don't share stats info, each 
operator's stats may differ with the other one, and the RelSubset's best node 
and stats may vary during optimization. So logical and physical rule pruning 
are not safe at the moment, otherwise it almost implies changing downstream 
project's cost model silently. 

On the other hand, ensuring child group exploration task finish first will make 
rule mutual exclusivity check possible, like the result of ReduceExpressionRule 
won't need trigger the same rule again, The join result of JoinCommuteRule 
won't need trigger JoinCommuteRule and ReduceExpressionRule again. 

More importantly, most if not all the the long planning queries in Calcite are 
not caused by too many alternatives, but mutual triggering, or cyclic 
triggering, which can be avoided by customizing the rules instead of using the 
default one. Unless you use dynamic programming (Calcite use heuristic) on join 
reordering (left-deep, right-deep, bushy), space pruning won't help make long / 
infinite running query faster.

b) No evidence shows current version of Calcite will return the most promising 
plan in first planning iteration. Instead of praying for getting good enough 
plan in the first iteration, why not focus on fixing rules that causes the 
issue?

c) That is not the goal.

On 2020/04/19 15:14:57, Seliverstov Igor  wrote: 
> Haisheng,
> 
> From my point of view splitting logical and physical planning parts isn’t a 
> good idea. 
> 
> I think so because actual cost may be calculated correctly using physical 
> operators only. So that we:
>  a) won't be able to implement Branch and Bound Space Pruning (as far as I 
> understand, at exploring time there are no physical operators, no correct 
> costs, but assumptions only, I don’t think we should rely on them)
>  b) won’t be able to get good enough plan (without Branch and Bound Space 
> Pruning it’s non-trivial task to get right search direction and most 
> promising plans in first planning iterations)
>  c) won’t be able to tune the good enough plan during several similar queries 
> are executed
> 
> Regards,
> Igor
> 
> 
> > 19 апр. 2020 г., в 17:37, Haisheng Yuan  написал(а):
> > 
> > Hi Igor,
> > 
> > You can't have your cake and eat it.
> > But one feasible work item definitely we can do is that once timeout, stop 
> > exploring, use the first available physical operator in each group and 
> > optimize it.
> > 
> > Because most, if not all, of the long / infinite running optimizations are 
> > caused by project related transpose, join reordering (join commute + 
> > associative), constant reduction for large expression (see CALCITE-3460), 
> > all of which are logical transformations rules and many of which have 
> > corresponding JIRAs. So another alternative is, let's fix these bugs to 
> > improve Calcite to make timeout option less usable.
> > 
> > Another thing worth noting is that sql server optimizer timeout mechanism 
> > is not based on clock time, but the number of optimization tasks it has 
> > done [1].
> > 
> > [1] 
> > https://techcommunity.microsoft.com/t5/sql-server-support/understanding-optimizer-timeout-and-how-complex-queries-can-be/ba-p/319188
> > 
> > Regards,
> > Haisheng 
> > 
> > On 2020/04/19 11:31:27, Seliverstov Igor  wrote: 
> >> Haisheng,
> >> 
> >> Ok, then such notification isn't needed
> >> 
> >> But in this case we don't have any control over how long planning takes
> >> 
> >> For some systems it's necessary to get good enough plan right now instead
> >> of best one after while
> >> 
> >> For example we've been considering a case when a query is optimised several
> >> times in short iterations in case it's impossible to get the best plan in
> >> reasonable period of time (for example there is an SLA for response time)
> >> 
> >> This mean we need all needed physical implementations after each logical
> >> transformation is applied.
> >> 
> >> Regards,
> >> Igor
> >> 
> >> 
> >> вс, 19 апр. 2020 г., 13:55 Haisheng Yuan :
> >> 
> >>> Hi Igor,
> >>> 
> >>> There will be no rule queue anymore. Y will be fully explored (logical
> >>> rules are matched and applied) before it can be implemented and optimized.
> >>> 
> >>> Thanks,
> >>> Haisheng
> >>> 
> >>> On 2020/04/19 10:11:45, Seliverstov Igor  wrote:
>  Hi Haisheng,
>  
>  Great explanation, thanks.
>  
>  One thing I'd like to cover in advance is trait propagation process (I
> >>> need
>  it for Apache Ignite SQL engine implementation).
>  
>  For example:
>  
>  There are two nodes: Rel X and its child node Rel Y
>  
>  Both nodes are in Optimized state, and there is a Logical rule for Rel Y
> >>> in
>  a rules queue matched,
>  
>  After logical rule is applied, there is a logical rel Rel Y' and
> >>> 

Re: [DISCUSS] Towards Cascades Optimizer

2020-04-19 Thread Seliverstov Igor
Haisheng,

From my point of view splitting logical and physical planning parts isn’t a 
good idea. 

I think so because actual cost may be calculated correctly using physical 
operators only. So that we:
 a) won't be able to implement Branch and Bound Space Pruning (as far as I 
understand, at exploring time there are no physical operators, no correct 
costs, but assumptions only, I don’t think we should rely on them)
 b) won’t be able to get good enough plan (without Branch and Bound Space 
Pruning it’s non-trivial task to get right search direction and most promising 
plans in first planning iterations)
 c) won’t be able to tune the good enough plan during several similar queries 
are executed

Regards,
Igor


> 19 апр. 2020 г., в 17:37, Haisheng Yuan  написал(а):
> 
> Hi Igor,
> 
> You can't have your cake and eat it.
> But one feasible work item definitely we can do is that once timeout, stop 
> exploring, use the first available physical operator in each group and 
> optimize it.
> 
> Because most, if not all, of the long / infinite running optimizations are 
> caused by project related transpose, join reordering (join commute + 
> associative), constant reduction for large expression (see CALCITE-3460), all 
> of which are logical transformations rules and many of which have 
> corresponding JIRAs. So another alternative is, let's fix these bugs to 
> improve Calcite to make timeout option less usable.
> 
> Another thing worth noting is that sql server optimizer timeout mechanism is 
> not based on clock time, but the number of optimization tasks it has done [1].
> 
> [1] 
> https://techcommunity.microsoft.com/t5/sql-server-support/understanding-optimizer-timeout-and-how-complex-queries-can-be/ba-p/319188
> 
> Regards,
> Haisheng 
> 
> On 2020/04/19 11:31:27, Seliverstov Igor  wrote: 
>> Haisheng,
>> 
>> Ok, then such notification isn't needed
>> 
>> But in this case we don't have any control over how long planning takes
>> 
>> For some systems it's necessary to get good enough plan right now instead
>> of best one after while
>> 
>> For example we've been considering a case when a query is optimised several
>> times in short iterations in case it's impossible to get the best plan in
>> reasonable period of time (for example there is an SLA for response time)
>> 
>> This mean we need all needed physical implementations after each logical
>> transformation is applied.
>> 
>> Regards,
>> Igor
>> 
>> 
>> вс, 19 апр. 2020 г., 13:55 Haisheng Yuan :
>> 
>>> Hi Igor,
>>> 
>>> There will be no rule queue anymore. Y will be fully explored (logical
>>> rules are matched and applied) before it can be implemented and optimized.
>>> 
>>> Thanks,
>>> Haisheng
>>> 
>>> On 2020/04/19 10:11:45, Seliverstov Igor  wrote:
 Hi Haisheng,
 
 Great explanation, thanks.
 
 One thing I'd like to cover in advance is trait propagation process (I
>>> need
 it for Apache Ignite SQL engine implementation).
 
 For example:
 
 There are two nodes: Rel X and its child node Rel Y
 
 Both nodes are in Optimized state, and there is a Logical rule for Rel Y
>>> in
 a rules queue matched,
 
 After logical rule is applied, there is a logical rel Rel Y' and
>>> containing
 it set moves into Exploring state (since there is a logical node which
>>> has
 to be implemented)
 
 After whole process Exploring -> ... -> Optimized we need to force parent
 Rel X set to switch its state from Optimized to Optimizing to peek the
 newly implemented child and (possible) produce a new Rel X' physical rel
>>> on
 the basis of previously requested from the Rel X set traits.
 
 If a new Rel X' is produced, a Rel X' parent should move its state
 Optimized -> Optimizing and repeat described above operations.
 
 Does it look like true?
 
 Regards,
 Igor
 
 
 вс, 19 апр. 2020 г., 6:52 Haisheng Yuan :
 
> Hi,
> 
> In the past few months, we have discussed a lot about Cascades style
> top-down optimization, including on-demand trait derivation/request,
>>> rule
> apply, branch and bound space pruning. Now we think it is time to move
> towards these targets.
> 
> We will separate it into several small issues, and each one can be
> integrated as a standalone, independent feature, and most importantly,
> meanwhile keep backward compatibility.
> 
> 1. Top-down trait request
> In other words, pass traits requirements from parent nodes to child
>>> nodes.
> The trait requests happens after all the logical transformation rules
>>> and
> physical implementation rules are done, in a top-down manner, driven
>>> from
> root set. e.g.:
> SELECT a, sum(c) FROM
>(SELECT * FROM R JOIN S USING (a, b)) t
> GROUP BY a;
> 
> Suppose we have the following plan tree in the MEMO, and let's only
> consider distribution for simplicity, each group represents a 

Re: [DISCUSS] Towards Cascades Optimizer

2020-04-19 Thread Haisheng Yuan
Hi Igor,

You can't have your cake and eat it.
But one feasible work item definitely we can do is that once timeout, stop 
exploring, use the first available physical operator in each group and optimize 
it.

Because most, if not all, of the long / infinite running optimizations are 
caused by project related transpose, join reordering (join commute + 
associative), constant reduction for large expression (see CALCITE-3460), all 
of which are logical transformations rules and many of which have corresponding 
JIRAs. So another alternative is, let's fix these bugs to improve Calcite to 
make timeout option less usable.

Another thing worth noting is that sql server optimizer timeout mechanism is 
not based on clock time, but the number of optimization tasks it has done [1].

[1] 
https://techcommunity.microsoft.com/t5/sql-server-support/understanding-optimizer-timeout-and-how-complex-queries-can-be/ba-p/319188

Regards,
Haisheng 

On 2020/04/19 11:31:27, Seliverstov Igor  wrote: 
> Haisheng,
> 
> Ok, then such notification isn't needed
> 
> But in this case we don't have any control over how long planning takes
> 
> For some systems it's necessary to get good enough plan right now instead
> of best one after while
> 
> For example we've been considering a case when a query is optimised several
> times in short iterations in case it's impossible to get the best plan in
> reasonable period of time (for example there is an SLA for response time)
> 
> This mean we need all needed physical implementations after each logical
> transformation is applied.
> 
> Regards,
> Igor
> 
> 
> вс, 19 апр. 2020 г., 13:55 Haisheng Yuan :
> 
> > Hi Igor,
> >
> > There will be no rule queue anymore. Y will be fully explored (logical
> > rules are matched and applied) before it can be implemented and optimized.
> >
> > Thanks,
> > Haisheng
> >
> > On 2020/04/19 10:11:45, Seliverstov Igor  wrote:
> > > Hi Haisheng,
> > >
> > > Great explanation, thanks.
> > >
> > > One thing I'd like to cover in advance is trait propagation process (I
> > need
> > > it for Apache Ignite SQL engine implementation).
> > >
> > > For example:
> > >
> > > There are two nodes: Rel X and its child node Rel Y
> > >
> > > Both nodes are in Optimized state, and there is a Logical rule for Rel Y
> > in
> > > a rules queue matched,
> > >
> > > After logical rule is applied, there is a logical rel Rel Y' and
> > containing
> > > it set moves into Exploring state (since there is a logical node which
> > has
> > > to be implemented)
> > >
> > > After whole process Exploring -> ... -> Optimized we need to force parent
> > > Rel X set to switch its state from Optimized to Optimizing to peek the
> > > newly implemented child and (possible) produce a new Rel X' physical rel
> > on
> > > the basis of previously requested from the Rel X set traits.
> > >
> > > If a new Rel X' is produced, a Rel X' parent should move its state
> > > Optimized -> Optimizing and repeat described above operations.
> > >
> > > Does it look like true?
> > >
> > > Regards,
> > > Igor
> > >
> > >
> > > вс, 19 апр. 2020 г., 6:52 Haisheng Yuan :
> > >
> > > > Hi,
> > > >
> > > > In the past few months, we have discussed a lot about Cascades style
> > > > top-down optimization, including on-demand trait derivation/request,
> > rule
> > > > apply, branch and bound space pruning. Now we think it is time to move
> > > > towards these targets.
> > > >
> > > > We will separate it into several small issues, and each one can be
> > > > integrated as a standalone, independent feature, and most importantly,
> > > > meanwhile keep backward compatibility.
> > > >
> > > > 1. Top-down trait request
> > > > In other words, pass traits requirements from parent nodes to child
> > nodes.
> > > > The trait requests happens after all the logical transformation rules
> > and
> > > > physical implementation rules are done, in a top-down manner, driven
> > from
> > > > root set. e.g.:
> > > > SELECT a, sum(c) FROM
> > > > (SELECT * FROM R JOIN S USING (a, b)) t
> > > > GROUP BY a;
> > > >
> > > > Suppose we have the following plan tree in the MEMO, and let's only
> > > > consider distribution for simplicity, each group represents a RelSet
> > in the
> > > > MEMO.
> > > >
> > > >Group 1: Agg on [a]
> > > >Group 2:   +-- MergeJoin on [a, b]
> > > >Group 3:   |--- TableScan R
> > > >Group 4:   +--- TableScan S
> > > >| Group No | Operator| Derived Traits | Required Traits |
> > > >| --- | - | --- | --- |
> > > >| Group 1  | Aggregate| Hash[a] | N/A|
> > > >| Group 2  | MergeJoin| Hash[a,b]  | Hash[a]  |
> > > >| Group 3  | TableScan R | None| Hash[a,b]   |
> > > >| Group 4  | TableScan S | None| Hash[a,b]   |
> > > >
> > > > We will add new interface PhysicalNode (or extending RelNode) with
> > > > 

Re: [DISCUSS] Towards Cascades Optimizer

2020-04-19 Thread Seliverstov Igor
Haisheng,

Ok, then such notification isn't needed

But in this case we don't have any control over how long planning takes

For some systems it's necessary to get good enough plan right now instead
of best one after while

For example we've been considering a case when a query is optimised several
times in short iterations in case it's impossible to get the best plan in
reasonable period of time (for example there is an SLA for response time)

This mean we need all needed physical implementations after each logical
transformation is applied.

Regards,
Igor


вс, 19 апр. 2020 г., 13:55 Haisheng Yuan :

> Hi Igor,
>
> There will be no rule queue anymore. Y will be fully explored (logical
> rules are matched and applied) before it can be implemented and optimized.
>
> Thanks,
> Haisheng
>
> On 2020/04/19 10:11:45, Seliverstov Igor  wrote:
> > Hi Haisheng,
> >
> > Great explanation, thanks.
> >
> > One thing I'd like to cover in advance is trait propagation process (I
> need
> > it for Apache Ignite SQL engine implementation).
> >
> > For example:
> >
> > There are two nodes: Rel X and its child node Rel Y
> >
> > Both nodes are in Optimized state, and there is a Logical rule for Rel Y
> in
> > a rules queue matched,
> >
> > After logical rule is applied, there is a logical rel Rel Y' and
> containing
> > it set moves into Exploring state (since there is a logical node which
> has
> > to be implemented)
> >
> > After whole process Exploring -> ... -> Optimized we need to force parent
> > Rel X set to switch its state from Optimized to Optimizing to peek the
> > newly implemented child and (possible) produce a new Rel X' physical rel
> on
> > the basis of previously requested from the Rel X set traits.
> >
> > If a new Rel X' is produced, a Rel X' parent should move its state
> > Optimized -> Optimizing and repeat described above operations.
> >
> > Does it look like true?
> >
> > Regards,
> > Igor
> >
> >
> > вс, 19 апр. 2020 г., 6:52 Haisheng Yuan :
> >
> > > Hi,
> > >
> > > In the past few months, we have discussed a lot about Cascades style
> > > top-down optimization, including on-demand trait derivation/request,
> rule
> > > apply, branch and bound space pruning. Now we think it is time to move
> > > towards these targets.
> > >
> > > We will separate it into several small issues, and each one can be
> > > integrated as a standalone, independent feature, and most importantly,
> > > meanwhile keep backward compatibility.
> > >
> > > 1. Top-down trait request
> > > In other words, pass traits requirements from parent nodes to child
> nodes.
> > > The trait requests happens after all the logical transformation rules
> and
> > > physical implementation rules are done, in a top-down manner, driven
> from
> > > root set. e.g.:
> > > SELECT a, sum(c) FROM
> > > (SELECT * FROM R JOIN S USING (a, b)) t
> > > GROUP BY a;
> > >
> > > Suppose we have the following plan tree in the MEMO, and let's only
> > > consider distribution for simplicity, each group represents a RelSet
> in the
> > > MEMO.
> > >
> > >Group 1: Agg on [a]
> > >Group 2:   +-- MergeJoin on [a, b]
> > >Group 3:   |--- TableScan R
> > >Group 4:   +--- TableScan S
> > >| Group No | Operator| Derived Traits | Required Traits |
> > >| --- | - | --- | --- |
> > >| Group 1  | Aggregate| Hash[a] | N/A|
> > >| Group 2  | MergeJoin| Hash[a,b]  | Hash[a]  |
> > >| Group 3  | TableScan R | None| Hash[a,b]   |
> > >| Group 4  | TableScan S | None| Hash[a,b]   |
> > >
> > > We will add new interface PhysicalNode (or extending RelNode) with
> > > methods:
> > >
> > > - Pair> requireTraits(RelTraitSet
> required);
> > >   pair.left is the current operator's new traitset, pair.right is the
> list
> > > of children's required traitset.
> > >
> > > - RelNode passThrough(RelTraitSet required);
> > >   Default implementation will call above method requireTraits() and
> > > RelNode.copy() to create new RelNode. Available to be overriden for
> > > physical operators to customize the logic.
> > >
> > > The planner will call above method on MergeJoin operator to pass the
> > > required traits (Hash[a]) to Mergejoin's child operators.
> > >
> > > We will get a new MergeJoin:
> > >  MergeJoin hash[a]
> > >   | TableScan R hash[a] (RelSubset)
> > >   + TableScan S hash[a] (RelSubset)
> > >
> > > Now the MEMO group looks like:
> > >| Group No | Operator| Derived Traits   | Required Traits
> |
> > >| -- |  - |  |
> > > - |
> > >| Group1   | Aggregate   | Hash[a] | N/A
> > > |
> > >| Group2   | MergeJoin   | Hash[a,b], Hash[a]| Hash[a]
> > > |
> > >| Group3   | TableScan R | None| Hash[a,b],
> Hash[a]
> 

Re: [DISCUSS] Towards Cascades Optimizer

2020-04-19 Thread Haisheng Yuan
Hi Igor,

There will be no rule queue anymore. Y will be fully explored (logical rules 
are matched and applied) before it can be implemented and optimized. 

Thanks,
Haisheng

On 2020/04/19 10:11:45, Seliverstov Igor  wrote: 
> Hi Haisheng,
> 
> Great explanation, thanks.
> 
> One thing I'd like to cover in advance is trait propagation process (I need
> it for Apache Ignite SQL engine implementation).
> 
> For example:
> 
> There are two nodes: Rel X and its child node Rel Y
> 
> Both nodes are in Optimized state, and there is a Logical rule for Rel Y in
> a rules queue matched,
> 
> After logical rule is applied, there is a logical rel Rel Y' and containing
> it set moves into Exploring state (since there is a logical node which has
> to be implemented)
> 
> After whole process Exploring -> ... -> Optimized we need to force parent
> Rel X set to switch its state from Optimized to Optimizing to peek the
> newly implemented child and (possible) produce a new Rel X' physical rel on
> the basis of previously requested from the Rel X set traits.
> 
> If a new Rel X' is produced, a Rel X' parent should move its state
> Optimized -> Optimizing and repeat described above operations.
> 
> Does it look like true?
> 
> Regards,
> Igor
> 
> 
> вс, 19 апр. 2020 г., 6:52 Haisheng Yuan :
> 
> > Hi,
> >
> > In the past few months, we have discussed a lot about Cascades style
> > top-down optimization, including on-demand trait derivation/request, rule
> > apply, branch and bound space pruning. Now we think it is time to move
> > towards these targets.
> >
> > We will separate it into several small issues, and each one can be
> > integrated as a standalone, independent feature, and most importantly,
> > meanwhile keep backward compatibility.
> >
> > 1. Top-down trait request
> > In other words, pass traits requirements from parent nodes to child nodes.
> > The trait requests happens after all the logical transformation rules and
> > physical implementation rules are done, in a top-down manner, driven from
> > root set. e.g.:
> > SELECT a, sum(c) FROM
> > (SELECT * FROM R JOIN S USING (a, b)) t
> > GROUP BY a;
> >
> > Suppose we have the following plan tree in the MEMO, and let's only
> > consider distribution for simplicity, each group represents a RelSet in the
> > MEMO.
> >
> >Group 1: Agg on [a]
> >Group 2:   +-- MergeJoin on [a, b]
> >Group 3:   |--- TableScan R
> >Group 4:   +--- TableScan S
> >| Group No | Operator| Derived Traits | Required Traits |
> >| --- | - | --- | --- |
> >| Group 1  | Aggregate| Hash[a] | N/A|
> >| Group 2  | MergeJoin| Hash[a,b]  | Hash[a]  |
> >| Group 3  | TableScan R | None| Hash[a,b]   |
> >| Group 4  | TableScan S | None| Hash[a,b]   |
> >
> > We will add new interface PhysicalNode (or extending RelNode) with
> > methods:
> >
> > - Pair> requireTraits(RelTraitSet required);
> >   pair.left is the current operator's new traitset, pair.right is the list
> > of children's required traitset.
> >
> > - RelNode passThrough(RelTraitSet required);
> >   Default implementation will call above method requireTraits() and
> > RelNode.copy() to create new RelNode. Available to be overriden for
> > physical operators to customize the logic.
> >
> > The planner will call above method on MergeJoin operator to pass the
> > required traits (Hash[a]) to Mergejoin's child operators.
> >
> > We will get a new MergeJoin:
> >  MergeJoin hash[a]
> >   | TableScan R hash[a] (RelSubset)
> >   + TableScan S hash[a] (RelSubset)
> >
> > Now the MEMO group looks like:
> >| Group No | Operator| Derived Traits   | Required Traits  |
> >| -- |  - |  |
> > - |
> >| Group1   | Aggregate   | Hash[a] | N/A
> > |
> >| Group2   | MergeJoin   | Hash[a,b], Hash[a]| Hash[a]
> > |
> >| Group3   | TableScan R | None| Hash[a,b], Hash[a]
> > |
> >| Group4   | TableScan S | None| Hash[a,b], Hash[a]
> > |
> >
> > Calcite user may choose to ignore / not implement the interface to keep
> > the original behavior. Each physical operator, according to its own logic,
> > decides whether passThrough() should pass traits down or not by returning:
> > - a non-null RelNode, which means it can pass down
> > - null object, which means can't pass down
> >
> > 2. Provide option to disable AbstractConverter
> > Once the plan can request traits in top-down way in the framework, many
> > system don't need AbstractConverter anymore, since it is just a
> > intermediate operator to generate physical sort / exchange. For those, we
> > can provide option to disable AbstractConverter, generate physical enforcer
> > directly by adding a method to interface 

Re: [DISCUSS] Towards Cascades Optimizer

2020-04-19 Thread Seliverstov Igor
Hi Haisheng,

Great explanation, thanks.

One thing I'd like to cover in advance is trait propagation process (I need
it for Apache Ignite SQL engine implementation).

For example:

There are two nodes: Rel X and its child node Rel Y

Both nodes are in Optimized state, and there is a Logical rule for Rel Y in
a rules queue matched,

After logical rule is applied, there is a logical rel Rel Y' and containing
it set moves into Exploring state (since there is a logical node which has
to be implemented)

After whole process Exploring -> ... -> Optimized we need to force parent
Rel X set to switch its state from Optimized to Optimizing to peek the
newly implemented child and (possible) produce a new Rel X' physical rel on
the basis of previously requested from the Rel X set traits.

If a new Rel X' is produced, a Rel X' parent should move its state
Optimized -> Optimizing and repeat described above operations.

Does it look like true?

Regards,
Igor


вс, 19 апр. 2020 г., 6:52 Haisheng Yuan :

> Hi,
>
> In the past few months, we have discussed a lot about Cascades style
> top-down optimization, including on-demand trait derivation/request, rule
> apply, branch and bound space pruning. Now we think it is time to move
> towards these targets.
>
> We will separate it into several small issues, and each one can be
> integrated as a standalone, independent feature, and most importantly,
> meanwhile keep backward compatibility.
>
> 1. Top-down trait request
> In other words, pass traits requirements from parent nodes to child nodes.
> The trait requests happens after all the logical transformation rules and
> physical implementation rules are done, in a top-down manner, driven from
> root set. e.g.:
> SELECT a, sum(c) FROM
> (SELECT * FROM R JOIN S USING (a, b)) t
> GROUP BY a;
>
> Suppose we have the following plan tree in the MEMO, and let's only
> consider distribution for simplicity, each group represents a RelSet in the
> MEMO.
>
>Group 1: Agg on [a]
>Group 2:   +-- MergeJoin on [a, b]
>Group 3:   |--- TableScan R
>Group 4:   +--- TableScan S
>| Group No | Operator| Derived Traits | Required Traits |
>| --- | - | --- | --- |
>| Group 1  | Aggregate| Hash[a] | N/A|
>| Group 2  | MergeJoin| Hash[a,b]  | Hash[a]  |
>| Group 3  | TableScan R | None| Hash[a,b]   |
>| Group 4  | TableScan S | None| Hash[a,b]   |
>
> We will add new interface PhysicalNode (or extending RelNode) with
> methods:
>
> - Pair> requireTraits(RelTraitSet required);
>   pair.left is the current operator's new traitset, pair.right is the list
> of children's required traitset.
>
> - RelNode passThrough(RelTraitSet required);
>   Default implementation will call above method requireTraits() and
> RelNode.copy() to create new RelNode. Available to be overriden for
> physical operators to customize the logic.
>
> The planner will call above method on MergeJoin operator to pass the
> required traits (Hash[a]) to Mergejoin's child operators.
>
> We will get a new MergeJoin:
>  MergeJoin hash[a]
>   | TableScan R hash[a] (RelSubset)
>   + TableScan S hash[a] (RelSubset)
>
> Now the MEMO group looks like:
>| Group No | Operator| Derived Traits   | Required Traits  |
>| -- |  - |  |
> - |
>| Group1   | Aggregate   | Hash[a] | N/A
> |
>| Group2   | MergeJoin   | Hash[a,b], Hash[a]| Hash[a]
> |
>| Group3   | TableScan R | None| Hash[a,b], Hash[a]
> |
>| Group4   | TableScan S | None| Hash[a,b], Hash[a]
> |
>
> Calcite user may choose to ignore / not implement the interface to keep
> the original behavior. Each physical operator, according to its own logic,
> decides whether passThrough() should pass traits down or not by returning:
> - a non-null RelNode, which means it can pass down
> - null object, which means can't pass down
>
> 2. Provide option to disable AbstractConverter
> Once the plan can request traits in top-down way in the framework, many
> system don't need AbstractConverter anymore, since it is just a
> intermediate operator to generate physical sort / exchange. For those, we
> can provide option to disable AbstractConverter, generate physical enforcer
> directly by adding a method to interface Convention:
> - RelNode enforce(RelNode node, RelTraitSet traits);
>
> The default implementation may just calling changeTraitsUsingConverters(),
> but people may choose to override it if the system has special needs, like
> several traits must implement together, or the position of collation in
> RelTraitSet is before distribution.
>
> 3. Top-down driven, on-demand rule apply
> For every RelNode in a RelSet, rule is matched and applied sequentially,
>