Re: Re: [DISCUSS] Refactor how planner rules are parameterized

2020-06-14 Thread Haisheng Yuan
Druid adapter rules are not used by Druid. As far as I know, only Hive uses 
these rules, I even think they should be moved to Hive project. :) If any other 
projects other than Hive are using them, please let us know. Basically, these 
Druid rules are not generally applicable. The PR changed many built-in rules to 
specifically accommodate Druid adapter rules, especially with third operand 
that can match any operator, even EnumerableTableScan is legit.

Most converter rules are not designed to be flexible, no one would extend them. 
RelOptRule#operand() methods are deprecated, which implies someday, they will 
be removed and migrated to the new Config. But most down stream projects don't 
need to worry about the their rules' extensibility and compatibility, because 
they can modify their own rules freely, anytime.

At the end of the day, we may find that only a small fraction of transformation 
rules need refactoring, e.g. ProjectFilterTransposeRule. Note that even 
FilterProjectTransposeRule doesn't need refactoring, we can safely remove 
copyFilter and copyProject, it is safe to always copy them, do we really want 
to match physical operators and generate logical alternatives?

Last but the most important, rule operands, IMO, should be out of Config's 
reach. How frequent do we need to change rule operands or its matching class? 
IMO, operands of rule constructor should remain unchanged, RelFactory and Rule 
description can also remain the same, or adapted by Config silently without 
changing the rule itself, if there are no other additional parameters. Other 
than that, everything can be put into Config. Therefore I don't think 
RelOptNewRule is needed, because RelOptRule should be able to integrate with 
Config seamlessly, without breaking anything.

All in all, I prefer the KISS principle, keep it simple, stupid.

So my opinion is +0.5.

Thanks,
Haisheng

On 2020/06/14 23:36:21, Stamatis Zampetakis  wrote: 
> Hello,
> 
> Thanks for putting this together Julian.
> 
> I went quickly over the PR just to grasp better the idea and here are some
> first thoughts.
> 
> I really like the fact that rules creation is now much more uniform than
> before.
> I also like the fact that in some cases subclassing can be avoided thanks
> to the flexible configuration. As a fan of the Liskov
> substitution principle I find it positive to be able to avoid classes such
> as DruidProjectFilterTransposeRule although to be honest I don't know why
> these Druid rules impose those additional restrictions.
> 
> On the other hand, I also feel that the new approach is a bit harder to
> follow than the previous one. The fact that we have more extension points
> gives more flexibility but at the same time complicates the implementation
> a bit. I guess regular committers will not have much trouble to adapt to
> the changes but for newcomers there may be more questions. For instance:
> What do we do when we want to customize the behavior of an existing rule?
> * Use an existing config with different parameters.
> * Extend the config.
> * Extend the rule.
> * Extend the config and the rule.
> * Create a new rule (if yes under which interface RelOptRule or
> RelOptNewRule).
> 
> As Haisheng mentioned, the fact that every rule can be configured with any
> RelOptRuleOperand is also a point possibly deserving more discussion.
> Ideally, the API should be able to guide developers to pass the correct
> configuration parameters and not fail at runtime.
> 
> Overall, I have mixed feelings about the proposed refactoring (definitely
> not something blocking), I guess cause I haven't seen as many use-cases as
> Julian.
> I'm curious to see what others think about the changes. It's a pity to take
> a decision based on the feedback of only two/three people.
> 
> Best,
> Stamatis
> 
> 
> 
> 
> On Sun, Jun 14, 2020 at 9:36 PM Haisheng Yuan  wrote:
> 
> > Hi Julian,
> >
> > Thanks for working on this.
> >
> > We haven't reached a consensus yet.
> >
> > Frankly speaking, I agree with what Stamatis said earlier. Flexibility
> > doesn't come with no cost. Admittedly, with this patch, any rule
> > constructor refactoring won't need to deprecate old constructors and break
> > backward compatibility, however, it also makes the rule definition much
> > more verbose, less readable and understandable. IMHO, it does more harm
> > than good.
> >
> > Let's see how CassandraFilterRule becomes before and after the change.
> >
> > Before this change:
> >
> >   private static class CassandraFilterRule extends RelOptRule {
> > private static final CassandraFilterRule INSTANCE = new
> > CassandraFilterRule();
> >
> > private CassandraFilterRule() {
> >   super(operand(LogicalFilter.class, operand(CassandraTableScan.class,
> > none())),
> >   "CassandraFilterRule");
> > }
> >   }
> >
> > After this change:
> >
> >   private static class CassandraFilterRule
> >   extends RelOptNewRule {
> > private static final CassandraFilterRule 

Re: Why migrate from Maven to Gradle

2020-06-14 Thread Danny Chan
Thanks, are there some building functionalities that grade can do but maven can 
not ? Gradle is not that user friendly for new uses but I works well, it has 
better flexibility than maven with many custom plugins.

Best,
Danny Chan
在 2020年6月13日 +0800 AM1:31,丁小冬(比古) ,写道:
>
> Hi, Calcite developers
>
> I would like to build calcite with Maven, but now calcite only support gradle 
> build. I have three questions:
> 1. Is there any reason that Calcite had to migrate from maven to gradle?
> 2. If I still insist on using maven to build calcite, is there any risk?
> 3. If I want to migrate from gradle back to maven, is there any quick and 
> correct method to do this?
>
> thank you for your attention, this is really important to me, looking forward 
> to your reply
>
> Great thanks
> Winter Ding


Re: Re: [DISCUSS] Refactor how planner rules are parameterized

2020-06-14 Thread Stamatis Zampetakis
Hello,

Thanks for putting this together Julian.

I went quickly over the PR just to grasp better the idea and here are some
first thoughts.

I really like the fact that rules creation is now much more uniform than
before.
I also like the fact that in some cases subclassing can be avoided thanks
to the flexible configuration. As a fan of the Liskov
substitution principle I find it positive to be able to avoid classes such
as DruidProjectFilterTransposeRule although to be honest I don't know why
these Druid rules impose those additional restrictions.

On the other hand, I also feel that the new approach is a bit harder to
follow than the previous one. The fact that we have more extension points
gives more flexibility but at the same time complicates the implementation
a bit. I guess regular committers will not have much trouble to adapt to
the changes but for newcomers there may be more questions. For instance:
What do we do when we want to customize the behavior of an existing rule?
* Use an existing config with different parameters.
* Extend the config.
* Extend the rule.
* Extend the config and the rule.
* Create a new rule (if yes under which interface RelOptRule or
RelOptNewRule).

As Haisheng mentioned, the fact that every rule can be configured with any
RelOptRuleOperand is also a point possibly deserving more discussion.
Ideally, the API should be able to guide developers to pass the correct
configuration parameters and not fail at runtime.

Overall, I have mixed feelings about the proposed refactoring (definitely
not something blocking), I guess cause I haven't seen as many use-cases as
Julian.
I'm curious to see what others think about the changes. It's a pity to take
a decision based on the feedback of only two/three people.

Best,
Stamatis




On Sun, Jun 14, 2020 at 9:36 PM Haisheng Yuan  wrote:

> Hi Julian,
>
> Thanks for working on this.
>
> We haven't reached a consensus yet.
>
> Frankly speaking, I agree with what Stamatis said earlier. Flexibility
> doesn't come with no cost. Admittedly, with this patch, any rule
> constructor refactoring won't need to deprecate old constructors and break
> backward compatibility, however, it also makes the rule definition much
> more verbose, less readable and understandable. IMHO, it does more harm
> than good.
>
> Let's see how CassandraFilterRule becomes before and after the change.
>
> Before this change:
>
>   private static class CassandraFilterRule extends RelOptRule {
> private static final CassandraFilterRule INSTANCE = new
> CassandraFilterRule();
>
> private CassandraFilterRule() {
>   super(operand(LogicalFilter.class, operand(CassandraTableScan.class,
> none())),
>   "CassandraFilterRule");
> }
>   }
>
> After this change:
>
>   private static class CassandraFilterRule
>   extends RelOptNewRule {
> private static final CassandraFilterRule INSTANCE =
> Config.EMPTY
> .withOperandSupplier(b0 ->
> b0.operand(LogicalFilter.class)
> .oneInput(b1 -> b1.operand(CassandraTableScan.class)
> .noInputs()))
> .as(Config.class)
> .toRule();
>
> /** Creates a CassandraFilterRule. */
> protected CassandraFilterRule(Config config) {
>   super(config);
> }
>
> /** Rule configuration. */
> public interface Config extends RelOptNewRule.Config {
>   @Override default CassandraFilterRule toRule() {
> return new CassandraFilterRule(this);
>   }
> }
>   }
>
>
> Intuitively, if we want to check what does the rule generally match or how
> it is defined, we just check the constructor. Now we checkout the
> constructor, only config is there, go to Config, there is still nothing
> interesting, we have to go to the INSTANCE. What is the difference with
> just using operand and optionConfig as the rule constructor?
>
>protected CassandraFilterRule(RelOptRuleOperand operand, Config config)
> {
>  super(operand, config);
>}
>
> Or even simply replace Config with int, with each bit represent an option,
> if all of them are boolean options.
>
> Nothing is more flexible than just using RelOptRuleOperand as the
> parameter, just like the base class RelOptRule does. But do we want it?
>
> At the same time, with the new approach, it is now legit to create the
> following instance:
>
>   private static final CassandraFilterRule INSTANCE2 =
> Config.EMPTY
> .withOperandSupplier(b0 ->
> b0.operand(LogicalProject.class)  // Even the is intended
> to match Filter, but it compiles
> .oneInput(b1 -> b1.operand(CassandraTableScan.class)
> .noInputs()))
> .as(Config.class)
> .toRule();
>
> Before we continue to the discussion and code review, we need to go back
> to the original problem, is it a real problem that is facing us? Is there
> any real demand or just artificial demand? How about we 

Re: Re: [DISCUSS] Refactor how planner rules are parameterized

2020-06-14 Thread Haisheng Yuan
Hi Julian,

Thanks for working on this.

We haven't reached a consensus yet.

Frankly speaking, I agree with what Stamatis said earlier. Flexibility doesn't 
come with no cost. Admittedly, with this patch, any rule constructor 
refactoring won't need to deprecate old constructors and break backward 
compatibility, however, it also makes the rule definition much more verbose, 
less readable and understandable. IMHO, it does more harm than good.

Let's see how CassandraFilterRule becomes before and after the change.

Before this change:

  private static class CassandraFilterRule extends RelOptRule {
private static final CassandraFilterRule INSTANCE = new 
CassandraFilterRule();

private CassandraFilterRule() {
  super(operand(LogicalFilter.class, operand(CassandraTableScan.class, 
none())),
  "CassandraFilterRule");
}
  }

After this change:

  private static class CassandraFilterRule
  extends RelOptNewRule {
private static final CassandraFilterRule INSTANCE =
Config.EMPTY
.withOperandSupplier(b0 ->
b0.operand(LogicalFilter.class)
.oneInput(b1 -> b1.operand(CassandraTableScan.class)
.noInputs()))
.as(Config.class)
.toRule();

/** Creates a CassandraFilterRule. */
protected CassandraFilterRule(Config config) {
  super(config);
}

/** Rule configuration. */
public interface Config extends RelOptNewRule.Config {
  @Override default CassandraFilterRule toRule() {
return new CassandraFilterRule(this);
  }
}
  }


Intuitively, if we want to check what does the rule generally match or how it 
is defined, we just check the constructor. Now we checkout the constructor, 
only config is there, go to Config, there is still nothing interesting, we have 
to go to the INSTANCE. What is the difference with just using operand and 
optionConfig as the rule constructor?

   protected CassandraFilterRule(RelOptRuleOperand operand, Config config) {
 super(operand, config);
   }

Or even simply replace Config with int, with each bit represent an option, if 
all of them are boolean options.

Nothing is more flexible than just using RelOptRuleOperand as the parameter, 
just like the base class RelOptRule does. But do we want it?

At the same time, with the new approach, it is now legit to create the 
following instance:

  private static final CassandraFilterRule INSTANCE2 =
Config.EMPTY
.withOperandSupplier(b0 ->
b0.operand(LogicalProject.class)  // Even the is intended to 
match Filter, but it compiles
.oneInput(b1 -> b1.operand(CassandraTableScan.class)
.noInputs()))
.as(Config.class)
.toRule();

Before we continue to the discussion and code review, we need to go back to the 
original problem, is it a real problem that is facing us? Is there any real 
demand or just artificial demand? How about we conduct a Calcite user survey to 
see how Calcite devs and users think? Then let's see how to move forward.

[+1] Hell yeah, the new approach is awesome, let's go with it.
[+0] Meh, I am OK with current approach, I don't see any burden or problem with 
it.
[-1] Hell no, current approach is bad, the new one is even worse.


Thanks,
Haisheng


On 2020/06/12 23:51:56, Julian Hyde  wrote: 
> There is now a PR: https://github.com/apache/calcite/pull/2024. Can
> people please review?
> 
> Here's the TL;DR:
> 
> Previously, it was not easy to customize, re-use or extend planner
> rules. If you wished to customize a rule (i.e. create a new instance
> of a rule with different properties) you would have to call the rule's
> constructor. Frequently the required constructor did not exist, so we
> would have to add a new constructor and deprecate the old one.
> 
> After this change, you start off from an instance of the rule, modify
> its configuration, and call toRule() on the configuration. (Rule
> constructors are now private, because only the configuration ever
> calls them.)
> 
> A good illustration of this is DruidRules, which used to contain many
> sub-classes. Those sub-classes are no longer needed. Old code:
> 
>   public static final DruidSortProjectTransposeRule SORT_PROJECT_TRANSPOSE =
>   new DruidSortProjectTransposeRule(RelFactories.LOGICAL_BUILDER);
> 
> public static class DruidSortProjectTransposeRule
> extends SortProjectTransposeRule {
>   public DruidSortProjectTransposeRule(RelBuilderFactory
> relBuilderFactory) {
> super(
> operand(Sort.class,
> operand(Project.class, operand(DruidQuery.class, none(,
> relBuilderFactory, null);
>   }
> }
> 
> New code:
> 
>   public static final SortProjectTransposeRule SORT_PROJECT_TRANSPOSE =
>   SortProjectTransposeRule.INSTANCE.config
>   .withOperandFor(Sort.class, Project.class, DruidQuery.class)
>