Hello,

Thanks for putting this together Julian.

I went quickly over the PR just to grasp better the idea and here are some
first thoughts.

I really like the fact that rules creation is now much more uniform than
before.
I also like the fact that in some cases subclassing can be avoided thanks
to the flexible configuration. As a fan of the Liskov
substitution principle I find it positive to be able to avoid classes such
as DruidProjectFilterTransposeRule although to be honest I don't know why
these Druid rules impose those additional restrictions.

On the other hand, I also feel that the new approach is a bit harder to
follow than the previous one. The fact that we have more extension points
gives more flexibility but at the same time complicates the implementation
a bit. I guess regular committers will not have much trouble to adapt to
the changes but for newcomers there may be more questions. For instance:
What do we do when we want to customize the behavior of an existing rule?
* Use an existing config with different parameters.
* Extend the config.
* Extend the rule.
* Extend the config and the rule.
* Create a new rule (if yes under which interface RelOptRule or
RelOptNewRule).

As Haisheng mentioned, the fact that every rule can be configured with any
RelOptRuleOperand is also a point possibly deserving more discussion.
Ideally, the API should be able to guide developers to pass the correct
configuration parameters and not fail at runtime.

Overall, I have mixed feelings about the proposed refactoring (definitely
not something blocking), I guess cause I haven't seen as many use-cases as
Julian.
I'm curious to see what others think about the changes. It's a pity to take
a decision based on the feedback of only two/three people.

Best,
Stamatis




On Sun, Jun 14, 2020 at 9:36 PM Haisheng Yuan <hy...@apache.org> wrote:

> Hi Julian,
>
> Thanks for working on this.
>
> We haven't reached a consensus yet.
>
> Frankly speaking, I agree with what Stamatis said earlier. Flexibility
> doesn't come with no cost. Admittedly, with this patch, any rule
> constructor refactoring won't need to deprecate old constructors and break
> backward compatibility, however, it also makes the rule definition much
> more verbose, less readable and understandable. IMHO, it does more harm
> than good.
>
> Let's see how CassandraFilterRule becomes before and after the change.
>
> Before this change:
>
>   private static class CassandraFilterRule extends RelOptRule {
>     private static final CassandraFilterRule INSTANCE = new
> CassandraFilterRule();
>
>     private CassandraFilterRule() {
>       super(operand(LogicalFilter.class, operand(CassandraTableScan.class,
> none())),
>           "CassandraFilterRule");
>     }
>   }
>
> After this change:
>
>   private static class CassandraFilterRule
>       extends RelOptNewRule<CassandraFilterRule.Config> {
>     private static final CassandraFilterRule INSTANCE =
>         Config.EMPTY
>             .withOperandSupplier(b0 ->
>                 b0.operand(LogicalFilter.class)
>                     .oneInput(b1 -> b1.operand(CassandraTableScan.class)
>                         .noInputs()))
>             .as(Config.class)
>             .toRule();
>
>     /** Creates a CassandraFilterRule. */
>     protected CassandraFilterRule(Config config) {
>       super(config);
>     }
>
>     /** Rule configuration. */
>     public interface Config extends RelOptNewRule.Config {
>       @Override default CassandraFilterRule toRule() {
>         return new CassandraFilterRule(this);
>       }
>     }
>   }
>
>
> Intuitively, if we want to check what does the rule generally match or how
> it is defined, we just check the constructor. Now we checkout the
> constructor, only config is there, go to Config, there is still nothing
> interesting, we have to go to the INSTANCE. What is the difference with
> just using operand and optionConfig as the rule constructor?
>
>    protected CassandraFilterRule(RelOptRuleOperand operand, Config config)
> {
>      super(operand, config);
>    }
>
> Or even simply replace Config with int, with each bit represent an option,
> if all of them are boolean options.
>
> Nothing is more flexible than just using RelOptRuleOperand as the
> parameter, just like the base class RelOptRule does. But do we want it?
>
> At the same time, with the new approach, it is now legit to create the
> following instance:
>
>   private static final CassandraFilterRule INSTANCE2 =
>         Config.EMPTY
>             .withOperandSupplier(b0 ->
>                 b0.operand(LogicalProject.class)  // Even the is intended
> to match Filter, but it compiles
>                     .oneInput(b1 -> b1.operand(CassandraTableScan.class)
>                         .noInputs()))
>             .as(Config.class)
>             .toRule();
>
> Before we continue to the discussion and code review, we need to go back
> to the original problem, is it a real problem that is facing us? Is there
> any real demand or just artificial demand? How about we conduct a Calcite
> user survey to see how Calcite devs and users think? Then let's see how to
> move forward.
>
> [+1] Hell yeah, the new approach is awesome, let's go with it.
> [+0] Meh, I am OK with current approach, I don't see any burden or problem
> with it.
> [-1] Hell no, current approach is bad, the new one is even worse.
>
>
> Thanks,
> Haisheng
>
>
> On 2020/06/12 23:51:56, Julian Hyde <jh...@apache.org> wrote:
> > There is now a PR: https://github.com/apache/calcite/pull/2024. Can
> > people please review?
> >
> > Here's the TL;DR:
> >
> > Previously, it was not easy to customize, re-use or extend planner
> > rules. If you wished to customize a rule (i.e. create a new instance
> > of a rule with different properties) you would have to call the rule's
> > constructor. Frequently the required constructor did not exist, so we
> > would have to add a new constructor and deprecate the old one.
> >
> > After this change, you start off from an instance of the rule, modify
> > its configuration, and call toRule() on the configuration. (Rule
> > constructors are now private, because only the configuration ever
> > calls them.)
> >
> > A good illustration of this is DruidRules, which used to contain many
> > sub-classes. Those sub-classes are no longer needed. Old code:
> >
> >   public static final DruidSortProjectTransposeRule
> SORT_PROJECT_TRANSPOSE =
> >       new DruidSortProjectTransposeRule(RelFactories.LOGICAL_BUILDER);
> >
> >     public static class DruidSortProjectTransposeRule
> >         extends SortProjectTransposeRule {
> >       public DruidSortProjectTransposeRule(RelBuilderFactory
> > relBuilderFactory) {
> >         super(
> >             operand(Sort.class,
> >                 operand(Project.class, operand(DruidQuery.class,
> none()))),
> >             relBuilderFactory, null);
> >       }
> >     }
> >
> > New code:
> >
> >   public static final SortProjectTransposeRule SORT_PROJECT_TRANSPOSE =
> >       SortProjectTransposeRule.INSTANCE.config
> >           .withOperandFor(Sort.class, Project.class, DruidQuery.class)
> >           .toRule();
> >
> > The change maintains backwards compatibility to a large degree. In a
> > few places, I had to change rule instances from type RelOptRule to
> > Supplier<RelOptRule>, to avoid deadlocks during class loading. For
> > instance, instead of writing FilterJoinRule.FILTER_ON_JOIN you must
> > now write FilterJoinRule.FILTER_ON_JOIN.get().
> >
> > Julian
> >
> >
> > On Thu, Apr 16, 2020 at 12:08 PM Julian Hyde <jh...@apache.org> wrote:
> > >
> > > I have now pushed a dev branch with a prototype. Please see
> > > https://issues.apache.org/jira/browse/CALCITE-3923 for details.
> > >
> > > Having built the prototype, I believe that this change is beneficial
> > > and we should do it. But I would like to get to consensus on the
> > > design before we pull the trigger.
> > >
> > > Julian
> > >
> > > On Tue, Apr 14, 2020 at 2:06 PM Julian Hyde <jh...@apache.org> wrote:
> > > >
> > > > Haisheng,
> > > >
> > > > I hear you. I agree that major changes to rules will require new rule
> > > > classes (not merely sub-classes). People should copy-paste, refactor,
> > > > and all that good stuff. But I think there are a lot of cases where
> we
> > > > need to make minor changes to rules (there are many of these in the
> > > > code base already), and this change will help.
> > > >
> > > > I have logged https://issues.apache.org/jira/browse/CALCITE-3923 and
> > > > am going to start working on a prototype. When we have a prototype we
> > > > will be able to assess how big an impact the API change will have.
> > > > (E.g. whether it will be a breaking change.)
> > > >
> > > > Julian
> > > >
> > > > On Sat, Mar 14, 2020 at 8:22 PM Haisheng Yuan <
> h.y...@alibaba-inc.com> wrote:
> > > > >
> > > > > I don't think it is worth the refactoring. People who want to
> customize the rule, in most cases, won't be satisfied by a different
> parameter, they most likely still need to rewrite (copy & paste) the rule
> with some slightly their own logic. For many Calcite users, the rule is not
> reusable even with flexible configurations.
> > > > >
> > > > > - Haisheng
> > > > >
> > > > > ------------------------------------------------------------------
> > > > > 发件人:Stamatis Zampetakis<zabe...@gmail.com>
> > > > > 日 期:2020年03月14日 22:54:04
> > > > > 收件人:<dev@calcite.apache.org>
> > > > > 主 题:Re: [DISCUSS] Refactor how planner rules are parameterized
> > > > >
> > > > > Hello,
> > > > >
> > > > > Apologies for the late reply but I just realised that I had
> written the
> > > > > mail and never pressed the send button.
> > > > >
> > > > > I think it is a nice idea and certainly a problem worth
> addressing. If I
> > > > > understood well you're thinking something like the current
> constructor of
> > > > > the RelBuilder [1] that accepts a Context parameter. Indeed it
> seems that
> > > > > with this change even rules that are not designed to be configured
> can be
> > > > > changed much more gracefully (without adding new constructors and
> breaking
> > > > > changes).
> > > > >
> > > > > On the other hand, some of the advantages that you mention can
> also be
> > > > > turned into disadvantages. For instance, copying a rule without
> knowing the
> > > > > values of the other parameters is a bit risky and might be harder
> to reason
> > > > > about its correctness. Moreover, private constructors, final
> classes, etc.,
> > > > > are primarily used for encapsulation purposes so allowing the
> state of the
> > > > > rule escape somehow breaks the original design of the rule.
> > > > >
> > > > > Another problem with respect to rules is cross convention matching
> and
> > > > > transformations [2]. Many rules should not fire for operands that
> are in
> > > > > different conventions; a typical example that comes in my mind is
> > > > > FilterProjectTransposeRule [3]. In the same spirit most rules
> should not
> > > > > generate mixed convention transformations. Although a different
> problem, I
> > > > > am mentioning it here since it could affect the design of the new
> API.
> > > > >
> > > > > Best,
> > > > > Stamatis
> > > > >
> > > > > [1]
> > > > >
> https://github.com/apache/calcite/blob/f11115a2fe9e360f38910f112288581040e0ced5/core/src/main/java/org/apache/calcite/tools/RelBuilder.java#L155
> > > > >
> > > > > [2]
> > > > >
> https://lists.apache.org/thread.html/da1860f99f8bfd6ec7d26626c428ce1c55480e7c61ae7f83060a40c2%40%3Cdev.calcite.apache.org%3E
> > > > > [3]
> > > > >
> https://github.com/apache/calcite/blob/7c27b147414c64505fa33c947100ece094caa15c/core/src/main/java/org/apache/calcite/rel/rules/FilterProjectTransposeRule.java#L57
> > > > >
> > > > > On Thu, Feb 20, 2020 at 9:20 PM Michael Mior <mm...@apache.org>
> wrote:
> > > > >
> > > > > > This sounds reasonable to me. It also sounds like we could make
> this
> > > > > > backwards compatible by retaining (but deprecating) the existing
> > > > > > constructors and factory methods that will no longer be needed.
> > > > > > --
> > > > > > Michael Mior
> > > > > > mm...@apache.org
> > > > > >
> > > > > > Le jeu. 20 févr. 2020 à 13:11, Julian Hyde <jh...@apache.org> a
> écrit :
> > > > > > >
> > > > > > > I have an idea for a refactoring to RelOptRule. I haven’t
> fully thought
> > > > > > it through, but I’m going to sketch it out here to see whether
> folks agree
> > > > > > about the problems/solutions.
> > > > > > >
> > > > > > > It will be a breaking change (in the sense that people will
> have to
> > > > > > change their code in order to get it to compile) but relatively
> safe (in
> > > > > > that once the code compiles, it will have the same behavior as
> before).
> > > > > > Also it will give Calcite developers and users a lot more
> flexibility going
> > > > > > forward.
> > > > > > >
> > > > > > > The problem I see is that people often want different variants
> of
> > > > > > planner rules. An example is FilterJoinRule, which has a
> 'boolean smart’
> > > > > > parameter, a predicate (which returns whether to pull up filter
> > > > > > conditions), operands (which determine the precise sub-classes
> of RelNode
> > > > > > that the rule should match) and a relBuilderFactory (which
> controls the
> > > > > > type of RelNode created by this rule).
> > > > > > >
> > > > > > > Suppose you have an instance of FilterJoinRule and you want to
> change
> > > > > > ‘smart’ from true to false. The ‘smart’ parameter is immutable
> (good!) but
> > > > > > you can’t easily create a clone of the rule because you don’t
> know the
> > > > > > values of the other parameters. Your instance might even be
> (unbeknownst to
> > > > > > you) a sub-class with extra parameters and a private constructor.
> > > > > > >
> > > > > > > So, my proposal is to put all of the config information of a
> RelOptRule
> > > > > > into a single ‘config’ parameter that contains all relevant
> properties.
> > > > > > Each sub-class of RelOptRule would have one constructor with
> just a
> > > > > > ‘config’ parameter. Each config knows which sub-class of
> RelOptRule to
> > > > > > create. Therefore it is easy to copy a config, change one or more
> > > > > > properties, and create a new rule instance.
> > > > > > >
> > > > > > > Adding a property to a rule’s config does not require us to
> add or
> > > > > > deprecate any constructors.
> > > > > > >
> > > > > > > The operands are part of the config, so if you have a rule
> that matches
> > > > > > a EnumerableFilter on an EnumerableJoin and you want to make it
> match an
> > > > > > EnumerableFilter on an EnumerableNestedLoopJoin, you can easily
> create one
> > > > > > with one changed operand.
> > > > > > >
> > > > > > > The config is immutable and self-describing, so we can use it
> to
> > > > > > automatically generate a unique description for each rule
> instance.
> > > > > > >
> > > > > > > Julian
> > > > > > >
> > > > > > > [1]
> > > > > >
> https://github.com/apache/calcite/blob/5fa41609cb0fe310a0a11d86319d861423850a36/core/src/main/java/org/apache/calcite/rel/rules/FilterJoinRule.java#L93
> > > > > > <
> > > > > >
> https://github.com/apache/calcite/blob/5fa41609cb0fe310a0a11d86319d861423850a36/core/src/main/java/org/apache/calcite/rel/rules/FilterJoinRule.java#L93
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> >
>

Reply via email to