Hi,

Thanks, everybody, for your comments. I want to clarify that my concern is
not the ideas behind the changes (which are completely valid and valuable)
but that users are _forced_ to use the new approach even for simple
patterns. Walking through the peculiarities of RelRule.Config and
Immutables look pretty complicated to define a simple pattern like "A on
top of B".

The previous infrastructure with manually defined operand requirements is
still available via the RelOptRule, but some of the relevant methods are
deprecated, and the operand builder is hidden from the direct usage by the
Java visibility rules. What do you think if we refractor OperandBuilderImpl
and related classes to make them publicly available without the Config
object? This could allow for straightforward Rule construction without
RelRule.Config and static code-generation that could be used for the cases
when complex configuration is not needed, which is a common pattern for
physical rules in the downstream projects?

Regards,
Vladimir.

чт, 14 апр. 2022 г. в 16:06, Julian Hyde <jh...@apache.org>:

> Yanjing,
>
> That is a valid point. The API could perhaps allow people to define rules
> with nodes that are optional. Can you please log a JIRA for this? Include
> the code that you would like to write in order to define a rule and handle
> a match.
>
> Julian
>
> On Thu, Apr 14, 2022 at 2:27 AM Yanjing Wang <zhuangzixiao...@gmail.com>
> wrote:
>
> > Hi, all,
> >
> > I think the rule configuration is not convenient for optional operand,
> such
> > as a match operand tree
> >               Project
> >                   |
> >               Union
> >               /        \
> >             Filter    Filter
> > If I want to compose a rule that the Project operand is optional, I must
> > create two configuration and new two rule instance, if we can make the
> > operand optional, I think I just need one rule and configuration.
> >
> >
> > Viliam Durina <vil...@hazelcast.com.invalid> 于2022年4月14日周四 14:55写道:
> >
> > > I can say for myself that though the migration was a nuisance, after we
> > > figured out what to do, it was straightforward and the current approach
> > > with Immutables is more readable and easier to write than the previous
> > with
> > > `operand` and `operandJ`.
> > >
> > > Viliam
> > >
> > > On Wed, 13 Apr 2022 at 19:23, Julian Hyde <jhyde.apa...@gmail.com>
> > wrote:
> > >
> > > > 1. I agree with Jacques. Thanks, Vladimir, for asking the question
> ‘is
> > > > progress really progress?’. We may not back out these changes
> (probably
> > > > won’t) but it will inform future discussions. It’s valid to say ‘when
> > > did X
> > > > as part of the rule refactoring, that didn’t deliver on promises’.
> > > >
> > > > 2. Thanks, Thomas, for sending the link to the original discussion.
> It
> > is
> > > > still my position that many more people will use existing rules than
> > > write
> > > > their own.
> > > >
> > > > 3. We have effectively created a DSL [1] (especially for operand
> > patterns
> > > > [2]) embedded in Java. Java 8 is not a great language for writing
> DSLs:
> > > > Kotlin, Scala, Lisp and even later versions of Java are superior.
> But a
> > > > design option we should consider is a DSL with its own syntax and
> > parser.
> > > > CockroachDB’s OptGen [3] is an example of this, and there is much to
> > > like.
> > > >
> > > > Compared to DSLs embedded in other languages, languages with their
> own
> > > > parser are more intuitive for users, and give better error messages.
> It
> > > is
> > > > also possible to support multiple versions simultaneously.
> > > >
> > > > Julian
> > > >
> > > > [1] https://en.wikipedia.org/wiki/Domain-specific_language <
> > > > https://en.wikipedia.org/wiki/Domain-specific_language>
> > > >
> > > > [2] https://issues.apache.org/jira/browse/CALCITE-3923 <
> > > > https://issues.apache.org/jira/browse/CALCITE-3923>
> > > >
> > > > [3]
> > > >
> > >
> >
> https://github.com/cockroachdb/cockroach/blob/master/pkg/sql/opt/optgen/lang/doc.go
> > > > <
> > > >
> > >
> >
> https://github.com/cockroachdb/cockroach/blob/master/pkg/sql/opt/optgen/lang/doc.go
> > > >
> > > >
> > > >
> > > >
> > > > > On Apr 13, 2022, at 9:56 AM, Jacques Nadeau <jacq...@apache.org>
> > > wrote:
> > > > >
> > > > > Vladimir,
> > > > >
> > > > > It's always good to ask these kinds of questions. e.g. "Is progress
> > > > really
> > > > > progress?"
> > > > >
> > > > > I think moving to a configuration object was a substantial
> > improvement.
> > > > > Until Java has default parameters and named parameters, I don't
> know
> > of
> > > > > another good way to make rules sufficiently configurable for a wide
> > > range
> > > > > of users.
> > > > >
> > > > > To me, one of the main benefits of the new configuration objects
> is a
> > > > much
> > > > > less frequent need to subclass a rule. Back when we built original
> > > Drill
> > > > > and Dremio code, we were constantly pained by the need to control
> > more
> > > > > pieces of a rule than were exposed by public constructors. When
> > trying
> > > to
> > > > > introduce those options, we had to both: (1) update calcite and
> wait
> > > for
> > > > a
> > > > > release cycle and (2) create another constructor overload,
> sometimes
> > > > > leading to a large number of overloads. The config object almost
> > > entirely
> > > > > eliminated this challenge.
> > > > >
> > > > > I do agree that having config objects is more complex for the
> casual
> > > user
> > > > > so in some ways we're adding advanced functionality at the
> potential
> > > cost
> > > > > of usability for new users. That's not something great. I'm not
> sure
> > > it's
> > > > > really bad but it is something to be aware of. Having default
> configs
> > > for
> > > > > rules seems to make this a rather small additional challenge.
> > > > >
> > > > > For your comments on downstream projects and the examples you
> give: I
> > > > think
> > > > > that sounds more like advanced usage. My experience is somewhat
> > > different
> > > > > from yours: these changes have been net positive on the stuff I've
> > been
> > > > > working on.
> > > > >
> > > > > Two last thoughts wrt downstream projects:
> > > > >
> > > > >   - I think it is important to separate how to create config
> objects
> > > from
> > > > >   the way that Calcite does it. We did change from proxies to
> > > immutables
> > > > >   within Caclite but neither has ever been the required way to
> > > implement
> > > > >   config objects. Config objects are always defined as plain java
> > > > interfaces.
> > > > >   If a particular subproject wants to expose configuration in a
> > > > completely
> > > > >   different way, I don't know of any constraint the rules interface
> > > > forces.
> > > > >   - A nice thing that was done with the initial implementation is
> now
> > > of
> > > > >   the configuration concepts are exposed on the RelOptRule abstract
> > > > class. A
> > > > >   user must opt in to RelRule for them to have to worry about any
> of
> > > > these
> > > > >   concerns.
> > > > >
> > > > >
> > > > > In working with the config concepts and extensibility, I do think a
> > > > couple
> > > > > of things are relatively painful. I don't
> > > > >
> > > > >   1. The definition of configs as interfaces inside the rules
> causes
> > > some
> > > > >   issues with static initialization order. I think Julian even
> > > mentioned
> > > > some
> > > > >   of this in one of the original docs config docs.
> > > > >   2. The generics of RelRule specifically can be challenging when
> > > > extended
> > > > >   rules in some cases due to how generics and inheritance work.
> > > > >   3. The class definition of the config interfaces are not
> > > > >   self-referencing (one of my shelved reworks actually modified to
> > > > >   support this) meaning that there are a lot of ugly (my subjective
> > > > opinion)
> > > > >   uses of Config.as() in the code. This last one could be resolved
> by
> > > > >   exposing the concrete builders that Immutables generates but for
> at
> > > > least
> > > > >   the initial changes, I wanted to avoid exposing these as new
> public
> > > > >   interfaces (and thus all the immutables classes are marked
> package
> > > > private).
> > > > >   4. Merging construction and use feels like an anti-pattern (my
> > > > >   subjective opinion). The most common patterns I've seen treat a
> > > builder
> > > > >   separate from an immutable constructed object (and have something
> > > akin
> > > > to a
> > > > >   toBuilder() method to take an immutable config back to a
> builder).
> > In
> > > > the
> > > > >   Calcite config, these two concepts are merged. In some ways it
> > makes
> > > > things
> > > > >   simpler for trivial cases. However,  it is less formal and causes
> > > pain
> > > > when
> > > > >   you have required properties that have no defaults since there is
> > no
> > > > formal
> > > > >   model for "I'm done building, now check everything is complete".
> > This
> > > > means
> > > > >   in several places we have to put default values that are actually
> > > > invalid
> > > > >   rather than just rely on a builder's build validation step.
> > > > >
> > > > > One other note, I think if someone is working in Java 14+ (records)
> > or
> > > > > Kotlin, there are also several easy ways to produce config impls
> that
> > > are
> > > > > easy to use (without proxies and/or immutables).
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > On Tue, Apr 12, 2022 at 9:29 AM Thomas Rebele
> > > <treb...@tibco.com.invalid
> > > > >
> > > > > wrote:
> > > > >
> > > > >> Hello,
> > > > >>
> > > > >> The reasons for the planner rules configuration can be found here:
> > > > >> CALCITE-3923 <https://issues.apache.org/jira/browse/CALCITE-3923
> >.
> > > See
> > > > >> also
> > > > >> the email thread [DISCUSS] Refactor how planner rules are
> > > parameterized
> > > > >> <
> > > > >>
> > > >
> > >
> >
> https://lists.apache.org/thread.html/rfdf6f9b7821988bdd92b0377e3d293443a6376f4773c4c658c891cf9%40%3Cdev.calcite.apache.org%3E
> > > > >>>
> > > > >> .
> > > > >>
> > > > >> Cordialement / Best Regards,
> > > > >> *Thomas Rebele, PhD* | R&D Developer | Germany | www.tibco.com
> > > > >>
> > > > >>
> > > > >> On Tue, Apr 12, 2022 at 6:10 PM Gavin Ray <ray.gavi...@gmail.com>
> > > > wrote:
> > > > >>
> > > > >>> I don't have any weight behind my opinion or experience,
> > > > >>> but anything that lowers the barrier to entry to Calcite for
> > > newcomers
> > > > >> is a
> > > > >>> huge win in my mind.
> > > > >>>
> > > > >>> I assume the reason for the changes was because codegen improved
> > > > >>> performance?
> > > > >>>
> > > > >>> Could it make sense to allow both options, the
> easy/less-performant
> > > way
> > > > >> for
> > > > >>> people who want to experiment and learn the ropes,
> > > > >>> and the codegen path for productionizing the final rules you come
> > up
> > > > >> with?
> > > > >>>
> > > > >>> Or does this make matters worse, trying to support two API's
> > > > >>>
> > > > >>> On Tue, Apr 12, 2022 at 6:25 AM Vladimir Ozerov <
> > ppoze...@gmail.com>
> > > > >>> wrote:
> > > > >>>
> > > > >>>> Hi folks,
> > > > >>>>
> > > > >>>> Rules are an essential part of the Calcite-based query
> > optimizers. A
> > > > >>>> typical optimizer may require dozens of custom rules that are
> > > created
> > > > >> by
> > > > >>>> extending some Apache Calcite interfaces.
> > > > >>>>
> > > > >>>> During the last two years, there were two major revisions of how
> > > rules
> > > > >>> are
> > > > >>>> created:
> > > > >>>>
> > > > >>>>   1. In early 1.2x versions, the typical approach was to use
> > > > >>>>   RelOptRuleOperand with a set of helper methods in a
> builder-like
> > > > >>>>   pattern.
> > > > >>>>   2. Then, we switched to the runtime code generation.
> > > > >>>>   3. Finally, we switched to the compile-time code generation
> with
> > > the
> > > > >>>>   Immutables framework.
> > > > >>>>
> > > > >>>> Every such change requires the downstream projects to rewrite
> all
> > > > their
> > > > >>>> rules. Not only does this require time to understand the new
> > > approach,
> > > > >>> but
> > > > >>>> it may also compromise the correctness of the downstream
> optimizer
> > > > >>> because
> > > > >>>> the regression tracking in query optimizers is not trivial.
> > > > >>>>
> > > > >>>> I had the privilege to try all three approaches, and I cannot
> get
> > > rid
> > > > >> of
> > > > >>>> the feeling that every new approach is more complicated than the
> > > > >> previous
> > > > >>>> one. I understand that this is a highly subjective statement,
> but
> > > when
> > > > >> I
> > > > >>>> just started using Apache Calcite and knew very little about
> it, I
> > > was
> > > > >>> able
> > > > >>>> to write rule patterns by simply looking at the IDE JavaDoc
> > pop-ups
> > > > and
> > > > >>>> code completion. When the RuleConfig was introduced, every new
> > rule
> > > > >>> always
> > > > >>>> required me to look at some other rule as an example, yet it was
> > > > >> doable.
> > > > >>>> Now we also need to configure the project build system to write
> a
> > > > >> single
> > > > >>>> custom rule.
> > > > >>>>
> > > > >>>> At the same time, a significant fraction of the rules are pretty
> > > > >> simple.
> > > > >>>> E.g., "operator A on top of operator B". If some additional
> > > > >> configuration
> > > > >>>> is required, it could be added via plain rules fields, because
> at
> > > the
> > > > >> end
> > > > >>>> of the day the rule instance is not more than a plain Java
> object.
> > > > >>>>
> > > > >>>> A good example is the FilterProjectTransposeRule. What now takes
> > > tens
> > > > >> of
> > > > >>>> lines of code in the Config subclass [1] (that you hardly could
> > > write
> > > > >>>> without a reference example), and ~500 LOC in the generated code
> > > that
> > > > >> you
> > > > >>>> get through additional plugin configuration [2] in your build
> > > system,
> > > > >>> could
> > > > >>>> have been expressed in a dozen lines of code [3] in Apache
> Calcite
> > > > >>> 1.22.0.
> > > > >>>>
> > > > >>>> My question is - are we sure we are going in the right direction
> > in
> > > > >> terms
> > > > >>>> of complexity and the entry bar for the newcomers? Wouldn't it
> be
> > > > >> better
> > > > >>> to
> > > > >>>> follow the 80/20 rule, when simple rules could be easily created
> > > > >>>> programmatically with no external dependencies, while more
> > advanced
> > > > >>>> facilities like Immutables are used only for the complex rules?
> > > > >>>>
> > > > >>>> Regards,
> > > > >>>> Vladimir.
> > > > >>>>
> > > > >>>> [1]
> > > > >>>>
> > > > >>>>
> > > > >>>
> > > > >>
> > > >
> > >
> >
> https://github.com/apache/calcite/blob/calcite-1.30.0/core/src/main/java/org/apache/calcite/rel/rules/FilterProjectTransposeRule.java#L208-L260
> > > > >>>> [2]
> > > > >>>>
> > > > >>>>
> > > > >>>
> > > > >>
> > > >
> > >
> >
> https://github.com/apache/calcite/blob/calcite-1.30.0/core/build.gradle.kts#L215-L224
> > > > >>>> [3]
> > > > >>>>
> > > > >>>>
> > > > >>>
> > > > >>
> > > >
> > >
> >
> https://github.com/apache/calcite/blob/calcite-1.22.0/core/src/main/java/org/apache/calcite/rel/rules/FilterProjectTransposeRule.java#L99-L110
> > > > >>>>
> > > > >>>
> > > > >>
> > > >
> > > >
> > >
> > > --
> > > This message contains confidential information and is intended only for
> > > the
> > > individuals named. If you are not the named addressee you should not
> > > disseminate, distribute or copy this e-mail. Please notify the sender
> > > immediately by e-mail if you have received this e-mail by mistake and
> > > delete this e-mail from your system. E-mail transmission cannot be
> > > guaranteed to be secure or error-free as information could be
> > intercepted,
> > > corrupted, lost, destroyed, arrive late or incomplete, or contain
> > viruses.
> > > The sender therefore does not accept liability for any errors or
> > omissions
> > > in the contents of this message, which arise as a result of e-mail
> > > transmission. If verification is required, please request a hard-copy
> > > version. -Hazelcast
> > >
> >
>

Reply via email to