Hi folks,

Rules are an essential part of the Calcite-based query optimizers. A
typical optimizer may require dozens of custom rules that are created by
extending some Apache Calcite interfaces.

During the last two years, there were two major revisions of how rules are
created:

   1. In early 1.2x versions, the typical approach was to use
   RelOptRuleOperand with a set of helper methods in a builder-like
   pattern.
   2. Then, we switched to the runtime code generation.
   3. Finally, we switched to the compile-time code generation with the
   Immutables framework.

Every such change requires the downstream projects to rewrite all their
rules. Not only does this require time to understand the new approach, but
it may also compromise the correctness of the downstream optimizer because
the regression tracking in query optimizers is not trivial.

I had the privilege to try all three approaches, and I cannot get rid of
the feeling that every new approach is more complicated than the previous
one. I understand that this is a highly subjective statement, but when I
just started using Apache Calcite and knew very little about it, I was able
to write rule patterns by simply looking at the IDE JavaDoc pop-ups and
code completion. When the RuleConfig was introduced, every new rule always
required me to look at some other rule as an example, yet it was doable.
Now we also need to configure the project build system to write a single
custom rule.

At the same time, a significant fraction of the rules are pretty simple.
E.g., "operator A on top of operator B". If some additional configuration
is required, it could be added via plain rules fields, because at the end
of the day the rule instance is not more than a plain Java object.

A good example is the FilterProjectTransposeRule. What now takes tens of
lines of code in the Config subclass [1] (that you hardly could write
without a reference example), and ~500 LOC in the generated code that you
get through additional plugin configuration [2] in your build system, could
have been expressed in a dozen lines of code [3] in Apache Calcite 1.22.0.

My question is - are we sure we are going in the right direction in terms
of complexity and the entry bar for the newcomers? Wouldn't it be better to
follow the 80/20 rule, when simple rules could be easily created
programmatically with no external dependencies, while more advanced
facilities like Immutables are used only for the complex rules?

Regards,
Vladimir.

[1]
https://github.com/apache/calcite/blob/calcite-1.30.0/core/src/main/java/org/apache/calcite/rel/rules/FilterProjectTransposeRule.java#L208-L260
[2]
https://github.com/apache/calcite/blob/calcite-1.30.0/core/build.gradle.kts#L215-L224
[3]
https://github.com/apache/calcite/blob/calcite-1.22.0/core/src/main/java/org/apache/calcite/rel/rules/FilterProjectTransposeRule.java#L99-L110

Reply via email to