yashmayya commented on code in PR #18513:
URL: https://github.com/apache/pinot/pull/18513#discussion_r3282648392
##########
pinot-query-planner/src/main/java/org/apache/pinot/calcite/rel/rules/PinotQueryRuleSets.java:
##########
@@ -141,9 +141,21 @@ private PinotQueryRuleSets() {
// push aggregate functions through join, disabled by default
AggregateJoinTransposeRule.Config.EXTENDED
.withDescription(PlannerRuleNames.AGGREGATE_JOIN_TRANSPOSE_EXTENDED).toRule(),
- // aggregate union rule
+ // AggregateUnionAggregate and AggregateUnionTranspose are inverses of
each other and the defaults are chosen to
+ // optimize for distributed-OLAP cost (shuffle dominates compute):
+ // - AggregateUnionTranspose (default-on) pushes the aggregate into
every UNION ALL branch, so each branch
+ // ships pre-aggregated rows across the next exchange instead of raw
rows. With low-cardinality group keys
+ // (the common analytics case) this trades a cheap extra per-branch
aggregate for a much smaller shuffle.
+ // - AggregateUnionAggregate (default-off) does the opposite: it
collapses Agg(Union(Agg(A), B)) into
+ // Agg(Union(A, B)), which loses the pre-aggregation on A and forces
raw rows through the union's exchange.
+ // We keep it available behind `usePlannerRules` for
embedded/in-memory deployments where shuffle is free,
+ // but enabling it alongside the default-on Transpose just undoes
Transpose's work (see the order below).
AggregateUnionAggregateRule.Config.DEFAULT
.withDescription(PlannerRuleNames.AGGREGATE_UNION_AGGREGATE).toRule(),
+ // Registered after AggregateUnionAggregate on purpose: if both are
enabled, this phase runs last so Transpose
+ // wins and any merge AggregateUnionAggregate did first gets pushed back
into the branches.
+ PinotAggregateUnionTransposeRule
Review Comment:
Thanks for taking another look. I want to engage with the contract framing
more precisely, because I think "preserve the old opt-in contract" is doing
more work than it appears.
The shape of this change mirrors #16515 exactly: there, deployments that had
relied on Calcite's default-on `AggregateUnionAggregate` had to add
`usePlannerRules='AggregateUnionAggregate'` post-upgrade to keep their plans.
Here, deployments currently carrying that same `usePlannerRules` line will need
to add `skipPlannerRules='AggregateUnionTranspose'` to keep theirs. Neither
case changes the rule's own behavior — both change the surrounding default
ruleset. We accepted the trade in #16515 on the grounds that the new default
was right for the dominant Pinot workload; this PR is making the same trade in
the same direction for the same reason.
I do see the asymmetry you're pointing at — users who set `usePlannerRules`
explicitly arguably did more than users who just inherited Calcite's stock
defaults. But the `usePlannerRules` semantic is "register this rule into the
planner's ruleset", not "pin the plan shape it produced against any subsequent
rule that runs in a later phase". A multi-phase HEP planner doesn't and can't
make the latter guarantee — if it did, we couldn't add any future rule that
touches an aggregate-over-union pattern.
The compat-shim alternative (auto-skip Transpose whenever
AggregateUnionAggregate is opted in, or vice versa) is what worries me more
than the explicit-`SET`-line ask:
- It encodes a runtime coupling between two notionally-independent rules
inside `PinotQueryRuleSets`. Future maintainers won't expect it and won't know
to extend it when adding the next aggregate-over-union rule.
- It silently disables the cluster-wide shuffle optimization for any
deployment that still has the historical opt-in sitting in config-as-code —
which is exactly the audience you're trying to protect, but the protection
lands by giving them the *old* (worse-for-most-workloads) plan without telling
them.
- It gets the policy wrong if anyone has a legitimate reason to want both,
or any future combination we haven't anticipated. There's no good place to
express that beyond what query options already allow.
The audience this affects is also narrow and self-selecting: only
deployments that *deliberately* opted into AggregateUnionAggregate because they
believed the merged shape was better for their workload. Those are precisely
the deployments for whom the new Transpose default is the wrong default — they
need `skipPlannerRules='AggregateUnionTranspose'` anyway. Asking them to write
the two-rule `SET` together isn't extra ceremony, it's the
configuration-as-code spelling of "I want the merged shape", which is the
decision they already made.
Concretely, in line with how #16515 was rolled out, I'll add an explicit
migration callout to the PR description and the eventual release notes:
"Deployments currently using `usePlannerRules='AggregateUnionAggregate'` should
add `skipPlannerRules='AggregateUnionTranspose'` to preserve the previous
plan." If you still feel strongly about the compat shim after this, I'd
appreciate hearing what specific production scenario you want to protect — that
might surface something I'm missing.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]