terrymanu commented on issue #37644:
URL:
https://github.com/apache/shardingsphere/issues/37644#issuecomment-3710032239
### Problem Understanding
Cross-shard aggregation where SUM is wrapped by IFNULL(SUM(...),0) returns
only the first shard’s value (0 when that shard is empty), while a top-level
COUNT still accumulates across shards.
### Root Cause
The current result-merge pipeline only performs cross-shard aggregation on
projections recognized as top-level aggregations. When an aggregation is nested
inside an expression (e.g., IFNULL), the entire expression is treated as a
normal column, no AggregationUnit is built for the inner SUM, so it is not
merged across shards and the first shard’s value “wins.” This aligns with
current implementation boundaries rather than user misuse.
### Problem Analysis
- Because only top-level aggregations are merged, nested aggregations fall
outside the supported shape and get returned as-is from the first shard. This
is a capability gap that can surprise users writing conventional SQL with
wrappers like IFNULL/COALESCE.
- To support this SQL shape, the engine would need to detect and merge
aggregations even when they are embedded inside expression projections.
### Problem Conclusion
This is a missing capability (feature to be enhanced), not a violation of
usage rules by the user. It should be addressed either by (a) enhancing result
merge to recognize and merge aggregations inside expressions, and/or (b)
clarifying the limitation in the docs until the enhancement lands.
### Design-Level Enhancement Proposal (welcome PRs!)
1. Parsing/context: When handling ExpressionProjectionSegment, recursively
extract any AggregationProjectionSegment, assign deterministic aliases, and
register them in the projections/merge context (similar to how AVG spawns
SUM/COUNT derived columns).
2. Result merge: For extracted aggregations, create AggregationUnits and
accumulate per-shard values; after merge, evaluate the original expression
(e.g., IFNULL) on the merged row.
3. Column alignment/rewrite: Ensure derived aggregation aliases are
available in the result set or via rewrite so merge can fetch them reliably.
4. Tests: Add unit tests for projection extraction and result-merge
accumulation; add integration tests with two shards (one empty, one with data)
to verify IFNULL(SUM(...),0) aggregates correctly while COUNT remains correct.
5. Documentation: Update the merge section to state the current limitation
and, once implemented, document support for aggregations nested inside
expressions.
We’d love to see a community PR implementing the above and adding the
tests/docs; we’re happy to help review and iterate.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]