terrymanu commented on issue #37644:
URL: 
https://github.com/apache/shardingsphere/issues/37644#issuecomment-3710032239

   ###  Problem Understanding
     Cross-shard aggregation where SUM is wrapped by IFNULL(SUM(...),0) returns 
only the first shard’s value (0 when that shard is empty), while a top-level 
COUNT still accumulates across shards.
   
   ###  Root Cause
     The current result-merge pipeline only performs cross-shard aggregation on 
projections recognized as top-level aggregations. When an aggregation is nested 
inside an expression (e.g., IFNULL), the entire expression is treated as a 
normal column, no AggregationUnit is built for the inner SUM, so it is not 
merged across shards and the first shard’s value “wins.” This aligns with 
current implementation boundaries rather than user misuse.
   
   ###  Problem Analysis
   
     - Because only top-level aggregations are merged, nested aggregations fall 
outside the supported shape and get returned as-is from the first shard. This 
is a capability gap that can surprise users writing conventional SQL with 
wrappers like IFNULL/COALESCE.
     - To support this SQL shape, the engine would need to detect and merge 
aggregations even when they are embedded inside expression projections.
   
   ###  Problem Conclusion
     This is a missing capability (feature to be enhanced), not a violation of 
usage rules by the user. It should be addressed either by (a) enhancing result 
merge to recognize and merge aggregations inside expressions, and/or (b) 
clarifying the limitation in the docs until the enhancement lands.
   
   ###  Design-Level Enhancement Proposal (welcome PRs!)
   
     1. Parsing/context: When handling ExpressionProjectionSegment, recursively 
extract any AggregationProjectionSegment, assign deterministic aliases, and 
register them in the projections/merge context (similar to how AVG spawns 
SUM/COUNT derived columns).
     2. Result merge: For extracted aggregations, create AggregationUnits and 
accumulate per-shard values; after merge, evaluate the original expression 
(e.g., IFNULL) on the merged row.
     3. Column alignment/rewrite: Ensure derived aggregation aliases are 
available in the result set or via rewrite so merge can fetch them reliably.
     4. Tests: Add unit tests for projection extraction and result-merge 
accumulation; add integration tests with two shards (one empty, one with data) 
to verify IFNULL(SUM(...),0) aggregates correctly while COUNT remains correct.
     5. Documentation: Update the merge section to state the current limitation 
and, once implemented, document support for aggregations nested inside 
expressions.
   
     We’d love to see a community PR implementing the above and adding the 
tests/docs; we’re happy to help review and iterate.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to