yaooqinn opened a new pull request, #55957:
URL: https://github.com/apache/spark/pull/55957

   ### What changes were proposed in this pull request?
   
   Extend `DecimalAggregates` to peel a scale-preserving widening `Cast` around 
`Sum`/`Average` arguments, recovering the long-backed fast path when the inner 
expression's precision still fits the existing safety bounds.
   
   When the input is `Sum(Cast(inner: dec(p, s), dec(p', s)))` with `p' >= p`:
   - SUM arm fires under `p + 10 <= 18`, identical to the existing SUM 
fast-path guard.
   - AVG arm fires under `p <= 7` (`AVG_PEEL_MAX_INNER_PRECISION`), strictly 
tighter than the existing AVG arm's `p + 4 <= 15` (= `p <= 11`), to avoid 
amplifying SPARK-37024 Double-regime precision loss.
   
   Both arms share a `WidenedDecimalChild` extractor that refuses to unwrap 
`CheckOverflow` (preserves row-level overflow semantics). Window arm is 
unchanged: `ExtractWindowExpressions` hoists the `Cast` into a preceding 
`Project`, so an expression-level rewrite cannot see it.
   
   ### Why are the changes needed?
   
   The existing fast path keys off the declared precision `p'` after a widening 
Cast, not the effective precision `p` of the inner expression. User patterns 
like `SUM(CAST(small_dec AS larger_dec))` — common from BI tools generating SQL 
with normalized types — fall off the fast path even though `p + 10 <= 18`. 
TPC-DS q18 exhibits this pattern.
   
   ### Does this PR introduce _any_ user-facing change?
   
   No.
   
   ### How was this patch tested?
   
   - `DecimalAggregatesSuite`, including invariant-guard tests that lock the 
SUM/AVG safety boundaries.
   - ScalaCheck property-based tests in `DataFrameAggregateSuite` for numerical 
equivalence of the peeled and un-peeled paths.
   - `TPCDSV1_4PlanStabilitySuite` and `TPCDSV1_4PlanStabilityWithStatsSuite` 
regenerated for q18.
   - `DecimalAggregatesBenchmark` added; results committed for JDK 17/21/25 
under `sql/core/benchmarks/`.
   
   ### Was this patch authored or co-authored using generative AI tooling?
   
   Generated-by: Claude Opus 4.7


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to