kim created SPARK-54750:
---------------------------

             Summary: [SQL] ROUND returns NULL for Decimal values with low 
runtime precision
                 Key: SPARK-54750
                 URL: https://issues.apache.org/jira/browse/SPARK-54750
             Project: Spark
          Issue Type: Bug
          Components: SQL
    Affects Versions: 4.0.0, 4.1.0, 3.3.4, 3.0.0
            Reporter: kim


h4. Summary

{{ROUND(PERCENTILE_APPROX(...), N)}} incorrectly returns {{NULL}} for certain 
Decimal values (e.g., {{{}2.15{}}}) while working correctly for others (e.g., 
{{{}2.151{}}}).
h4. Reproduction
{code:java}
-- Returns NULL (BUG)

SELECT ROUND(PERCENTILE_APPROX(2150 / 1000.0, 0.95), 3) as p95

-- Returns 2.151 (OK)

SELECT ROUND(PERCENTILE_APPROX(2151 / 1000.0, 0.95), 3) as p95 {code}
h4. Affected Versions
 * {*}Tested and confirmed{*}: 3.0.x, 3.3.4
 * {*}Code review confirmed{*}: 4.0.0-preview, master branch 

The buggy code has been present since at least Spark 3.0 and remains unfixed in 
the current master branch.
h4. Root Cause Analysis

The bug is in {{RoundBase.nullSafeEval}} (mathExpressions.scala):

 
{code:java}
case DecimalType.Fixed(p, s) =>  
val decimal = input1.asInstanceOf[Decimal]  
if (_scale >= 0) {    
          decimal.toPrecision(decimal.precision, s, mode)  
          // BUG: uses decimal.precision  
} {code}
{*}Problem{*}: The code uses {{decimal.precision}} (runtime input value's 
precision) instead of {{p}} (schema-defined output precision).

{*}Why it fails{*}:
 # {{PERCENTILE_APPROX}} internally computes with Double, then converts back to 
Decimal using {{{}Decimal(doubleValue){}}}.
 # This creates a "compact" Decimal with minimal precision (e.g., {{2.15}} → 
precision=3).
 # When {{ROUND(..., 3)}} tries to expand to 3 decimal places, {{2.15}} becomes 
{{2.150}} (needs precision=4).
 # The buggy code limits precision to {{{}decimal.precision=3{}}}, causing {{4 
> 3}} overflow check to fail.
 # Result: {{{}NULL{}}}.

{*}Why some values work{*}:
 * {{2.151}} has runtime precision=4, which is sufficient for the output 
{{2.151}} (also precision=4).
 * {{2.15}} has runtime precision=3, but needs precision=4 for output 
{{{}2.150{}}}.

h4. Fix:  
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/mathExpressions.scala

 
{code:java}
- decimal.toPrecision(decimal.precision, s, mode)
+ decimal.toPrecision(p, s, mode) {code}
h4. Workaround

Before this issue is fixed, users can work around the bug by casting to 
{{DOUBLE}} before applying {{{}ROUND{}}}:
{code:java}
-- BUG: Returns NULL
SELECT ROUND(PERCENTILE_APPROX(2150 / 1000.0, 0.95), 3) as p95

-- WORKAROUND: Cast to DOUBLE first, returns 2.15
SELECT ROUND(CAST(PERCENTILE_APPROX(2150 / 1000.0, 0.95) AS DOUBLE), 3) as 
p95{code}
 

 

 

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to