andygrove opened a new pull request, #4179:
URL: https://github.com/apache/datafusion-comet/pull/4179

   ## Which issue does this PR close?
   
   Closes #4124.
   
   ## Rationale for this change
   
   The new Spark 4.1 test `SPARK-53968 reading the view after 
allowPrecisionLoss is changed` in `SQLViewSuite` returns values 10x smaller 
than expected when run with Comet. The reproducer stores `DECIMAL(38, 18)` 
values in a view whose plan does `unit_price + COALESCE(shipping_price, 0)`, 
then re-reads the view after toggling 
`spark.sql.decimalOperations.allowPrecisionLoss`.
   
   Spark 4.1.1 (SPARK-53968) added a `NumericEvalContext` to 
`Add`/`Subtract`/`Multiply`/`Divide`/`IntegralDivide` that captures 
`allowDecimalPrecisionLoss` at construction time, so a view's analyzed plan 
keeps a stable decimal result type across config changes. Comet's 
`DecimalPrecision.promote` rule, however, recomputed the result type from the 
**live** `SQLConf` and wrapped the expression in `CheckOverflow` with that 
recomputed type. When the two disagreed (which is the SPARK-53968 scenario), 
the wrapper's target type no longer matched the child's `dataType`.
   
   Comet's native `CheckOverflow` 
(`native/spark-expr/src/math_funcs/internal/checkoverflow.rs`) only validates 
that values fit the target precision; it does **not** rescale. So a 
`Decimal128Array` produced at scale 17 ended up relabelled as scale 18 (or vice 
versa), shifting every value by a factor of 10. With view created at 
`allowPrecisionLoss=true` and read at `false`, expected `100.00000000000000000` 
came back as `10.00000000000000000`.
   
   ## What changes are included in this PR?
   
   - `DecimalPrecision.promote` now sets the `CheckOverflow` target type to 
`expr.dataType` directly. On Spark 3.4 - 4.0 this is equivalent to the previous 
recomputation; on Spark 4.1+ it preserves the per-expression 
`allowDecimalPrecisionLoss` captured at view creation time.
   - The unused `allowPrecisionLoss` parameter is dropped from the rule and its 
single call site in `QueryPlanSerde.exprToProto`.
   - The `IgnoreComet` annotation on `SPARK-53968 reading the view after 
allowPrecisionLoss is changed` is removed from `dev/diffs/4.1.1.diff`.
   
   ## How are these changes tested?
   
   - New unit test `CometDecimalArithmeticViewSuite` (under 
`spark/src/test/spark-4.1/`) constructs `Add` with a `NumericEvalContext` whose 
`allowDecimalPrecisionLoss` disagrees with the live `SQLConf`, and asserts that 
the promoted `CheckOverflow` target tracks `Add.dataType`. The test fails 
before the fix (target was `DecimalType(38,17)`/`DecimalType(38,18)` instead of 
the stored type) and passes after.
   - The existing `decimal division result type matches Spark` regression test 
in `CometExpressionSuite`, which already targeted a closely-related Spark 4 
path, continues to pass.
   - The `SQLViewSuite.SPARK-53968` Spark SQL test will be exercised by the 
Spark SQL CI workflow once it is no longer ignored.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to