jianzhenwu commented on code in PR #12009:
URL: https://github.com/apache/gluten/pull/12009#discussion_r3207426153
##########
backends-velox/src/test/scala/org/apache/spark/sql/execution/VeloxExpandSuite.scala:
##########
@@ -77,4 +88,128 @@ class VeloxExpandSuite extends
VeloxWholeStageTransformerSuite {
}
}
}
+
+ test("Expand with round(avg(decimal)) and multiple distinct aggregates") {
Review Comment:
I encountered this problem using Spark 3.2. I believe it's also possible to
reproduce the problem using Spark 3.3. I've tried using AI to explain the issue.
Spark 3.3 can reproduce the issue because its physical ExpandExec contains
this decimal expression shape:
```
CAST((case_dd_decimal26 + case_ccb_decimal26) AS DECIMAL(27,10)) +
case_fsv_decimal27
```
Spark declares the Expand output column as:
```
DECIMAL(27,10)
```
The null rows in the same Expand column are also:
```
CAST(NULL AS DECIMAL(27,10))
```
But when Velox compiles the non-null decimal arithmetic row, it infers:
```
DECIMAL(28,10)
```
So native ExpandNode sees mixed types in the same output column:
```
row 0: DECIMAL(28,10)
row 1: DECIMAL(27,10)
```
Then Velox fails with:
```
The projections type does not match across different rows in the same column.
Got: DECIMAL(27, 10), DECIMAL(28, 10)
```
Spark 3.5 does not reproduce it because the generated ExpandExec expression
is different:
```
(case_dd_decimal25 + case_ccb_decimal25) + case_fsv_decimal25
```
It does not insert the intermediate:
```
CAST(... AS DECIMAL(27,10))
```
that Spark 3.3 has. With this Spark 3.5 plan shape, Velox’s inferred type
stays compatible with Spark’s Expand output type, so all projection rows in the
Expand column remain consistent.
So the difference is not the SQL result type. Both Spark versions declare
the Expand output as DECIMAL(27,10). The difference is the internal decimal
expression tree Spark generates before Gluten/Velox conversion. Spark 3.3’s
tree causes Velox to widen one projection row to DECIMAL(28,10); Spark 3.5’s
tree does not.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]