weiting-chen commented on code in PR #12142:
URL: https://github.com/apache/gluten/pull/12142#discussion_r3304527582
##########
shims/spark35/src/main/scala/org/apache/gluten/sql/shims/spark35/Spark35Shims.scala:
##########
@@ -80,7 +80,8 @@ class Spark35Shims extends SparkShims {
Sig[RegrSlope](ExpressionNames.REGR_SLOPE),
Sig[RegrIntercept](ExpressionNames.REGR_INTERCEPT),
Sig[RegrSXY](ExpressionNames.REGR_SXY),
- Sig[RegrReplacement](ExpressionNames.REGR_REPLACEMENT)
+ Sig[RegrReplacement](ExpressionNames.REGR_REPLACEMENT),
+ Sig[BitmapConstructAgg](ExpressionNames.BITMAP_CONSTRUCT_AGG)
Review Comment:
**Consider adding plan-shape assertion in tests**
**Problem:** `GlutenBitmapExpressionsQuerySuite` inherits upstream Spark
tests that only use `checkAnswer()` — they verify correctness but do not assert
that native Velox execution is actually active. A regression that silently
disables offloading would go undetected since vanilla Spark also produces
correct results.
**Evidence:**
```scala
class GlutenBitmapExpressionsQuerySuite
extends BitmapExpressionsQuerySuite
with GlutenSQLTestsTrait {}
// All upstream tests only call checkAnswer(df, expected)
```
**Suggested Fix:** Add at least one Gluten-specific test asserting the
native transformer is present:
```scala
test("bitmap_construct_agg routes to native") {
val df = spark.sql("SELECT bitmap_construct_agg(col) FROM values (1L),
(2L) AS t(col)")
val plan = df.queryExecution.executedPlan
assert(collect(plan) { case h: HashAggregateExecTransformer => h
}.nonEmpty,
"Expected native HashAggregateExecTransformer in plan")
}
```
##########
cpp/velox/substrait/SubstraitToVeloxPlanValidator.cc:
##########
@@ -1323,7 +1323,8 @@ bool SubstraitToVeloxPlanValidator::validate(const
::substrait::AggregateRel& ag
"regr_slope",
"regr_intercept",
"regr_sxy",
- "regr_replacement"};
+ "regr_replacement",
+ "bitmap_construct_agg"};
Review Comment:
**Follow-up opportunity: register bitmap_or_agg (and bitmap_and_agg for
Spark 4.1)**
**Problem:** `bitmap_or_agg` was introduced in the same Spark 3.5 commit as
`bitmap_construct_agg`, and `bitmap_and_agg` in Spark 4.1. If native Velox
implementations exist for these, registering them together would avoid
unnecessary columnar-to-row transitions when users combine bitmap functions in
the same query stage.
**Investigation Needed:** Confirm whether `bitmap_or_agg` and
`bitmap_and_agg` have native Velox implementations (look for registration in
`cpp/velox/udf/` or the Velox aggregate function registry). If yes, consider
adding them in a follow-up PR:
```cpp
"bitmap_construct_agg",
"bitmap_or_agg",
"bitmap_and_agg"};
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]