brijrajk commented on code in PR #12151:
URL: https://github.com/apache/gluten/pull/12151#discussion_r3500310173


##########
gluten-ut/spark40/src/test/resources/backends-velox/tpcds-plan-stability/gluten-approved-plans-modified/q59.sf100/simplified.txt:
##########
@@ -23,52 +23,42 @@ VeloxColumnarToRow
                                         InputIteratorTransformer
                                           InputAdapter
                                             ColumnarBroadcastExchange #2
-                                              WholeStageCodegenTransformer (3)
-                                                FilterExecTransformer 
[d_date_sk,d_week_seq]
-                                                  Subquery #1
-                                                    VeloxColumnarToRow
-                                                      
WholeStageCodegenTransformer (2)
-                                                        
RegularHashAggregateExecTransformer [bloom_filter_agg(xxhash64(d_week_seq, 42), 
335, 8990, 0, 0),bloomFilter,buf,buf]
-                                                          
InputIteratorTransformer
-                                                            InputAdapter
-                                                              ColumnarExchange 
#3
-                                                                
VeloxResizeBatches
-                                                                  
WholeStageCodegenTransformer (1)
-                                                                    
FlushableHashAggregateExecTransformer [_pre_x] [buf,buf,buf]
+                                              RowToVeloxColumnar
+                                                WholeStageCodegen (1)
+                                                  Filter [d_date_sk,d_week_seq]
+                                                    Subquery #1
+                                                      ObjectHashAggregate 
[buf] [bloom_filter_agg(xxhash64(d_week_seq, 42), 335, 8990, 0, 
0),bloomFilter,buf]
+                                                        VeloxColumnarToRow
+                                                          ColumnarExchange #3
+                                                            VeloxResizeBatches
+                                                              
RowToVeloxColumnar
+                                                                
ObjectHashAggregate [d_week_seq] [buf,buf]
+                                                                  
VeloxColumnarToRow
+                                                                    
WholeStageCodegenTransformer (1)

Review Comment:
   @zhztheplayer Yes, it's in the final executed plan. The `bloom_filter_agg` 
attribute name comes from `resultExpressions` (frozen when 
`ObjectHashAggregateExec` was first built from vanilla `BloomFilterAggregate`) 
— the subsequent `transformAllExpressions` only updates `aggregateExpressions`, 
not `resultExpressions`. The actual physical function is 
`VeloxBloomFilterAggregate`, not vanilla; it runs in JVM mode but `eval()` 
still calls `serialize(buffer)` unconditionally → Velox-format bytes. 
`VeloxBloomFilterMightContain` reads them correctly.
   
   The JVM mode is because `PlanSubqueries` (position 2) prepares the subquery 
before `ApplyColumnarRulesAndInsertTransitions` (position 10) runs. At that 
point the subquery has vanilla `BloomFilterAggregate` with no Substrait 
mapping, so `HeuristicTransform` leaves it as `ObjectHashAggregateExec`. Our 
rule substitutes `VeloxBloomFilterAggregate` into the already-prepared plan but 
`HeuristicTransform` doesn't re-run.
   
   Restoring native offloading would mean substituting during subquery 
preparation, which also fires for SPARK-54336 (`might_contain(empty_subquery, 
literal)`) — `VeloxBloomFilterAggregate` has no cardinality guard so empty 
input returns bytes instead of `null`.
   
   Added a test in the latest commit that directly inspects the executed plan's 
`ObjectHashAggregateExec` for `VeloxBloomFilterAggregate` and confirms correct 
results.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to