yaooqinn commented on PR #11837:
URL: https://github.com/apache/gluten/pull/11837#issuecomment-4144087793

   ## Root Cause Analysis
   
   Thanks @zhouyuan for catching this! I traced the issue:
   
   ### What changed
   
   **Before PR #16416**: `collect_set` had **1 signature** → 
`hasSameIntermediateTypesAcrossSignatures()` returns `false` → companion 
function registered as `collect_set_merge_extract` (no suffix).
   
   **After PR #16416**: `collect_set` has **2 signatures** (1-arg + 2-arg with 
boolean), both with `intermediateType("array(T)")` → 
`hasSameIntermediateTypesAcrossSignatures()` returns `true` → companion 
function registered as `collect_set_merge_extract_array_T` (**with suffix**).
   
   ### The mismatch
   
   Gluten's `SubstraitToVeloxPlan.cc::toAggregationFunctionName()` (line 
280-294):
   1. First tries `collect_set_merge_extract` → **not found** (was registered 
with suffix now)
   2. Falls through, constructs suffix from concrete result type: 
`collect_set_merge_extract_array_row_VARCHAR_BIGINT_BIGINT_endrow`
   3. Looks up that name → **not found** (registered as generic 
`collect_set_merge_extract_array_T`)
   4. **Throws**: `Cannot find function signature`
   
   ### Fix options
   
   1. **In Velox**: Ensure `hasSameIntermediateTypesAcrossSignatures` returns 
`false` by keeping companion functions registered without suffix. Could use 
`ignoreDuplicates` or separate registration for the 2-arg signature.
   2. **In Gluten Substrait C++**: Update `toAggregationFunctionName` to try 
the generic (no-suffix) companion name via 
`getAggregateFunctionSignatures(baseName + "_merge_extract")` first with type 
resolution, similar to how Velox's own planner resolves generic companion 
functions.
   
   I'll fix this in the Velox PR #16416 (or a follow-up) since the root cause 
is the signature change creating a suffix-based companion registration.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to