CuteChuanChuan commented on PR #19592:
URL: https://github.com/apache/datafusion/pull/19592#issuecomment-3734861197

   Thanks @Jefffrey  and @andygrove for the valuable feedback! I've revised the 
implementation based on your suggestions.
   
   Changes Made
   
     1. Stricter Type Signature
     2. Simplified return_field_from_args
     3. Removed argument count check in invoke_with_args
     4. Re-implemented by using Arrow kernels / OffsetBuffer::lengths()
     5. Follow Spark null behavior: null → -1 (Referenced @andygrove's [Comet 
implementation](https://github.com/apache/datafusion-comet/pull/2862))
     6. Moved tests and added tests to SLTs
   
   Currently, the detailed implementation is decided by null_count() Check
   ```rust
   if array.null_count() == 0 {
       // Fast path: use Arrow kernel directly
       Ok(arrow_length(array)?)
   } else {
       // Spark returns -1 for null, not null like Arrow's length kernel
       list_array.offsets().lengths().enumerate()
           .map(|(i, len)| if array.is_null(i) { -1 } else { len as i32 })
           .collect()
   }
   ```  
   This approach:
   - Uses optimized Arrow kernel for the common case (no nulls)
   - Only does per-element null checking when necessary
   - Single pass through data in both cases
   
   
   But I have some questions
   
   1. return_field_from_args nullability
   Currently returns args.arg_fields[0].is_nullable(), but since Spark returns 
-1 for null input (not null), the output is never actually null. Should this be 
false instead?
   ```rust
   // Current
   args.arg_fields[0].is_nullable()
   
   // Alternative
   false  // output is never null since null → -1
   ```
   
   2. Code duplication
   If this if-else pattern is acceptable, then this pattern is repeated for 
List, LargeList, and Map. Would you prefer extracting a helper function, or is 
the current explicit approach acceptable for readability?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to