CuteChuanChuan commented on PR #19592:
URL: https://github.com/apache/datafusion/pull/19592#issuecomment-3734861197
Thanks @Jefffrey and @andygrove for the valuable feedback! I've revised the
implementation based on your suggestions.
Changes Made
1. Stricter Type Signature
2. Simplified return_field_from_args
3. Removed argument count check in invoke_with_args
4. Re-implemented by using Arrow kernels / OffsetBuffer::lengths()
5. Follow Spark null behavior: null → -1 (Referenced @andygrove's [Comet
implementation](https://github.com/apache/datafusion-comet/pull/2862))
6. Moved tests and added tests to SLTs
Currently, the detailed implementation is decided by null_count() Check
```rust
if array.null_count() == 0 {
// Fast path: use Arrow kernel directly
Ok(arrow_length(array)?)
} else {
// Spark returns -1 for null, not null like Arrow's length kernel
list_array.offsets().lengths().enumerate()
.map(|(i, len)| if array.is_null(i) { -1 } else { len as i32 })
.collect()
}
```
This approach:
- Uses optimized Arrow kernel for the common case (no nulls)
- Only does per-element null checking when necessary
- Single pass through data in both cases
But I have some questions
1. return_field_from_args nullability
Currently returns args.arg_fields[0].is_nullable(), but since Spark returns
-1 for null input (not null), the output is never actually null. Should this be
false instead?
```rust
// Current
args.arg_fields[0].is_nullable()
// Alternative
false // output is never null since null → -1
```
2. Code duplication
If this if-else pattern is acceptable, then this pattern is repeated for
List, LargeList, and Map. Would you prefer extracting a helper function, or is
the current explicit approach acceptable for readability?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]