EdsonPetry opened a new pull request, #23265:
URL: https://github.com/apache/datafusion/pull/23265
## Which issue does this PR close?
<!--
We generally require a GitHub issue to be filed for all bug fixes and
enhancements and this helps us generate change logs for our releases.
-->
- N/A. No dedicated tracking issue; this adds a single self-contained
higher-order array function. Happy to file one if preferred.
## Rationale for this change
DataFusion already provides higher-order array functions such as
`array_any_match`, `array_filter`, and `array_transform`, but there is no
direct way to retrieve the *first* element of an array that satisfies a
predicate. Today this requires `array_filter` followed by `array_element(...,
1)`, which materializes an intermediate filtered array. `array_first` expresses
this directly and rounds out the set of lambda-based array functions.
## What changes are included in this PR?
- New higher-order function `array_first(array, predicate)` (alias
`list_first`) in `datafusion-functions-nested`, returning the first element for
which the lambda predicate returns `true`:
- returns `null` when the array is empty or no element matches;
- a predicate that evaluates to `null` for an element is treated as not
matching;
- a matched element that is itself `null` is returned as `null`.
- Implemented as a `HigherOrderUDFImpl` following the existing array-lambda
functions, including the standard fast paths (fully-null input) and correct
handling of sliced lists, null sublists, and captured outer columns.
- Registration in `functions-nested` (`expr_fn` re-export and the default
higher-order function list).
- Unit tests, sqllogictest coverage, and regenerated SQL function
documentation.
## Are these changes tested?
Yes:
- Unit tests in `array_first.rs` covering match/no-match, empty and null
arrays, null-predicate handling, matched-null elements, sliced lists, captured
outer columns, and non-primitive element types.
- sqllogictest cases in `test_files/array/array_first.slt`, including
`LargeList` and the `list_first` alias.
## Are there any user-facing changes?
Yes. A new array function `array_first` (alias `list_first`) is available in
SQL, with generated documentation under the Array Functions section. There are
no breaking changes to existing public APIs.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]