alamb commented on code in PR #9871:
URL: https://github.com/apache/arrow-rs/pull/9871#discussion_r3191647540
##########
arrow-string/src/like.rs:
##########
@@ -187,6 +189,30 @@ pub fn contains(left: &dyn Datum, right: &dyn Datum) ->
Result<BooleanArray, Arr
like_op(Op::Contains, left, right)
}
+/// Perform equality check on two byte arrays using an ASCII case-insensitive
match.
Review Comment:
It can also be string / stringview / largestring arrays too I think
##########
arrow-string/src/like.rs:
##########
@@ -1394,6 +1425,22 @@ mod tests {
vec![true, false, true, true, true]
);
+ test_utf8!(
+ test_utf8_array_eq_ignore_ascii_case,
+ vec!["arrow", "arrow", "arrow", "parquet", "parquet"],
Review Comment:
could we add a substring test too -- like comparing `arrow` to `arro` and
verify they don't match
##########
arrow-string/src/like.rs:
##########
@@ -187,6 +189,30 @@ pub fn contains(left: &dyn Datum, right: &dyn Datum) ->
Result<BooleanArray, Arr
like_op(Op::Contains, left, right)
}
+/// Perform equality check on two byte arrays using an ASCII case-insensitive
match.
+///
+/// `left` and `right` must be the same type, and one of
+/// - Utf8
+/// - LargeUtf8
+/// - Utf8View
+///
+/// # Example
+/// ```
+/// # use arrow_array::{StringArray, BooleanArray};
+/// # use arrow_string::like::eq_ignore_ascii_case;
+/// let strings = StringArray::from(vec!["arrow", "rs", "arrow-rS",
"Parquet"]);
+/// let patterns = StringArray::from(vec!["ARROW", "rS", "ARROW-rs", "arrow"]);
+///
+/// let result = eq_ignore_ascii_case(&strings, &patterns).unwrap();
+/// assert_eq!(result, BooleanArray::from(vec![true, true, true, false]));
+/// ```
+pub fn eq_ignore_ascii_case(
Review Comment:
Maybe we can change the module doc to be more general:
from
```rust
//! Provide SQL's LIKE operators for Arrow's string arrays
```
to something like:
```rust
//! String predicate kernels for Arrow arrays.
//!
//! Provides SQL `LIKE`/`ILIKE` kernels as well as related
//! string predicates such as `contains`, `starts_with`, `ends_with`, and
//! ASCII case-insensitive equality.
```
##########
arrow-string/src/like.rs:
##########
@@ -187,6 +189,30 @@ pub fn contains(left: &dyn Datum, right: &dyn Datum) ->
Result<BooleanArray, Arr
like_op(Op::Contains, left, right)
}
+/// Perform equality check on two byte arrays using an ASCII case-insensitive
match.
+///
+/// `left` and `right` must be the same type, and one of
+/// - Utf8
+/// - LargeUtf8
+/// - Utf8View
+///
+/// # Example
+/// ```
+/// # use arrow_array::{StringArray, BooleanArray};
+/// # use arrow_string::like::eq_ignore_ascii_case;
+/// let strings = StringArray::from(vec!["arrow", "rs", "arrow-rS",
"Parquet"]);
+/// let patterns = StringArray::from(vec!["ARROW", "rS", "ARROW-rs", "arrow"]);
+///
+/// let result = eq_ignore_ascii_case(&strings, &patterns).unwrap();
+/// assert_eq!(result, BooleanArray::from(vec![true, true, true, false]));
+/// ```
+pub fn eq_ignore_ascii_case(
Review Comment:
I think this is fine
The other potential option would be in the `eq` module of arrow-ord -- but
that doesn't have string stuff, so I think this is actually better
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]