alamb opened a new issue #1069: URL: https://github.com/apache/arrow-rs/issues/1069
**Describe the bug** Characters such as `[` and `.` are sometimes treated as regular expressions rather than literals in regular expressions The arrow regular expression kernels such as `like_utf8` https://github.com/apache/arrow-rs/blob/master/arrow/src/compute/kernels/comparison.rs#L311-L323 take limited SQL style string matching patterns (e.g. `%`). However, under the covers a regular expression matching library is used but special regular expression characters are not escaped. @ovr added code to handle `(` and `)` in https://github.com/apache/arrow-rs/pull/1042 but there are other special characters as well **To Reproduce** ```rust let array: StringArray = vec!["foo", "bar", "baz"] .into_iter() .map(Some) .collect(); let comparison = arrow::compute::like_utf8_scalar(&array, "foo%.*").unwrap(); let expected: BooleanArray = vec![false, false, false] .into_iter() .map(Some) .collect(); assert_eq!(comparison, expected); ``` **Expected behavior** This test should pass (is what postgres produces) ```sql alamb=# select * from foo; x ----- foo bar baz (3 rows) alamb=# select x, x like 'foo%.*' from foo; x | ?column? -----+---------- foo | f bar | f baz | f (3 rows) ``` **Additional context** Follow on to https://github.com/apache/arrow-rs/pull/1042 where @ovr fixed the parenthesis issue -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
