jayzhan211 commented on issue #6973: URL: https://github.com/apache/arrow-datafusion/issues/6973#issuecomment-1638150991
I think we can have both. For the postgres version, which is the current one, my understanding of array_contains(array_has_all) now is that we have array_has_all(list, sub-list), And we iterate sub-list to check whether the element exists in the list, `any` similar to bitwise OR, `all` similar to bitwise AND, `array_has` is `array_has_all` but with element, not sub-list. In this case, it does not make sense to me to support nested list (>1d) for sub-list, since there is not a lot of difference if we limit sub-list to 1d array. For example, `array_has_all(list, [[[1,2,2,3],[3,4,4,4]]])` is the same as `array_has_all(list, [1,2,3,4])` but MUCH MORE SIMPLE. The Clickhouse version to me is more like set operations and really does what `contains` do. We can have array_has_all(list, sub-list) where the sub-list SHOULD accept the nested array, and we check if the list contains the sub-list. In the above example, we should check if `[[[1,2,2,3],[3,4,4,4]]]` exists in parts of the list in 3d. To combine these two, `array_has/array_contains` is the basic function that has two arguments (list: nested_array, sub-list: nested_array with dimension <= list). `array_has_all` is the extended version that is equivalent to `array_has(list, sub-list[0]) && array_has(list, sub-list[1]) .. && array_has(list, sub-list[n])`, bitwise AND version. Similarly, `array_has_any` is the bitwise OR version. And, arguments of `array_has_all` and `array_has_any` are (list: nested_array, sub-list: Vec<nested_array>). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
