andygrove commented on code in PR #11586:
URL: https://github.com/apache/datafusion/pull/11586#discussion_r1687090064


##########
datafusion/physical-expr/src/expressions/is_null.rs:
##########
@@ -117,6 +117,21 @@ pub(crate) fn compute_is_null(array: ArrayRef) -> 
Result<BooleanArray> {
     }
 }
 
+/// workaround <https://github.com/apache/arrow-rs/issues/6017>,
+/// this can be replaced with a direct call to `arrow::compute::is_not_null` 
once it's fixed.
+pub(crate) fn compute_is_not_null(array: ArrayRef) -> Result<BooleanArray> {
+    if let Some(union_array) = array.as_any().downcast_ref::<UnionArray>() {
+        let is_null = if let Some(offsets) = union_array.offsets() {
+            dense_union_is_null(union_array, offsets)?
+        } else {
+            sparse_union_is_null(union_array)?
+        };
+        compute::not(&is_null).map_err(Into::into)
+    } else {
+        compute::is_not_null(array.as_ref()).map_err(Into::into)

Review Comment:
   > This goes faster because it calls a single kernel (`compute::is_not_null`) 
rather than 2 (`is_null` and `not`)?
   
   Yes, exactly. It avoids creating an interim vector that is then discarded.
    
   > Could we add some basic tests for union? Perhaps following the model in 
#11321 ?
   
   We do already have at least one test for `IS NOT NULL` for union, that was 
added in https://github.com/apache/datafusion/pull/11321.
   
   There is no functional change for union in this PR. The code in 
`compute_is_not_null` for union is copied from the `compute_is_null` method, 
and adds a call to `not`, so it is doing the same thing as before but the flow 
changed a little.
   
   Union is the only case that this PR does not optimize for, because I didn't 
want to mess with the temporary workaround that is in place.
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to