Dandandan commented on code in PR #8951:
URL: https://github.com/apache/arrow-rs/pull/8951#discussion_r2701042297


##########
arrow-select/src/coalesce.rs:
##########
@@ -238,10 +245,112 @@ impl BatchCoalescer {
         batch: RecordBatch,
         filter: &BooleanArray,
     ) -> Result<(), ArrowError> {
-        // TODO: optimize this to avoid materializing (copying the results
-        // of filter to a new batch)
-        let filtered_batch = filter_record_batch(&batch, filter)?;
-        self.push_batch(filtered_batch)
+        // We only support primitve now, fallback to filter_record_batch for 
other types
+        // Also, skip optimization when filter is not very selective
+        if batch
+            .schema()
+            .fields()
+            .iter()
+            .any(|field| !field.data_type().is_primitive())

Review Comment:
   Because of the all-or-nothing check it has no effect on tpch/clickbench 
performance yet. I am not sure if we can somehow use the conventional filter 
for non-supported arrays 🤔 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to