Dandandan commented on a change in pull request #55:
URL: https://github.com/apache/arrow-datafusion/pull/55#discussion_r619873424



##########
File path: datafusion/src/physical_plan/hash_join.rs
##########
@@ -891,6 +898,36 @@ impl Stream for HashJoinStream {
                     }
                     Some(result)
                 }
+                // If maybe_batch is None and num_output_rows is 0, that means 
right side batch was
+                // empty and has been coalesced to None. Fill right side with 
Null if preserve_left
+                // is true.
+                None if self.preserve_left && self.num_output_rows == 0 => {

Review comment:
       I think this partially resolves a more general issue with the left join, 
which is that it doesn't keep track of unmatched left rows across batches. 
https://issues.apache.org/jira/browse/ARROW-10971
   Maybe we can add a TODO here / issue that we should generalize this to 
produce rows that were not matched.
   This looks like a great start for that 👍




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to