michalursa opened a new pull request, #13686:
URL: https://github.com/apache/arrow/pull/13686

   Hash join implementation using HashJoinBasicImpl class was missing 
initialization in case of no batches one the build side.
   Initialization of a few data structures, mainly two RowEncoder instances for 
holding key and payload columns for rows on build side, was missing inside 
BuildHashTable_exec_task, the method responsible for transforming accumulated 
batches on build side of the hash join into a hash table. 
   
   The initialization of RowEncoder inserts a single special row containing 
null values for all columns. This special row is accessed when outputting probe 
side rows with no matches in case of left outer and full outer join (these 
joins are supposed in that case to output nulls in place of all fields that 
would come from build side).
   
   Interestingly, the initialization was present in a similar case when batches 
were present on build side but all of them included zero rows. I modified the 
code to use the same code path for both these logically equivalent cases: a) 
zero build side batches and b) non-zero batches but with zero rows each.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to