houqp opened a new pull request #1023:
URL: https://github.com/apache/arrow-datafusion/pull/1023


   # Rationale for this change
   
   The current join constraint check is too strict, which causes false 
positives for valid join queries. Our current outer/full/right/left join logic 
is also not correct because it generates row values for both of the join 
columns from only one side of the join.
   
   # What changes are included in this PR?
   
   1. allow duplicate column names in `check_join_set_is_valid`
   2. move column index building logic into `build_join_schema` so it can 
support column with duplicated names. This should also result in minor 
performance improvement because we are not rebuilding the same column index 
when executing join on every single partition anymore.
   
   # Are there any user-facing changes?
   
   no
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to