houqp opened a new pull request #1023: URL: https://github.com/apache/arrow-datafusion/pull/1023
# Rationale for this change The current join constraint check is too strict, which causes false positives for valid join queries. Our current outer/full/right/left join logic is also not correct because it generates row values for both of the join columns from only one side of the join. # What changes are included in this PR? 1. allow duplicate column names in `check_join_set_is_valid` 2. move column index building logic into `build_join_schema` so it can support column with duplicated names. This should also result in minor performance improvement because we are not rebuilding the same column index when executing join on every single partition anymore. # Are there any user-facing changes? no -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org