Jefffrey commented on code in PR #4840:
URL: https://github.com/apache/arrow-datafusion/pull/4840#discussion_r1063978094


##########
datafusion/expr/src/utils.rs:
##########
@@ -150,13 +150,23 @@ pub fn expand_wildcard(schema: &DFSchema, plan: 
&LogicalPlan) -> Result<Vec<Expr
     let using_columns = plan.using_columns()?;
     let columns_to_skip = using_columns
         .into_iter()
-        // For each USING JOIN condition, only expand to one column in 
projection
+        // For each USING JOIN condition, only expand to one of each join 
column in projection
         .flat_map(|cols| {
             let mut cols = cols.into_iter().collect::<Vec<_>>();
             // sort join columns to make sure we consistently keep the same
             // qualified column
             cols.sort();
-            cols.into_iter().skip(1)
+            let mut out_column_names: HashSet<String> = HashSet::new();
+            cols.into_iter()
+                .filter_map(|c| {
+                    if out_column_names.contains(&c.name) {
+                        Some(c)
+                    } else {
+                        out_column_names.insert(c.name);
+                        None
+                    }
+                })
+                .collect::<Vec<_>>()

Review Comment:
   main fix is here, since instead of only skipping the first column (which is 
based on assumption of using join with only one column), actually keep track of 
which columns to skip, allowing only one set of the join columns to be output



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to