Jefffrey commented on code in PR #4840: URL: https://github.com/apache/arrow-datafusion/pull/4840#discussion_r1063978094
########## datafusion/expr/src/utils.rs: ########## @@ -150,13 +150,23 @@ pub fn expand_wildcard(schema: &DFSchema, plan: &LogicalPlan) -> Result<Vec<Expr let using_columns = plan.using_columns()?; let columns_to_skip = using_columns .into_iter() - // For each USING JOIN condition, only expand to one column in projection + // For each USING JOIN condition, only expand to one of each join column in projection .flat_map(|cols| { let mut cols = cols.into_iter().collect::<Vec<_>>(); // sort join columns to make sure we consistently keep the same // qualified column cols.sort(); - cols.into_iter().skip(1) + let mut out_column_names: HashSet<String> = HashSet::new(); + cols.into_iter() + .filter_map(|c| { + if out_column_names.contains(&c.name) { + Some(c) + } else { + out_column_names.insert(c.name); + None + } + }) + .collect::<Vec<_>>() Review Comment: main fix is here, since instead of only skipping the first column (which is based on assumption of using join with only one column), actually keep track of which columns to skip, allowing only one set of the join columns to be output -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org