Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/21737#discussion_r201164293 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -738,6 +738,10 @@ class Analyzer( if findAliases(aggregateExpressions).intersect(conflictingAttributes).nonEmpty => (oldVersion, oldVersion.copy(aggregateExpressions = newAliases(aggregateExpressions))) + case oldVersion @ FlatMapGroupsInPandas(_, _, output, _) + if AttributeSet(output).intersect(conflictingAttributes).nonEmpty => --- End diff -- cc @maryannxue Deduplicating on conflicting attributes in this function is easily broken. In the long term, this is not the perfect way to handle it. We should consider to fundamentally fix it.
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org