dongjoon-hyun commented on a change in pull request #28876: URL: https://github.com/apache/spark/pull/28876#discussion_r443291919
########## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/AggUtils.scala ########## @@ -144,11 +145,16 @@ object AggUtils { // [COUNT(DISTINCT foo), MAX(DISTINCT foo)], but [COUNT(DISTINCT bar), COUNT(DISTINCT foo)] is // disallowed because those two distinct aggregates have different column expressions. val distinctExpressions = functionsWithDistinct.head.aggregateFunction.children - val namedDistinctExpressions = distinctExpressions.map { - case ne: NamedExpression => ne - case other => Alias(other, other.toString)() + val normalizedNamedDistinctExpressions = distinctExpressions.map { e => + // Ideally this should be done in `NormalizeFloatingNumbers`, but we do it here because + // `groupingExpressions` is not extracted during logical phase. + NormalizeFloatingNumbers.normalize(e) match { + case ne: NamedExpression => ne + case other => Alias(other, other.toString)() Review comment: If we broaden the scope, `SparkStrategies` already is looking at the detail of `functionsWithDistinct` like the following. ```scala val (functionsWithDistinct, functionsWithoutDistinct) = aggregateExpressions.partition(_.isDistinct) if (functionsWithDistinct.map(_.aggregateFunction.children.toSet).distinct.length > 1) { // This is a sanity check. We should not reach here when we have multiple distinct // column sets. Our `RewriteDistinctAggregates` should take care this case. sys.error("You hit a query analyzer bug. Please report your query to " + "Spark user mailing list.") } ``` And the very next line is the same logic block for `groupingExpression`. ```scala // Ideally this should be done in `NormalizeFloatingNumbers`, but we do it here because // `groupingExpressions` is not extracted during logical phase. val normalizedGroupingExpressions = groupingExpressions.map { e => NormalizeFloatingNumbers.normalize(e) match { case n: NamedExpression => n case other => Alias(other, e.name)(exprId = e.exprId) } } ``` Given the above, I guess what you concerned is only one line source code, `val distinctExpressions = functionsWithDistinct.head.aggregateFunction.children`. And, the comment is about the definition of `functionsWithDistinct` which is generated from the above. The comment is not a detail hidden from `SparkStrategies`. For me, it seems to be an assumption given by `SparkStrategies`. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org