beliefer commented on code in PR #44184: URL: https://github.com/apache/spark/pull/44184#discussion_r1419842677
########## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/Mode.scala: ########## @@ -18,22 +18,21 @@ package org.apache.spark.sql.catalyst.expressions.aggregate import org.apache.spark.sql.catalyst.InternalRow -import org.apache.spark.sql.catalyst.analysis.TypeCheckResult -import org.apache.spark.sql.catalyst.analysis.TypeCheckResult.{DataTypeMismatch, TypeCheckSuccess} -import org.apache.spark.sql.catalyst.expressions.{Expression, ExpressionDescription, ImplicitCastInputTypes, Literal} -import org.apache.spark.sql.catalyst.trees.{BinaryLike, UnaryLike} +import org.apache.spark.sql.catalyst.analysis.{ExpressionBuilder, UnresolvedWithinGroup} +import org.apache.spark.sql.catalyst.expressions.{Ascending, Descending, Expression, ExpressionDescription, ImplicitCastInputTypes, SortOrder} +import org.apache.spark.sql.catalyst.trees.UnaryLike import org.apache.spark.sql.catalyst.types.PhysicalDataType import org.apache.spark.sql.catalyst.util.GenericArrayData -import org.apache.spark.sql.catalyst.util.TypeUtils.toSQLExpr -import org.apache.spark.sql.errors.DataTypeErrors.{toSQLId, toSQLType} +import org.apache.spark.sql.errors.QueryCompilationErrors import org.apache.spark.sql.types.{AbstractDataType, AnyDataType, ArrayType, BooleanType, DataType} import org.apache.spark.util.collection.OpenHashMap // scalastyle:off line.size.limit @ExpressionDescription( usage = """ - _FUNC_(col[, deterministic]) - Returns the most frequent value for the values within `col`. NULL values are ignored. If all the values are NULL, or there are 0 rows, returns NULL. - When multiple values have the same greatest frequency then either any of values is returned if `deterministic` is false or is not defined, or the lowest value is returned if `deterministic` is true.""", + _FUNC_(col[, reverse]) - Returns the most frequent value for the values within `col`. NULL values are ignored. If all the values are NULL, or there are 0 rows, returns NULL. + When multiple values have the same greatest frequency only one value will be returned. The value will be chosen based on optional reverse value. Return the smallest value + if reverse is false or the largest value if reverse is true from multiple values with the same frequency. If reverse is not specified the chosen value is not determined.""", Review Comment: For keep compatibility with origin `deterministicExpr`. The false in `mode(col, false)` is the value of `deterministicExpr`. The true in `mode(col, true)` is the value of `deterministicExpr`. After the offline discussion. @cloud-fan suggests keep temporarily compatibility. I don't know if the users really need `deterministicExpr`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org