skambha commented on a change in pull request #27627: [WIP][SPARK-28067][SQL] Fix incorrect results for decimal aggregate sum by returning null on decimal overflow URL: https://github.com/apache/spark/pull/27627#discussion_r380941152
########## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/Sum.scala ########## @@ -60,38 +60,104 @@ case class Sum(child: Expression) extends DeclarativeAggregate with ImplicitCast private lazy val sumDataType = resultType private lazy val sum = AttributeReference("sum", sumDataType)() + private lazy val overflow = AttributeReference("overflow", BooleanType, false)() private lazy val zero = Literal.default(resultType) - override lazy val aggBufferAttributes = sum :: Nil + override lazy val aggBufferAttributes = sum :: overflow :: Nil override lazy val initialValues: Seq[Expression] = Seq( - /* sum = */ Literal.create(null, sumDataType) + /* sum = */ Literal.create(null, sumDataType), + /* overflow = */ Literal.create(false, BooleanType) Review comment: We keep track of overflow using this aggBufferAttributes - overflow to know if any of the intermediate add operations in updateExpressions and/or mergeExpressions overflow'd. If the overflow is true and if spark.sql.ansi.enabled flag is false, then we return null for the sum operation in evaluateExpression. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org