cloud-fan commented on a change in pull request #21599: [SPARK-26218][SQL] Overflow on arithmetic operations returns incorrect result URL: https://github.com/apache/spark/pull/21599#discussion_r308722618
########## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/Sum.scala ########## @@ -89,5 +92,11 @@ case class Sum(child: Expression) extends DeclarativeAggregate with ImplicitCast ) } - override lazy val evaluateExpression: Expression = sum + override lazy val evaluateExpression: Expression = { + if (sumDataType == resultType) { + sum + } else { + Cast(sum, resultType) Review comment: After some more thoughts, I think we should still use long as buffer type to sum long, at least by default. Adding long values is faster than adding decimal values, and we shouldn't introduce this performance regression silently. We can have an option to use decimal as buffer type, to reduce the possibility of overflow. But it should be opt-in. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org