[ https://issues.apache.org/jira/browse/SPARK-5680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14584448#comment-14584448 ]
Holman Lan commented on SPARK-5680: ----------------------------------- Thanks. A closer look at the statement in the description I realized that it's different from our test cases. Our test cases are: select sum(c2) from sum_test select c1, sum(c2) from sum_test group by c1 Where c1 is an int column with non-NULL values and c2 is an int column with all NULL values. Spark 1.3.0 and 1.3.1 return NULL for sum(c2) whereas Spark 1.4.0 returns 0. Hive, Impala and SQL Server returns NULL for the both cases. For the statement "select sum('a') from src" Hive indeed returns 0, my bad. The title of this JIRA caught my attention. Could the change in behavior on sum of all NULL values be related to the changes made for this JIRA? > Sum function on all null values, should return zero > --------------------------------------------------- > > Key: SPARK-5680 > URL: https://issues.apache.org/jira/browse/SPARK-5680 > Project: Spark > Issue Type: Bug > Components: SQL > Reporter: Venkata Ramana G > Assignee: Venkata Ramana G > Priority: Minor > Fix For: 1.3.1, 1.4.0 > > > SELECT sum('a'), avg('a'), variance('a'), std('a') FROM src; > Current output: > NULL NULL NULL NULL > Expected output: > 0.0 NULL NULL NULL > This fixes hive udaf_number_format.q -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org