[
https://issues.apache.org/jira/browse/SPARK-1914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14008629#comment-14008629
]
Cheng Lian commented on SPARK-1914:
---
Added steps to reproduce this bug in {{sbt hive/console}}.
Simplify CountFunction not to traverse to evaluate all child expressions.
-
Key: SPARK-1914
URL: https://issues.apache.org/jira/browse/SPARK-1914
Project: Spark
Issue Type: Bug
Components: SQL
Reporter: Takuya Ueshin
{{CountFunction}} should count up only if the child's evaluated value is not
null.
Because it traverses to evaluate all child expressions, even if the child is
null, it counts up if one of the all children is not null.
To reproduce this bug in {{sbt hive/console}}:
{code}
scala hql(SELECT COUNT(*) FROM src1).collect()
res1: Array[org.apache.spark.sql.Row] = Array([25])
scala hql(SELECT COUNT(*) FROM src1 WHERE key IS NULL).collect()
res2: Array[org.apache.spark.sql.Row] = Array([10])
scala hql(SELECT COUNT(key + 1) FROM src1).collect()
res3: Array[org.apache.spark.sql.Row] = Array([25])
{code}
{{res3}} should be 15 since there are 10 null keys.
--
This message was sent by Atlassian JIRA
(v6.2#6252)