[ https://issues.apache.org/jira/browse/SPARK-10707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14977084#comment-14977084 ]
Apache Spark commented on SPARK-10707: -------------------------------------- User 'mbautin' has created a pull request for this issue: https://github.com/apache/spark/pull/9308 > Set operation output columns may have incorrect nullability > ----------------------------------------------------------- > > Key: SPARK-10707 > URL: https://issues.apache.org/jira/browse/SPARK-10707 > Project: Spark > Issue Type: Bug > Components: SQL > Reporter: Mikhail Bautin > Assignee: Mark Hamstra > > If "dual" is a one-row table, the query > SELECT count(v) FROM ( > SELECT v FROM ( > SELECT NULL AS v FROM dual > UNION ALL > SELECT 'foo' AS v FROM dual > ) my_union WHERE isnull(v) > ) my_subview; > returns 0 and the same query with union'ed sub-queries switched returns 1: > SELECT count(v) FROM ( > SELECT v FROM ( > SELECT 'foo' AS v FROM dual > UNION ALL > SELECT NULL AS v FROM dual > ) my_union WHERE isnull(v) > ) my_subview; > Example output (with Catalyst tracing turned on): > https://gist.githubusercontent.com/mbautin/c916a2a7ce733d039137/raw > This is caused by the behavior in set operation implementation where only the > nullability of the first child is taken into account: > SetOperation in org.apache.spark.sql.catalyst.plans.logical: > https://github.com/apache/spark/blob/82268f07abfa658869df2354ae72f8d6ddd119e8/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicOperators.scala#L93 > Union in org.apache.spark.sql.execution: > https://github.com/apache/spark/blob/e626ac5f5c27dcc74113070f2fec03682bcd12bd/sql/core/src/main/scala/org/apache/spark/sql/execution/basicOperators.scala#L178 > Intersect in org.apache.spark.sql.execution: > https://github.com/apache/spark/blob/e626ac5f5c27dcc74113070f2fec03682bcd12bd/sql/core/src/main/scala/org/apache/spark/sql/execution/basicOperators.scala#L320 > Except in org.apache.spark.sql.execution: > https://github.com/apache/spark/blob/e626ac5f5c27dcc74113070f2fec03682bcd12bd/sql/core/src/main/scala/org/apache/spark/sql/execution/basicOperators.scala#L307 -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org