[ 
https://issues.apache.org/jira/browse/SPARK-10740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-10740:
------------------------------------

    Assignee:     (was: Apache Spark)

> handle nondeterministic expressions correctly for set operations
> ----------------------------------------------------------------
>
>                 Key: SPARK-10740
>                 URL: https://issues.apache.org/jira/browse/SPARK-10740
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>            Reporter: Wenchen Fan
>
> We should only push down deterministic filter condition to set operator.
> For Union, let's say we do a non-deterministic filter on 1...5 union 1...5, 
> and we may get 1,3 for the left side and 2,4 for the right side, then the 
> result should be 1,3,2,4. If we push down this filter, we get 1,3 for both 
> side(we create a new random object with same seed in each side) and the 
> result would be 1,3,1,3.
> For Intersect, let's say there is a non-deterministic condition with a 0.5 
> possibility to accept a row and we have a row that presents in both sides of 
> an Intersect. Once we push down this condition, the possibility to accept 
> this row will be 0.25.
> For Except, let's say there is a row that presents in both sides of an 
> Except. This row should not be in the final output. However, if we pushdown a 
> non-deterministic condition, it is possible that this row is rejected from 
> one side and then we output a row that should not be a part of the result.
>  We should only push down deterministic projection to Union.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to