[ 
https://issues.apache.org/jira/browse/SPARK-18966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15826390#comment-15826390
 ] 

Nattavut Sutyanyong edited comment on SPARK-18966 at 1/17/17 5:19 PM:
----------------------------------------------------------------------

A semantically rewritten form of the above query is:

{code}
select *
from   t1 left anti join t2
       on  b2=b1
       and (isnull(a2=a1) or (a2=a1))
{code}


was (Author: nsyca):
A semantically rewrite of the above query is:

{code}
select *
from   t1 left anti join t2
       on  b2=b1
       and (isnull(a2=a1) or (a2=a1))
{code}

> NOT IN subquery with correlated expressions may return incorrect result
> -----------------------------------------------------------------------
>
>                 Key: SPARK-18966
>                 URL: https://issues.apache.org/jira/browse/SPARK-18966
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.0.0
>            Reporter: Nattavut Sutyanyong
>              Labels: correctness
>
> {code}
> Seq((1, 2)).toDF("a1", "b1").createOrReplaceTempView("t1")
> Seq[(java.lang.Integer, java.lang.Integer)]((1, null)).toDF("a2", 
> "b2").createOrReplaceTempView("t2")
> // The expected result is 1 row of (1,2) as shown in the next statement.
> sql("select * from t1 where a1 not in (select a2 from t2 where b2 = b1)").show
> +---+---+
> | a1| b1|
> +---+---+
> +---+---+
> sql("select * from t1 where a1 not in (select a2 from t2 where b2 = 2)").show
> +---+---+
> | a1| b1|
> +---+---+
> |  1|  2|
> +---+---+
> {code}
> The two SQL statements above should return the same result.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to