[jira] [Commented] (SPARK-21274) Implement EXCEPT ALL and INTERSECT ALL

Dilip Biswal (JIRA) Sun, 06 May 2018 01:05:59 -0700

    [ 
https://issues.apache.org/jira/browse/SPARK-21274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16465008#comment-16465008
 ]


Dilip Biswal commented on SPARK-21274:
--------------------------------------

[~viirya] Thank you. I will study this more tomorrow. From a quick peek, it 
looks reasonable to me. Since our current EXCEPT ALL rewrite is simpler, we can 
stay with that (given both hive and db2 do it the same way) ? And change the 
INTERSECT ALL rewrite after studying it a little more.

> Implement EXCEPT ALL and INTERSECT ALL
> --------------------------------------
>
>                 Key: SPARK-21274
>                 URL: https://issues.apache.org/jira/browse/SPARK-21274
>             Project: Spark
>          Issue Type: New Feature
>          Components: SQL
>    Affects Versions: 2.0.0, 2.1.0, 2.2.0
>            Reporter: Ruslan Dautkhanov
>            Priority: Major
>
> 1) *EXCEPT ALL* / MINUS ALL :
> {code}
> SELECT a,b,c FROM tab1
>  EXCEPT ALL 
> SELECT a,b,c FROM tab2
> {code}
> can be rewritten as following outer join:
> {code}
> SELECT a,b,c
> FROM    tab1 t1
>      LEFT OUTER JOIN 
>         tab2 t2
>      ON (
>         (t1.a, t1.b, t1.c) = (t2.a, t2.b, t2.c)
>      )
> WHERE
>     COALESCE(t2.a, t2.b, t2.c) IS NULL
> {code}
> (register as a temp.view this second query under "*t1_except_t2_df*" name 
> that can be also used to find INTERSECT ALL below):
> 2) *INTERSECT ALL*:
> {code}
> SELECT a,b,c FROM tab1
>  INTERSECT ALL 
> SELECT a,b,c FROM tab2
> {code}
> can be rewritten as following anti-join using t1_except_t2_df we defined 
> above:
> {code}
> SELECT a,b,c
> FROM    tab1 t1
> WHERE 
>    NOT EXISTS
>    (    SELECT 1
>         FROM    t1_except_t2_df e
>         WHERE (t1.a, t1.b, t1.c) = (e.a, e.b, e.c)
>    )
> {code}
> So the suggestion is just to use above query rewrites to implement both 
> EXCEPT ALL and INTERSECT ALL sql set operations.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-21274) Implement EXCEPT ALL and INTERSECT ALL

Reply via email to