[ 
https://issues.apache.org/jira/browse/IMPALA-8945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Rodoni updated IMPALA-8945:
--------------------------------
    Description: 
Reported by [~icook]

The Impala docs entry for the IS DISTINCT FROM operator states:

The <=> operator, used like an equality operator in a join query, is more 
efficient than the equivalent clause: A = B OR (A IS NULL AND B IS NULL). The 
<=> operator can use a hash join, while the OR expression cannot.

But this expression is not equivalent to A <=> B. See the attached screenshot 
demonstrating their non-equivalence. An expression that is equivalent to A <=> 
B is this:

(A IS NULL AND B IS NULL) OR ((A IS NOT NULL AND B IS NOT NULL) AND (A = B))

 This expression should replace the existing incorrect expression.

Another expression that is equivalent to A <=> B is:

if(A IS NULL OR B IS NULL, A IS NULL AND B IS NULL, A = B)

This one is a bit easier to follow. If you use this one in the docs, just 
replace the following line with:

The <=> operator can use a hash join, while the if expression cannot.

  was:
The Impala docs entry for the IS DISTINCT FROM operator states:

The <=> operator, used like an equality operator in a join query, is more 
efficient than the equivalent clause: A = B OR (A IS NULL AND B IS NULL). The 
<=> operator can use a hash join, while the OR expression cannot.

But this expression is not equivalent to A <=> B. See the attached screenshot 
demonstrating their non-equivalence. An expression that is equivalent to A <=> 
B is this:

(A IS NULL AND B IS NULL) OR ((A IS NOT NULL AND B IS NOT NULL) AND (A = B))

 This expression should replace the existing incorrect expression.

Another expression that is equivalent to A <=> B is:

if(A IS NULL OR B IS NULL, A IS NULL AND B IS NULL, A = B)

This one is a bit easier to follow. If you use this one in the docs, just 
replace the following line with:

The <=> operator can use a hash join, while the if expression cannot.


> Impala Doc: Incorrect Claim of Equivalence in Impala Docs
> ---------------------------------------------------------
>
>                 Key: IMPALA-8945
>                 URL: https://issues.apache.org/jira/browse/IMPALA-8945
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Docs
>            Reporter: Alex Rodoni
>            Assignee: Alex Rodoni
>            Priority: Major
>
> Reported by [~icook]
> The Impala docs entry for the IS DISTINCT FROM operator states:
> The <=> operator, used like an equality operator in a join query, is more 
> efficient than the equivalent clause: A = B OR (A IS NULL AND B IS NULL). The 
> <=> operator can use a hash join, while the OR expression cannot.
> But this expression is not equivalent to A <=> B. See the attached screenshot 
> demonstrating their non-equivalence. An expression that is equivalent to A 
> <=> B is this:
> (A IS NULL AND B IS NULL) OR ((A IS NOT NULL AND B IS NOT NULL) AND (A = B))
>  This expression should replace the existing incorrect expression.
> Another expression that is equivalent to A <=> B is:
> if(A IS NULL OR B IS NULL, A IS NULL AND B IS NULL, A = B)
> This one is a bit easier to follow. If you use this one in the docs, just 
> replace the following line with:
> The <=> operator can use a hash join, while the if expression cannot.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

Reply via email to