[ 
https://issues.apache.org/jira/browse/IMPALA-6660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Norbert Luksa resolved IMPALA-6660.
-----------------------------------
    Fix Version/s: Impala 3.4.0
       Resolution: Fixed

> -0/+0 floating point do not compare as equal in hash table
> ----------------------------------------------------------
>
>                 Key: IMPALA-6660
>                 URL: https://issues.apache.org/jira/browse/IMPALA-6660
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Backend
>    Affects Versions: Impala 2.6.0, Impala 2.7.0, Impala 2.8.0, Impala 2.9.0, 
> Impala 2.10.0, Impala 2.11.0, Impala 3.0, Impala 2.12.0
>            Reporter: Tim Armstrong
>            Assignee: Norbert Luksa
>            Priority: Major
>              Labels: correctness, ramp-up
>             Fix For: Impala 3.4.0
>
>
> This can happen because we hash the binary representation of the numbers. 
> -0/0 should be treated as equal for hash joins.
> {noformat}
> [localhost:21000] > select * from (select cast("-0" as float) c1) v1,  
> (select cast("0" as float) c2) v2 where v1.c1 = v2.c2;
> Fetched 0 row(s) in 0.12s
> [localhost:21000] > select * from (select cast("0" as float) c1) v1,  (select 
> cast("0" as float) c2) v2 where v1.c1 = v2.c2;
> +----+----+
> | c1 | c2 |
> +----+----+
> | 0  | 0  |
> +----+----+
> Fetched 1 row(s) in 0.11s
> [localhost:21000] > select * from (select cast("-0" as float) c1) v1,  
> (select cast("-0" as float) c2) v2 where v1.c1 = v2.c2;
> +----+----+
> | c1 | c2 |
> +----+----+
> | -0 | -0 |
> +----+----+
> Fetched 1 row(s) in 0.11s
> {noformat}
> With aggregations, we get separate groups. I could see the argument either 
> way on whether this is the preferred behaviour for group by, since group by 
> already handles equality of NULL differently. The behaviour here is tied to 
> the behaviour in the join right now, so we should make sure to add a test for 
> this case when fixing the join.
> {noformat}
> [localhost:21000] > select distinct * from (values(cast("-0" as float)), 
> (cast("0" as float))) v;
> +---------------------+
> | cast('-0' as float) |
> +---------------------+
> | -0                  |
> | 0                   |
> +---------------------+
> {noformat}
> *Workaround*
> Casting the floating point numbers to decimal fixes the problem.
> *Proposed solution*
> The frontend could wrap floating point expressions in the hash join or hash 
> aggregation in a normalisation function that converts -0 to +0.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

Reply via email to