Gerald Sangudi created PHOENIX-5353:
---------------------------------------

             Summary: Incorrect results when JOINs project array elements
                 Key: PHOENIX-5353
                 URL: https://issues.apache.org/jira/browse/PHOENIX-5353
             Project: Phoenix
          Issue Type: Bug
            Reporter: Gerald Sangudi


There are two errors observed below, one with HASH JOIN and the other with SORT 
MERGE JOIN.
 
Here are the steps for reproducing the bug in array element projection when 
using sort merge join. 
I also noticed there is an issue with hash-joins (see the last example).
 * Create tables and upsert test data.
{quote}{quote}0: jdbc:phoenix:localhost:2181:/hbase> create table d (c1 integer 
primary key, c2 integer, c3 integer[]);{quote}
{quote}0: jdbc:phoenix:localhost:2181:/hbase> upsert into d values (1, 1, 
array[1,2,3]);{quote}
{quote}0: jdbc:phoenix:localhost:2181:/hbase> upsert into d values (2, 10, 
array[2,3,4]);{quote}
{quote} {quote}
{quote}0: jdbc:phoenix:localhost:2181:/hbase> create table t2 (c1 integer 
primary key, c2 integer);{quote}
{quote}0: jdbc:phoenix:localhost:2181:/hbase> upsert into t2 values (1, 
10);{quote}
{quote}0: jdbc:phoenix:localhost:2181:/hbase> upsert into t2 values (2, 
20);{quote}{quote}
 * Verify data in the tables
{quote}{quote}0: jdbc:phoenix:localhost:2181:/hbase> select * from d;
+-----+-----+------------+
| C1  | C2  |     C3     |
+-----+-----+------------+
| 1   | 1   | [1, 2, 3]  |
| 2   | 10  | [2, 3, 4]  |
+-----+-----+------------+
2 rows selected (0.027 seconds)
0: jdbc:phoenix:localhost:2181:/hbase> select * from t2;
+-----+-----+
| C1  | C2  |
+-----+-----+
| 1   | 10  |
| 2   | 20  |
+-----+-----+
2 rows selected (0.021 seconds){quote}{quote}
 * Perform sort merge join without projecting array elements. This works fine.
{quote}{quote}0: jdbc:phoenix:localhost:2181:/hbase> select /*+ 
use_sort_merge_join */ d.c1, d.c2, d.c3, t2.c1, t2.c2 from d join t2 on d.c1 = 
t2.c1;{quote}
{quote}+-------+-------+------------+--------+--------+{quote}
{quote}| D.C1  | D.C2  |    D.C3    | T2.C1  | T2.C2  |{quote}
{quote}+-------+-------+------------+--------+--------+{quote}
{quote}| 1     | 1     | [1, 2, 3]  | 1      | 10     |{quote}
{quote}| 2     | 10    | [2, 3, 4]  | 2      | 20     |{quote}
{quote}+-------+-------+------------+--------+--------+{quote}
{quote}2 rows selected (0.054 seconds){quote}{quote}
 * Perform sort merge join by projecting array elements. *Returns junk data for 
array elements.*
{quote}{quote}0: jdbc:phoenix:localhost:2181:/hbase> select /*+ 
use_sort_merge_join */ d.c1, d.c2, d.c3[1], d.c3[2], d.c3[3], t2.c1, t2.c2 from 
d join t2 on d.c1 = t2.c1;{quote}
{quote}+-------+-------+----------------------+----------------------+----------------------+--------+--------+{quote}
{quote}| D.C1  | D.C2  | ARRAY_ELEM(D.C3, 1)  | ARRAY_ELEM(D.C3, 2)  | 
ARRAY_ELEM(D.C3, 3)  | T2.C1  | T2.C2  |{quote}
{quote}+-------+-------+----------------------+----------------------+----------------------+--------+--------+{quote}
{quote}| 1     | 1     | {color:#ff0000}-1937768448{color}          | 
{color:#ff0000}-2122317824{color}          | {color:#ff0000}-2105540608{color}  
        | 1      | 10     |{quote}
{quote}| 2     | 10    | {color:#ff0000}-1937768448{color}          | 
{color:#ff0000}-2105540608{color}          | {color:#ff0000}-2088763392{color}  
        | 2      | 20     |{quote}
{quote}+-------+-------+----------------------+----------------------+----------------------+--------+--------+{quote}
{quote}2 rows selected (0.043 seconds){quote}{quote}
 * Array element projection works fine when using hash-join but *columns from 
the non-array table are messed up*.
{quote}{quote}0: jdbc:phoenix:localhost:2181:/hbase> select  d.c1, d.c2, 
d.c3[1], d.c3[2], d.c3[3], t2.c1, t2.c2 from d join t2 on d.c1 = t2.c1;{quote}
{quote}+-------+-------+----------------------+----------------------+----------------------+--------------+--------------+{quote}
{quote}| D.C1  | D.C2  | ARRAY_ELEM(D.C3, 1)  | ARRAY_ELEM(D.C3, 2)  | 
ARRAY_ELEM(D.C3, 3)  |    T2.C1     |    T2.C2     |{quote}
{quote}+-------+-------+----------------------+----------------------+----------------------+--------------+--------------+{quote}
{quote}| 1     | 1     | 1                    | 2                    | 3        
            | {color:#ff0000}-2146795520{color}  | 
{color:#ff0000}-2147319808{color}  |{quote}
{quote}| 2     | 10    | 2                    | 3                    | 4        
            | {color:#ff0000}-2146140160{color}  | 
{color:#ff0000}-2147319808{color}  |{quote}
{quote}+-------+-------+----------------------+----------------------+----------------------+--------------+--------------+{quote}
{quote}2 rows selected (0.067 seconds){quote}{quote}
 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to