Gerald Sangudi created PHOENIX-5353: ---------------------------------------
Summary: Incorrect results when JOINs project array elements Key: PHOENIX-5353 URL: https://issues.apache.org/jira/browse/PHOENIX-5353 Project: Phoenix Issue Type: Bug Reporter: Gerald Sangudi There are two errors observed below, one with HASH JOIN and the other with SORT MERGE JOIN. Here are the steps for reproducing the bug in array element projection when using sort merge join. I also noticed there is an issue with hash-joins (see the last example). * Create tables and upsert test data. {quote}{quote}0: jdbc:phoenix:localhost:2181:/hbase> create table d (c1 integer primary key, c2 integer, c3 integer[]);{quote} {quote}0: jdbc:phoenix:localhost:2181:/hbase> upsert into d values (1, 1, array[1,2,3]);{quote} {quote}0: jdbc:phoenix:localhost:2181:/hbase> upsert into d values (2, 10, array[2,3,4]);{quote} {quote} {quote} {quote}0: jdbc:phoenix:localhost:2181:/hbase> create table t2 (c1 integer primary key, c2 integer);{quote} {quote}0: jdbc:phoenix:localhost:2181:/hbase> upsert into t2 values (1, 10);{quote} {quote}0: jdbc:phoenix:localhost:2181:/hbase> upsert into t2 values (2, 20);{quote}{quote} * Verify data in the tables {quote}{quote}0: jdbc:phoenix:localhost:2181:/hbase> select * from d; +-----+-----+------------+ | C1 | C2 | C3 | +-----+-----+------------+ | 1 | 1 | [1, 2, 3] | | 2 | 10 | [2, 3, 4] | +-----+-----+------------+ 2 rows selected (0.027 seconds) 0: jdbc:phoenix:localhost:2181:/hbase> select * from t2; +-----+-----+ | C1 | C2 | +-----+-----+ | 1 | 10 | | 2 | 20 | +-----+-----+ 2 rows selected (0.021 seconds){quote}{quote} * Perform sort merge join without projecting array elements. This works fine. {quote}{quote}0: jdbc:phoenix:localhost:2181:/hbase> select /*+ use_sort_merge_join */ d.c1, d.c2, d.c3, t2.c1, t2.c2 from d join t2 on d.c1 = t2.c1;{quote} {quote}+-------+-------+------------+--------+--------+{quote} {quote}| D.C1 | D.C2 | D.C3 | T2.C1 | T2.C2 |{quote} {quote}+-------+-------+------------+--------+--------+{quote} {quote}| 1 | 1 | [1, 2, 3] | 1 | 10 |{quote} {quote}| 2 | 10 | [2, 3, 4] | 2 | 20 |{quote} {quote}+-------+-------+------------+--------+--------+{quote} {quote}2 rows selected (0.054 seconds){quote}{quote} * Perform sort merge join by projecting array elements. *Returns junk data for array elements.* {quote}{quote}0: jdbc:phoenix:localhost:2181:/hbase> select /*+ use_sort_merge_join */ d.c1, d.c2, d.c3[1], d.c3[2], d.c3[3], t2.c1, t2.c2 from d join t2 on d.c1 = t2.c1;{quote} {quote}+-------+-------+----------------------+----------------------+----------------------+--------+--------+{quote} {quote}| D.C1 | D.C2 | ARRAY_ELEM(D.C3, 1) | ARRAY_ELEM(D.C3, 2) | ARRAY_ELEM(D.C3, 3) | T2.C1 | T2.C2 |{quote} {quote}+-------+-------+----------------------+----------------------+----------------------+--------+--------+{quote} {quote}| 1 | 1 | {color:#ff0000}-1937768448{color} | {color:#ff0000}-2122317824{color} | {color:#ff0000}-2105540608{color} | 1 | 10 |{quote} {quote}| 2 | 10 | {color:#ff0000}-1937768448{color} | {color:#ff0000}-2105540608{color} | {color:#ff0000}-2088763392{color} | 2 | 20 |{quote} {quote}+-------+-------+----------------------+----------------------+----------------------+--------+--------+{quote} {quote}2 rows selected (0.043 seconds){quote}{quote} * Array element projection works fine when using hash-join but *columns from the non-array table are messed up*. {quote}{quote}0: jdbc:phoenix:localhost:2181:/hbase> select d.c1, d.c2, d.c3[1], d.c3[2], d.c3[3], t2.c1, t2.c2 from d join t2 on d.c1 = t2.c1;{quote} {quote}+-------+-------+----------------------+----------------------+----------------------+--------------+--------------+{quote} {quote}| D.C1 | D.C2 | ARRAY_ELEM(D.C3, 1) | ARRAY_ELEM(D.C3, 2) | ARRAY_ELEM(D.C3, 3) | T2.C1 | T2.C2 |{quote} {quote}+-------+-------+----------------------+----------------------+----------------------+--------------+--------------+{quote} {quote}| 1 | 1 | 1 | 2 | 3 | {color:#ff0000}-2146795520{color} | {color:#ff0000}-2147319808{color} |{quote} {quote}| 2 | 10 | 2 | 3 | 4 | {color:#ff0000}-2146140160{color} | {color:#ff0000}-2147319808{color} |{quote} {quote}+-------+-------+----------------------+----------------------+----------------------+--------------+--------------+{quote} {quote}2 rows selected (0.067 seconds){quote}{quote} -- This message was sent by Atlassian JIRA (v7.6.3#76005)