Hi all,
I've tried BigTop test for join_filters:
CREATE TABLE myinput1(key int, value int);
LOAD DATA LOCAL INPATH 'seed_data_files/in3.txt' INTO TABLE myinput1;
where seed_data_files/in3.txt:
12 35
NULL 40
48 NULL
100 100
I've tried:
SELECT * FROM myinput1 a FULL OUTER JOIN myinput1 b on a.key > 40 AND a.value > 50 AND a.key =
a.value AND b.key > 40 AND b.value > 50 AND b.key = b.value ORDER BY a.key, a.value, b.key, b.value;
and expected result in test is:
NULL NULL NULL 40
NULL NULL NULL 40
NULL NULL NULL 40
NULL NULL NULL 40
NULL NULL 12 35
NULL NULL 12 35
NULL NULL 12 35
NULL NULL 12 35
NULL NULL 48 NULL
NULL NULL 48 NULL
NULL NULL 48 NULL
NULL NULL 48 NULL
NULL 40 NULL NULL
NULL 40 NULL NULL
NULL 40 NULL NULL
NULL 40 NULL NULL
12 35 NULL NULL
12 35 NULL NULL
12 35 NULL NULL
12 35 NULL NULL
48 NULL NULL NULL
48 NULL NULL NULL
48 NULL NULL NULL
48 NULL NULL NULL
100 100 NULL NULL
100 100 NULL NULL
100 100 NULL NULL
100 100 100 100
but real hive result is:
NULL NULL NULL 40
NULL NULL 12 35
NULL NULL 48 NULL
NULL 40 NULL NULL
12 35 NULL NULL
48 NULL NULL NULL
100 100 100 100
btw. result from postgresql is:
(SELECT *
FROM myinput1 a
LEFT JOIN
myinput1 b on
a.key > 40 AND
a.value > 50 AND
a.key = a.value AND
b.key > 40 AND
b.value > 50 AND
b.key = b.value ORDER BY a.key, a.value, b.key, b.value)
UNION (SELECT *
FROM myinput1 a
RIGHT JOIN
myinput1 b on
a.key > 40 AND
a.value > 50 AND
a.key = a.value AND
b.key > 40 AND
b.value > 50 AND
b.key = b.value
ORDER BY a.key, a.value, b.key, b.value);
| | 12 | 35
12 | 35 | |
| | 48 |
48 | | |
| 40 | |
| | | 40
100 | 100 | 100 | 100
so it's the same like in hive.
What is the right result for this full outer join in HiveQL, please?
--
Best Regards,
Martin Kudlej.
MRG/Grid & RHS-Hadoop Senior Quality Assurance Engineer
Red Hat Czech s.r.o.
Phone: +420 532 294 155
E-mail:mkudlej at redhat.com
IRC: mkudlej at #brno, #messaging, #grid, #rhs, #distcomp