[
https://issues.apache.org/jira/browse/HIVE-741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12899044#action_12899044
]
Ning Zhang commented on HIVE-741:
---------------------------------
@Amareshwari, aside from adding new test cases for sort merge join, this patch
also has some bugs.
For example in your test data:
{code}
hive> select * from myinput1
NULL 356
484 NULL
10 10
-- incorrect result below
hive> select * FROM myinput1 a left outer JOIN myinput1 b ON a.value = b.value;
484 NULL 484 NULL
10 10 10 10
NULL 356 NULL NULL
hive> select * FROM myinput1 a right outer JOIN myinput1 b ON a.value = b.value;
484 NULL 484 NULL
10 10 10 10
NULL NULL NULL 356
hive> select * FROM myinput1 a left outer JOIN myinput1 b right outer join
myinput1 c ON a.value = b.value and b.value = c.value;
NULL NULL NULL NULL 484 NULL
NULL NULL NULL NULL 10 10
NULL NULL NULL NULL NULL 356
{code}
Can you take a look? I'm not sure whether ending a group and starting a new
group for each null-keyed row works for all cases particularly in joins
involving more than 2 tables and mixture of left and right outer joins.
> NULL is not handled correctly in join
> -------------------------------------
>
> Key: HIVE-741
> URL: https://issues.apache.org/jira/browse/HIVE-741
> Project: Hadoop Hive
> Issue Type: Bug
> Reporter: Ning Zhang
> Assignee: Amareshwari Sriramadasu
> Attachments: patch-741.txt
>
>
> With the following data in table input4_cb:
> Key Value
> ------ --------
> NULL 325
> 18 NULL
> The following query:
> {code}
> select * from input4_cb a join input4_cb b on a.key = b.value;
> {code}
> returns the following result:
> NULL 325 18 NULL
> The correct result should be empty set.
> When 'null' is replaced by '' it works.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.