[ https://issues.apache.org/jira/browse/HIVE-741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12899044#action_12899044 ]
Ning Zhang commented on HIVE-741: --------------------------------- @Amareshwari, aside from adding new test cases for sort merge join, this patch also has some bugs. For example in your test data: {code} hive> select * from myinput1 NULL 356 484 NULL 10 10 -- incorrect result below hive> select * FROM myinput1 a left outer JOIN myinput1 b ON a.value = b.value; 484 NULL 484 NULL 10 10 10 10 NULL 356 NULL NULL hive> select * FROM myinput1 a right outer JOIN myinput1 b ON a.value = b.value; 484 NULL 484 NULL 10 10 10 10 NULL NULL NULL 356 hive> select * FROM myinput1 a left outer JOIN myinput1 b right outer join myinput1 c ON a.value = b.value and b.value = c.value; NULL NULL NULL NULL 484 NULL NULL NULL NULL NULL 10 10 NULL NULL NULL NULL NULL 356 {code} Can you take a look? I'm not sure whether ending a group and starting a new group for each null-keyed row works for all cases particularly in joins involving more than 2 tables and mixture of left and right outer joins. > NULL is not handled correctly in join > ------------------------------------- > > Key: HIVE-741 > URL: https://issues.apache.org/jira/browse/HIVE-741 > Project: Hadoop Hive > Issue Type: Bug > Reporter: Ning Zhang > Assignee: Amareshwari Sriramadasu > Attachments: patch-741.txt > > > With the following data in table input4_cb: > Key Value > ------ -------- > NULL 325 > 18 NULL > The following query: > {code} > select * from input4_cb a join input4_cb b on a.key = b.value; > {code} > returns the following result: > NULL 325 18 NULL > The correct result should be empty set. > When 'null' is replaced by '' it works. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.