[
https://issues.apache.org/jira/browse/HIVE-741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Amareshwari Sriramadasu updated HIVE-741:
-----------------------------------------
Attachment: patch-741-2.txt
Patch fixes SMBMapJoinOperator also. I modified compareKeys(ArrayList<Object>
k1, ArrayList<Object> k2) to do the following:
{code}
if (hasNullElements(k1) && hasNullElements(k2)) {
return -1; // just return k1 is smaller than k2
} else if (hasNullElements(k1)) {
return (0 - k2.size());
} else if (hasNullElements(k2)) {
return k1.size();
}
... //the existing code.
{code}
Does the above make sense?
Updated the testcase with smb join queries.
When I'm running smb join on my local machine (pseudo distributed mode), I'm
getting different results. I think that is mostly because of HIVE-1561. Will
update the issue with my findings.
> NULL is not handled correctly in join
> -------------------------------------
>
> Key: HIVE-741
> URL: https://issues.apache.org/jira/browse/HIVE-741
> Project: Hadoop Hive
> Issue Type: Bug
> Reporter: Ning Zhang
> Assignee: Amareshwari Sriramadasu
> Attachments: patch-741-1.txt, patch-741-2.txt, patch-741.txt,
> smbjoin_nulls.q.txt
>
>
> With the following data in table input4_cb:
> Key Value
> ------ --------
> NULL 325
> 18 NULL
> The following query:
> {code}
> select * from input4_cb a join input4_cb b on a.key = b.value;
> {code}
> returns the following result:
> NULL 325 18 NULL
> The correct result should be empty set.
> When 'null' is replaced by '' it works.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.