[ 
https://issues.apache.org/jira/browse/HIVE-4845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13709399#comment-13709399
 ] 

Ashutosh Chauhan commented on HIVE-4845:
----------------------------------------

I meant your previous two patches on this same jira. (the second one looked 
like is on top of the first one, instead of including it). But, now that you 
have regenerated patch, the new one supercedes earlier two.
                
> Correctness issue with MapJoins using the null safe operator
> ------------------------------------------------------------
>
>                 Key: HIVE-4845
>                 URL: https://issues.apache.org/jira/browse/HIVE-4845
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Brock Noland
>            Assignee: Brock Noland
>            Priority: Critical
>         Attachments: HIVE-4845.patch, HIVE-4845.patch, HIVE-4845.patch
>
>
> I found a correctness issue while working on HIVE-4838. The following query 
> from join_nullsafe.q gives different results depending on if it's executed 
> map-side or reduce-side:
> {noformat}
> SELECT /*+ MAPJOIN(a) */ * FROM smb_input1 a JOIN smb_input1 b ON a.key <=> 
> b.key AND a.value <=> b.value ORDER BY a.key, a.value, b.key, b.value;
> {noformat}
> For that query, on the map side, rows which should be joined are not. For 
> example, the reduce side outputs this row:
> {noformat}
> a.key   a.value   b.key   b.value
> 148     NULL      148     NULL
> {noformat}
> which makes sense since a.key is equal to b.key and a.value is equal to 
> b.value but the current map-side code omits this row. The reason is that 
> MapJoinDoubleKey is used for the map-side join which doesn't properly compare 
> null values.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to