[ 
https://issues.apache.org/jira/browse/HIVE-742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740838#action_12740838
 ] 

Namit Jain commented on HIVE-742:
---------------------------------

Pasted from http://issues.apache.org/jira/browse/HIVE-734 as Ning corrrectly 
found out.

The issue is not null vs. null comparison - the join happens at the reducer 
where the keys are not even taken into consideration.
NULL hashes to the same value, which explains this behavior.

The correct fix would be to filter the rows with NULL keys at the mapper itself.

However, we need to figure out the following:

1. Expected semantics when join key consists of a null and a non-null column. 
If that is also same (4, NULL) does not join with (4, NULL), then we can safely
    filter all rows even if one key is NULL.
2. Expected behavior of outer joins.



> joins dont handle null correctly 
> ---------------------------------
>
>                 Key: HIVE-742
>                 URL: https://issues.apache.org/jira/browse/HIVE-742
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Query Processor
>            Reporter: Namit Jain
>
> Ning Zhang added a comment - 07/Aug/09 10:46 AM
> BTW, I tried joins and it doesn't handle NULL semantics correctly. Here's the 
> data and results:
> Table:
> Key Value
> ------- -------
> NULL 325
> 18 NULL
> Query:
> select * from input4_cb a join input4_cb b on a.key = b.value;
> Result:
> NULL 325 18 NULL
> The correct result should be empty set.
> I guess the NULL vs. NULL comparison is incorrectly evaluated.
> [ Show ยป ]
> Ning Zhang added a comment - 07/Aug/09 10:46 AM BTW, I tried joins and it 
> doesn't handle NULL semantics correctly. Here's the data and results: Table: 
> Key Value ------- ------- NULL 325 18 NULL Query: select * from input4_cb a 
> join input4_cb b on a.key = b.value; Result: NULL 325 18 NULL The correct 
> result should be empty set. I guess the NULL vs. NULL comparison is 
> incorrectly evaluated.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to