[jira] Commented: (HIVE-262) outer join gets some duplicate rows in some scenarios

Namit Jain (JIRA) Sat, 31 Jan 2009 15:30:23 -0800

    [ 
https://issues.apache.org/jira/browse/HIVE-262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12669286#action_12669286
 ]


Namit Jain commented on HIVE-262:
---------------------------------

Ashish suggested the following approach:

Based on join conditions, create a set of all tables being joined (no outer 
join), and if one of them is null for a given value, all of them become null.

For example,

A join B on A.c1=B.c1 join C on A.c1=C.c1 right outer join D on A.c1=D.c1

A,B,C belong to the same group (since A joins with B and A joins with C).

So, for a given key (c1), if there is no row corresponding to either of A, B, 
or C - assume that
there is no row for all of them for that key.

That works for the example above, and the approach is different from the patch

> outer join gets some duplicate rows in some scenarios
> -----------------------------------------------------
>
>                 Key: HIVE-262
>                 URL: https://issues.apache.org/jira/browse/HIVE-262
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Query Processor
>    Affects Versions: 0.2.0
>            Reporter: Namit Jain
>            Assignee: Namit Jain
>            Priority: Blocker
>             Fix For: 0.2.0
>
>         Attachments: patch.262.1.txt, patch262.2.txt
>
>
> SELECT * FROM src src1 JOIN src src2 ON (src1.key = src2.key AND src1.key < 
> 10) RIGHT OUTER JOIN src src3 ON (src1.key = src3.key AND src3.key < 20);
> returns duplicate rows for outer join

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-262) outer join gets some duplicate rows in some scenarios

Reply via email to