[
https://issues.apache.org/jira/browse/HADOOP-2021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12543331
]
udanax edited comment on HADOOP-2021 at 11/17/07 7:16 PM:
---------------------------------------------------------------
{code}
r1
a b c
======================
row1 a1 b1 c1
row2 a2 b2 c2
r2
e f
==================
row1 e1 a1
row2 e2 f2
row3 e3 f3
row4 e4 a1
r1 = table('r1');
r2 = table('r2');
r3 = r1.join(r1.a = r2.f) and r2;
r3
a b c row e f
=====================================
row1 a1 b1 c1 row1 e1 a1
row1 a1 b1 c1 row4 e4 a1
{code}
was (Author: udanax):
r1
a b c
================
row1 a1 b1 c1
row2 a2 b2 c2
r2
e f
============
row1 e1 a1
row2 e2 f2
row3 e3 f3
row4 e4 a1
{code}
r1 = table('r1');
r2 = table('r2');
r3 = r1.join(r1.a = r2.f) and r2;
{code}
r3
a b c row e f
=========================
row1 a1 b1 c1 row1 e1 a1
row1 a1 b1 c1 row4 e4 a1
> Sort Join Implementation
> ------------------------
>
> Key: HADOOP-2021
> URL: https://issues.apache.org/jira/browse/HADOOP-2021
> Project: Hadoop
> Issue Type: Sub-task
> Components: contrib/hbase
> Affects Versions: 0.14.1
> Environment: all environments
> Reporter: Edward Yoon
> Priority: Minor
> Fix For: 0.16.0
>
> Attachments: 2021_v01.patch
>
>
> If we don't have an index for a domain in the join, we can still improve on
> the nested-loop join using sort join.
> {code}
> R1 = table('movieLog_table');
> R2 = table('stockCompany_info');
> result = R1.join(R1.studioName = R2.corporation) and R2;
> {code}
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.