[ 
https://issues.apache.org/jira/browse/PIG-3727?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohini Palaniswamy reassigned PIG-3727:
---------------------------------------

    Assignee: Rohini Palaniswamy  (was: Cheolsoo Park)

Also need to do ONE_ONE edge and PoIdentityInOutTez similar to PIG-3732. 
Without that Orderby with roundrobin partitioner was taking way more time than 
MR (I aborted running it when it had taken 30 mins more than MR and still had 
the last 40 reducers pending).

> Fix split + skewed join
> -----------------------
>
>                 Key: PIG-3727
>                 URL: https://issues.apache.org/jira/browse/PIG-3727
>             Project: Pig
>          Issue Type: Sub-task
>          Components: tez
>    Affects Versions: tez-branch
>            Reporter: Cheolsoo Park
>            Assignee: Rohini Palaniswamy
>             Fix For: tez-branch
>
>
> The e2e SkewedJoin_6 test runs the following query-
> {code}
> a = load ':INPATH:/singlefile/studenttab10k';
> b = filter a by $1 > 25;
> c = join a by $0, b by $0 using 'skewed' parallel 7;
> store c into ':OUTPATH:';
> {code}
> Currently, this fails with a compilation error in TezCompiler. Basically, 
> visitSkewedJoin() doesn't handle the POSplit that is inserted between load 
> and join.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to