[
https://issues.apache.org/jira/browse/PIG-3292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Daniel Dai updated PIG-3292:
----------------------------
Attachment: PIG-3292-1.patch
Looks good. Note the issue only occurs in nested cross. Self-cross in top level
cross is not a problem, since LOSplit will take care of the uid reassign.
Interplay with ColumnPruner is fine here since nested plan will include entire
required plan branch. So no need to track the lineage of LOCross in nested
cross. To be more specific, check "nested" flag around "fixDuplicateUids". Also
add a test case.
> Logical plan invalid state: duplicate uid in schema during self-join to get
> cross product
> -----------------------------------------------------------------------------------------
>
> Key: PIG-3292
> URL: https://issues.apache.org/jira/browse/PIG-3292
> Project: Pig
> Issue Type: Bug
> Components: parser
> Affects Versions: 0.10.0
> Environment: CDH 4.2
> Reporter: Sergey
> Assignee: Cheolsoo Park
> Labels: newbie
> Fix For: 0.12.0, 0.11.2
>
> Attachments: PIG-3292-1.patch, PIG-3292.patch
>
>
> Hi.
> Looks like PIG-3020
> but works in a different way.
> Our pig version is:
> Apache Pig version 0.10.0-cdh4.2.0 (rexported)
> compiled Feb 15 2013, 12:20:54
> Accoring to release note, PIG-3020 is included into CDH 4.2 dist
> http://archive.cloudera.com/cdh4/cdh/4/pig-0.10.0-cdh4.2.0.CHANGES.txt
> The problem:
> We want to do self join to get cross-product
> {code}
> a = load '/input' as (key, x);
> a_group = group a by key;
> b = foreach a_group {
> y = a.x;
> pair = cross a.x, y;
> generate flatten(pair);
> }
> dump b;
> {code}
> And an error:
> {code}
> ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2270: Logical plan invalid
> state: duplicate uid in schema : 1-7::x#16:bytearray,y::x#16:bytearray
> {code}
> Here is workaround :)
> {code}
> a = load '/input' as (key, x:int);
> a_group = group a by key;
> b = foreach a_group {
> y = foreach a generate -(-x);
> pair = cross a.x, y;
> generate flatten(pair);
> }
> dump b;
> {code}
--
This message was sent by Atlassian JIRA
(v6.1#6144)