[
https://issues.apache.org/jira/browse/PIG-3292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13678589#comment-13678589
]
Johnny Zhang commented on PIG-3292:
-----------------------------------
thanks for opening jira Sergey, took a look, not sure this is by design or not.
Take a simpler case, we can do self-cross outside of foreach
{noformat}
a = load 'input' as (key, x);
b = cross a, a;
dump b;
{noformat}
but do it in a foreach will fail
{noformat}
a = load 'input' as (key, x);
a_group = group a by key;
b = foreach a_group { y = cross a, a; generate y; }
dump b;
{noformat}
although they fail with different error (Duplicate schema alias), but overall I
think it is important to figure out why we have this kind of restriction within
foreach ?
> Logical plan invalid state: duplicate uid in schema during self-join to get
> cross product
> -----------------------------------------------------------------------------------------
>
> Key: PIG-3292
> URL: https://issues.apache.org/jira/browse/PIG-3292
> Project: Pig
> Issue Type: Bug
> Components: parser
> Affects Versions: 0.10.0
> Environment: CDH 4.2
> Reporter: Sergey
> Labels: newbie
>
> Hi.
> Looks like PIG-3020
> but works in a different way.
> Our pig version is:
> Apache Pig version 0.10.0-cdh4.2.0 (rexported)
> compiled Feb 15 2013, 12:20:54
> Accoring to release note, PIG-3020 is included into CDH 4.2 dist
> http://archive.cloudera.com/cdh4/cdh/4/pig-0.10.0-cdh4.2.0.CHANGES.txt
> The problem:
> We want to do self join to get cross-product
> {code}
> a = load '/input' as (key, x);
> a_group = group a by key;
> b = foreach a_group {
> y = a.x;
> pair = cross a.x, y;
> generate flatten(pair);
> }
> dump b;
> {code}
> And an error:
> {code}
> ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2270: Logical plan invalid
> state: duplicate uid in schema : 1-7::x#16:bytearray,y::x#16:bytearray
> {code}
> Here is workaround :)
> {code}
> a = load '/input' as (key, x:int);
> a_group = group a by key;
> b = foreach a_group {
> y = foreach a generate -(-x);
> pair = cross a.x, y;
> generate flatten(pair);
> }
> dump b;
> {code}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira