[ 
https://issues.apache.org/jira/browse/PIG-3347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13891032#comment-13891032
 ] 

Daniel Dai commented on PIG-3347:
---------------------------------

[~knoguchi], in the "B = foreach A generate a as col1, a as col2; ", we will 
need to generate a new uid for col2 to avoid uid conflict (using a UDF 
IdentityColumn). The downside is this will break the lineage chain. The uid is 
mostly used in optimizer, there several holes when we use it for pure lineage. 
Optimizer rules is expected to live with these holes by skip optimize (eg, 
PushUpFilter is skip the foreach with UDF, which include IdentityColumn aiming 
to fix the uid conflict)

> Store invocation brings side effect
> -----------------------------------
>
>                 Key: PIG-3347
>                 URL: https://issues.apache.org/jira/browse/PIG-3347
>             Project: Pig
>          Issue Type: Bug
>          Components: grunt
>    Affects Versions: 0.11
>         Environment: local mode
>            Reporter: Sergey
>            Assignee: Daniel Dai
>            Priority: Critical
>             Fix For: 0.12.1
>
>         Attachments: PIG-3347-1.patch, PIG-3347-2-testonly.patch, 
> PIG-3347-3.patch, PIG-3347-4-testonly.patch, PIG-3347-5.patch
>
>
> The problem is that intermediate 'store' invocation "changes" the final store 
> output. Looks like it brings some kind of side effect. We did use 'local' 
> mode to run script
> here is the input data:
> 1
> 1
> Here is the script:
> {code}
> a = load 'test';
> a_group = group a by $0;
> b = foreach a_group {
>   a_distinct = distinct a.$0;
>   generate group, a_distinct;
> }
> --store b into 'b';
> c = filter b by SIZE(a_distinct) == 1;
> store c into 'out';
> {code}
> We expect output to be:
> 1 1
> The output is empty file.
> Uncomment {code}--store b into 'b';{code} line and see the diffrence.
> Yuo would get expected output.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to