[ 
https://issues.apache.org/jira/browse/PIG-3347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13692379#comment-13692379
 ] 

Daniel Dai commented on PIG-3347:
---------------------------------

That's the incorrect PushUpFilter. Can be solved by disable PushUpFilter rule:
pig -t PushUpFilter -x local xxx.pig

Look at the logical plan:
{code}
c: (Name: LOStore Schema: 
group#28:bytearray,a_distinct#29:bag{#30:tuple(#31:bytearray)})
|
|---b: (Name: LOForEach Schema: 
group#28:bytearray,a_distinct#29:bag{#30:tuple(#31:bytearray)})
    |   |
    |   (Name: LOGenerate[false,false] Schema: 
group#28:bytearray,a_distinct#29:bag{#30:tuple(#31:bytearray)})ColumnPrune:InputUids=[29,
 28]ColumnPrune:OutputUids=[29, 28]
    |   |   |
    |   |   group:(Name: Project Type: bytearray Uid: 28 Input: 0 Column: (*))
    |   |   |
    |   |   a_distinct:(Name: Project Type: bag Uid: 29 Input: 1 Column: (*))
    |   |
    |   |---(Name: LOInnerLoad[0] Schema: group#28:bytearray)
    |   |
    |   |---a_distinct: (Name: LODistinct Schema: #31:bytearray)
    |       |
    |       |---1-7: (Name: LOForEach Schema: #31:bytearray)
    |           |   |
    |           |   (Name: LOGenerate[false] Schema: #31:bytearray)
    |           |   |   |
    |           |   |   (Name: Project Type: bytearray Uid: 31 Input: 0 Column: 
(*))
    |           |   |
    |           |   |---(Name: LOInnerLoad[0] Schema: #31:bytearray)
    |           |
    |           |---a: (Name: LOInnerLoad[1] Schema: null)
    |
    |---c: (Name: LOFilter Schema: group#28:bytearray,a#29:bag{#36:tuple()})
        |   |
        |   (Name: Equal Type: boolean Uid: 35)
        |   |
        |   |---(Name: UserFunc(org.apache.pig.builtin.BagSize) Type: long Uid: 
32)
        |   |   |
        |   |   |---a:(Name: Project Type: bag Uid: 29 Input: 0 Column: 1)
        |   |
        |   |---(Name: Cast Type: long Uid: 33)
        |       |
        |       |---(Name: Constant Type: int Uid: 33)
        |
        |---a_group: (Name: LOCogroup Schema: 
group#28:bytearray,a#29:bag{#36:tuple()})
            |   |
            |   (Name: Project Type: bytearray Uid: 28 Input: 0 Column: 0)
            |
            |---a: (Name: LOLoad Schema: null)RequiredFields:null
{code}

Filter is pushed in front of foreach, which is wrong.
                
> Store invocation in local mode brings sire effect
> -------------------------------------------------
>
>                 Key: PIG-3347
>                 URL: https://issues.apache.org/jira/browse/PIG-3347
>             Project: Pig
>          Issue Type: Bug
>          Components: grunt
>    Affects Versions: 0.11
>         Environment: local mode
>            Reporter: Sergey
>
> The problem is that intermediate 'store' invocation "changes" the final store 
> output. Looks like it brings some kind of side effect. We did use 'local' 
> mode to run script
> here is the input data:
> 1
> 1
> Here is the script:
> {code}
> a = load 'test';
> a_group = group a by $0;
> b = foreach a_group {
>   a_distinct = distinct a.$0;
>   generate group, a_distinct;
> }
> --store b into 'b';
> c = filter b by SIZE(a_distinct) == 1;
> store c into 'out';
> {code}
> We expect output to be:
> 1 1
> The output is empty file.
> Uncomment {code}--store b into 'b';{code} line and see the diffrence.
> Yuo would get expected output.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to