Hello,
I have just noticed that the implicit split is added in the wrong place in this 
plan. I am just examining the plan for the Pig script that is available in the 
jira issue: https://issues.apache.org/jira/browse/PIG-627

A = load 'data' as (a, b, c);
B = filter A by a > 5;
store B into 'output1';
C = group B by b;
store C into 'output2';

The plan logical plan is below. I think the split operator 
should be placed before the filter. And so the filter will 
be performed on only one branch not on both.

Store 1-14 Schema: {a: bytearray,b: bytearray,c: bytearray} Type: Unknown
|
|---SplitOutput[B] 1-21 Schema: {a: bytearray,b: bytearray,c: bytearray} Type: 
bag
    |   |
    |   Const 1-20 FieldSchema: boolean Type: boolean
    |
    |---Split 1-19 Schema: {a: bytearray,b: bytearray,c: bytearray} Type: bag
        |
        |---Filter 1-13 Schema: {a: bytearray,b: bytearray,c: bytearray} Type: 
bag
            |   |
            |   GreaterThan 1-12 FieldSchema: boolean Type: boolean
            |   |
            |   |---Const 1-11 FieldSchema: int Type: int
            |   |
            |   |---Cast 1-18 FieldSchema: int Type: int
            |       |
            |       |---Project 1-10 Projections: [0] Overloaded: false 
FieldSchema: a: bytearray Type: bytearray
            |           Input: Load 1-9
            |
            |---Load 1-9 Schema: {a: bytearray,b: bytearray,c: bytearray} Type: 
bag

Store 1-17 Schema: {group: bytearray,B: {a: bytearray,b: bytearray,c: 
bytearray}} Type: Unknown
|
|---CoGroup 1-16 Schema: {group: bytearray,B: {a: bytearray,b: bytearray,c: 
bytearray}} Type: bag
    |   |
    |   Project 1-15 Projections: [1] Overloaded: false FieldSchema: b: 
bytearray Type: bytearray
    |   Input: SplitOutput[B] 1-23
    |
    |---SplitOutput[B] 1-23 Schema: {a: bytearray,b: bytearray,c: bytearray} 
Type: bag
        |   |
        |   Const 1-22 FieldSchema: boolean Type: boolean
        |
        |---Split 1-19 Schema: {a: bytearray,b: bytearray,c: bytearray} Type: 
bag
            |
            |---Filter 1-13 Schema: {a: bytearray,b: bytearray,c: bytearray} 
Type: bag
                |   |
                |   GreaterThan 1-12 FieldSchema: boolean Type: boolean
                |   |
                |   |---Const 1-11 FieldSchema: int Type: int
                |   |
                |   |---Cast 1-18 FieldSchema: int Type: int
                |       |
                |       |---Project 1-10 Projections: [0] Overloaded: false 
FieldSchema: a: bytearray Type: bytearray
                |           Input: Load 1-9
                |
                |---Load 1-9 Schema: {a: bytearray,b: bytearray,c: bytearray} 
Type: bag

Thanks,
Iman.




      __________________________________________________________________
Yahoo! Canada Toolbar: Search from anywhere on the web, and bookmark your 
favourite sites. Download it now at
http://ca.toolbar.yahoo.com.

Reply via email to