Hello,
I have just noticed that the implicit split is added in the wrong place in this
plan. I am just examining the plan for the Pig script that is available in the
jira issue: https://issues.apache.org/jira/browse/PIG-627
A = load 'data' as (a, b, c);
B = filter A by a > 5;
store B into 'output1';
C = group B by b;
store C into 'output2';
The plan logical plan is below. I think the split operator
should be placed before the filter. And so the filter will
be performed on only one branch not on both.
Store 1-14 Schema: {a: bytearray,b: bytearray,c: bytearray} Type: Unknown
|
|---SplitOutput[B] 1-21 Schema: {a: bytearray,b: bytearray,c: bytearray} Type:
bag
| |
| Const 1-20 FieldSchema: boolean Type: boolean
|
|---Split 1-19 Schema: {a: bytearray,b: bytearray,c: bytearray} Type: bag
|
|---Filter 1-13 Schema: {a: bytearray,b: bytearray,c: bytearray} Type:
bag
| |
| GreaterThan 1-12 FieldSchema: boolean Type: boolean
| |
| |---Const 1-11 FieldSchema: int Type: int
| |
| |---Cast 1-18 FieldSchema: int Type: int
| |
| |---Project 1-10 Projections: [0] Overloaded: false
FieldSchema: a: bytearray Type: bytearray
| Input: Load 1-9
|
|---Load 1-9 Schema: {a: bytearray,b: bytearray,c: bytearray} Type:
bag
Store 1-17 Schema: {group: bytearray,B: {a: bytearray,b: bytearray,c:
bytearray}} Type: Unknown
|
|---CoGroup 1-16 Schema: {group: bytearray,B: {a: bytearray,b: bytearray,c:
bytearray}} Type: bag
| |
| Project 1-15 Projections: [1] Overloaded: false FieldSchema: b:
bytearray Type: bytearray
| Input: SplitOutput[B] 1-23
|
|---SplitOutput[B] 1-23 Schema: {a: bytearray,b: bytearray,c: bytearray}
Type: bag
| |
| Const 1-22 FieldSchema: boolean Type: boolean
|
|---Split 1-19 Schema: {a: bytearray,b: bytearray,c: bytearray} Type:
bag
|
|---Filter 1-13 Schema: {a: bytearray,b: bytearray,c: bytearray}
Type: bag
| |
| GreaterThan 1-12 FieldSchema: boolean Type: boolean
| |
| |---Const 1-11 FieldSchema: int Type: int
| |
| |---Cast 1-18 FieldSchema: int Type: int
| |
| |---Project 1-10 Projections: [0] Overloaded: false
FieldSchema: a: bytearray Type: bytearray
| Input: Load 1-9
|
|---Load 1-9 Schema: {a: bytearray,b: bytearray,c: bytearray}
Type: bag
Thanks,
Iman.
__________________________________________________________________
Yahoo! Canada Toolbar: Search from anywhere on the web, and bookmark your
favourite sites. Download it now at
http://ca.toolbar.yahoo.com.