Better MergeForEach rule
------------------------
Key: PIG-2009
URL: https://issues.apache.org/jira/browse/PIG-2009
Project: Pig
Issue Type: Improvement
Affects Versions: 0.9.0
Reporter: Daniel Dai
Assignee: Daniel Dai
Fix For: 0.10
MergeForEach rule will not merge two consecutive ForEach if the second ForEach
has inner relational plan. This prevent some optimizations. Eg,
{code}
A = LOAD 'input1' AS (a0, a1, a2);
B = LOAD 'input2' AS (b0, b1, b2);
C = cogroup A by a0, B by b0;
D = foreach C { E = limit A 10; F = E.a1; G = DISTINCT F; generate group,
COUNT(G);};
explain D;
{code}
We add ForEach after cogroup to prune B, however, we cannot merge this ForEach
with D. Secondary key optimization for this query is thus disabled.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira