[ 
https://issues.apache.org/jira/browse/PIG-750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated PIG-750:
------------------------------

    Release Note: 
With changes in the patch, queries which have algebraic functions within 
expressions also will use combiner. This is as long as the bags from group-by 
are only input for algebraic expressions. If bag is projected or a non 
algebraic expression/udf has bag as input, combiner will not be used.
Combiner will be used in case of following foreach statements (that follow 
group) -
describe B ;
B: {group: int, A: {c1 : int, c2 : int, c3 : int}}

1) foreach B generate SUM(A.c2) * AVG(A.c3), ...
2) foreach B generate 1 / SUM(A.c2)
3) foreach B generate EXP(AVG(A.c2))
4) foreach B generate group + SUM(A.c2)


Following statements will not use combiner -
1) foreach B generate A.c2, ...
2) foreach B generate EXP(c2) , SUM(c2) ... - Where EXP is non algebraic 
function

In case of nested foreach statement, if it has limit, order, or filter , 
combiner does not get used (as before).

This patch also fixes PIG-490, foreach statements that access group elements 
also use combiner 
for example -
1) foreach B generate group.$0, group.$1, COUNT(A);
1) foreach B generate group.c1, group.c2, COUNT(A);


> Use combiner when algebraic UDFs are used in expressions
> --------------------------------------------------------
>
>                 Key: PIG-750
>                 URL: https://issues.apache.org/jira/browse/PIG-750
>             Project: Pig
>          Issue Type: Improvement
>            Reporter: Amir Youssefi
>            Assignee: Thejas M Nair
>            Priority: Minor
>         Attachments: PIG-750.1.patch
>
>
> Currently Pig uses combiner when all a,b, c,... are algebraic (e.g. SUM, AVG 
> etc.) in foreach:
> foreach X generate a,b,c,... 
>  It's a performance improvement if it uses combiner when a mix of algebraic 
> and non-algebraic functions are used as well.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to