[ 
https://issues.apache.org/jira/browse/PIG-1735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated PIG-1735:
-------------------------------

    Assignee:     (was: Thejas M Nair)
    
> Use combiner in cogroup
> -----------------------
>
>                 Key: PIG-1735
>                 URL: https://issues.apache.org/jira/browse/PIG-1735
>             Project: Pig
>          Issue Type: Improvement
>            Reporter: Thejas M Nair
>
> As reported by Scott Carey in PIG-479, combiner does not get used for 
> co-group, even if the functions applied on the bags are algebraic . -
> Quoting from the comment  - 
> "For example, I'm not quite sure why this one doesn't use a combiner - it 
> reads ~350x as much input bytes from HDFS as its reduce output, a combiner 
> would be very effective:
> J = COGROUP
> UV BY (s, d, h, g, p, pa, st) OUTER,
> UC BY (s, d, h, g, p, pa, st) OUTER,
> AT BY (s, d, h, g, p, pa, st) OUTER,
> V BY (s, d, h, g, p, pa, st) OUTER,
> C BY (s, d, h, g, p, pa, st) OUTER;
> OUTPUT = FOREACH J GENERATE
> FLATTEN(group) as (s, d, h, g, p, pa, st),
> COUNT_STAR(C) as c,
> COUNT_STAR(V) as v,
> SUM(AT.p1) as p1,
> SUM(AT.p2) as p2,
> SUM(AT.p3) as p3,
> SUM(UC.q) as ucq,
> SUM(UC.r) as ucr,
> SUM(UV.q) as uvq,
> SUM(UV.r) as uvr;
> "

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to