[ https://issues.apache.org/jira/browse/PIG-1735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Thejas M Nair updated PIG-1735: ------------------------------- Assignee: (was: Thejas M Nair) > Use combiner in cogroup > ----------------------- > > Key: PIG-1735 > URL: https://issues.apache.org/jira/browse/PIG-1735 > Project: Pig > Issue Type: Improvement > Reporter: Thejas M Nair > > As reported by Scott Carey in PIG-479, combiner does not get used for > co-group, even if the functions applied on the bags are algebraic . - > Quoting from the comment - > "For example, I'm not quite sure why this one doesn't use a combiner - it > reads ~350x as much input bytes from HDFS as its reduce output, a combiner > would be very effective: > J = COGROUP > UV BY (s, d, h, g, p, pa, st) OUTER, > UC BY (s, d, h, g, p, pa, st) OUTER, > AT BY (s, d, h, g, p, pa, st) OUTER, > V BY (s, d, h, g, p, pa, st) OUTER, > C BY (s, d, h, g, p, pa, st) OUTER; > OUTPUT = FOREACH J GENERATE > FLATTEN(group) as (s, d, h, g, p, pa, st), > COUNT_STAR(C) as c, > COUNT_STAR(V) as v, > SUM(AT.p1) as p1, > SUM(AT.p2) as p2, > SUM(AT.p3) as p3, > SUM(UC.q) as ucq, > SUM(UC.r) as ucr, > SUM(UV.q) as uvq, > SUM(UV.r) as uvr; > " -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira