Haley Thrapp created PIG-5356:
---------------------------------
Summary: Add keyword on reducing tasks to allow combiner hints
Key: PIG-5356
URL: https://issues.apache.org/jira/browse/PIG-5356
Project: Pig
Issue Type: Wish
Reporter: Haley Thrapp
Create a keyword that would allow the programmer to tell PIG that a particular
DISTINCT/GROUP BY action should disable the combiner.
Often we have pieces of code to ensure that data is meeting expectations, even
if we are pretty sure it already is. One example, is when we do a DISTINCT on
data to ensure we do not have duplicates or we re-GROUP data to a certain grain
before a join. In these cases, the combiner is taking extra time, but not
actually giving us a benefit. We can gain significant performance improvement
in this cases if the combiner is simply not run. In some cases we can do this
at the job level, but in others, we may only want to have the combiner shut off
for particular statements.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)