[ https://issues.apache.org/jira/browse/PIG-1810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Olga Natkovich updated PIG-1810: -------------------------------- Fix Version/s: (was: 0.9.0) Unlinking from the release since there is no activity and it is too late for new functionality to be added to 0.9 > Prioritize hadoop parameter "mapred.reduce.task" above estimation of reducer > number > ----------------------------------------------------------------------------------- > > Key: PIG-1810 > URL: https://issues.apache.org/jira/browse/PIG-1810 > Project: Pig > Issue Type: Bug > Affects Versions: 0.9.0 > Reporter: Jeff Zhang > Assignee: Jeff Zhang > > Anup Point this problem in PIG-1249 > {quote} > Anup added a comment - 18/Jan/11 07:46 PM > one thing that we didn't take care is the use of the hadoop parameter > "mapred.reduce.tasks". > If I specify the hadoop parameter -Dmapred.reduce.tasks=450 for all the MR > jobs , it is overwritten by estimateNumberOfReducers(conf,mro), which in my > case is 15. > I am not specifying any default_parallel and PARALLEL statements. > Ideally, the number of reducer should be 450. > I think we should prioritize this parameter above the estimate reducers > calculations. > The priority list should be > 1. PARALLEL statement > 2. default_parallel statement > 3. mapred.reduce.task hadoop parameter > 4. estimateNumberOfreducers(); > {quote} -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira