[ https://issues.apache.org/jira/browse/PIG-4162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Rohini Palaniswamy updated PIG-4162: ------------------------------------ Attachment: PIG-4162-1.patch Following changes are done: - Always estimate intermediate reducer parallelism even if user has specified PARALLEL. - intermediate reducer parallelism = Min(2 * userparallelism, Math.max(userparallelism, Math.max(estimatedparallelism, Math.max(2999,PigReducerEstimator.MAX_REDUCER_COUNT_PARAM)). i.e Limiting estimated parallelism to be not more than 2x userparallelism or 2999. Hardcoding 2999 for now which is different from final reducer max parallelism default of 999 and is only for intermediate reducers. Will make it configurable later if needed. - ShuffleVertexManager.TEZ_SHUFFLE_VERTEX_MANAGER_DESIRED_TASK_INPUT_SIZE is set to blocksize for intermediate tasks(same as mapper behaviour) instead of InputSizeReducerEstimator.DEFAULT_BYTES_PER_REDUCER which defaults to 1G Patch has few other minor unrelated fixes as well. > Intermediate reducer parallelism in Tez should be higher > -------------------------------------------------------- > > Key: PIG-4162 > URL: https://issues.apache.org/jira/browse/PIG-4162 > Project: Pig > Issue Type: Sub-task > Components: tez > Reporter: Rohini Palaniswamy > Assignee: Rohini Palaniswamy > Fix For: 0.14.0 > > Attachments: PIG-4162-1.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)