[ 
https://issues.apache.org/jira/browse/PIG-4162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohini Palaniswamy updated PIG-4162:
------------------------------------
    Attachment: PIG-4162-1.patch

Following changes are done:

- Always estimate intermediate reducer parallelism even if user has specified 
PARALLEL.
- intermediate reducer parallelism = Min(2 * userparallelism, 
Math.max(userparallelism, Math.max(estimatedparallelism, 
Math.max(2999,PigReducerEstimator.MAX_REDUCER_COUNT_PARAM)). i.e Limiting 
estimated parallelism to be not more than 2x userparallelism or 2999. 
Hardcoding 2999 for now which is different from final reducer max parallelism 
default of 999 and is only for intermediate reducers. Will make it configurable 
later if needed. 
- ShuffleVertexManager.TEZ_SHUFFLE_VERTEX_MANAGER_DESIRED_TASK_INPUT_SIZE is 
set to blocksize for intermediate tasks(same as mapper behaviour) instead of 
InputSizeReducerEstimator.DEFAULT_BYTES_PER_REDUCER which defaults to 1G

   Patch has few other minor unrelated fixes as well.

> Intermediate reducer parallelism in Tez should be higher
> --------------------------------------------------------
>
>                 Key: PIG-4162
>                 URL: https://issues.apache.org/jira/browse/PIG-4162
>             Project: Pig
>          Issue Type: Sub-task
>          Components: tez
>            Reporter: Rohini Palaniswamy
>            Assignee: Rohini Palaniswamy
>             Fix For: 0.14.0
>
>         Attachments: PIG-4162-1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to