[ 
https://issues.apache.org/jira/browse/TEZ-2850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14903562#comment-14903562
 ] 

Rohini Palaniswamy commented on TEZ-2850:
-----------------------------------------

bq. A reducer vertex task fetches around 200000 map outputs, each of around 
~100 odd bytes.
   In case question pops up on why such high number of map outputs, it is 
because of auto parallelism. Consider the case of auto parallelism estimation 
of 999 for source and target vertex which is very common with Pig (999 is the 
default upper limit for estimation). But source produces less data making it 
change the target vertex parallelism to 1 (may be a higher number in Saikat's 
case). So 1 task can end up fetching 999*999 = 998001 map outputs. 

> Tez MergeManager OOM for small Map Outputs
> ------------------------------------------
>
>                 Key: TEZ-2850
>                 URL: https://issues.apache.org/jira/browse/TEZ-2850
>             Project: Apache Tez
>          Issue Type: Bug
>            Reporter: Saikat
>         Attachments: OOM_1.png, OOM_2.png, OOM_3.png, TEZ-2850_test.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to