[ 
https://issues.apache.org/jira/browse/TEZ-2850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14908260#comment-14908260
 ] 

Saikat commented on TEZ-2850:
-----------------------------

[~sseth] some question for the approach that you mention
1. "We should try capping the value based on a rough estimate of the size of 
segments."
How to we estimate the size of the segments, since it may vary for each map 
output?
and what percent should be set as default?

2. Whats should be the default number of segments (should it be 0, so that 0 
means ignore this setting)?
(commitmemory>mergethreshold || (inMemMergeSegmentsThreshold != 0 && 
inMemoryMapOutputs.size() > inMemMergeSegmentsThreshold)) 


3. What should be the flag name? hadoop has something like 
"mapreduce.reduce.merge.inmem.threshold".


> Tez MergeManager OOM for small Map Outputs
> ------------------------------------------
>
>                 Key: TEZ-2850
>                 URL: https://issues.apache.org/jira/browse/TEZ-2850
>             Project: Apache Tez
>          Issue Type: Bug
>            Reporter: Saikat
>            Assignee: Saikat
>         Attachments: OOM_1.png, OOM_2.png, OOM_3.png, TEZ-2850_test.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to