[jira] [Commented] (TEZ-2850) Tez MergeManager OOM for small Map Outputs

2015-10-21 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14967407#comment-14967407
 ] 

Siddharth Seth commented on TEZ-2850:
-

Not from me. I think this should go back to 0.6. I don't believe any more 
releases are planned from the 0.5 line - so I haven't been backporting patches 
to the 0.5 branch.

> Tez MergeManager OOM for small Map Outputs
> --
>
> Key: TEZ-2850
> URL: https://issues.apache.org/jira/browse/TEZ-2850
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Saikat
>Assignee: Jonathan Eagles
> Attachments: OOM_1.png, OOM_2.png, OOM_3.png, TEZ-2850.1.patch, 
> TEZ-2850.2.patch, TEZ-2850.3.patch, TEZ-2850_test.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2850) Tez MergeManager OOM for small Map Outputs

2015-10-19 Thread Jonathan Eagles (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14963548#comment-14963548
 ] 

Jonathan Eagles commented on TEZ-2850:
--

[~sseth], [~gopalv], any other comments/concerns before this goes in? Also, 
should this go back to 0.5/0.6 or just to 0.7?

> Tez MergeManager OOM for small Map Outputs
> --
>
> Key: TEZ-2850
> URL: https://issues.apache.org/jira/browse/TEZ-2850
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Saikat
>Assignee: Jonathan Eagles
> Attachments: OOM_1.png, OOM_2.png, OOM_3.png, TEZ-2850.1.patch, 
> TEZ-2850.2.patch, TEZ-2850.3.patch, TEZ-2850_test.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2850) Tez MergeManager OOM for small Map Outputs

2015-10-16 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14961627#comment-14961627
 ] 

Siddharth Seth commented on TEZ-2850:
-

+1. This looks good. The null in close would have shown up as an NPE in 
ChecksumInputStream if it were invoked.

> Tez MergeManager OOM for small Map Outputs
> --
>
> Key: TEZ-2850
> URL: https://issues.apache.org/jira/browse/TEZ-2850
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Saikat
> Attachments: OOM_1.png, OOM_2.png, OOM_3.png, TEZ-2850.1.patch, 
> TEZ-2850.2.patch, TEZ-2850.3.patch, TEZ-2850_test.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2850) Tez MergeManager OOM for small Map Outputs

2015-10-16 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14961474#comment-14961474
 ] 

Gopal V commented on TEZ-2850:
--

[~jeagles]: good call on the in == null check, that's a valid assumption for 
InMemoryReader.

I notice that there's some possibility of an NPE in close() if checksSumIn is 
null, but I'm pretty sure it doesn't called in the normal operations.

> Tez MergeManager OOM for small Map Outputs
> --
>
> Key: TEZ-2850
> URL: https://issues.apache.org/jira/browse/TEZ-2850
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Saikat
> Attachments: OOM_1.png, OOM_2.png, OOM_3.png, TEZ-2850.1.patch, 
> TEZ-2850.2.patch, TEZ-2850.3.patch, TEZ-2850_test.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2850) Tez MergeManager OOM for small Map Outputs

2015-10-16 Thread TezQA (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14961446#comment-14961446
 ] 

TezQA commented on TEZ-2850:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12767127/TEZ-2850.3.patch
  against master revision 25f0247.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 3.0.1) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in :
   org.apache.tez.dag.app.TestSpeculation

Test results: 
https://builds.apache.org/job/PreCommit-TEZ-Build/1232//testReport/
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/1232//console

This message is automatically generated.

> Tez MergeManager OOM for small Map Outputs
> --
>
> Key: TEZ-2850
> URL: https://issues.apache.org/jira/browse/TEZ-2850
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Saikat
> Attachments: OOM_1.png, OOM_2.png, OOM_3.png, TEZ-2850.1.patch, 
> TEZ-2850.2.patch, TEZ-2850.3.patch, TEZ-2850_test.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2850) Tez MergeManager OOM for small Map Outputs

2015-10-16 Thread Jonathan Eagles (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14961265#comment-14961265
 ] 

Jonathan Eagles commented on TEZ-2850:
--

TEZ-2901 was filed to take over the maximum in memory segments. This ticket 
will be for removing the unnecessary IFileInputStream.

> Tez MergeManager OOM for small Map Outputs
> --
>
> Key: TEZ-2850
> URL: https://issues.apache.org/jira/browse/TEZ-2850
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Saikat
> Attachments: OOM_1.png, OOM_2.png, OOM_3.png, TEZ-2850.1.patch, 
> TEZ-2850.2.patch, TEZ-2850_test.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2850) Tez MergeManager OOM for small Map Outputs

2015-10-16 Thread Saikat (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14961167#comment-14961167
 ] 

Saikat commented on TEZ-2850:
-

[~jeagles] unassigning myself to due time critical nature of this bug. Will 
take up the bug if still unresolved.

> Tez MergeManager OOM for small Map Outputs
> --
>
> Key: TEZ-2850
> URL: https://issues.apache.org/jira/browse/TEZ-2850
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Saikat
> Attachments: OOM_1.png, OOM_2.png, OOM_3.png, TEZ-2850.1.patch, 
> TEZ-2850.2.patch, TEZ-2850_test.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2850) Tez MergeManager OOM for small Map Outputs

2015-10-13 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14955306#comment-14955306
 ] 

Siddharth Seth commented on TEZ-2850:
-

Yes, that should take care of not allocating the buffer. An alternate 
constructor may be required in the main Reader.

> Tez MergeManager OOM for small Map Outputs
> --
>
> Key: TEZ-2850
> URL: https://issues.apache.org/jira/browse/TEZ-2850
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Saikat
>Assignee: Saikat
> Attachments: OOM_1.png, OOM_2.png, OOM_3.png, TEZ-2850.1.patch, 
> TEZ-2850.2.patch, TEZ-2850_test.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2850) Tez MergeManager OOM for small Map Outputs

2015-10-13 Thread Saikat (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14955234#comment-14955234
 ] 

Saikat commented on TEZ-2850:
-

[~sseth] if my understanding is correct when we call the InMemoryReader 
Constructor which in turn calls the IFile.Reader superclass constructor, we 
should pass an info saying that donot allocate the IFileInputStream object 
since checksumIn its not used, as the data is already in memory.

> Tez MergeManager OOM for small Map Outputs
> --
>
> Key: TEZ-2850
> URL: https://issues.apache.org/jira/browse/TEZ-2850
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Saikat
>Assignee: Saikat
> Attachments: OOM_1.png, OOM_2.png, OOM_3.png, TEZ-2850.1.patch, 
> TEZ-2850.2.patch, TEZ-2850_test.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2850) Tez MergeManager OOM for small Map Outputs

2015-10-12 Thread TezQA (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14954018#comment-14954018
 ] 

TezQA commented on TEZ-2850:


{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12766181/TEZ-2850.2.patch
  against master revision 822bc69.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 3.0.1) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-TEZ-Build/1210//testReport/
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/1210//console

This message is automatically generated.

> Tez MergeManager OOM for small Map Outputs
> --
>
> Key: TEZ-2850
> URL: https://issues.apache.org/jira/browse/TEZ-2850
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Saikat
>Assignee: Saikat
> Attachments: OOM_1.png, OOM_2.png, OOM_3.png, TEZ-2850.1.patch, 
> TEZ-2850.2.patch, TEZ-2850_test.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2850) Tez MergeManager OOM for small Map Outputs

2015-10-12 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14954009#comment-14954009
 ] 

Siddharth Seth commented on TEZ-2850:
-

[~saikatr], as [~gopalv] pointed out in the previous comment - the 4K chunks 
are not required for in-memory segments. Fixing that would be a lot simpler - 
and less error prone (races on when to trigger a spill). We should target that 
as the fix for this issue.

> Tez MergeManager OOM for small Map Outputs
> --
>
> Key: TEZ-2850
> URL: https://issues.apache.org/jira/browse/TEZ-2850
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Saikat
>Assignee: Saikat
> Attachments: OOM_1.png, OOM_2.png, OOM_3.png, TEZ-2850.1.patch, 
> TEZ-2850.2.patch, TEZ-2850_test.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2850) Tez MergeManager OOM for small Map Outputs

2015-10-11 Thread TezQA (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14952646#comment-14952646
 ] 

TezQA commented on TEZ-2850:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12766061/TEZ-2850.1.patch
  against master revision ba63219.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:red}-1 findbugs{color}.  The patch appears to introduce 2 new 
Findbugs (version 3.0.1) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in :
   
org.apache.tez.runtime.library.api.TestTezRuntimeConfiguration

Test results: 
https://builds.apache.org/job/PreCommit-TEZ-Build/1208//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-TEZ-Build/1208//artifact/patchprocess/newPatchFindbugsWarningstez-runtime-library.html
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/1208//console

This message is automatically generated.

> Tez MergeManager OOM for small Map Outputs
> --
>
> Key: TEZ-2850
> URL: https://issues.apache.org/jira/browse/TEZ-2850
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Saikat
>Assignee: Saikat
> Attachments: OOM_1.png, OOM_2.png, OOM_3.png, TEZ-2850.1.patch, 
> TEZ-2850_test.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2850) Tez MergeManager OOM for small Map Outputs

2015-09-29 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14936021#comment-14936021
 ] 

Siddharth Seth commented on TEZ-2850:
-

That is a very good point. The checksum has already been computed/verified 
while writing the segment to a buffer. Looks like setting up the constructors 
correctly will take care of this.

> Tez MergeManager OOM for small Map Outputs
> --
>
> Key: TEZ-2850
> URL: https://issues.apache.org/jira/browse/TEZ-2850
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Saikat
>Assignee: Saikat
> Attachments: OOM_1.png, OOM_2.png, OOM_3.png, TEZ-2850_test.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2850) Tez MergeManager OOM for small Map Outputs

2015-09-29 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14935959#comment-14935959
 ] 

Gopal V commented on TEZ-2850:
--

I've been trying to understand why we even have a reference to IFileInputStream 
from the Segment.

The shuffleToMemory() should throw away the IFileInputStream as soon as it 
copies the data into memory.

>From my understanding of the merger code, for in-memory segments, this buffer 
>is assumed to be already thrown away after the reader pulls it into memory.

Only disk segments should be having 4kb chunks attached to them (a total of 4Mb 
with a 100 sort factor).

> Tez MergeManager OOM for small Map Outputs
> --
>
> Key: TEZ-2850
> URL: https://issues.apache.org/jira/browse/TEZ-2850
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Saikat
>Assignee: Saikat
> Attachments: OOM_1.png, OOM_2.png, OOM_3.png, TEZ-2850_test.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2850) Tez MergeManager OOM for small Map Outputs

2015-09-28 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14933962#comment-14933962
 ] 

Siddharth Seth commented on TEZ-2850:
-

bq. How to we estimate the size of the segments, since it may vary for each map 
output?
I mean size of the segment data structure in memory. That should be independent 
of the data size. Looking at the heap images you've posted - this is 
approximately 5.5K ?
3% of the memory allocated for shuffle. Comes to about 1024 segments for a 
200MB allocation.

bq. Whats should be the default number of segments (should it be 0, so that 0 
means ignore this setting)?
A high number. Something like 4096. 0 would disable the checks.


mapreduce.reduce.merge.inmem.threshold in hadoop corresponds to 
tez.runtime.shuffle.memory-to-memory.segments - which indicates the number of 
segments after which an in-mem merge will be triggered, if enabled. This is 
slightly different - it's a limit on the segments, but triggers a disk merge 
instead of an in-mem merge. It'll have to be consolidated with in-mem merge 
once that is tested properly with Tez.
The property could be named tez.runtime.shuffle.in-memory.segments.max

> Tez MergeManager OOM for small Map Outputs
> --
>
> Key: TEZ-2850
> URL: https://issues.apache.org/jira/browse/TEZ-2850
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Saikat
>Assignee: Saikat
> Attachments: OOM_1.png, OOM_2.png, OOM_3.png, TEZ-2850_test.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2850) Tez MergeManager OOM for small Map Outputs

2015-09-25 Thread Saikat (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14908260#comment-14908260
 ] 

Saikat commented on TEZ-2850:
-

[~sseth] some question for the approach that you mention
1. "We should try capping the value based on a rough estimate of the size of 
segments."
How to we estimate the size of the segments, since it may vary for each map 
output?
and what percent should be set as default?

2. Whats should be the default number of segments (should it be 0, so that 0 
means ignore this setting)?
(commitmemory>mergethreshold || (inMemMergeSegmentsThreshold != 0 && 
inMemoryMapOutputs.size() > inMemMergeSegmentsThreshold)) 


3. What should be the flag name? hadoop has something like 
"mapreduce.reduce.merge.inmem.threshold".


> Tez MergeManager OOM for small Map Outputs
> --
>
> Key: TEZ-2850
> URL: https://issues.apache.org/jira/browse/TEZ-2850
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Saikat
>Assignee: Saikat
> Attachments: OOM_1.png, OOM_2.png, OOM_3.png, TEZ-2850_test.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2850) Tez MergeManager OOM for small Map Outputs

2015-09-23 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14904810#comment-14904810
 ] 

Siddharth Seth commented on TEZ-2850:
-

I think it'll be better to add / compute another parameter which would limit 
the number of segments which are retained in memory.
i.e. Spill to disk if 1) the memory threshold is exceeded, or 2) If #segments 
limit is reached. 

This could be a configurable parameter - which serves more as an upper limit. 
We should try capping the value based on a rough estimate of the size of 
segments.
The JVM size cannot be used as an available memory parameter, since multiple 
Inputs/Outputs could be running in the same JVM. We could limit this to a small 
fraction of the allocated memory for the shuffle.

> Tez MergeManager OOM for small Map Outputs
> --
>
> Key: TEZ-2850
> URL: https://issues.apache.org/jira/browse/TEZ-2850
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Saikat
>Assignee: Saikat
> Attachments: OOM_1.png, OOM_2.png, OOM_3.png, TEZ-2850_test.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2850) Tez MergeManager OOM for small Map Outputs

2015-09-23 Thread Saikat (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14904597#comment-14904597
 ] 

Saikat commented on TEZ-2850:
-

Thanks [~gopalv] [~sseth] for the explanation. So how do we go about handling 
this scenario.
Can we have a TEZ config flag to turn on/off this optimization feature? If so 
what name should be used for the flag. I can submit a patch for review.

So this IFileInputStream optimzation flag and/or tweaking the 
shuffle.merge.percent flag can resolve this problem.

(without this optmization turned off, we might need to put a very low value of 
around 0.01 for shuffle.merge.percent)

> Tez MergeManager OOM for small Map Outputs
> --
>
> Key: TEZ-2850
> URL: https://issues.apache.org/jira/browse/TEZ-2850
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Saikat
> Attachments: OOM_1.png, OOM_2.png, OOM_3.png, TEZ-2850_test.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2850) Tez MergeManager OOM for small Map Outputs

2015-09-22 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14903821#comment-14903821
 ] 

Gopal V commented on TEZ-2850:
--

Good catch [~saikatr], that's 4kb of space overhead for 100 bytes of data.

The perf fix was to fix the total # of JNI calls to libhadoop.so CRC32. With 
this fix, the Writable deserialization is unbuffered - so an IntWritable will 
trigger 1 JNI call out to libhadoop.so per 4 byte Integer read (also see 
HADOOP-10778).

> Tez MergeManager OOM for small Map Outputs
> --
>
> Key: TEZ-2850
> URL: https://issues.apache.org/jira/browse/TEZ-2850
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Saikat
> Attachments: OOM_1.png, OOM_2.png, OOM_3.png, TEZ-2850_test.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2850) Tez MergeManager OOM for small Map Outputs

2015-09-22 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14903787#comment-14903787
 ] 

Siddharth Seth commented on TEZ-2850:
-

Nice find!

I believe this change was made to reduce the number of times the checksum is 
computed, and to try and compute it in chunks of 4096 for better performance. 
cc [~gopalv]

Other than the 4K buffer, there's a bunch of other objects, references etc per 
Segment - I won't be surprised if this adds up to a KB. Along with the memory 
spill limit, adding a limit on the number of in-memory segments would help.

The memory-to-memory merger would normally have helped in this case, but that's 
not tested and should not be enabled.

> Tez MergeManager OOM for small Map Outputs
> --
>
> Key: TEZ-2850
> URL: https://issues.apache.org/jira/browse/TEZ-2850
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Saikat
> Attachments: OOM_1.png, OOM_2.png, OOM_3.png, TEZ-2850_test.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2850) Tez MergeManager OOM for small Map Outputs

2015-09-22 Thread Rohini Palaniswamy (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14903562#comment-14903562
 ] 

Rohini Palaniswamy commented on TEZ-2850:
-

bq. A reducer vertex task fetches around 20 map outputs, each of around 
~100 odd bytes.
   In case question pops up on why such high number of map outputs, it is 
because of auto parallelism. Consider the case of auto parallelism estimation 
of 999 for source and target vertex which is very common with Pig (999 is the 
default upper limit for estimation). But source produces less data making it 
change the target vertex parallelism to 1 (may be a higher number in Saikat's 
case). So 1 task can end up fetching 999*999 = 998001 map outputs. 

> Tez MergeManager OOM for small Map Outputs
> --
>
> Key: TEZ-2850
> URL: https://issues.apache.org/jira/browse/TEZ-2850
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Saikat
> Attachments: OOM_1.png, OOM_2.png, OOM_3.png, TEZ-2850_test.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2850) Tez MergeManager OOM for small Map Outputs

2015-09-22 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14903501#comment-14903501
 ] 

Hitesh Shah commented on TEZ-2850:
--

\cc [~rajesh.balamohan] [~sseth] [~gopalv] 

> Tez MergeManager OOM for small Map Outputs
> --
>
> Key: TEZ-2850
> URL: https://issues.apache.org/jira/browse/TEZ-2850
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Saikat
> Attachments: OOM_1.png, OOM_2.png, OOM_3.png, TEZ-2850_test.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2850) Tez MergeManager OOM for small Map Outputs

2015-09-22 Thread Saikat (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14903498#comment-14903498
 ] 

Saikat commented on TEZ-2850:
-

adding [~jeagles] [~rohini] for watch

> Tez MergeManager OOM for small Map Outputs
> --
>
> Key: TEZ-2850
> URL: https://issues.apache.org/jira/browse/TEZ-2850
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Saikat
> Attachments: OOM_1.png, OOM_2.png, OOM_3.png, TEZ-2850_test.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2850) Tez MergeManager OOM for small Map Outputs

2015-09-22 Thread Saikat (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14903497#comment-14903497
 ] 

Saikat commented on TEZ-2850:
-

[~hitesh] I was going through Hadoop's IFileInputStream 
implementation(hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/IFileInputStream.java
 ) and found that this implementation of buffer[4096] is not present in hadoop 
but in Tez. I submitted a tentative patch in which IFileInputStream of Tez 
behaves exactly as that of Hadoop. Can you please throw some light what does 
this added buffer do?


 

> Tez MergeManager OOM for small Map Outputs
> --
>
> Key: TEZ-2850
> URL: https://issues.apache.org/jira/browse/TEZ-2850
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Saikat
> Attachments: OOM_1.png, OOM_2.png, OOM_3.png, TEZ-2850_test.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2850) Tez MergeManager OOM for small Map Outputs

2015-09-22 Thread Saikat (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14903481#comment-14903481
 ] 

Saikat commented on TEZ-2850:
-

This is a unique scenario, that we faced, while running a Tez Job.
A reducer vertex task fetches around 20 map outputs, each of around  ~100 
odd bytes.
So total mapoutput size is around 20 * 100 ~ 20Mb.
The MergeManager has a merge threshold check, where if it crosses this 
threshold, InmemoryMerger will be triggered and it will spill the inmemory 
fetched map outputs to disk to free up memory.

In our scenario, mergethreshold(~500mb) >> commitMemory(~20mb), So inMemory 
merger never gets triggerd.
Finally when the finalMerge() is called in close(), MergeManager calls 
createInMemorySegments() to do the final merge.
In this, when Tez creates a IFileInputStream object for the InMemoryReader, the 
IFileInputStream allocates a buffer of size 4096(hard coded).
Thus the total size of a single inmemory segment comes to around 5kb, even 
though data in this segment is only in order of 100 bytes. So, for 20 map 
outputs, the total size is 20 * 5000 ~ 1G, which causes OOM!

Attached is  a snapshot of the heap dump which shows this scenario.




> Tez MergeManager OOM for small Map Outputs
> --
>
> Key: TEZ-2850
> URL: https://issues.apache.org/jira/browse/TEZ-2850
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Saikat
> Attachments: OOM_1.png, OOM_2.png, OOM_3.png
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)