[jira] [Commented] (TEZ-3813) Reduce Object size of MemoryFetchedInput for large jobs

2017-08-09 Thread TezQA (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16120715#comment-16120715
 ] 

TezQA commented on TEZ-3813:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12881081/TEZ-3813.006.patch
  against master revision 8dcf8a1.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 3.0.1) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in :
   org.apache.tez.dag.app.rm.TestTaskScheduler

Test results: 
https://builds.apache.org/job/PreCommit-TEZ-Build/2609//testReport/
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/2609//console

This message is automatically generated.

> Reduce Object size of MemoryFetchedInput for large jobs
> ---
>
> Key: TEZ-3813
> URL: https://issues.apache.org/jira/browse/TEZ-3813
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Muhammad Samir Khan
>Assignee: Muhammad Samir Khan
> Attachments: TEZ-3813.001.patch, TEZ-3813.002.patch, 
> TEZ-3813.003.patch, TEZ-3813.004.patch, TEZ-3813.005.patch, TEZ-3813.006.patch
>
>
> Same as TEZ-3752 for the unordered case. MemoryFetchedInput has a 
> BoundedByteArrayOutputStream that is not used (only the underlying byte[] is 
> used).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (TEZ-3813) Reduce Object size of MemoryFetchedInput for large jobs

2017-08-09 Thread TezQA (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16120652#comment-16120652
 ] 

TezQA commented on TEZ-3813:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12881063/TEZ-3813.005.patch
  against master revision 8dcf8a1.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 3.0.1) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The following test timeouts occurred in :
 org.apache.tez.test.TestTezJobs

Test results: 
https://builds.apache.org/job/PreCommit-TEZ-Build/2607//testReport/
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/2607//console

This message is automatically generated.

> Reduce Object size of MemoryFetchedInput for large jobs
> ---
>
> Key: TEZ-3813
> URL: https://issues.apache.org/jira/browse/TEZ-3813
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Muhammad Samir Khan
>Assignee: Muhammad Samir Khan
> Attachments: TEZ-3813.001.patch, TEZ-3813.002.patch, 
> TEZ-3813.003.patch, TEZ-3813.004.patch, TEZ-3813.005.patch, TEZ-3813.006.patch
>
>
> Same as TEZ-3752 for the unordered case. MemoryFetchedInput has a 
> BoundedByteArrayOutputStream that is not used (only the underlying byte[] is 
> used).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (TEZ-3813) Reduce Object size of MemoryFetchedInput for large jobs

2017-08-09 Thread Jonathan Eagles (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16120590#comment-16120590
 ] 

Jonathan Eagles commented on TEZ-3813:
--

Couple of minor nits

Seems we can removed this commented out code
{code:title=FetchedInput.java}
+//  public long getActualSize() {
+//return this.actualSize;
+//  }
+//
+//  public long getCompressedSize() {
+//return this.compressedSize;
+//  }
{code}

We should add \@Override to this and others who override getSize
{code:title=MemoryFetchedInput.java}
public long getSize()
{code}

> Reduce Object size of MemoryFetchedInput for large jobs
> ---
>
> Key: TEZ-3813
> URL: https://issues.apache.org/jira/browse/TEZ-3813
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Muhammad Samir Khan
>Assignee: Muhammad Samir Khan
> Attachments: TEZ-3813.001.patch, TEZ-3813.002.patch, 
> TEZ-3813.003.patch, TEZ-3813.004.patch, TEZ-3813.005.patch
>
>
> Same as TEZ-3752 for the unordered case. MemoryFetchedInput has a 
> BoundedByteArrayOutputStream that is not used (only the underlying byte[] is 
> used).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (TEZ-3813) Reduce Object size of MemoryFetchedInput for large jobs

2017-08-09 Thread Jonathan Eagles (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16120166#comment-16120166
 ] 

Jonathan Eagles commented on TEZ-3813:
--

[~samirkhan], Can we try removing the MemoryFetchedInput#size member. That 
would allow us to move us one 8 bytes boundary more for this object. We will 
have to avoid the null pointer exception in 
SimpleFetchedInputAllocator#cleanup. Perhaps just moving byteArray = null;  
below the notifyFreedResource call?

> Reduce Object size of MemoryFetchedInput for large jobs
> ---
>
> Key: TEZ-3813
> URL: https://issues.apache.org/jira/browse/TEZ-3813
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Muhammad Samir Khan
>Assignee: Muhammad Samir Khan
> Attachments: TEZ-3813.001.patch, TEZ-3813.002.patch, 
> TEZ-3813.003.patch, TEZ-3813.004.patch
>
>
> Same as TEZ-3752 for the unordered case. MemoryFetchedInput has a 
> BoundedByteArrayOutputStream that is not used (only the underlying byte[] is 
> used).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (TEZ-3813) Reduce Object size of MemoryFetchedInput for large jobs

2017-08-08 Thread TezQA (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16118859#comment-16118859
 ] 

TezQA commented on TEZ-3813:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12880868/TEZ-3813.004.patch
  against master revision 8dcf8a1.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 3.0.1) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in :
   org.apache.tez.client.TestTezClient

Test results: 
https://builds.apache.org/job/PreCommit-TEZ-Build/2604//testReport/
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/2604//console

This message is automatically generated.

> Reduce Object size of MemoryFetchedInput for large jobs
> ---
>
> Key: TEZ-3813
> URL: https://issues.apache.org/jira/browse/TEZ-3813
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Muhammad Samir Khan
>Assignee: Muhammad Samir Khan
> Attachments: TEZ-3813.001.patch, TEZ-3813.002.patch, 
> TEZ-3813.003.patch, TEZ-3813.004.patch
>
>
> Same as TEZ-3752 for the unordered case. MemoryFetchedInput has a 
> BoundedByteArrayOutputStream that is not used (only the underlying byte[] is 
> used).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (TEZ-3813) Reduce Object size of MemoryFetchedInput for large jobs

2017-08-07 Thread TezQA (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16117457#comment-16117457
 ] 

TezQA commented on TEZ-3813:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12880711/TEZ-3813.003.patch
  against master revision 8dcf8a1.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:red}-1 findbugs{color}.  The patch appears to introduce 1 new 
Findbugs (version 3.0.1) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-TEZ-Build/2603//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-TEZ-Build/2603//artifact/patchprocess/newPatchFindbugsWarningstez-runtime-library.html
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/2603//console

This message is automatically generated.

> Reduce Object size of MemoryFetchedInput for large jobs
> ---
>
> Key: TEZ-3813
> URL: https://issues.apache.org/jira/browse/TEZ-3813
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Muhammad Samir Khan
>Assignee: Muhammad Samir Khan
> Attachments: TEZ-3813.001.patch, TEZ-3813.002.patch, 
> TEZ-3813.003.patch
>
>
> Same as TEZ-3752 for the unordered case. MemoryFetchedInput has a 
> BoundedByteArrayOutputStream that is not used (only the underlying byte[] is 
> used).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (TEZ-3813) Reduce Object size of MemoryFetchedInput for large jobs

2017-08-07 Thread TezQA (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16117354#comment-16117354
 ] 

TezQA commented on TEZ-3813:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12880696/TEZ-3813.002.patch
  against master revision 8dcf8a1.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:red}-1 findbugs{color}.  The patch appears to introduce 1 new 
Findbugs (version 3.0.1) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-TEZ-Build/2602//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-TEZ-Build/2602//artifact/patchprocess/newPatchFindbugsWarningstez-runtime-library.html
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/2602//console

This message is automatically generated.

> Reduce Object size of MemoryFetchedInput for large jobs
> ---
>
> Key: TEZ-3813
> URL: https://issues.apache.org/jira/browse/TEZ-3813
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Muhammad Samir Khan
>Assignee: Muhammad Samir Khan
> Attachments: TEZ-3813.001.patch, TEZ-3813.002.patch, 
> TEZ-3813.003.patch
>
>
> Same as TEZ-3752 for the unordered case. MemoryFetchedInput has a 
> BoundedByteArrayOutputStream that is not used (only the underlying byte[] is 
> used).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (TEZ-3813) Reduce Object size of MemoryFetchedInput for large jobs

2017-08-04 Thread Jonathan Eagles (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16114738#comment-16114738
 ] 

Jonathan Eagles commented on TEZ-3813:
--

[~samirkhan], can you fix the findbugs warning. I'm not sure if the exception 
is already present or missing in the findbugs exception file. There is some 
extra code since getOutputStream is only used for Type.DISK and never 
Type.MEMORY, but it does not harm. If you have time you can refactor that, but 
it is fine the way it is.



> Reduce Object size of MemoryFetchedInput for large jobs
> ---
>
> Key: TEZ-3813
> URL: https://issues.apache.org/jira/browse/TEZ-3813
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Muhammad Samir Khan
>Assignee: Muhammad Samir Khan
> Attachments: TEZ-3813.001.patch
>
>
> Same as TEZ-3752 for the unordered case. MemoryFetchedInput has a 
> BoundedByteArrayOutputStream that is not used (only the underlying byte[] is 
> used).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (TEZ-3813) Reduce Object size of MemoryFetchedInput for large jobs

2017-08-03 Thread TezQA (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16113843#comment-16113843
 ] 

TezQA commented on TEZ-3813:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12880300/TEZ-3813.001.patch
  against master revision 614937c.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:red}-1 findbugs{color}.  The patch appears to introduce 1 new 
Findbugs (version 3.0.1) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-TEZ-Build/2599//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-TEZ-Build/2599//artifact/patchprocess/newPatchFindbugsWarningstez-runtime-library.html
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/2599//console

This message is automatically generated.

> Reduce Object size of MemoryFetchedInput for large jobs
> ---
>
> Key: TEZ-3813
> URL: https://issues.apache.org/jira/browse/TEZ-3813
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Muhammad Samir Khan
>Assignee: Muhammad Samir Khan
> Attachments: TEZ-3813.001.patch
>
>
> Same as TEZ-3752 for the unordered case. MemoryFetchedInput has a 
> BoundedByteArrayOutputStream that is not used (only the underlying byte[] is 
> used).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (TEZ-3813) Reduce Object size of MemoryFetchedInput for large jobs

2017-08-03 Thread Muhammad Samir Khan (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16113454#comment-16113454
 ] 

Muhammad Samir Khan commented on TEZ-3813:
--

Tested with filterLinesByWord and compared output before and after.

> Reduce Object size of MemoryFetchedInput for large jobs
> ---
>
> Key: TEZ-3813
> URL: https://issues.apache.org/jira/browse/TEZ-3813
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Muhammad Samir Khan
>Assignee: Muhammad Samir Khan
> Attachments: TEZ-3813.001.patch
>
>
> Same as TEZ-3752 for the unordered case. MemoryFetchedInput has a 
> BoundedByteArrayOutputStream that is not used (only the underlying byte[] is 
> used).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (TEZ-3813) Reduce Object size of MemoryFetchedInput for large jobs

2017-08-03 Thread Muhammad Samir Khan (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16113451#comment-16113451
 ] 

Muhammad Samir Khan commented on TEZ-3813:
--

*JOL Dump:*
+Before:+
Internals:
{code}
# Running 64-bit HotSpot VM.
# Using compressed oop with 3-bit shift.
# Using compressed klass with 3-bit shift.
# Objects are 8 bytes aligned.
# Field sizes by type: 4, 1, 1, 2, 2, 4, 4, 8, 8 [bytes]
# Array element sizes: 4, 1, 1, 2, 2, 4, 4, 8, 8 [bytes]

Instantiated the sample instance via public 
org.apache.tez.runtime.library.common.shuffle.MemoryFetchedInput(long,long,org.apache.tez.runtime.library.common.InputAttemptIdentifier,org.apache.tez.runtime.library.common.shuffle.FetchedInputCallback)

org.apache.tez.runtime.library.common.shuffle.MemoryFetchedInput object 
internals:
 OFFSET  SIZE 
TYPE DESCRIPTION   VALUE
  0 4   
   (object header)   01 00 00 00 (0001  
 ) (1)
  4 4   
   (object header)   00 00 00 00 (  
 ) (0)
  8 4   
   (object header)   7a 12 01 f8 (0010 00010010 
0001 1000) (-134147462)
 12 4  
int FetchedInput.id   0
 16 8 
long FetchedInput.actualSize   0
 24 8 
long FetchedInput.compressedSize   0
 32 4 
org.apache.tez.runtime.library.common.InputAttemptIdentifier 
FetchedInput.inputAttemptIdentifier   null
 36 4  
org.apache.tez.runtime.library.common.shuffle.FetchedInput.Type 
FetchedInput.type (object)
 40 4   
org.apache.tez.runtime.library.common.shuffle.FetchedInputCallback 
FetchedInput.callback null
 44 4 
org.apache.tez.runtime.library.common.shuffle.FetchedInput.State 
FetchedInput.state(object)
 48 4
org.apache.hadoop.io.BoundedByteArrayOutputStream MemoryFetchedInput.byteStream 
(object)
 52 4   
   (loss due to the next object alignment)
Instance size: 56 bytes
Space losses: 0 bytes internal + 4 bytes external = 4 bytes total
{code}

Footprint:
{code}
# Running 64-bit HotSpot VM.
# Using compressed oop with 3-bit shift.
# Using compressed klass with 3-bit shift.
# Objects are 8 bytes aligned.
# Field sizes by type: 4, 1, 1, 2, 2, 4, 4, 8, 8 [bytes]
# Array element sizes: 4, 1, 1, 2, 2, 4, 4, 8, 8 [bytes]

Instantiated the sample instance via public 
org.apache.tez.runtime.library.common.shuffle.MemoryFetchedInput(long,long,org.apache.tez.runtime.library.common.InputAttemptIdentifier,org.apache.tez.runtime.library.common.shuffle.FetchedInputCallback)

org.apache.tez.runtime.library.common.shuffle.MemoryFetchedInput@215be6bbd 
footprint:
 COUNT   AVG   SUM   DESCRIPTION
 11616   [B
 23264   [C
 22448   java.lang.String
 13232   
org.apache.hadoop.io.BoundedByteArrayOutputStream
 12424   
org.apache.tez.runtime.library.common.shuffle.FetchedInput$State
 12424   
org.apache.tez.runtime.library.common.shuffle.FetchedInput$Type
 15656   
org.apache.tez.runtime.library.common.shuffle.MemoryFetchedInput
 9 264   (total)
{code}

+After:+
Internals:
{code}
# Running 64-bit HotSpot VM.
# Using compressed oop with 3-bit shift.
# Using compressed klass with 3-bit shift.
# Objects are 8 bytes aligned.
# Field sizes by type: 4, 1, 1, 2, 2, 4, 4, 8, 8 [bytes]
# Array element sizes: 4, 1, 1, 2, 2, 4, 4, 8, 8 [bytes]

Instantiated the sample instance via public 
org.apache.tez.runtime.library.common.shuffle.MemoryFetchedInput(long,long,org.apache.tez.runtime.library.common.InputAttemptIdentifier,org.apache.tez.runtime.library.common.shuffle.FetchedInputCallback)

org.apache.tez.runtime.library.common.shuffle.MemoryFetchedInput object 
internals:
 OFFSET  SIZE 
TYPE DESCRIPTION   VALUE
  0 4   
   (object header)   01 00 00 00 (0001  
 ) (1)