[jira] [Commented] (TEZ-3813) Reduce Object size of MemoryFetchedInput for large jobs
[ https://issues.apache.org/jira/browse/TEZ-3813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16120715#comment-16120715 ] TezQA commented on TEZ-3813: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12881081/TEZ-3813.006.patch against master revision 8dcf8a1. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 3.0.1) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in : org.apache.tez.dag.app.rm.TestTaskScheduler Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/2609//testReport/ Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/2609//console This message is automatically generated. > Reduce Object size of MemoryFetchedInput for large jobs > --- > > Key: TEZ-3813 > URL: https://issues.apache.org/jira/browse/TEZ-3813 > Project: Apache Tez > Issue Type: Bug >Reporter: Muhammad Samir Khan >Assignee: Muhammad Samir Khan > Attachments: TEZ-3813.001.patch, TEZ-3813.002.patch, > TEZ-3813.003.patch, TEZ-3813.004.patch, TEZ-3813.005.patch, TEZ-3813.006.patch > > > Same as TEZ-3752 for the unordered case. MemoryFetchedInput has a > BoundedByteArrayOutputStream that is not used (only the underlying byte[] is > used). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
Failed: TEZ-3159 PreCommit Build #2608
Jira: https://issues.apache.org/jira/browse/TEZ-3159 Build: https://builds.apache.org/job/PreCommit-TEZ-Build/2608/ ### ## LAST 60 LINES OF THE CONSOLE ### [...truncated 111.79 KB...] Running tests /home/jenkins/tools/maven/latest/bin/mvn clean install -fn -DTezPatchProcess cat: /home/jenkins/jenkins-slave/workspace/PreCommit-TEZ-Build@2/../patchprocess/testrun.txt: No such file or directory awk: cannot open /home/jenkins/jenkins-slave/workspace/PreCommit-TEZ-Build@2/../patchprocess/testrun.txt (No such file or directory) {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12881071/TEZ-3159.002.patch against master revision 8dcf8a1. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 3.0.1) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/2608//testReport/ Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/2608//console This message is automatically generated. == == Adding comment to Jira. == == Comment added. e62c680fd4763289a8719ca5fdec3d5345a170eb logged out == == Finished build. == == Archiving artifacts ERROR: No artifacts found that match the file pattern "patchprocess/*.*". Configuration error? ERROR: ?patchprocess/*.*? doesn?t match anything, but ?*.*? does. Perhaps that?s what you mean? Build step 'Archive the artifacts' changed build result to FAILURE [description-setter] Could not determine description. Recording test results Email was triggered for: Failure - Any Sending email for trigger: Failure - Any ### ## FAILED TESTS (if any) ## All tests passed
[jira] [Updated] (TEZ-3816) Ability to automatically speculate single-task vertices
[ https://issues.apache.org/jira/browse/TEZ-3816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Muhammad Samir Khan updated TEZ-3816: - Attachment: TEZ-3816.001.patch Added ability to LegacyTaskRuntimeEstimator to speculate based on timeout for single task vertices. > Ability to automatically speculate single-task vertices > --- > > Key: TEZ-3816 > URL: https://issues.apache.org/jira/browse/TEZ-3816 > Project: Apache Tez > Issue Type: Improvement >Reporter: Muhammad Samir Khan >Assignee: Muhammad Samir Khan > Attachments: TEZ-3816.001.patch > > > When a single-task vertex is unlucky, it lands on a very slow node. > Speculation doesn't currently apply when there are no other tasks to compare > with. It would be good to either have a configurable timeout after which the > tasks automatically speculate. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (TEZ-3816) Ability to automatically speculate single-task vertices
Muhammad Samir Khan created TEZ-3816: Summary: Ability to automatically speculate single-task vertices Key: TEZ-3816 URL: https://issues.apache.org/jira/browse/TEZ-3816 Project: Apache Tez Issue Type: Improvement Reporter: Muhammad Samir Khan Assignee: Muhammad Samir Khan When a single-task vertex is unlucky, it lands on a very slow node. Speculation doesn't currently apply when there are no other tasks to compare with. It would be good to either have a configurable timeout after which the tasks automatically speculate. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
Failed: TEZ-3813 PreCommit Build #2607
Jira: https://issues.apache.org/jira/browse/TEZ-3813 Build: https://builds.apache.org/job/PreCommit-TEZ-Build/2607/ ### ## LAST 60 LINES OF THE CONSOLE ### [...truncated 340.44 KB...] [INFO] BUILD SUCCESS [INFO] [INFO] Total time: 55:35 min [INFO] Finished at: 2017-08-09T21:04:43Z [INFO] Final Memory: 89M/1406M [INFO] {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12881063/TEZ-3813.005.patch against master revision 8dcf8a1. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 3.0.1) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The following test timeouts occurred in : org.apache.tez.test.TestTezJobs Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/2607//testReport/ Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/2607//console This message is automatically generated. == == Adding comment to Jira. == == Comment added. 6d519af3e9fe8f9b413f229ac903c532d19835a3 logged out == == Finished build. == == Build step 'Execute shell' marked build as failure Archiving artifacts [description-setter] Could not determine description. Recording test results Email was triggered for: Failure - Any Sending email for trigger: Failure - Any ### ## FAILED TESTS (if any) ## All tests passed
[jira] [Commented] (TEZ-3813) Reduce Object size of MemoryFetchedInput for large jobs
[ https://issues.apache.org/jira/browse/TEZ-3813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16120652#comment-16120652 ] TezQA commented on TEZ-3813: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12881063/TEZ-3813.005.patch against master revision 8dcf8a1. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 3.0.1) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The following test timeouts occurred in : org.apache.tez.test.TestTezJobs Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/2607//testReport/ Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/2607//console This message is automatically generated. > Reduce Object size of MemoryFetchedInput for large jobs > --- > > Key: TEZ-3813 > URL: https://issues.apache.org/jira/browse/TEZ-3813 > Project: Apache Tez > Issue Type: Bug >Reporter: Muhammad Samir Khan >Assignee: Muhammad Samir Khan > Attachments: TEZ-3813.001.patch, TEZ-3813.002.patch, > TEZ-3813.003.patch, TEZ-3813.004.patch, TEZ-3813.005.patch, TEZ-3813.006.patch > > > Same as TEZ-3752 for the unordered case. MemoryFetchedInput has a > BoundedByteArrayOutputStream that is not used (only the underlying byte[] is > used). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (TEZ-3813) Reduce Object size of MemoryFetchedInput for large jobs
[ https://issues.apache.org/jira/browse/TEZ-3813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Muhammad Samir Khan updated TEZ-3813: - Attachment: TEZ-3813.006.patch Fixed > Reduce Object size of MemoryFetchedInput for large jobs > --- > > Key: TEZ-3813 > URL: https://issues.apache.org/jira/browse/TEZ-3813 > Project: Apache Tez > Issue Type: Bug >Reporter: Muhammad Samir Khan >Assignee: Muhammad Samir Khan > Attachments: TEZ-3813.001.patch, TEZ-3813.002.patch, > TEZ-3813.003.patch, TEZ-3813.004.patch, TEZ-3813.005.patch, TEZ-3813.006.patch > > > Same as TEZ-3752 for the unordered case. MemoryFetchedInput has a > BoundedByteArrayOutputStream that is not used (only the underlying byte[] is > used). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (TEZ-3159) Reduce memory utilization while serializing keys and values
[ https://issues.apache.org/jira/browse/TEZ-3159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Muhammad Samir Khan updated TEZ-3159: - Attachment: TEZ-3159.002.patch Fixed findbugs and release audit warnings. > Reduce memory utilization while serializing keys and values > --- > > Key: TEZ-3159 > URL: https://issues.apache.org/jira/browse/TEZ-3159 > Project: Apache Tez > Issue Type: Improvement >Reporter: Rohini Palaniswamy >Assignee: Muhammad Samir Khan > Attachments: TEZ-3159.001.patch, TEZ-3159.002.patch > > > Currently DataOutputBuffer is used for serializing. The underlying buffer > keeps doubling in size when it reaches capacity. In some of the Pig scripts > which serialize big bags, we end up with OOM in Tez as there is no space to > double the array size. Mapreduce mode runs fine in those cases with 1G heap. > The scenarios are > - When combiner runs in reducer and some of the fields after combining > are still big bags (For eg: distinct). Currently with mapreduce combiner does > not run in reducer - MAPREDUCE-5221. Since input sort buffers hold good > amount of memory at that time it can easily go OOM. >- While serializing output with bags when there are multiple inputs and > outputs and the sort buffers for those take up space. > It is a pain especially after buffer size hits 128MB. Doubling at 128MB will > require 128MB (existing array) +256MB (new array). Any doubling after that > requires even more space. But most of the time the data is probably not going > to fill up that 256MB leading to wastage. > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (TEZ-3813) Reduce Object size of MemoryFetchedInput for large jobs
[ https://issues.apache.org/jira/browse/TEZ-3813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16120590#comment-16120590 ] Jonathan Eagles commented on TEZ-3813: -- Couple of minor nits Seems we can removed this commented out code {code:title=FetchedInput.java} +// public long getActualSize() { +//return this.actualSize; +// } +// +// public long getCompressedSize() { +//return this.compressedSize; +// } {code} We should add \@Override to this and others who override getSize {code:title=MemoryFetchedInput.java} public long getSize() {code} > Reduce Object size of MemoryFetchedInput for large jobs > --- > > Key: TEZ-3813 > URL: https://issues.apache.org/jira/browse/TEZ-3813 > Project: Apache Tez > Issue Type: Bug >Reporter: Muhammad Samir Khan >Assignee: Muhammad Samir Khan > Attachments: TEZ-3813.001.patch, TEZ-3813.002.patch, > TEZ-3813.003.patch, TEZ-3813.004.patch, TEZ-3813.005.patch > > > Same as TEZ-3752 for the unordered case. MemoryFetchedInput has a > BoundedByteArrayOutputStream that is not used (only the underlying byte[] is > used). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (TEZ-3813) Reduce Object size of MemoryFetchedInput for large jobs
[ https://issues.apache.org/jira/browse/TEZ-3813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Muhammad Samir Khan updated TEZ-3813: - Attachment: TEZ-3813.005.patch Removed int size from MemoryFetchedInput. *JOL Dump:* +After:+ Internals: {code} # Running 64-bit HotSpot VM. # Using compressed oop with 3-bit shift. # Using compressed klass with 3-bit shift. # Objects are 8 bytes aligned. # Field sizes by type: 4, 1, 1, 2, 2, 4, 4, 8, 8 [bytes] # Array element sizes: 4, 1, 1, 2, 2, 4, 4, 8, 8 [bytes] Instantiated the sample instance via public org.apache.tez.runtime.library.common.shuffle.MemoryFetchedInput(long,org.apache.tez.runtime.library.common.InputAttemptIdentifier,org.apache.tez.runtime.library.common.shuffle.FetchedInputCallback) org.apache.tez.runtime.library.common.shuffle.MemoryFetchedInput object internals: OFFSET SIZE TYPE DESCRIPTION VALUE 0 4 (object header) 01 00 00 00 (0001 ) (1) 4 4 (object header) 00 00 00 00 ( ) (0) 8 4 (object header) 7d 12 01 f8 (0101 00010010 0001 1000) (-134147459) 12 4 int FetchedInput.id 0 16 1 byte FetchedInput.state0 17 3 (alignment/padding gap) 20 4 org.apache.tez.runtime.library.common.InputAttemptIdentifier FetchedInput.inputAttemptIdentifier null 24 4 org.apache.tez.runtime.library.common.shuffle.FetchedInputCallback FetchedInput.callback null 28 4 byte[] MemoryFetchedInput.byteArray [] Instance size: 32 bytes Space losses: 3 bytes internal + 0 bytes external = 3 bytes total {code} Footprint: {code} # Running 64-bit HotSpot VM. # Using compressed oop with 3-bit shift. # Using compressed klass with 3-bit shift. # Objects are 8 bytes aligned. # Field sizes by type: 4, 1, 1, 2, 2, 4, 4, 8, 8 [bytes] # Array element sizes: 4, 1, 1, 2, 2, 4, 4, 8, 8 [bytes] Instantiated the sample instance via public org.apache.tez.runtime.library.common.shuffle.MemoryFetchedInput(long,org.apache.tez.runtime.library.common.InputAttemptIdentifier,org.apache.tez.runtime.library.common.shuffle.FetchedInputCallback) org.apache.tez.runtime.library.common.shuffle.MemoryFetchedInput@215be6bbd footprint: COUNT AVG SUM DESCRIPTION 11616 [B 13232 org.apache.tez.runtime.library.common.shuffle.MemoryFetchedInput 2 48 (total) {code} > Reduce Object size of MemoryFetchedInput for large jobs > --- > > Key: TEZ-3813 > URL: https://issues.apache.org/jira/browse/TEZ-3813 > Project: Apache Tez > Issue Type: Bug >Reporter: Muhammad Samir Khan >Assignee: Muhammad Samir Khan > Attachments: TEZ-3813.001.patch, TEZ-3813.002.patch, > TEZ-3813.003.patch, TEZ-3813.004.patch, TEZ-3813.005.patch > > > Same as TEZ-3752 for the unordered case. MemoryFetchedInput has a > BoundedByteArrayOutputStream that is not used (only the underlying byte[] is > used). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (TEZ-3815) allow plugins to be aware of each other
Sergey Shelukhin created TEZ-3815: - Summary: allow plugins to be aware of each other Key: TEZ-3815 URL: https://issues.apache.org/jira/browse/TEZ-3815 Project: Apache Tez Issue Type: Bug Reporter: Sergey Shelukhin Given that many sets of plugins (e.g. LLAP) come as a package deal and do not work without each other, it would make sense for them to be aware of each other. Not sure yet of the best way for this to work, we probably don't want too much complexity, dependency systems, etc. Perhaps after all the plugins are initialized fully, an optional-to-implement call could be made to each of them passing a map from plugin type (communicator, scheduler, etc.) to the instance that is going to be used for this DAG. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (TEZ-3815) allow plugins to be aware of each other
[ https://issues.apache.org/jira/browse/TEZ-3815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16120487#comment-16120487 ] Sergey Shelukhin commented on TEZ-3815: --- cc [~sseth] > allow plugins to be aware of each other > --- > > Key: TEZ-3815 > URL: https://issues.apache.org/jira/browse/TEZ-3815 > Project: Apache Tez > Issue Type: Bug >Reporter: Sergey Shelukhin > > Given that many sets of plugins (e.g. LLAP) come as a package deal and do not > work without each other, it would make sense for them to be aware of each > other. > Not sure yet of the best way for this to work, we probably don't want too > much complexity, dependency systems, etc. Perhaps after all the plugins are > initialized fully, an optional-to-implement call could be made to each of > them passing a map from plugin type (communicator, scheduler, etc.) to the > instance that is going to be used for this DAG. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (TEZ-3159) Reduce memory utilization while serializing keys and values
[ https://issues.apache.org/jira/browse/TEZ-3159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16120484#comment-16120484 ] TezQA commented on TEZ-3159: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12881049/TEZ-3159.001.patch against master revision 8dcf8a1. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:red}-1 findbugs{color}. The patch appears to introduce 4 new Findbugs (version 3.0.1) warnings. {color:red}-1 release audit{color}. The applied patch generated 1 release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/2606//testReport/ Release audit warnings: https://builds.apache.org/job/PreCommit-TEZ-Build/2606//artifact/patchprocess/patchReleaseAuditProblems.txt Findbugs warnings: https://builds.apache.org/job/PreCommit-TEZ-Build/2606//artifact/patchprocess/newPatchFindbugsWarningstez-runtime-library.html Findbugs warnings: https://builds.apache.org/job/PreCommit-TEZ-Build/2606//artifact/patchprocess/newPatchFindbugsWarningstez-common.html Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/2606//console This message is automatically generated. > Reduce memory utilization while serializing keys and values > --- > > Key: TEZ-3159 > URL: https://issues.apache.org/jira/browse/TEZ-3159 > Project: Apache Tez > Issue Type: Improvement >Reporter: Rohini Palaniswamy >Assignee: Muhammad Samir Khan > Attachments: TEZ-3159.001.patch > > > Currently DataOutputBuffer is used for serializing. The underlying buffer > keeps doubling in size when it reaches capacity. In some of the Pig scripts > which serialize big bags, we end up with OOM in Tez as there is no space to > double the array size. Mapreduce mode runs fine in those cases with 1G heap. > The scenarios are > - When combiner runs in reducer and some of the fields after combining > are still big bags (For eg: distinct). Currently with mapreduce combiner does > not run in reducer - MAPREDUCE-5221. Since input sort buffers hold good > amount of memory at that time it can easily go OOM. >- While serializing output with bags when there are multiple inputs and > outputs and the sort buffers for those take up space. > It is a pain especially after buffer size hits 128MB. Doubling at 128MB will > require 128MB (existing array) +256MB (new array). Any doubling after that > requires even more space. But most of the time the data is probably not going > to fill up that 256MB leading to wastage. > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
Failed: TEZ-3159 PreCommit Build #2606
Jira: https://issues.apache.org/jira/browse/TEZ-3159 Build: https://builds.apache.org/job/PreCommit-TEZ-Build/2606/ ### ## LAST 60 LINES OF THE CONSOLE ### [...truncated 340.31 KB...] [INFO] Finished at: 2017-08-09T19:11:26Z [INFO] Final Memory: 81M/1454M [INFO] {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12881049/TEZ-3159.001.patch against master revision 8dcf8a1. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:red}-1 findbugs{color}. The patch appears to introduce 4 new Findbugs (version 3.0.1) warnings. {color:red}-1 release audit{color}. The applied patch generated 1 release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/2606//testReport/ Release audit warnings: https://builds.apache.org/job/PreCommit-TEZ-Build/2606//artifact/patchprocess/patchReleaseAuditProblems.txt Findbugs warnings: https://builds.apache.org/job/PreCommit-TEZ-Build/2606//artifact/patchprocess/newPatchFindbugsWarningstez-runtime-library.html Findbugs warnings: https://builds.apache.org/job/PreCommit-TEZ-Build/2606//artifact/patchprocess/newPatchFindbugsWarningstez-common.html Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/2606//console This message is automatically generated. == == Adding comment to Jira. == == Comment added. c3e48adbf4a6411b3b063fc0171b7b39fa429449 logged out == == Finished build. == == Build step 'Execute shell' marked build as failure Archiving artifacts [Fast Archiver] Compressed 3.31 MB of artifacts by 12.3% relative to #2605 [description-setter] Could not determine description. Recording test results Email was triggered for: Failure - Any Sending email for trigger: Failure - Any ### ## FAILED TESTS (if any) ## All tests passed
[jira] [Updated] (TEZ-3159) Reduce memory utilization while serializing keys and values
[ https://issues.apache.org/jira/browse/TEZ-3159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Muhammad Samir Khan updated TEZ-3159: - Attachment: TEZ-3159.001.patch Added a new class that grows buffers via "doubling" until it hits a threshold and then adds a new buffer to a list. Also modified IFile.Writer to work with it. > Reduce memory utilization while serializing keys and values > --- > > Key: TEZ-3159 > URL: https://issues.apache.org/jira/browse/TEZ-3159 > Project: Apache Tez > Issue Type: Improvement >Reporter: Rohini Palaniswamy >Assignee: Muhammad Samir Khan > Attachments: TEZ-3159.001.patch > > > Currently DataOutputBuffer is used for serializing. The underlying buffer > keeps doubling in size when it reaches capacity. In some of the Pig scripts > which serialize big bags, we end up with OOM in Tez as there is no space to > double the array size. Mapreduce mode runs fine in those cases with 1G heap. > The scenarios are > - When combiner runs in reducer and some of the fields after combining > are still big bags (For eg: distinct). Currently with mapreduce combiner does > not run in reducer - MAPREDUCE-5221. Since input sort buffers hold good > amount of memory at that time it can easily go OOM. >- While serializing output with bags when there are multiple inputs and > outputs and the sort buffers for those take up space. > It is a pain especially after buffer size hits 128MB. Doubling at 128MB will > require 128MB (existing array) +256MB (new array). Any doubling after that > requires even more space. But most of the time the data is probably not going > to fill up that 256MB leading to wastage. > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (TEZ-3813) Reduce Object size of MemoryFetchedInput for large jobs
[ https://issues.apache.org/jira/browse/TEZ-3813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16120166#comment-16120166 ] Jonathan Eagles commented on TEZ-3813: -- [~samirkhan], Can we try removing the MemoryFetchedInput#size member. That would allow us to move us one 8 bytes boundary more for this object. We will have to avoid the null pointer exception in SimpleFetchedInputAllocator#cleanup. Perhaps just moving byteArray = null; below the notifyFreedResource call? > Reduce Object size of MemoryFetchedInput for large jobs > --- > > Key: TEZ-3813 > URL: https://issues.apache.org/jira/browse/TEZ-3813 > Project: Apache Tez > Issue Type: Bug >Reporter: Muhammad Samir Khan >Assignee: Muhammad Samir Khan > Attachments: TEZ-3813.001.patch, TEZ-3813.002.patch, > TEZ-3813.003.patch, TEZ-3813.004.patch > > > Same as TEZ-3752 for the unordered case. MemoryFetchedInput has a > BoundedByteArrayOutputStream that is not used (only the underlying byte[] is > used). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (TEZ-3617) TestHistoryParser#testParserWithSuccessfulJob fails intermittently
[ https://issues.apache.org/jira/browse/TEZ-3617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16119905#comment-16119905 ] TezQA commented on TEZ-3617: {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12869290/TEZ-3617.1.patch against master revision 8dcf8a1. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 3.0.1) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/2605//testReport/ Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/2605//console This message is automatically generated. > TestHistoryParser#testParserWithSuccessfulJob fails intermittently > -- > > Key: TEZ-3617 > URL: https://issues.apache.org/jira/browse/TEZ-3617 > Project: Apache Tez > Issue Type: Bug >Affects Versions: 0.9.0 > Environment: Ubuntu 14.04 >Reporter: Sonia Garudi >Assignee: Jonathan Eagles > Labels: ppc64le, x86 > Attachments: org.apache.tez.history.TestHistoryParser-output.txt, > TEZ-3617.1.patch > > > The TestHistoryParser#testParserWithSuccessfulJob test fails intermittently > in tez-history-parser project. > Error message : > testParserWithSuccessfulJob(org.apache.tez.history.TestHistoryParser) Time > elapsed: 29.952 sec <<< FAILURE! > java.lang.AssertionError: null > at org.junit.Assert.fail(Assert.java:86) > at org.junit.Assert.assertTrue(Assert.java:41) > at org.junit.Assert.assertTrue(Assert.java:52) > at > org.apache.tez.history.TestHistoryParser.verifyJobSpecificInfo(TestHistoryParser.java:266) > at > org.apache.tez.history.TestHistoryParser.testParserWithSuccessfulJob(TestHistoryParser.java:212) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
Success: TEZ-3617 PreCommit Build #2605
Jira: https://issues.apache.org/jira/browse/TEZ-3617 Build: https://builds.apache.org/job/PreCommit-TEZ-Build/2605/ ### ## LAST 60 LINES OF THE CONSOLE ### [...truncated 339.20 KB...] [INFO] Tez SUCCESS [ 0.034 s] [INFO] [INFO] BUILD SUCCESS [INFO] [INFO] Total time: 54:36 min [INFO] Finished at: 2017-08-09T13:34:12Z [INFO] Final Memory: 92M/1322M [INFO] {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12869290/TEZ-3617.1.patch against master revision 8dcf8a1. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 3.0.1) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/2605//testReport/ Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/2605//console This message is automatically generated. == == Adding comment to Jira. == == Comment added. befc9f17f215a29beff1d0bf73bec34556ff3952 logged out == == Finished build. == == Archiving artifacts [description-setter] Description set: TEZ-3617 Recording test results Email was triggered for: Success Sending email for trigger: Success ### ## FAILED TESTS (if any) ## All tests passed
[jira] [Commented] (TEZ-3617) TestHistoryParser#testParserWithSuccessfulJob fails intermittently
[ https://issues.apache.org/jira/browse/TEZ-3617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16119784#comment-16119784 ] Sneha Kanekar commented on TEZ-3617: [~aplusplus] any update on this? > TestHistoryParser#testParserWithSuccessfulJob fails intermittently > -- > > Key: TEZ-3617 > URL: https://issues.apache.org/jira/browse/TEZ-3617 > Project: Apache Tez > Issue Type: Bug >Affects Versions: 0.9.0 > Environment: Ubuntu 14.04 >Reporter: Sonia Garudi >Assignee: Jonathan Eagles > Labels: ppc64le, x86 > Attachments: org.apache.tez.history.TestHistoryParser-output.txt, > TEZ-3617.1.patch > > > The TestHistoryParser#testParserWithSuccessfulJob test fails intermittently > in tez-history-parser project. > Error message : > testParserWithSuccessfulJob(org.apache.tez.history.TestHistoryParser) Time > elapsed: 29.952 sec <<< FAILURE! > java.lang.AssertionError: null > at org.junit.Assert.fail(Assert.java:86) > at org.junit.Assert.assertTrue(Assert.java:41) > at org.junit.Assert.assertTrue(Assert.java:52) > at > org.apache.tez.history.TestHistoryParser.verifyJobSpecificInfo(TestHistoryParser.java:266) > at > org.apache.tez.history.TestHistoryParser.testParserWithSuccessfulJob(TestHistoryParser.java:212) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (TEZ-3431) Add unit tests for container release
[ https://issues.apache.org/jira/browse/TEZ-3431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16119516#comment-16119516 ] Taklon Stephen Wu commented on TEZ-3431: [~ssreenivasan] can you assign this to me? otherwise, may I clone this issue? > Add unit tests for container release > > > Key: TEZ-3431 > URL: https://issues.apache.org/jira/browse/TEZ-3431 > Project: Apache Tez > Issue Type: Bug >Reporter: Sushmitha Sreenivasan > Labels: newbie > > * Add unit tests to verify that scheduler release container after expiry > time(HeldContainer.containerExpiryTime). -- This message was sent by Atlassian JIRA (v6.4.14#64029)