[jira] [Commented] (TEZ-3115) Shuffle string handling adds significant memory overhead
[ https://issues.apache.org/jira/browse/TEZ-3115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15176511#comment-15176511 ] TezQA commented on TEZ-3115: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12791002/TEZ-3115.4.patch against master revision ac0fd8b. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 3.0.1) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in : org.apache.tez.test.TestFaultTolerance Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/1536//testReport/ Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/1536//console This message is automatically generated. > Shuffle string handling adds significant memory overhead > > > Key: TEZ-3115 > URL: https://issues.apache.org/jira/browse/TEZ-3115 > Project: Apache Tez > Issue Type: Bug >Affects Versions: 0.7.0 >Reporter: Jason Lowe >Assignee: Jonathan Eagles > Fix For: 0.7.1, 0.8.3 > > Attachments: TEZ-3115.1.patch, TEZ-3115.2.patch, > TEZ-3115.3-branch-0.7.patch, TEZ-3115.3.patch, TEZ-3115.4-branch-0.7.patch, > TEZ-3115.4.patch > > > While investigating the OOM heap dump from TEZ-3114 I noticed that the > ShuffleManager and other shuffle-related objects were holding onto many > strings that added up to over a hundred megabytes of memory. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-3115) Shuffle string handling adds significant memory overhead
[ https://issues.apache.org/jira/browse/TEZ-3115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15176472#comment-15176472 ] TezQA commented on TEZ-3115: {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12791002/TEZ-3115.4.patch against master revision ac0fd8b. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 3.0.1) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/1535//testReport/ Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/1535//console This message is automatically generated. > Shuffle string handling adds significant memory overhead > > > Key: TEZ-3115 > URL: https://issues.apache.org/jira/browse/TEZ-3115 > Project: Apache Tez > Issue Type: Bug >Affects Versions: 0.7.0 >Reporter: Jason Lowe >Assignee: Jonathan Eagles > Fix For: 0.7.1, 0.8.3 > > Attachments: TEZ-3115.1.patch, TEZ-3115.2.patch, > TEZ-3115.3-branch-0.7.patch, TEZ-3115.3.patch, TEZ-3115.4-branch-0.7.patch, > TEZ-3115.4.patch > > > While investigating the OOM heap dump from TEZ-3114 I noticed that the > ShuffleManager and other shuffle-related objects were holding onto many > strings that added up to over a hundred megabytes of memory. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-3115) Shuffle string handling adds significant memory overhead
[ https://issues.apache.org/jira/browse/TEZ-3115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15176305#comment-15176305 ] Siddharth Seth commented on TEZ-3115: - +1. Thanks [~jeagles] Wasn't aware of the improvements to interning in Java7 etc. I supposed either can be used in that case.. > Shuffle string handling adds significant memory overhead > > > Key: TEZ-3115 > URL: https://issues.apache.org/jira/browse/TEZ-3115 > Project: Apache Tez > Issue Type: Bug >Affects Versions: 0.7.0 >Reporter: Jason Lowe >Assignee: Jonathan Eagles > Attachments: TEZ-3115.1.patch, TEZ-3115.2.patch, > TEZ-3115.3-branch-0.7.patch, TEZ-3115.3.patch, TEZ-3115.4-branch-0.7.patch, > TEZ-3115.4.patch > > > While investigating the OOM heap dump from TEZ-3114 I noticed that the > ShuffleManager and other shuffle-related objects were holding onto many > strings that added up to over a hundred megabytes of memory. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-3115) Shuffle string handling adds significant memory overhead
[ https://issues.apache.org/jira/browse/TEZ-3115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15174281#comment-15174281 ] Siddharth Seth commented on TEZ-3115: - I think the interning needs to be done via StringInterner.weakIntern() ? The rest looks good. Minor, could you please add a toString method on thenew classes - HostPort, PathPartition, HostPortPartition > Shuffle string handling adds significant memory overhead > > > Key: TEZ-3115 > URL: https://issues.apache.org/jira/browse/TEZ-3115 > Project: Apache Tez > Issue Type: Bug >Affects Versions: 0.7.0 >Reporter: Jason Lowe >Assignee: Jonathan Eagles > Attachments: TEZ-3115.1.patch, TEZ-3115.2.patch, > TEZ-3115.3-branch-0.7.patch, TEZ-3115.3.patch > > > While investigating the OOM heap dump from TEZ-3114 I noticed that the > ShuffleManager and other shuffle-related objects were holding onto many > strings that added up to over a hundred megabytes of memory. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-3115) Shuffle string handling adds significant memory overhead
[ https://issues.apache.org/jira/browse/TEZ-3115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15172321#comment-15172321 ] TezQA commented on TEZ-3115: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12790511/TEZ-3115.3.patch against master revision 18398c8. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 3.0.1) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in : org.apache.tez.dag.app.rm.TestContainerReuse Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/1531//testReport/ Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/1531//console This message is automatically generated. > Shuffle string handling adds significant memory overhead > > > Key: TEZ-3115 > URL: https://issues.apache.org/jira/browse/TEZ-3115 > Project: Apache Tez > Issue Type: Bug >Affects Versions: 0.7.0 >Reporter: Jason Lowe >Assignee: Jonathan Eagles > Attachments: TEZ-3115.1.patch, TEZ-3115.2.patch, > TEZ-3115.3-branch-0.7.patch, TEZ-3115.3.patch > > > While investigating the OOM heap dump from TEZ-3114 I noticed that the > ShuffleManager and other shuffle-related objects were holding onto many > strings that added up to over a hundred megabytes of memory. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-3115) Shuffle string handling adds significant memory overhead
[ https://issues.apache.org/jira/browse/TEZ-3115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15171663#comment-15171663 ] Rajesh Balamohan commented on TEZ-3115: --- Minor: -Should FetcherOrderedGrouped directly call ShuffleUtils.constructBaseURIForShuffleHandler instead of going via another redirection (to avoid additional string creation in for host + ":" + port) > Shuffle string handling adds significant memory overhead > > > Key: TEZ-3115 > URL: https://issues.apache.org/jira/browse/TEZ-3115 > Project: Apache Tez > Issue Type: Bug >Affects Versions: 0.7.0 >Reporter: Jason Lowe >Assignee: Jonathan Eagles > Attachments: TEZ-3115.1.patch, TEZ-3115.2.patch > > > While investigating the OOM heap dump from TEZ-3114 I noticed that the > ShuffleManager and other shuffle-related objects were holding onto many > strings that added up to over a hundred megabytes of memory. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-3115) Shuffle string handling adds significant memory overhead
[ https://issues.apache.org/jira/browse/TEZ-3115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15169463#comment-15169463 ] Jonathan Eagles commented on TEZ-3115: -- [~sseth], can you have a review of this patch? The finbugs warnings are due to TEZ-1911 and TEZ-3077. javac warning is expected for this scenario. > Shuffle string handling adds significant memory overhead > > > Key: TEZ-3115 > URL: https://issues.apache.org/jira/browse/TEZ-3115 > Project: Apache Tez > Issue Type: Bug >Affects Versions: 0.7.0 >Reporter: Jason Lowe >Assignee: Jonathan Eagles > Attachments: TEZ-3115.1.patch, TEZ-3115.2.patch > > > While investigating the OOM heap dump from TEZ-3114 I noticed that the > ShuffleManager and other shuffle-related objects were holding onto many > strings that added up to over a hundred megabytes of memory. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-3115) Shuffle string handling adds significant memory overhead
[ https://issues.apache.org/jira/browse/TEZ-3115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15169414#comment-15169414 ] TezQA commented on TEZ-3115: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12790149/TEZ-3115.2.patch against master revision 923f7b4. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:red}-1 javac{color}. The applied patch generated 31 javac compiler warnings (more than the master's current 30 warnings). {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:red}-1 findbugs{color}. The patch appears to introduce 2 new Findbugs (version 3.0.1) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/1520//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-TEZ-Build/1520//artifact/patchprocess/newPatchFindbugsWarningstez-api.html Findbugs warnings: https://builds.apache.org/job/PreCommit-TEZ-Build/1520//artifact/patchprocess/newPatchFindbugsWarningstez-runtime-library.html Javac warnings: https://builds.apache.org/job/PreCommit-TEZ-Build/1520//artifact/patchprocess/diffJavacWarnings.txt Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/1520//console This message is automatically generated. > Shuffle string handling adds significant memory overhead > > > Key: TEZ-3115 > URL: https://issues.apache.org/jira/browse/TEZ-3115 > Project: Apache Tez > Issue Type: Bug >Affects Versions: 0.7.0 >Reporter: Jason Lowe >Assignee: Jonathan Eagles > Attachments: TEZ-3115.1.patch, TEZ-3115.2.patch > > > While investigating the OOM heap dump from TEZ-3114 I noticed that the > ShuffleManager and other shuffle-related objects were holding onto many > strings that added up to over a hundred megabytes of memory. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-3115) Shuffle string handling adds significant memory overhead
[ https://issues.apache.org/jira/browse/TEZ-3115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15169266#comment-15169266 ] Jonathan Eagles commented on TEZ-3115: -- Patch 2 summary. - Host and attempt are now the fundamental storage types. Created several subtypes that allow us to intern host and path component immediately after processing the DataMovementEvent. This allows us to not only reduce down to one copy not only exact strings, but the string derivatives (host -> host, host-port, host-port-partition), (path component -> path component, path component-partition). There are a few non-string handling scenarios that still need improvements (extremely large auto-reduce parallelism, and large number of empty partitions). Filed TEZ-3144 and TEZ-3145 to address those scenarios. > Shuffle string handling adds significant memory overhead > > > Key: TEZ-3115 > URL: https://issues.apache.org/jira/browse/TEZ-3115 > Project: Apache Tez > Issue Type: Bug >Affects Versions: 0.7.0 >Reporter: Jason Lowe >Assignee: Jonathan Eagles > Attachments: TEZ-3115.1.patch, TEZ-3115.2.patch > > > While investigating the OOM heap dump from TEZ-3114 I noticed that the > ShuffleManager and other shuffle-related objects were holding onto many > strings that added up to over a hundred megabytes of memory. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-3115) Shuffle string handling adds significant memory overhead
[ https://issues.apache.org/jira/browse/TEZ-3115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15149130#comment-15149130 ] Siddharth Seth commented on TEZ-3115: - Some strings which could be interned - Intern pathComponent (ShuffleInputEventHandler, InputAttemptIdentifier, MapHost, etc) - Intern hostIdentifier in MapHost (and wherever else is is created). Can potentially avoid storing this - however, it doesn't seem like there's an explosion of strings here - since it's just host:port - Intern the hostname The biggest offender will however continue to be host:port:partition when reduce parallelism kicks in. That should not be linked to the host in any way - however, I think that change should be in a separate jira - since it affects functionality quite a bit. Another side affect of using the partitionId to identify the host is that we can end up with multiple parallel fetches from the same host - which is otherwise explicitly avoided in the Ordered Shuffle. That could be leading to overloaded nodes as well. > Shuffle string handling adds significant memory overhead > > > Key: TEZ-3115 > URL: https://issues.apache.org/jira/browse/TEZ-3115 > Project: Apache Tez > Issue Type: Bug >Affects Versions: 0.7.0 >Reporter: Jason Lowe > Attachments: TEZ-3115.1.patch > > > While investigating the OOM heap dump from TEZ-3114 I noticed that the > ShuffleManager and other shuffle-related objects were holding onto many > strings that added up to over a hundred megabytes of memory. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-3115) Shuffle string handling adds significant memory overhead
[ https://issues.apache.org/jira/browse/TEZ-3115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15142893#comment-15142893 ] Jason Lowe commented on TEZ-3115: - When auto-parallelism kicks in we're going to see many copies of the same upstream task attempt IDs, host:port, etc. We should at least consider interning or otherwise sharing these, or potentially just storing the raw ID and generating the string when necessary on-the-fly. MapHost is another example of many redundancies, since it stores the fully qualified host name and port at least three times (as part of baseUrl, identifier, and hostIdentifier). I wonder if it would be better overall to have MapHost be more efficiently stored and generate the URLs and identifiers on-demand. > Shuffle string handling adds significant memory overhead > > > Key: TEZ-3115 > URL: https://issues.apache.org/jira/browse/TEZ-3115 > Project: Apache Tez > Issue Type: Bug >Affects Versions: 0.7.0 >Reporter: Jason Lowe > > While investigating the OOM heap dump from TEZ-3114 I noticed that the > ShuffleManager and other shuffle-related objects were holding onto many > strings that added up to over a hundred megabytes of memory. -- This message was sent by Atlassian JIRA (v6.3.4#6332)