[ https://issues.apache.org/jira/browse/YARN-3491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14531471#comment-14531471 ]
Hudson commented on YARN-3491: ------------------------------ FAILURE: Integrated in Hadoop-trunk-Commit #7750 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/7750/]) YARN-3491. PublicLocalizer#addResource is too slow. (zxu via rkanter) (rkanter: rev b72507810aece08e17ab4b5aae1f7eae1fe98609) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/DirectoryCollection.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestDirectoryCollection.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/TestResourceLocalizationService.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/LocalDirsHandlerService.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/ResourceLocalizationService.java * hadoop-yarn-project/CHANGES.txt > PublicLocalizer#addResource is too slow. > ---------------------------------------- > > Key: YARN-3491 > URL: https://issues.apache.org/jira/browse/YARN-3491 > Project: Hadoop YARN > Issue Type: Improvement > Components: nodemanager > Affects Versions: 2.7.0 > Reporter: zhihai xu > Assignee: zhihai xu > Priority: Critical > Fix For: 2.8.0 > > Attachments: YARN-3491.000.patch, YARN-3491.001.patch, > YARN-3491.002.patch, YARN-3491.003.patch, YARN-3491.004.patch > > > Based on the profiling, The bottleneck in PublicLocalizer#addResource is > getInitializedLocalDirs. getInitializedLocalDirs call checkLocalDir. > checkLocalDir is very slow which takes about 10+ ms. > The total delay will be approximately number of local dirs * 10+ ms. > This delay will be added for each public resource localization. > Because PublicLocalizer#addResource is slow, the thread pool can't be fully > utilized. Instead of doing public resource localization in > parallel(multithreading), public resource localization is serialized most of > the time. > And also PublicLocalizer#addResource is running in Dispatcher thread, > So the Dispatcher thread will be blocked by PublicLocalizer#addResource for > long time. -- This message was sent by Atlassian JIRA (v6.3.4#6332)