[ https://issues.apache.org/jira/browse/YARN-9863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16949707#comment-16949707 ]
Miklos Szegedi edited comment on YARN-9863 at 10/11/19 6:39 PM: ---------------------------------------------------------------- [~belugabehr], thank you for the patch. Could you explain the motivation for this change a bit more? AFAIK the order is better to be decided in HDFS. Also, once you use a random number, why do not you use SecureRandom? My third question is whether localization is running in parallel in which case the order does not matter so much. All in all my experience with YARN and localization suggests that if you have a bottleneck on HDFS, you would rather just do a suitable replica increase in HDFS even if it is temporary. HDFS is much better in doing replicas for localization, since it can do streaming avoiding any bottlenecks. Then the localization goes to the local instance, making it practically painless. was (Author: szegedim): [~belugabehr], could you explain the motivation for this change a bit more? AFAIK the order is better to be decided in HDFS. Also, once you use a random number, why do not you use SecureRandom? My third question is whether localization is running in parallel in which case the order does not matter so much. All in all my experience with YARN and localization suggests that if you have a bottleneck on HDFS, you would rather just do a suitable replica increase in HDFS even if it is temporary. HDFS is much better in doing replicas for localization, since it can do streaming avoiding any bottlenecks. Then the localization goes to the local instance, making it practically painless. > Randomize List of Resources to Localize > --------------------------------------- > > Key: YARN-9863 > URL: https://issues.apache.org/jira/browse/YARN-9863 > Project: Hadoop YARN > Issue Type: New Feature > Components: nodemanager > Reporter: David Mollitor > Assignee: David Mollitor > Priority: Minor > Attachments: YARN-9863.1.patch, YARN-9863.2.patch > > > https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/main/java/org/apache/hadoop/mapreduce/v2/util/LocalResourceBuilder.java > Add a new parameter to {{LocalResourceBuilder}} that allows the list of > resources to be shuffled randomly. This will allow the Localizer to spread > the load of requests so that not all of the NodeManagers are requesting to > localize the same files, in the same order, from the same DataNodes, -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org