[ https://issues.apache.org/jira/browse/MAPREDUCE-4309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13409323#comment-13409323 ]
Junping Du commented on MAPREDUCE-4309: --------------------------------------- Hello Nicholas, Thanks for your carefully review. Please see my comments and questions below: 1. We need to add new element to JobCounter and NodeType at least in pluggin patch (MAPREDUCE-4310). For enum of JobCounter and NodeType, it is hard to extend even in plug-in code as we know. If adding new element to base Enum really hurts, then I am OK with using type-safe enum class (effective java 1st version, http://developer.java.sun.com/developer/Books/effectivejava/Chapter5.pdf) to replace enum if we decide to do it. If we decide to the latter one, it will make patch in-compatible but could be easily extended in future. Which way are we prefer here? 2. I will put some unused code like: NODEGROUP_LOCAL_MAPS, YarnConfiguration.NET_TOPOLOGY_WITH_NODEGROUP to the plug-in patch. Also, NodeType.NODEGROUP_LOCAL is used twice in this pluggable patch but can be refactored to avoid to appear in this patch: 1) For one usage, which is in FifoScheduler, we can override getMaxAllocatableContainers() to make NodeType.NODEGROUP_LOCAL go to FifoSchedulerWithNodeGroup. 2) For the other usage, which is in AppSchedulingInfo, we can use the standard pluggable way to create a subclass of AppSchedulingInfo i.e: AppSchedulingWithNodeGroupInfo and use ReflectionUtils to construct one at round time. However, as we already add some pluggable classes for nodegroup-aware in configuration as following, I am wondering if we are on the right direction to make things pluggable. Does it looks over-complexity for user to configure? <code> + /** Implementation class of ScheduledRequests.*/ + public static final String RM_SCHEDULED_REQUESTS_CLASS_KEY = + RM_PREFIX + "scheduled.requests.class"; + + /** The boolean value if network topology with nodegroup.*/ + public static final String NET_TOPOLOGY_WITH_NODEGROUP = + "net.topology.with.nodegroup"; + + /** Implementation class of SchedulerNode.*/ + public static final String RM_SCHEDULER_NODE_CLASS_KEY = + RM_PREFIX + "scheduler.node.class"; + + /** LeafQueue implementation class for Capacity Scheduler.*/ + public static final String RM_CAPACITY_SCHEDULER_LEAFQUEUE_CLASS_KEY = + RM_PREFIX + "scheduler.leafqueue.class"; <code> > Make locatlity in YARN's container assignment and task scheduling pluggable > for other deployment topology > --------------------------------------------------------------------------------------------------------- > > Key: MAPREDUCE-4309 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4309 > Project: Hadoop Map/Reduce > Issue Type: Bug > Affects Versions: 1.0.0, 2.0.0-alpha > Reporter: Junping Du > Assignee: Junping Du > Attachments: > HADOOP-8474-ContainerAssignmentTaskScheduling-pluggable.patch, > MAPREDUCE-4309-v2.patch, MAPREDUCE-4309-v3.patch, MAPREDUCE-4309-v4.patch, > MAPREDUCE-4309.patch > > > There are several classes in YARN’s container assignment and task scheduling > algorithms that relate to data locality which were updated to give preference > to running a container on other locality besides node-local and rack-local > (like nodegroup-local). This propose to make these data structure/algorithms > pluggable, like: SchedulerNode, RMNodeImpl, etc. The inner class > ScheduledRequests was made a package level class to it would be easier to > create a subclass, ScheduledRequestsWithNodeGroup. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira