[jira] [Commented] (YARN-1042) add ability to specify affinity/anti-affinity in container requests
[ https://issues.apache.org/jira/browse/YARN-1042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15736672#comment-15736672 ] Konstantinos Karanasos commented on YARN-1042: -- [~leftnoteasy], I think you are right that exposing min/max cardinality might be complicated for the user... So, you are suggesting to expose expose to the user explicit affinity and anti-affinity constraints, but interpret them under the hood as more complex unified constraints? I think it is a good approach. This way we keep it simple for the end user, and we create the hooks for more complicated cardinality and tag constraints as future work. > add ability to specify affinity/anti-affinity in container requests > --- > > Key: YARN-1042 > URL: https://issues.apache.org/jira/browse/YARN-1042 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Affects Versions: 3.0.0-alpha1 >Reporter: Steve Loughran >Assignee: Wangda Tan > Attachments: YARN-1042-demo.patch, YARN-1042-design-doc.pdf, > YARN-1042-global-scheduling.poc.1.patch, YARN-1042.001.patch, > YARN-1042.002.patch > > > container requests to the AM should be able to request anti-affinity to > ensure that things like Region Servers don't come up on the same failure > zones. > Similarly, you may be able to want to specify affinity to same host or rack > without specifying which specific host/rack. Example: bringing up a small > giraph cluster in a large YARN cluster would benefit from having the > processes in the same rack purely for bandwidth reasons. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-1042) add ability to specify affinity/anti-affinity in container requests
[ https://issues.apache.org/jira/browse/YARN-1042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15736084#comment-15736084 ] Wangda Tan commented on YARN-1042: -- Thanks [~kkaranasos] for these suggestions. I agree to merge the implementation in scheduler to support affinity and anti-affinity, we should not have duplicated code to support both of (anti-)affinity. However, personally I'm not in favor of min-cardinality / max-cardinality, two reasons: 1) Comparing to affinity-to / anti-affinity-to, it is not straightforward enough to a normal YARN user. (of course, it is easy for a PhD :p) 2) We have to tradeoff between syntax flexibility and feature availability. Users will be happy if YARN allow them to specify as many constraints as they want, but they will be soon frustrated if we cannot give them in time. For example, user can specify allocate min=10/max=10 MPI tasks each host, but in reality, it could be quite hard in a busy cluster. And it gonna be hard for YARN to optimize such constraints, for example, how to preempt containers to satisfy min=max=10 cardinality. I have a simpler suggestion to handle most possible use cases as the first step. Which is: - Extend existing ResourceRequest, by adding a new {{placementConstraintExpression}} - Simple anti-affinity (min=max=1) - Simple affinity (min=2,max=infinity) - Within app and between app - Once tags are allowed to specified in ResourceRequest, we can support (anti-)affinity for tags The API could looks like: - {{affinity-to app=app_1234_0002}} - {{anti-affinity-to app=app_1234_0002}} - {{anti-affinity-to tag="Hbase-master"}} - {{anti-affinity-to app=app_1234_0002,tag="HBase-master"}} And more rich syntax (like affinity to rack / cluster, etc.) we can think more when doing YARN-4902. > add ability to specify affinity/anti-affinity in container requests > --- > > Key: YARN-1042 > URL: https://issues.apache.org/jira/browse/YARN-1042 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Affects Versions: 3.0.0-alpha1 >Reporter: Steve Loughran >Assignee: Wangda Tan > Attachments: YARN-1042-demo.patch, YARN-1042-design-doc.pdf, > YARN-1042-global-scheduling.poc.1.patch, YARN-1042.001.patch, > YARN-1042.002.patch > > > container requests to the AM should be able to request anti-affinity to > ensure that things like Region Servers don't come up on the same failure > zones. > Similarly, you may be able to want to specify affinity to same host or rack > without specifying which specific host/rack. Example: bringing up a small > giraph cluster in a large YARN cluster would benefit from having the > processes in the same rack purely for bandwidth reasons. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-1042) add ability to specify affinity/anti-affinity in container requests
[ https://issues.apache.org/jira/browse/YARN-1042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15733903#comment-15733903 ] Konstantinos Karanasos commented on YARN-1042: -- Hi [~leftnoteasy]. Thanks for working on this. Being able to specify (anti-)affinity constraints is a feature that we are very interested in as well. I gave a quick look at your patch. As I mentioned to you in the past, we also have a working prototype for defining and imposing similar constraints (I had uploaded an initial version of it at YARN-4902 some time back). We can share more code if it helps. For now let's focus on the API -- we can later discuss how to make the scheduling efficient (e.g., using the global scheduling and so on). I would suggest to unify the affinity and anti-affinity constraints and make them a bit more general too. For instance, we can specify "cardinality" constraints of the form "put no more than 4 HBase masters at each rack". This constraint is useful but cannot be captured by pure affinity or anti-affinity constraints. I assume that its task can be associated with a set of labels (similar to the ones we are discussing in YARN-3409). This is not there at the moment, but can be added. Then, we could use the following placement constraint form: {noformat} {task, target-tasks, cluster-scope, min-cardinality, max-cardinality}. {noformat} In the above constraint, {{task}} is a specific task, {{target-tasks}} is a label (e.g., "HBase-master", "app_000123" or "latency-critical") that specifies a set of tasks that are already scheduled in the cluster, {{cluster-scope}} is something among NODE and RACK (and possibly others in the future), {{min-cardinality}} is the minimum number of tasks with label {{target-tasks}} that can appear at the same {{cluster-scope}} with {{task}}, and {{max-cardinality}} is the respective maximum cardinality. Using the {{min-}} and {{max-cardinality}}, we can specify affinity, anti-affinity and other cardinality constraints. For example, {task_001, "HBase-master", NODE, 1, 1} is an anti-affinity constraint (don't put more than one HBase master at a node. Likewise, {task_001, "HBase-region-server", NODE, 2, MAX_INT} is an affinity constraint for region servers. Let me know how it looks. > add ability to specify affinity/anti-affinity in container requests > --- > > Key: YARN-1042 > URL: https://issues.apache.org/jira/browse/YARN-1042 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Affects Versions: 3.0.0-alpha1 >Reporter: Steve Loughran >Assignee: Wangda Tan > Attachments: YARN-1042-demo.patch, YARN-1042-design-doc.pdf, > YARN-1042-global-scheduling.poc.1.patch, YARN-1042.001.patch, > YARN-1042.002.patch > > > container requests to the AM should be able to request anti-affinity to > ensure that things like Region Servers don't come up on the same failure > zones. > Similarly, you may be able to want to specify affinity to same host or rack > without specifying which specific host/rack. Example: bringing up a small > giraph cluster in a large YARN cluster would benefit from having the > processes in the same rack purely for bandwidth reasons. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-1042) add ability to specify affinity/anti-affinity in container requests
[ https://issues.apache.org/jira/browse/YARN-1042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15674879#comment-15674879 ] Wangda Tan commented on YARN-1042: -- {color:red} Opened YARN-5907 as umbrella JIRA of this ticket. {color} > add ability to specify affinity/anti-affinity in container requests > --- > > Key: YARN-1042 > URL: https://issues.apache.org/jira/browse/YARN-1042 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Affects Versions: 3.0.0-alpha1 >Reporter: Steve Loughran >Assignee: Wangda Tan > Attachments: YARN-1042-demo.patch, YARN-1042-design-doc.pdf, > YARN-1042-global-scheduling.poc.1.patch, YARN-1042.001.patch, > YARN-1042.002.patch > > > container requests to the AM should be able to request anti-affinity to > ensure that things like Region Servers don't come up on the same failure > zones. > Similarly, you may be able to want to specify affinity to same host or rack > without specifying which specific host/rack. Example: bringing up a small > giraph cluster in a large YARN cluster would benefit from having the > processes in the same rack purely for bandwidth reasons. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-1042) add ability to specify affinity/anti-affinity in container requests
[ https://issues.apache.org/jira/browse/YARN-1042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15352072#comment-15352072 ] Wangda Tan commented on YARN-1042: -- Hi [~cheersyang], thanks for the design doc, patch and ideas. I think there're two major problems that we need to solve. 1) What the API should look like? 2) How to make it efficient enough? For 1), I'm agree with most of the comments from [~cchen317], existing ResourceRequest API has some flaws that we may need to think how to fix them. We have a design doc attached to YARN-4902 with a lot of thoughts regarding to this, you can read it if you have interests. And for anti-affinity/affinity API itself, we may need to consider anti-affinity/affinity between applications/priorities. Sometimes different services can be launched by one single YARN application, different roles from different services could possibly have different affinity/anti-affinity requirements. And one service/job can affinity/anti-affinity to another service/job. For 2), Existing scheduler is blindly looping all the nodes in the cluster, we need a global view to schedule them for better performance, I have opened YARN-5139 for the global scheduling framework and attached POC patch to it. Currently I'm working on a prototype patch for this, I want to make this feature based on YARN-5139 and probably based on YARN-4902. I'm interested in taking this forward, assigning it to myself. Would like to hear your thoughts. > add ability to specify affinity/anti-affinity in container requests > --- > > Key: YARN-1042 > URL: https://issues.apache.org/jira/browse/YARN-1042 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Affects Versions: 3.0.0-alpha1 >Reporter: Steve Loughran >Assignee: Wangda Tan > Attachments: YARN-1042-demo.patch, YARN-1042-design-doc.pdf, > YARN-1042.001.patch, YARN-1042.002.patch > > > container requests to the AM should be able to request anti-affinity to > ensure that things like Region Servers don't come up on the same failure > zones. > Similarly, you may be able to want to specify affinity to same host or rack > without specifying which specific host/rack. Example: bringing up a small > giraph cluster in a large YARN cluster would benefit from having the > processes in the same rack purely for bandwidth reasons. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-1042) add ability to specify affinity/anti-affinity in container requests
[ https://issues.apache.org/jira/browse/YARN-1042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14616522#comment-14616522 ] Weiwei Yang commented on YARN-1042: --- Hello [~cchen317] Thanks for your thoughts. {quote} Basically, one application/attempt may include multiple group of container requests, and each group includes multiple container requests. {quote} This seems like the idea to support specify a container allocation rule per request, e.g app1 asks for 3 containers with policy affinity then asks for 4 container with anti-affinity policy. This requires AM-RM protocol change and that's why it was not in my patch yet. What I have done is to let you be able to specify a rule per app. {quote} Fundamentally, it is hard for scheduler to make a right judgement without knowing the raw container request. The situation will get worse when dealing with affinity and anti-affinity or even gang scheduling etc. {quote} I do not fully understand the meaning of "raw container request" in your comments, but I think I understand your point. While implementing container allocation policies. The hard part for me is, scheduler is not aware of the *context* when it tries to allocate a container. Ideally, it needs to know what are the corresponding containers this container related to (they are not independent, like the group you mentioned), also it needs to know the scheduling details such as how long a request is being waiting for and how many requests are waiting, etc ... These information is very helpful to help scheduler to make more complex decisions. > add ability to specify affinity/anti-affinity in container requests > --- > > Key: YARN-1042 > URL: https://issues.apache.org/jira/browse/YARN-1042 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Affects Versions: 3.0.0 >Reporter: Steve Loughran >Assignee: Arun C Murthy > Attachments: YARN-1042-demo.patch, YARN-1042-design-doc.pdf, > YARN-1042.001.patch, YARN-1042.002.patch > > > container requests to the AM should be able to request anti-affinity to > ensure that things like Region Servers don't come up on the same failure > zones. > Similarly, you may be able to want to specify affinity to same host or rack > without specifying which specific host/rack. Example: bringing up a small > giraph cluster in a large YARN cluster would benefit from having the > processes in the same rack purely for bandwidth reasons. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-1042) add ability to specify affinity/anti-affinity in container requests
[ https://issues.apache.org/jira/browse/YARN-1042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14606934#comment-14606934 ] chong chen commented on YARN-1042: -- A few comments and thoughts: 1) YARN Application Model The current YARN job model only has two layers of requests, application/application attempt and container request. Container requests are independent from each other during scheduling phase. This model seems to work fine for existing simple scenario. However, when dealing with gang scheduling or affinity/anti-affinity etc, this is not enough. Essentially, all these requests introduce relationship among container requests. To easily express relationship, it seems to be nature to have a group concept here. Basically, one application/attempt may include multiple group of container requests, and each group includes multiple container requests. There will be policies within the group. For instance, you can say for group1 of container requests, application prefers to have anti-affinity policies over host, however, for group2, it would be nice to have affinity policy within rack. Furthermore, with group concept, you can also express to say for group3, scheduler needs to make sure all container requests be satisfied before making the allocation decision or even more advanced, group4 needs to meet minimum 4 container requests at the same time before making the allocation decision etc. There can be policies among groups. For instance, if group1 represents hBase masters, group2 represents region servers, I may not want two groups sharing the same host. The bottom line is, each container cannot be considered independently any more, there could be cases scheduler needs to consider them together when making the optimum placement decision. 2) Container Request and Scheduling Flaw To really handle affinity/gang scheduling properly, one should deal with this logic in scheduling and consider all container requests in the group as a whole. The logic should be in scheduler rather than RM layer and scheduler needs to know individual container requests within each group, so it can make proper scheduling decision. This leads to another potential design issue in current YARN. Currently, when AM sends container requests to RM and scheduler, it expands individual container requests into host/rack/any format. For instance, if I am asking for container request with preference "host1, host2, host3", assuming all are in the same rack rack1, instead of sending one raw container request to RM/Scheduler with raw preference list, it basically expand it to become 5 different objects with host1, host2, host3, rack1 and any in there. When scheduler receives information, it basically already lost the raw request. This is ok for single container request, but it will cause trouble when dealing with multiple container requests from the same application. Consider this case: 6 hosts, two racks: rack1 (host1, host2, host3) rack2 (host4, host5, host6) When application requests two containers with different data locality preference: c1: host1, host2, host4 c2: host2, host3, host5 This will end up with following container request list when client sending request to RM/Scheduler: host1: 1 instance host2: 2 instances host3: 1 instance host4: 1 instance host5: 1 instance rack1: 2 instances rack2: 2 instances any: 2 instances During scheduling, assume when host1 heartbeat comes, scheduler assigns container to host1, before Application master receives the container and updates its requests, if next host heartbeat is host4, scheduler will assign again even though they belong to the same container. Fundamentally, it is hard for scheduler to make a right judgement without knowing the raw container request. The situation will get worse when dealing with affinity and anti-affinity or even gang scheduling etc. Ideally, YARN resource allocation request should be changed to send in raw container requests instead of expanded one. It should be scheduler module responsibility to interpret raw request and build up optimized data structure to use it. > add ability to specify affinity/anti-affinity in container requests > --- > > Key: YARN-1042 > URL: https://issues.apache.org/jira/browse/YARN-1042 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Affects Versions: 3.0.0 >Reporter: Steve Loughran >Assignee: Arun C Murthy > Attachments: YARN-1042-demo.patch, YARN-1042-design-doc.pdf, > YARN-1042.001.patch, YARN-1042.002.patch > > > container requests to the AM should be able to request anti-affinity to > ensure that things like Region Servers don't come up on the same failure > zones. > Similarly, you may be able to wa
[jira] [Commented] (YARN-1042) add ability to specify affinity/anti-affinity in container requests
[ https://issues.apache.org/jira/browse/YARN-1042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14585494#comment-14585494 ] Weiwei Yang commented on YARN-1042: --- Hi Steve >From your comments, I think you want yarn API to support requesting containers >with given rules per request. Guess you want it to have the API looks like in AMRMClient.java, ContainerRequest supports following arguments * Resource capability * String[] nodes * String[] racks * Priority priority * boolean relaxLocality * String nodeLabelExpression * *ContainerAllocateRule containerAllocateRule* the last one (in bold text) is the new argument for you to specify a particular rule (we will discuss rule details later). Problem here is, RM needs to know which application (or role in slider's context) the rule applies for, only then RM can assign containers by obeying that rule when dealing with the request coming from the same application. However, if you only specify the rule per container request, how can RM know what are the containers need to be considered to when applies the rule ? Let me give an example to explain {code} ContainerRequest containerReq1 = new ContainerRequest(capability1, nodes, racks, priority, affinityRequiredRule); amClient.addContainerRequest(containerReq1); AllocateResponse allocResponse = amClient.allocate(0.1f) {code} The AllocationRequest AM sent to RM only tells RM that these container requests need to use affinityRequiredRule, but RM does not know which containers this request affine with, so RM cannot place the rule during the allocation. This is the reason why I propose to register the mapping about {code} application <-> allocation-rule {code} when client submits the application, and keep it in the RM context, so RM can apply the rule when there is a request coming from AM. > add ability to specify affinity/anti-affinity in container requests > --- > > Key: YARN-1042 > URL: https://issues.apache.org/jira/browse/YARN-1042 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Affects Versions: 3.0.0 >Reporter: Steve Loughran >Assignee: Arun C Murthy > Attachments: YARN-1042-demo.patch, YARN-1042-design-doc.pdf, > YARN-1042.001.patch > > > container requests to the AM should be able to request anti-affinity to > ensure that things like Region Servers don't come up on the same failure > zones. > Similarly, you may be able to want to specify affinity to same host or rack > without specifying which specific host/rack. Example: bringing up a small > giraph cluster in a large YARN cluster would benefit from having the > processes in the same rack purely for bandwidth reasons. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-1042) add ability to specify affinity/anti-affinity in container requests
[ https://issues.apache.org/jira/browse/YARN-1042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14581762#comment-14581762 ] Weiwei Yang commented on YARN-1042: --- Thanks Steve I have updated my patch based on your comment 2 and 3. But for 1, we need some more discussion. I am glad for all the suggestions. > add ability to specify affinity/anti-affinity in container requests > --- > > Key: YARN-1042 > URL: https://issues.apache.org/jira/browse/YARN-1042 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Affects Versions: 3.0.0 >Reporter: Steve Loughran >Assignee: Arun C Murthy > Attachments: YARN-1042-demo.patch, YARN-1042-design-doc.pdf, > YARN-1042.001.patch > > > container requests to the AM should be able to request anti-affinity to > ensure that things like Region Servers don't come up on the same failure > zones. > Similarly, you may be able to want to specify affinity to same host or rack > without specifying which specific host/rack. Example: bringing up a small > giraph cluster in a large YARN cluster would benefit from having the > processes in the same rack purely for bandwidth reasons. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-1042) add ability to specify affinity/anti-affinity in container requests
[ https://issues.apache.org/jira/browse/YARN-1042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14581638#comment-14581638 ] Steve Loughran commented on YARN-1042: -- A bit more detail on how slider uses placement, specifically, we have different policies for different container roles. We have the notion of different component types (region servers, hbase masters, kafka nodes), with different placement policies related to affinity/anti-affinity and whether we use the history of previous locations. Examples * HBase masters: don't care where they come up, as long as they have anti-affinity (that's a goal) * region servers: ideally, anti-affinity + history. We remember where they were last run and ask for one RS on every host that had one before (if there were 2 region servers on a host, only one is asked for, for better distribution) * Kafka nodes:ask for the previous locations, wait 20+ minutes for YARN to satisfy the request, because it is so expensive to rebuild this gives us the following policies * anywhere * anywhere but use history * anti-affinity-preferred * antiti-affinity required (not yet implemented) * strict: wherever used before is where needed next. There's no attempt to escalate placement if the request is unsatisfied. We assign one role per YARN request priority, which is how we map a granted container to a requested role. This implies that we have one placement policy per request priority. I'd be happy if YARN supported that, rather than allowing us to specify a policy per request, which would make aggregation of requests unaggregateable and hence unscaleable > add ability to specify affinity/anti-affinity in container requests > --- > > Key: YARN-1042 > URL: https://issues.apache.org/jira/browse/YARN-1042 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Affects Versions: 3.0.0 >Reporter: Steve Loughran >Assignee: Arun C Murthy > Attachments: YARN-1042-demo.patch, YARN-1042.001.patch > > > container requests to the AM should be able to request anti-affinity to > ensure that things like Region Servers don't come up on the same failure > zones. > Similarly, you may be able to want to specify affinity to same host or rack > without specifying which specific host/rack. Example: bringing up a small > giraph cluster in a large YARN cluster would benefit from having the > processes in the same rack purely for bandwidth reasons. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-1042) add ability to specify affinity/anti-affinity in container requests
[ https://issues.apache.org/jira/browse/YARN-1042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14577079#comment-14577079 ] Steve Loughran commented on YARN-1042: -- thanks for this. I'm afraid I don't know enough about the YARN scheduling to review it properly —hopefully the experts will. meanwhile, # does this let us specify different rules for different container requests? The uses we see in slider do differentiate different requests (e.g. we care more about affinity of hbase masters than we do about workers). # how would we set up the enum for if we ever want to add different placement policies in future? I think the strategy would be to have a "no-preferences" policy, which would be the default, and different from affinity (==I really want them on the same node) and anti-affinity. Then we could have a {{switch}} statement to choose placement, rather than just an {{isAffinity()}} predicate. # hadoop's [code rules|https://github.com/steveloughran/formality/blob/master/styleguide/styleguide.md#code-style] are "two spaces, no tabs". > add ability to specify affinity/anti-affinity in container requests > --- > > Key: YARN-1042 > URL: https://issues.apache.org/jira/browse/YARN-1042 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Affects Versions: 3.0.0 >Reporter: Steve Loughran >Assignee: Arun C Murthy > Attachments: YARN-1042-001.patch, YARN-1042-demo.patch > > > container requests to the AM should be able to request anti-affinity to > ensure that things like Region Servers don't come up on the same failure > zones. > Similarly, you may be able to want to specify affinity to same host or rack > without specifying which specific host/rack. Example: bringing up a small > giraph cluster in a large YARN cluster would benefit from having the > processes in the same rack purely for bandwidth reasons. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-1042) add ability to specify affinity/anti-affinity in container requests
[ https://issues.apache.org/jira/browse/YARN-1042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14576703#comment-14576703 ] Weiwei Yang commented on YARN-1042: --- Hi Steve Thanks for the comments. I correct the words. And regarding to {quote} That way the AM can choose to wait 1 minute or more for an anti-affine placement before giving up and accepting a node already in use. {quote} this is exactly the reason I proposed the PREFERRED rules, you can set a max time await before compromising the rule. For example, you use ANTI_AFFINITY rule and set 1 minute to the max wait time, then RM will wait for at least 1 minute before assigning a container to a node which already has a container running on it. Or ... forget about REQUIRED or PREFERRED, we can directly define these preference in ContainerAllocateRule class, with an attribute like *maxTimeAwaitBeforeCompromise*, default it is 0, which means never compromise the rule (REQUIRED). > add ability to specify affinity/anti-affinity in container requests > --- > > Key: YARN-1042 > URL: https://issues.apache.org/jira/browse/YARN-1042 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Affects Versions: 3.0.0 >Reporter: Steve Loughran >Assignee: Arun C Murthy > Attachments: YARN-1042-demo.patch > > > container requests to the AM should be able to request anti-affinity to > ensure that things like Region Servers don't come up on the same failure > zones. > Similarly, you may be able to want to specify affinity to same host or rack > without specifying which specific host/rack. Example: bringing up a small > giraph cluster in a large YARN cluster would benefit from having the > processes in the same rack purely for bandwidth reasons. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-1042) add ability to specify affinity/anti-affinity in container requests
[ https://issues.apache.org/jira/browse/YARN-1042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14567291#comment-14567291 ] Steve Loughran commented on YARN-1042: -- I like this, though I'd also like PREFERRED to have two Rs in the middle :). Thinking about how I'd use this in slider, I'd probably want to keep the escalation logic, when to decide when to accept shared-note placement, in my own code. That way the AM can choose to wait 1 minute or more for an anti-affine placement before giving up and accepting a node already in use. We already do that when asking for a container back on the host where an instance ran previously. > add ability to specify affinity/anti-affinity in container requests > --- > > Key: YARN-1042 > URL: https://issues.apache.org/jira/browse/YARN-1042 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Affects Versions: 3.0.0 >Reporter: Steve Loughran >Assignee: Arun C Murthy > Attachments: YARN-1042-demo.patch > > > container requests to the AM should be able to request anti-affinity to > ensure that things like Region Servers don't come up on the same failure > zones. > Similarly, you may be able to want to specify affinity to same host or rack > without specifying which specific host/rack. Example: bringing up a small > giraph cluster in a large YARN cluster would benefit from having the > processes in the same rack purely for bandwidth reasons. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-1042) add ability to specify affinity/anti-affinity in container requests
[ https://issues.apache.org/jira/browse/YARN-1042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14560776#comment-14560776 ] Weiwei Yang commented on YARN-1042: --- I am thinking about following approach, appreciate for suggestions : ) In ApplicationSubmissionContext class, add a new argument to indicate the container allocation rule in terms of affinity/anti-affinity. RM will follow the certain rules to allocate containers for this application. The argument is an instance of class ContainerAllocationRule(new), this class defines several types allocation rule, such as * AFFINITY_REQUIRED: containers MUST be allocated on the same host/rack * AFFINITY_PREFERED: prefer to allocate containers on same host/rack if possible * ANTI_AFFINITY_REQUIRED: containers MUST be allocated on different hosts/racks * ANTI_AFFINITY_PREFERED: prefer to allocate containers on different hosts/racks if possible Each of these rules will have a handler on the RM side to add some control on container allocation. When a client submits an application with a certain ContainerAllocationRule to RM, this information will be added into ApplicationAttemptId (because the allocation rule is defined per application), when RM uses registered scheduler to allocate containers, it can retrieve the rule from ApplicationAttemptId and call particular handler during the allocation. The code can be added into SchedulerApplicationAttempt.pullNewlyAllocatedContainersAndNMTokens so to avoid modifying all schedulers. > add ability to specify affinity/anti-affinity in container requests > --- > > Key: YARN-1042 > URL: https://issues.apache.org/jira/browse/YARN-1042 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Affects Versions: 3.0.0 >Reporter: Steve Loughran >Assignee: Arun C Murthy > Attachments: YARN-1042-demo.patch > > > container requests to the AM should be able to request anti-affinity to > ensure that things like Region Servers don't come up on the same failure > zones. > Similarly, you may be able to want to specify affinity to same host or rack > without specifying which specific host/rack. Example: bringing up a small > giraph cluster in a large YARN cluster would benefit from having the > processes in the same rack purely for bandwidth reasons. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-1042) add ability to specify affinity/anti-affinity in container requests
[ https://issues.apache.org/jira/browse/YARN-1042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14541088#comment-14541088 ] Vinod Kumar Vavilapalli commented on YARN-1042: --- I have a proposal document that I am going to post shortly. But the shorter answer is that the YARN effort is complex and will take a while. The SLIDER-82 could take a shortcut by implementing it in the AM - it won't be fool-proof without RM support, but it should get you off the ground. > add ability to specify affinity/anti-affinity in container requests > --- > > Key: YARN-1042 > URL: https://issues.apache.org/jira/browse/YARN-1042 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Affects Versions: 3.0.0 >Reporter: Steve Loughran >Assignee: Arun C Murthy > Attachments: YARN-1042-demo.patch > > > container requests to the AM should be able to request anti-affinity to > ensure that things like Region Servers don't come up on the same failure > zones. > Similarly, you may be able to want to specify affinity to same host or rack > without specifying which specific host/rack. Example: bringing up a small > giraph cluster in a large YARN cluster would benefit from having the > processes in the same rack purely for bandwidth reasons. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-1042) add ability to specify affinity/anti-affinity in container requests
[ https://issues.apache.org/jira/browse/YARN-1042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14539686#comment-14539686 ] Steve Loughran commented on YARN-1042: -- I now think that SLIDER-82 could be done without this; we just use the blacklisting feature of the YARN request API to blacklist all nodes on which we have a running instance, expand the cluster gradually & release those nodes where there's affinity conflict. this doesn't mean that it'd be convenient to have this in the YARN scheduler, only that I think we can get by without it > add ability to specify affinity/anti-affinity in container requests > --- > > Key: YARN-1042 > URL: https://issues.apache.org/jira/browse/YARN-1042 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Affects Versions: 3.0.0 >Reporter: Steve Loughran >Assignee: Arun C Murthy > Attachments: YARN-1042-demo.patch > > > container requests to the AM should be able to request anti-affinity to > ensure that things like Region Servers don't come up on the same failure > zones. > Similarly, you may be able to want to specify affinity to same host or rack > without specifying which specific host/rack. Example: bringing up a small > giraph cluster in a large YARN cluster would benefit from having the > processes in the same rack purely for bandwidth reasons. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-1042) add ability to specify affinity/anti-affinity in container requests
[ https://issues.apache.org/jira/browse/YARN-1042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14539482#comment-14539482 ] Weiwei Yang commented on YARN-1042: --- Sorry, not an alternative, SLIDER-82 is depending on this jira. > add ability to specify affinity/anti-affinity in container requests > --- > > Key: YARN-1042 > URL: https://issues.apache.org/jira/browse/YARN-1042 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Affects Versions: 3.0.0 >Reporter: Steve Loughran >Assignee: Arun C Murthy > Attachments: YARN-1042-demo.patch > > > container requests to the AM should be able to request anti-affinity to > ensure that things like Region Servers don't come up on the same failure > zones. > Similarly, you may be able to want to specify affinity to same host or rack > without specifying which specific host/rack. Example: bringing up a small > giraph cluster in a large YARN cluster would benefit from having the > processes in the same rack purely for bandwidth reasons. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-1042) add ability to specify affinity/anti-affinity in container requests
[ https://issues.apache.org/jira/browse/YARN-1042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14539481#comment-14539481 ] Weiwei Yang commented on YARN-1042: --- Sorry, not an alternative, SLIDER-82 is depending on this jira. > add ability to specify affinity/anti-affinity in container requests > --- > > Key: YARN-1042 > URL: https://issues.apache.org/jira/browse/YARN-1042 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Affects Versions: 3.0.0 >Reporter: Steve Loughran >Assignee: Arun C Murthy > Attachments: YARN-1042-demo.patch > > > container requests to the AM should be able to request anti-affinity to > ensure that things like Region Servers don't come up on the same failure > zones. > Similarly, you may be able to want to specify affinity to same host or rack > without specifying which specific host/rack. Example: bringing up a small > giraph cluster in a large YARN cluster would benefit from having the > processes in the same rack purely for bandwidth reasons. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-1042) add ability to specify affinity/anti-affinity in container requests
[ https://issues.apache.org/jira/browse/YARN-1042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14539467#comment-14539467 ] Yang Weiwei commented on YARN-1042: --- There is no update for a long time, anything new here ? This looks like a nice feature that should be helping us a lot, is this a correct direction that we should put some more efforts to get this done in RM side ? I noticed there are alternatives in slider project, e.g SLIDER-82. Please advise. > add ability to specify affinity/anti-affinity in container requests > --- > > Key: YARN-1042 > URL: https://issues.apache.org/jira/browse/YARN-1042 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Affects Versions: 3.0.0 >Reporter: Steve Loughran >Assignee: Arun C Murthy > Attachments: YARN-1042-demo.patch > > > container requests to the AM should be able to request anti-affinity to > ensure that things like Region Servers don't come up on the same failure > zones. > Similarly, you may be able to want to specify affinity to same host or rack > without specifying which specific host/rack. Example: bringing up a small > giraph cluster in a large YARN cluster would benefit from having the > processes in the same rack purely for bandwidth reasons. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-1042) add ability to specify affinity/anti-affinity in container requests
[ https://issues.apache.org/jira/browse/YARN-1042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14248179#comment-14248179 ] Yang Hao commented on YARN-1042: A configuration should be add for yarn-site.xml > add ability to specify affinity/anti-affinity in container requests > --- > > Key: YARN-1042 > URL: https://issues.apache.org/jira/browse/YARN-1042 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Affects Versions: 3.0.0 >Reporter: Steve Loughran >Assignee: Arun C Murthy > Attachments: YARN-1042-demo.patch > > > container requests to the AM should be able to request anti-affinity to > ensure that things like Region Servers don't come up on the same failure > zones. > Similarly, you may be able to want to specify affinity to same host or rack > without specifying which specific host/rack. Example: bringing up a small > giraph cluster in a large YARN cluster would benefit from having the > processes in the same rack purely for bandwidth reasons. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-1042) add ability to specify affinity/anti-affinity in container requests
[ https://issues.apache.org/jira/browse/YARN-1042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13805531#comment-13805531 ] Junping Du commented on YARN-1042: -- Sure. Arun, thanks for working on this. Please go ahead! > add ability to specify affinity/anti-affinity in container requests > --- > > Key: YARN-1042 > URL: https://issues.apache.org/jira/browse/YARN-1042 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Affects Versions: 3.0.0 >Reporter: Steve Loughran >Assignee: Junping Du > Attachments: YARN-1042-demo.patch > > > container requests to the AM should be able to request anti-affinity to > ensure that things like Region Servers don't come up on the same failure > zones. > Similarly, you may be able to want to specify affinity to same host or rack > without specifying which specific host/rack. Example: bringing up a small > giraph cluster in a large YARN cluster would benefit from having the > processes in the same rack purely for bandwidth reasons. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-1042) add ability to specify affinity/anti-affinity in container requests
[ https://issues.apache.org/jira/browse/YARN-1042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13805491#comment-13805491 ] Arun C Murthy commented on YARN-1042: - [~djp] Do you mind if I take this over? I can do this concurrently with YARN-796 (which I already have a patch). Tx! > add ability to specify affinity/anti-affinity in container requests > --- > > Key: YARN-1042 > URL: https://issues.apache.org/jira/browse/YARN-1042 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Affects Versions: 3.0.0 >Reporter: Steve Loughran >Assignee: Junping Du > Attachments: YARN-1042-demo.patch > > > container requests to the AM should be able to request anti-affinity to > ensure that things like Region Servers don't come up on the same failure > zones. > Similarly, you may be able to want to specify affinity to same host or rack > without specifying which specific host/rack. Example: bringing up a small > giraph cluster in a large YARN cluster would benefit from having the > processes in the same rack purely for bandwidth reasons. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-1042) add ability to specify affinity/anti-affinity in container requests
[ https://issues.apache.org/jira/browse/YARN-1042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13764142#comment-13764142 ] Junping Du commented on YARN-1042: -- Are you mean YARN-624? I didn't follow up this JIRA before, and looks like it has been quiet for a while. Do you know what's status there? Shall we add dependency to that JIRA? > add ability to specify affinity/anti-affinity in container requests > --- > > Key: YARN-1042 > URL: https://issues.apache.org/jira/browse/YARN-1042 > Project: Hadoop YARN > Issue Type: New Feature > Components: resourcemanager >Affects Versions: 3.0.0 >Reporter: Steve Loughran >Assignee: Junping Du > Attachments: YARN-1042-demo.patch > > > container requests to the AM should be able to request anti-affinity to > ensure that things like Region Servers don't come up on the same failure > zones. > Similarly, you may be able to want to specify affinity to same host or rack > without specifying which specific host/rack. Example: bringing up a small > giraph cluster in a large YARN cluster would benefit from having the > processes in the same rack purely for bandwidth reasons. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1042) add ability to specify affinity/anti-affinity in container requests
[ https://issues.apache.org/jira/browse/YARN-1042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13764131#comment-13764131 ] Steve Loughran commented on YARN-1042: -- I envisage it as RM-side. Once gang scheduling is in, you could then request of the AM that you get three containers on separate Nodes or (after a time period), fail with an error. > add ability to specify affinity/anti-affinity in container requests > --- > > Key: YARN-1042 > URL: https://issues.apache.org/jira/browse/YARN-1042 > Project: Hadoop YARN > Issue Type: New Feature > Components: resourcemanager >Affects Versions: 3.0.0 >Reporter: Steve Loughran >Assignee: Junping Du > Attachments: YARN-1042-demo.patch > > > container requests to the AM should be able to request anti-affinity to > ensure that things like Region Servers don't come up on the same failure > zones. > Similarly, you may be able to want to specify affinity to same host or rack > without specifying which specific host/rack. Example: bringing up a small > giraph cluster in a large YARN cluster would benefit from having the > processes in the same rack purely for bandwidth reasons. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1042) add ability to specify affinity/anti-affinity in container requests
[ https://issues.apache.org/jira/browse/YARN-1042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13763848#comment-13763848 ] Junping Du commented on YARN-1042: -- Thanks for comments. Luke! bq. Although you can do a lot at the app side with container filtering, protocol and scheduler support will make it more efficient. I guess the intention of the jira is more for the latter, that affinity support should be app independent. Oh. It remind me that intention of JIRA may be on RM side (so the title may be replaced from container requests to resource request) as long lived services may have different AppMaster from default one that I change here. Also, I agree that do it in RM side may be more efficient as no need to return containers in app side for against affinity rules. However, my concern is it may take extra complexity to RM as it make RM aware the affinity/anti-affinity group of tasks (or resource request). IMO, one simplicity and beauty for YARN is: RM only take care abstracted resource request, and do container allocation accordingly. I am not sure if putting resource request into affinity/anti-affinity groups and tracking resource request relationship hurt this beauty. Thoughts? > add ability to specify affinity/anti-affinity in container requests > --- > > Key: YARN-1042 > URL: https://issues.apache.org/jira/browse/YARN-1042 > Project: Hadoop YARN > Issue Type: New Feature > Components: resourcemanager >Affects Versions: 3.0.0 >Reporter: Steve Loughran >Assignee: Junping Du > Attachments: YARN-1042-demo.patch > > > container requests to the AM should be able to request anti-affinity to > ensure that things like Region Servers don't come up on the same failure > zones. > Similarly, you may be able to want to specify affinity to same host or rack > without specifying which specific host/rack. Example: bringing up a small > giraph cluster in a large YARN cluster would benefit from having the > processes in the same rack purely for bandwidth reasons. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1042) add ability to specify affinity/anti-affinity in container requests
[ https://issues.apache.org/jira/browse/YARN-1042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13763273#comment-13763273 ] Luke Lu commented on YARN-1042: --- bq. BTW, it seems the effort is more on application side. Do we think it is better to move to MAPREDUCE project? Although you can do a lot at the app side with container filtering, protocol and scheduler support will make it more efficient. I guess the intention of the jira is more for the latter, that affinity support should be app independent. > add ability to specify affinity/anti-affinity in container requests > --- > > Key: YARN-1042 > URL: https://issues.apache.org/jira/browse/YARN-1042 > Project: Hadoop YARN > Issue Type: New Feature > Components: resourcemanager >Affects Versions: 3.0.0 >Reporter: Steve Loughran >Assignee: Junping Du > Attachments: YARN-1042-demo.patch > > > container requests to the AM should be able to request anti-affinity to > ensure that things like Region Servers don't come up on the same failure > zones. > Similarly, you may be able to want to specify affinity to same host or rack > without specifying which specific host/rack. Example: bringing up a small > giraph cluster in a large YARN cluster would benefit from having the > processes in the same rack purely for bandwidth reasons. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1042) add ability to specify affinity/anti-affinity in container requests
[ https://issues.apache.org/jira/browse/YARN-1042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13763080#comment-13763080 ] Junping Du commented on YARN-1042: -- Hi [~ste...@apache.org], as you are the creator of this jira and probably consume this API in HOYA project. It is great if you can provide some input here. Thx! > add ability to specify affinity/anti-affinity in container requests > --- > > Key: YARN-1042 > URL: https://issues.apache.org/jira/browse/YARN-1042 > Project: Hadoop YARN > Issue Type: New Feature > Components: resourcemanager >Affects Versions: 3.0.0 >Reporter: Steve Loughran >Assignee: Junping Du > Attachments: YARN-1042-demo.patch > > > container requests to the AM should be able to request anti-affinity to > ensure that things like Region Servers don't come up on the same failure > zones. > Similarly, you may be able to want to specify affinity to same host or rack > without specifying which specific host/rack. Example: bringing up a small > giraph cluster in a large YARN cluster would benefit from having the > processes in the same rack purely for bandwidth reasons. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1042) add ability to specify affinity/anti-affinity in container requests
[ https://issues.apache.org/jira/browse/YARN-1042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13762928#comment-13762928 ] Junping Du commented on YARN-1042: -- BTW, it seems the effort is more on application side. Do we think it is better to move to MAPREDUCE project? > add ability to specify affinity/anti-affinity in container requests > --- > > Key: YARN-1042 > URL: https://issues.apache.org/jira/browse/YARN-1042 > Project: Hadoop YARN > Issue Type: New Feature > Components: resourcemanager >Affects Versions: 3.0.0 >Reporter: Steve Loughran >Assignee: Junping Du > Attachments: YARN-1042-demo.patch > > > container requests to the AM should be able to request anti-affinity to > ensure that things like Region Servers don't come up on the same failure > zones. > Similarly, you may be able to want to specify affinity to same host or rack > without specifying which specific host/rack. Example: bringing up a small > giraph cluster in a large YARN cluster would benefit from having the > processes in the same rack purely for bandwidth reasons. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1042) add ability to specify affinity/anti-affinity in container requests
[ https://issues.apache.org/jira/browse/YARN-1042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13762779#comment-13762779 ] Junping Du commented on YARN-1042: -- Attach a patch to showcase above proposal. This is only a demo patch and haven't included any unit tests so far. I think there are several open questions here before we move on next step: 1. this affinity/anti-affinity rule is bi-direction or not? If task A.affinity(B) is true then B.affinity(A) is always true or not? I guess it is not as A may prefer a list of nodes which makes the relationship non-symmetric. Also, that is how we can differ A prefer to live with B and C from A prefer to live B or C. Isn't it? 2. which rule's priority is higher in case affinity rule against with anti-affinity rule? In demo patch, affinity rule plays as higher priority but I am not sure if this is true in real case. Do we want to make it configurable? Or we just make sure rules updated later can override previous one if conflict. 3. Currently, the affinity/anti-affinity is only considered in node level, do we want to expand it to other level i.e. rack level in future? 4. The API now is to add a list of taskId as affinity/anti-affinity tasks. Is that easy to consume in application prospective? 5. the affinity/anti-affinity rules is a *must* conform rule in current implementation which may cause task starve for longer time, do we think about more leisure rule? Welcome to comments. Thx! > add ability to specify affinity/anti-affinity in container requests > --- > > Key: YARN-1042 > URL: https://issues.apache.org/jira/browse/YARN-1042 > Project: Hadoop YARN > Issue Type: New Feature > Components: resourcemanager >Affects Versions: 3.0.0 >Reporter: Steve Loughran >Assignee: Junping Du > Attachments: YARN-1042-demo.patch > > > container requests to the AM should be able to request anti-affinity to > ensure that things like Region Servers don't come up on the same failure > zones. > Similarly, you may be able to want to specify affinity to same host or rack > without specifying which specific host/rack. Example: bringing up a small > giraph cluster in a large YARN cluster would benefit from having the > processes in the same rack purely for bandwidth reasons. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1042) add ability to specify affinity/anti-affinity in container requests
[ https://issues.apache.org/jira/browse/YARN-1042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13756114#comment-13756114 ] Junping Du commented on YARN-1042: -- My initial proposal is: add affinity and anti-affinity list of TaskAttemptId in ContainerRequest, then in assigning map task to allocated containers, take affinity/anti-affinity relationship into consideration by looking at assignedRequests (tid -> container -> nodeID). Any comments? > add ability to specify affinity/anti-affinity in container requests > --- > > Key: YARN-1042 > URL: https://issues.apache.org/jira/browse/YARN-1042 > Project: Hadoop YARN > Issue Type: New Feature > Components: resourcemanager >Affects Versions: 3.0.0 >Reporter: Steve Loughran > > container requests to the AM should be able to request anti-affinity to > ensure that things like Region Servers don't come up on the same failure > zones. > Similarly, you may be able to want to specify affinity to same host or rack > without specifying which specific host/rack. Example: bringing up a small > giraph cluster in a large YARN cluster would benefit from having the > processes in the same rack purely for bandwidth reasons. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1042) add ability to specify affinity/anti-affinity in container requests
[ https://issues.apache.org/jira/browse/YARN-1042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13736520#comment-13736520 ] Junping Du commented on YARN-1042: -- Agree. IMO, YARN-796 just address more admin labels for each node, but failure group info is already expressed in network topology. Here we need a stateful allocation for a bunch of containers rather than do assignment for each container statelessly, and we need a way to express the requirement of topology relation between containers. > add ability to specify affinity/anti-affinity in container requests > --- > > Key: YARN-1042 > URL: https://issues.apache.org/jira/browse/YARN-1042 > Project: Hadoop YARN > Issue Type: New Feature > Components: resourcemanager >Affects Versions: 3.0.0 >Reporter: Steve Loughran > > container requests to the AM should be able to request anti-affinity to > ensure that things like Region Servers don't come up on the same failure > zones. > Similarly, you may be able to want to specify affinity to same host or rack > without specifying which specific host/rack. Example: bringing up a small > giraph cluster in a large YARN cluster would benefit from having the > processes in the same rack purely for bandwidth reasons. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1042) add ability to specify affinity/anti-affinity in container requests
[ https://issues.apache.org/jira/browse/YARN-1042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13735386#comment-13735386 ] Steve Loughran commented on YARN-1042: -- Maybe -but now that Hadoop topologies can express failure domains, we could pick that up to say "spread these containers across >1 failure domain" without having to execute any cluster-specific tuning. > add ability to specify affinity/anti-affinity in container requests > --- > > Key: YARN-1042 > URL: https://issues.apache.org/jira/browse/YARN-1042 > Project: Hadoop YARN > Issue Type: New Feature > Components: resourcemanager >Affects Versions: 3.0.0 >Reporter: Steve Loughran > > container requests to the AM should be able to request anti-affinity to > ensure that things like Region Servers don't come up on the same failure > zones. > Similarly, you may be able to want to specify affinity to same host or rack > without specifying which specific host/rack. Example: bringing up a small > giraph cluster in a large YARN cluster would benefit from having the > processes in the same rack purely for bandwidth reasons. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1042) add ability to specify affinity/anti-affinity in container requests
[ https://issues.apache.org/jira/browse/YARN-1042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13735281#comment-13735281 ] Alejandro Abdelnur commented on YARN-1042: -- it seems much of this could be achieved via YARN-796 > add ability to specify affinity/anti-affinity in container requests > --- > > Key: YARN-1042 > URL: https://issues.apache.org/jira/browse/YARN-1042 > Project: Hadoop YARN > Issue Type: New Feature > Components: resourcemanager >Affects Versions: 3.0.0 >Reporter: Steve Loughran > > container requests to the AM should be able to request anti-affinity to > ensure that things like Region Servers don't come up on the same failure > zones. > Similarly, you may be able to want to specify affinity to same host or rack > without specifying which specific host/rack. Example: bringing up a small > giraph cluster in a large YARN cluster would benefit from having the > processes in the same rack purely for bandwidth reasons. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1042) add ability to specify affinity/anti-affinity in container requests
[ https://issues.apache.org/jira/browse/YARN-1042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13733249#comment-13733249 ] Junping Du commented on YARN-1042: -- Yes. It is pretty useful in cases specified in description. With this info of affinity and anti-affinity, AM should have knowledge to translate container request to resource request and ask for RM. > add ability to specify affinity/anti-affinity in container requests > --- > > Key: YARN-1042 > URL: https://issues.apache.org/jira/browse/YARN-1042 > Project: Hadoop YARN > Issue Type: New Feature > Components: resourcemanager >Affects Versions: 3.0.0 >Reporter: Steve Loughran > > container requests to the AM should be able to request anti-affinity to > ensure that things like Region Servers don't come up on the same failure > zones. > Similarly, you may be able to want to specify affinity to same host or rack > without specifying which specific host/rack. Example: bringing up a small > giraph cluster in a large YARN cluster would benefit from having the > processes in the same rack purely for bandwidth reasons. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira