[ https://issues.apache.org/jira/browse/MESOS-5545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15317763#comment-15317763 ]
Fan Du edited comment on MESOS-5545 at 6/7/16 3:35 AM: ------------------------------------------------------- [~vinodkone] Thanks for the comments. Rack topology information does not fall into scope of network isolator, because it's not the target which can/should be isolated. Here is the explanation to justify rack topology information can be updated: The state of rack information could only transit from no rack information to valid rack information, in other words, it's possible that tasks use resources without rack information, but later on agents report rack id to master, the logic could be one/all of design decisions: a) notify corresponding frameworks with updated rack id for previous resources, b) subsequent allocation will have rack id tagged with agents, c)Resource freed by framework will have rack id for the next round allocation. The scenario is simpler and cleaner compared with attributes updates. OR only activate the agents for resource allocation once got valid rack id. Using attributes is a way to export the rack information, but I don't think that's possible in production, scale of +10000 servers, setting attributes with rack information from 3rd party logic and start agents?! Automatically exposing the rack information could save lots of deployment and maintenance effort. Apologize, seems I don't quite get the meaning of first class field, influencing allocation decision is not the intention of the ticket, I believe that part of work is out of scope the ticket, which I put them in the Future section of the design doc. The allocation strategy DOES honor DRF, current implementation is do the allocation in a per agent basis, and we could investigate different allocation modes. In addition, I'd prefer arranging agents in a per rack basis, because randomly shuffling agents scale to +10000 nodes is no good for every allocation iteration. IIRC, this number is grown. All in all, IMHO, it's a good feature for Mesos, the question is how to do it elegantly. :) was (Author: fan.du): [~vinodkone] Thanks for the comments. Rack topology information does not fall into scope of network isolator, because it's not the target which can/should be isolated. Here is the explanation to justify rack topology information can be updated: The state of rack information could only transit from no rack information to valid rack information, in other words, it's possible that tasks use resources without rack information, but later on agents report rack id to master, the logic could be one/all of design decisions: a) notify corresponding frameworks with updated rack id for previous resources, b) subsequent allocation will have rack id tagged with agents, c)Resource freed by framework will have rack id for the next round allocation. The scenario is simpler and cleaner compared with attributes updates. OR only activate the agents for resource allocation once got valid rack id. Using attributes is a way to export the rack information, but I don't think that's possible in production, scale of +10000 servers, setting attributes with rack information from 3rd party logic and start agents?! Automatically exposing the rack information could save lots of deployment and maintenance effort. Apologize, seems I don't quite get the meaning of first class field, influencing allocation decision is not the intention of the ticket, I believe that part of work is out of scope the ticket, which I put them in the Future section of the design doc. The allocation strategy DOES honor DRF, current implementation is do the allocation in a per agent basis, and we could investigate different allocation modes. In addition, I'd prefer arranging agents in a per rack basis, because randomly shuffling agents scale to +10000 nodes is no good for every allocation iteration. IIRC, this number is grown. All in all, IMHO, it's a good feature for Mesos, the question is how to do it elegantly. :) > Add rack awareness support for Mesos resources > ---------------------------------------------- > > Key: MESOS-5545 > URL: https://issues.apache.org/jira/browse/MESOS-5545 > Project: Mesos > Issue Type: Story > Components: hadoop, master > Reporter: Fan Du > > Resources managed by Mesos master have no topology information of the > cluster, for example, rack topology. While lots of data center applications > have rack awareness feature to provide data locality, fault tolerance and > intelligent task placement. This ticket tries to investigate how to add rack > awareness for Mesos resources topology. -- This message was sent by Atlassian JIRA (v6.3.4#6332)