Hi everyone

Let me summarize the discussion about Rack awareness in the community so far. First thanks for all the comments, advices or challenges! :)

#1. Stick with attributes for rack awareness

For compatibility with existing framework, I tend to be ok with using attributes to convey the rack information, but with the goal to do it automatically, easy to maintain and with good attributes schema. This will bring up below question where the controversy starts.

#2. Scripts vs programmatic way

Both can be used to set attributes, I've made my arguments in the Jira and the Design doc, I'm not gonna to argue more here. But please take a look discussion at MESOS-3366 before, which allow resources/attributes discovery.

A module to implement *slaveAttributesDecorator* hook will works like
a charm here in a static way. And need to justify attributes updating.

#3. Allow updating attributes
Several cases need to be covered here:

a). Mesos runs inside VMs or container, where live migration happens, so rack information need to be updated.

b). LLDP packets are broadcasted by the interval 10s~30s, a vendor specific implementation, and rack information are usually stored in LLDP daemon to be queried. Worst cases(nodes fresh reboot, or daemon restart) would be: Mesos slave have to wait 10s~30s for a valid rack information before register to master. Allow updating attributes will mitigate this problem.

c). Framework affinity

Framework X prefers to run on the same nodes with another framwork Y.
For example, it's desirable for Shark or Spark-SQL to reside on the
*worker* node where Alluxio(former Tachyon) to gain more performance boosting as SPARK-6707 ticket message {tachyon=true;us-east-1=false}

If framework could advertise agent attributes in the ResourcesOffer process, awesome!


#4. Rearrange agents in a more scalable manner, like per rack basis

Randomly offering agents resource to framework does not improve data locality, imagine the likelihood of a framework getting resources underneath the same rack, at the scale of +30000 nodes. Moreover time to randomly shuffle the agents also grows.

How about rearranging the agent in a per rack basis, and a minor change to the way how resources are allocated will fix this.


I might not see the whole picture here, so comments are welcomed!


On 2016/6/6 17:17, Du, Fan wrote:
Hi, Mesos folks

I’ve been thinking about Mesos rack awareness support for a while,

it’s a common interest for lots of data center applications to provide
data locality,

fault tolerance and better task placement. Create MESOS-5545 to track
the story,

and here is the initial design doc [1] to support rack awareness in Mesos.

Looking forward to hear any comments from end user and other developers,

Thanks!

[1]:
https://docs.google.com/document/d/1rql_LZSwtQzBPALnk0qCLsmxcT3-zB7X7aJp-H3xxyE/edit?usp=sharing

Reply via email to