Re: Rack awareness support for Mesos

Joris Van Remoortere Wed, 15 Jun 2016 00:45:32 -0700

Since your interest is in the determination of the values, as opposed to

their propagation, I would just urge that you keep in mind that we may


(as a project) not want to support this information as the current

string attributes.


Huh? Why not? If the attributes change, why can't this sub-project just
> change with those changing string attributes? Maybe some elaboration how
> this might not naturally be able to evolve is a warranted detail of
> discussion?


Sorry, I should clarify what I meant by support. By support I mean that we
may not want to promise that those values will be there (support as a
feature), and what schemas are mangled into the random strings that we
currently call attributes. I did not mean that we wouldn't allow users to
inject their own values if they wanted to. We just wouldn't control the
standard or schema as a project and therefore couldn't support it.

Any random collection of strings that has previously had no reserved
keywords is notoriously difficult to build new schemas in.
This is why we may want to instead introduce a typed structure that is
dedicated to fault domain information. This:

   - Prevents us from colliding with current users' attributes.
   - Allows us to have more control over the types (YAY) and ranges of
   values.
   - Allows us to introduce explicit structure such as dependency or
   hierarchy.

The fact that users have already encoded information in attributes is not a
reason for us to limit ourselves to that scope when better structures may
be available. This is why we shouldn't assume that the project will
*provide support for* (as opposed to allow users to) using attributes.

As your said, it is their prerogative to join the design discussion to
ensure that any formalized structure or schema we introduce is one that
they are agreeable with.



—
*Joris Van Remoortere*
Mesosphere

On Tue, Jun 14, 2016 at 6:31 PM, james <gar...@verizon.net> wrote:

> On 06/14/2016 08:14 AM, Joris Van Remoortere wrote:
>
>> On the condition of compatible with existing framework which already rely
>>> on parsing attributes for rack information.
>>>
>> There is currently nothing in Mesos that specifies the format or
>> structure for rack information in attributes.
>> The fact that operators / frameworks have decided to add this
>> information out of band is their problem to solve.
>> We don't need to be backwards compatible with something we never
>> published to begin with. This is why it's ok for us to consider adding a
>> typed form of failure domain information that is separate from the
>> typeless string attributes.
>>
>
> True. But you have to start somewhere, know that the schema and codes will
> morph over time to maintain relevance  and usefulness. In that vein, if
> folks have established interesting and useful parameters for this work,
> then it is most beneficial that those methods and codes are considered
> carefully.  AKA:: speak up now. Diversity and inclusion are keenly
> beneficial, where practical.
>
>
> Since your interest is in the determination of the values, as opposed to
>> their propagation, I would just urge that you keep in mind that we may
>> (as a project) not want to support this information as the current
>> string attributes.
>>
>
> Huh? Why not? If the attributes change, why can't this sub-project just
> change with those changing string attributes? Maybe some elaboration how
> this might not naturally be able to evolve is a warranted detail of
> discussion?
>
>
> I would venture that both 'determination of the values and propagation
> (delays)' are inherently important in a cluster of many things:: hardware,
> resources, frameworks, security codes, etc etc. The author
> and others seem to be keenly aware that a tight focus is not going to
> work, at this stage, so a broad appeal to a multitude of needs is best.
> And in fact, until some idea is proven to be useless or too difficult to
> implement, the bigger the tent, the more useful the codes that define this
> project/idea become.  Personally, I'm very excited that someone has stepped
> up in this area; hoping they keep an open mind and flexibility geared
> toward multiplicative usage, in the future. Most mature hardware folks who
> build ideas into robust systems do exactly that, to motivate a
> multiplicative usage for organizing hardware, performance and state
> metrics, and timing signals, gregariously. All of this is routine semantics
> from a hardware perspective.
>
> At some point, folks will realize that kernel configuration, testing and
> tweaks are critical to cluster performance, regardless of the codes
> running on top of the cluster. So this project could easily use cgroups
> and such for achieve robustness in many areas of need.
>
>
> Like it or not large amounts of hardware, need to have schema, planning
> and architectural robustness to keep large amounts of hardware, pristinely
> available for software efficiency to be any where near optimal deployment.
> This really becomes critical when the mix of different CPU types, GPUs and
> ram are to be considered in future deployments, regardless if you outsource
> or run your own cluster. Hardware vendors are going to want to sell their
> products to as wide of a customer base a possible and customers are going
> to demand seamless management for expansion of resources. Furthermore, as a
> consultant my experiences are that much of the future market is going to
> demand outsourced, hybrid and in-house options as a fundamental tenant of
> cluster resource adoption.
>
> hth,
> James
>
>
> *Joris Van Remoortere*
>> Mesosphere
>>
>> On Tue, Jun 14, 2016 at 3:02 PM, Du, Fan <fan...@intel.com
>> <mailto:fan...@intel.com>> wrote:
>>
>>
>>
>>     On 2016/6/14 20:32, Joris Van Remoortere wrote:
>>
>>              #1. Stick with attributes for rack awareness
>>
>>         I don't think this is the right approach; however, there seem to
>>         be 2
>>         components to this discussion:
>>
>>         1. How the values are presented (Attributes vs. a new type-aware
>>         structure)
>>         2. How the values are determined (scripts vs. automation vs.
>>         modules)
>>
>>         It seems you are more interested in working on #2. If that's the
>>         case,
>>         please make sure that you don't assume anything about #1, as we
>> not
>>         everyone agrees that we will use the existing attributes in the
>>         future.
>>
>>
>>     On the condition of compatible with existing framework which already
>>     rely on parsing attributes for rack information.
>>
>>     Quotes from my original statements:
>>     > For compatibility with existing framework, I tend to be ok with
>> using
>>     > attributes to convey the rack information
>>
>>     By all means, no matter what internal structures to use, current
>>     behavior should be honored. btw, I'm also thinking about #1, it's
>>     too earlier to bring up the details so far before the ticket got
>>     ACCEPTED.
>>
>>     Any way, I'm always open to all kind of discussion, thanks for your
>>     comments! Joris.
>>
>>         For #2, you should focus on an API (module or script results)
>>         that will
>>         support all the different methods the community wants to use to
>>         generate
>>         this data.
>>
>>         As you mentioned, updating the values for a running agent is not
>>         straightforward. A lot of design work will need to go into how
>> these
>>         values are propagated to frameworks that have made assumptions
>> about
>>         them, and which values are allowed to change vs. not.
>>
>>         —
>>         *Joris Van Remoortere*
>>         Mesosphere
>>
>>         On Tue, Jun 14, 2016 at 10:04 AM, Aaron Carey <aca...@ilm.com
>>         <mailto:aca...@ilm.com>
>>         <mailto:aca...@ilm.com <mailto:aca...@ilm.com>>> wrote:
>>
>>              #3 would be very helpful for us. Also related:
>>
>>         https://issues.apache.org/jira/browse/MESOS-3059
>>
>>              --
>>
>>              Aaron Carey
>>              Production Engineer - Cloud Pipeline
>>              Industrial Light & Magic
>>              London
>>              020 3751 9150
>>
>>              ________________________________________
>>              From: Du, Fan [fan...@intel.com <mailto:fan...@intel.com>
>>         <mailto:fan...@intel.com <mailto:fan...@intel.com>>]
>>              Sent: 14 June 2016 07:24
>>              To: user@mesos.apache.org <mailto:user@mesos.apache.org>
>>         <mailto:user@mesos.apache.org <mailto:user@mesos.apache.org>>;
>>         d...@mesos.apache.org <mailto:d...@mesos.apache.org>
>>         <mailto:d...@mesos.apache.org <mailto:d...@mesos.apache.org>>
>>              Cc: Joris Van Remoortere; vinodk...@apache.org
>>         <mailto:vinodk...@apache.org>
>>              <mailto:vinodk...@apache.org <mailto:vinodk...@apache.org>>
>>
>>
>>              Subject: Re: Rack awareness support for Mesos
>>
>>              Hi everyone
>>
>>              Let me summarize the discussion about Rack awareness in the
>>         community so
>>              far. First thanks for all the comments, advices or
>>         challenges! :)
>>
>>              #1. Stick with attributes for rack awareness
>>
>>              For compatibility with existing framework, I tend to be ok
>>         with using
>>              attributes to convey the rack information, but with the
>>         goal to do it
>>              automatically, easy to maintain and with good attributes
>>         schema. This
>>              will bring up below question where the controversy starts.
>>
>>              #2. Scripts vs programmatic way
>>
>>              Both can be used to set attributes, I've made my arguments
>>         in the Jira
>>              and the Design doc, I'm not gonna to argue more here. But
>>         please take a
>>              look discussion at MESOS-3366 before, which allow
>>         resources/attributes
>>              discovery.
>>
>>              A module to implement *slaveAttributesDecorator* hook will
>>         works like
>>              a charm here in a static way. And need to justify
>>         attributes updating.
>>
>>              #3. Allow updating attributes
>>              Several cases need to be covered here:
>>
>>              a). Mesos runs inside VMs or container, where live
>>         migration happens, so
>>              rack information need to be updated.
>>
>>              b). LLDP packets are broadcasted by the interval 10s~30s, a
>>         vendor
>>              specific implementation, and rack information are usually
>>         stored in LLDP
>>              daemon to be queried. Worst cases(nodes fresh reboot, or
>>         daemon restart)
>>              would be: Mesos slave have to wait 10s~30s for a valid rack
>>         information
>>              before register to master. Allow updating attributes will
>>         mitigate this
>>              problem.
>>
>>              c). Framework affinity
>>
>>              Framework X prefers to run on the same nodes with another
>>         framwork Y.
>>              For example, it's desirable for Shark or Spark-SQL to
>>         reside on the
>>              *worker* node where Alluxio(former Tachyon) to gain more
>>         performance
>>              boosting as SPARK-6707 ticket message
>>         {tachyon=true;us-east-1=false}
>>
>>              If framework could advertise agent attributes in the
>>         ResourcesOffer
>>              process, awesome!
>>
>>
>>              #4. Rearrange agents in a more scalable manner, like per
>>         rack basis
>>
>>              Randomly offering agents resource to framework does not
>>         improve data
>>              locality, imagine the likelihood of a framework getting
>>         resources
>>              underneath the same rack, at the scale of +30000 nodes.
>>         Moreover time to
>>              randomly shuffle the agents also grows.
>>
>>              How about rearranging the agent in a per rack basis, and a
>>         minor change
>>              to the way how resources are allocated will fix this.
>>
>>
>>              I might not see the whole picture here, so comments are
>>         welcomed!
>>
>>
>>              On 2016/6/6 17:17, Du, Fan wrote:
>>               > Hi, Mesos folks
>>               >
>>               > I’ve been thinking about Mesos rack awareness support
>>         for a while,
>>               >
>>               > it’s a common interest for lots of data center
>>         applications to
>>              provide
>>               > data locality,
>>               >
>>               > fault tolerance and better task placement. Create
>>         MESOS-5545 to track
>>               > the story,
>>               >
>>               > and here is the initial design doc [1] to support rack
>>         awareness
>>              in Mesos.
>>               >
>>               > Looking forward to hear any comments from end user and
>> other
>>              developers,
>>               >
>>               > Thanks!
>>               >
>>               > [1]:
>>               >
>>
>> https://docs.google.com/document/d/1rql_LZSwtQzBPALnk0qCLsmxcT3-zB7X7aJp-H3xxyE/edit?usp=sharing
>>               >
>>
>>
>>
>>
>

Re: Rack awareness support for Mesos

Reply via email to