Re: Rack awareness support for Mesos

james Tue, 14 Jun 2016 08:45:46 -0700

On 06/14/2016 08:14 AM, Joris Van Remoortere wrote:

On the condition of compatible with existing framework which already rely on 
parsing attributes for rack information.

There is currently nothing in Mesos that specifies the format or
structure for rack information in attributes.
The fact that operators / frameworks have decided to add this
information out of band is their problem to solve.
We don't need to be backwards compatible with something we never
published to begin with. This is why it's ok for us to consider adding a
typed form of failure domain information that is separate from the
typeless string attributes.

True. But you have to start somewhere, know that the schema and codeswill morph over time to maintain relevance and usefulness. In thatvein, if folks have established interesting and useful parameters forthis work, then it is most beneficial that those methods and codes areconsidered carefully. AKA:: speak up now. Diversity and inclusion arekeenly beneficial, where practical.

Since your interest is in the determination of the values, as opposed to
their propagation, I would just urge that you keep in mind that we may
(as a project) not want to support this information as the current
string attributes.

Huh? Why not? If the attributes change, why can't this sub-project justchange with those changing string attributes? Maybe some elaboration howthis might not naturally be able to evolve is a warranted detail ofdiscussion?

I would venture that both 'determination of the values and propagation(delays)' are inherently important in a cluster of many things::hardware, resources, frameworks, security codes, etc etc. The authorand others seem to be keenly aware that a tight focus is not going towork, at this stage, so a broad appeal to a multitude of needs is best.

And in fact, until some idea is proven to be useless or too difficult to

implement, the bigger the tent, the more useful the codes that definethis project/idea become. Personally, I'm very excited that someone hasstepped up in this area; hoping they keep an open mind and flexibilitygeared toward multiplicative usage, in the future. Most mature hardwarefolks who build ideas into robust systems do exactly that, to motivate amultiplicative usage for organizing hardware, performance and statemetrics, and timing signals, gregariously. All of this is routinesemantics from a hardware perspective.

At some point, folks will realize that kernel configuration, testing andtweaks are critical to cluster performance, regardless of the codes

running on top of the cluster. So this project could easily use cgroups
and such for achieve robustness in many areas of need.

Like it or not large amounts of hardware, need to have schema, planningand architectural robustness to keep large amounts of hardware,pristinely available for software efficiency to be any where nearoptimal deployment. This really becomes critical when the mix ofdifferent CPU types, GPUs and ram are to be considered in futuredeployments, regardless if you outsource or run your own cluster.Hardware vendors are going to want to sell their products to as wide ofa customer base a possible and customers are going to demand seamlessmanagement for expansion of resources. Furthermore, as a consultant myexperiences are that much of the future market is going to demandoutsourced, hybrid and in-house options as a fundamental tenant ofcluster resource adoption.


hth,
James

*Joris Van Remoortere*
Mesosphere

On Tue, Jun 14, 2016 at 3:02 PM, Du, Fan <fan...@intel.com
<mailto:fan...@intel.com>> wrote:



    On 2016/6/14 20:32, Joris Van Remoortere wrote:

             #1. Stick with attributes for rack awareness

        I don't think this is the right approach; however, there seem to
        be 2
        components to this discussion:

        1. How the values are presented (Attributes vs. a new type-aware
        structure)
        2. How the values are determined (scripts vs. automation vs.
        modules)

        It seems you are more interested in working on #2. If that's the
        case,
        please make sure that you don't assume anything about #1, as we not
        everyone agrees that we will use the existing attributes in the
        future.


    On the condition of compatible with existing framework which already
    rely on parsing attributes for rack information.

    Quotes from my original statements:
    > For compatibility with existing framework, I tend to be ok with using
    > attributes to convey the rack information

    By all means, no matter what internal structures to use, current
    behavior should be honored. btw, I'm also thinking about #1, it's
    too earlier to bring up the details so far before the ticket got
    ACCEPTED.

    Any way, I'm always open to all kind of discussion, thanks for your
    comments! Joris.

        For #2, you should focus on an API (module or script results)
        that will
        support all the different methods the community wants to use to
        generate
        this data.

        As you mentioned, updating the values for a running agent is not
        straightforward. A lot of design work will need to go into how these
        values are propagated to frameworks that have made assumptions about
        them, and which values are allowed to change vs. not.

        —
        *Joris Van Remoortere*
        Mesosphere

        On Tue, Jun 14, 2016 at 10:04 AM, Aaron Carey <aca...@ilm.com
        <mailto:aca...@ilm.com>
        <mailto:aca...@ilm.com <mailto:aca...@ilm.com>>> wrote:

             #3 would be very helpful for us. Also related:

        https://issues.apache.org/jira/browse/MESOS-3059

             --

             Aaron Carey
             Production Engineer - Cloud Pipeline
             Industrial Light & Magic
             London
             020 3751 9150

             ________________________________________
             From: Du, Fan [fan...@intel.com <mailto:fan...@intel.com>
        <mailto:fan...@intel.com <mailto:fan...@intel.com>>]
             Sent: 14 June 2016 07:24
             To: user@mesos.apache.org <mailto:user@mesos.apache.org>
        <mailto:user@mesos.apache.org <mailto:user@mesos.apache.org>>;
        d...@mesos.apache.org <mailto:d...@mesos.apache.org>
        <mailto:d...@mesos.apache.org <mailto:d...@mesos.apache.org>>
             Cc: Joris Van Remoortere; vinodk...@apache.org
        <mailto:vinodk...@apache.org>
             <mailto:vinodk...@apache.org <mailto:vinodk...@apache.org>>

             Subject: Re: Rack awareness support for Mesos

             Hi everyone

             Let me summarize the discussion about Rack awareness in the
        community so
             far. First thanks for all the comments, advices or
        challenges! :)

             #1. Stick with attributes for rack awareness

             For compatibility with existing framework, I tend to be ok
        with using
             attributes to convey the rack information, but with the
        goal to do it
             automatically, easy to maintain and with good attributes
        schema. This
             will bring up below question where the controversy starts.

             #2. Scripts vs programmatic way

             Both can be used to set attributes, I've made my arguments
        in the Jira
             and the Design doc, I'm not gonna to argue more here. But
        please take a
             look discussion at MESOS-3366 before, which allow
        resources/attributes
             discovery.

             A module to implement *slaveAttributesDecorator* hook will
        works like
             a charm here in a static way. And need to justify
        attributes updating.

             #3. Allow updating attributes
             Several cases need to be covered here:

             a). Mesos runs inside VMs or container, where live
        migration happens, so
             rack information need to be updated.

             b). LLDP packets are broadcasted by the interval 10s~30s, a
        vendor
             specific implementation, and rack information are usually
        stored in LLDP
             daemon to be queried. Worst cases(nodes fresh reboot, or
        daemon restart)
             would be: Mesos slave have to wait 10s~30s for a valid rack
        information
             before register to master. Allow updating attributes will
        mitigate this
             problem.

             c). Framework affinity

             Framework X prefers to run on the same nodes with another
        framwork Y.
             For example, it's desirable for Shark or Spark-SQL to
        reside on the
             *worker* node where Alluxio(former Tachyon) to gain more
        performance
             boosting as SPARK-6707 ticket message
        {tachyon=true;us-east-1=false}

             If framework could advertise agent attributes in the
        ResourcesOffer
             process, awesome!


             #4. Rearrange agents in a more scalable manner, like per
        rack basis

             Randomly offering agents resource to framework does not
        improve data
             locality, imagine the likelihood of a framework getting
        resources
             underneath the same rack, at the scale of +30000 nodes.
        Moreover time to
             randomly shuffle the agents also grows.

             How about rearranging the agent in a per rack basis, and a
        minor change
             to the way how resources are allocated will fix this.


             I might not see the whole picture here, so comments are
        welcomed!


             On 2016/6/6 17:17, Du, Fan wrote:
              > Hi, Mesos folks
              >
              > I’ve been thinking about Mesos rack awareness support
        for a while,
              >
              > it’s a common interest for lots of data center
        applications to
             provide
              > data locality,
              >
              > fault tolerance and better task placement. Create
        MESOS-5545 to track
              > the story,
              >
              > and here is the initial design doc [1] to support rack
        awareness
             in Mesos.
              >
              > Looking forward to hear any comments from end user and other
             developers,
              >
              > Thanks!
              >
              > [1]:
              >
        
https://docs.google.com/document/d/1rql_LZSwtQzBPALnk0qCLsmxcT3-zB7X7aJp-H3xxyE/edit?usp=sharing
              >

Re: Rack awareness support for Mesos

Reply via email to