On 06/14/2016 08:14 AM, Joris Van Remoortere wrote:
On the condition of compatible with existing framework which already rely on
parsing attributes for rack information.
There is currently nothing in Mesos that specifies the format or
structure for rack information in attributes.
The fact that operators / frameworks have decided to add this
information out of band is their problem to solve.
We don't need to be backwards compatible with something we never
published to begin with. This is why it's ok for us to consider adding a
typed form of failure domain information that is separate from the
typeless string attributes.
True. But you have to start somewhere, know that the schema and codes
will morph over time to maintain relevance and usefulness. In that
vein, if folks have established interesting and useful parameters for
this work, then it is most beneficial that those methods and codes are
considered carefully. AKA:: speak up now. Diversity and inclusion are
keenly beneficial, where practical.
Since your interest is in the determination of the values, as opposed to
their propagation, I would just urge that you keep in mind that we may
(as a project) not want to support this information as the current
string attributes.
Huh? Why not? If the attributes change, why can't this sub-project just
change with those changing string attributes? Maybe some elaboration how
this might not naturally be able to evolve is a warranted detail of
discussion?
I would venture that both 'determination of the values and propagation
(delays)' are inherently important in a cluster of many things::
hardware, resources, frameworks, security codes, etc etc. The author
and others seem to be keenly aware that a tight focus is not going to
work, at this stage, so a broad appeal to a multitude of needs is best.
And in fact, until some idea is proven to be useless or too difficult to
implement, the bigger the tent, the more useful the codes that define
this project/idea become. Personally, I'm very excited that someone has
stepped up in this area; hoping they keep an open mind and flexibility
geared toward multiplicative usage, in the future. Most mature hardware
folks who build ideas into robust systems do exactly that, to motivate a
multiplicative usage for organizing hardware, performance and state
metrics, and timing signals, gregariously. All of this is routine
semantics from a hardware perspective.
At some point, folks will realize that kernel configuration, testing and
tweaks are critical to cluster performance, regardless of the codes
running on top of the cluster. So this project could easily use cgroups
and such for achieve robustness in many areas of need.
Like it or not large amounts of hardware, need to have schema, planning
and architectural robustness to keep large amounts of hardware,
pristinely available for software efficiency to be any where near
optimal deployment. This really becomes critical when the mix of
different CPU types, GPUs and ram are to be considered in future
deployments, regardless if you outsource or run your own cluster.
Hardware vendors are going to want to sell their products to as wide of
a customer base a possible and customers are going to demand seamless
management for expansion of resources. Furthermore, as a consultant my
experiences are that much of the future market is going to demand
outsourced, hybrid and in-house options as a fundamental tenant of
cluster resource adoption.
hth,
James
*Joris Van Remoortere*
Mesosphere
On Tue, Jun 14, 2016 at 3:02 PM, Du, Fan <fan...@intel.com
<mailto:fan...@intel.com>> wrote:
On 2016/6/14 20:32, Joris Van Remoortere wrote:
#1. Stick with attributes for rack awareness
I don't think this is the right approach; however, there seem to
be 2
components to this discussion:
1. How the values are presented (Attributes vs. a new type-aware
structure)
2. How the values are determined (scripts vs. automation vs.
modules)
It seems you are more interested in working on #2. If that's the
case,
please make sure that you don't assume anything about #1, as we not
everyone agrees that we will use the existing attributes in the
future.
On the condition of compatible with existing framework which already
rely on parsing attributes for rack information.
Quotes from my original statements:
> For compatibility with existing framework, I tend to be ok with using
> attributes to convey the rack information
By all means, no matter what internal structures to use, current
behavior should be honored. btw, I'm also thinking about #1, it's
too earlier to bring up the details so far before the ticket got
ACCEPTED.
Any way, I'm always open to all kind of discussion, thanks for your
comments! Joris.
For #2, you should focus on an API (module or script results)
that will
support all the different methods the community wants to use to
generate
this data.
As you mentioned, updating the values for a running agent is not
straightforward. A lot of design work will need to go into how these
values are propagated to frameworks that have made assumptions about
them, and which values are allowed to change vs. not.
—
*Joris Van Remoortere*
Mesosphere
On Tue, Jun 14, 2016 at 10:04 AM, Aaron Carey <aca...@ilm.com
<mailto:aca...@ilm.com>
<mailto:aca...@ilm.com <mailto:aca...@ilm.com>>> wrote:
#3 would be very helpful for us. Also related:
https://issues.apache.org/jira/browse/MESOS-3059
--
Aaron Carey
Production Engineer - Cloud Pipeline
Industrial Light & Magic
London
020 3751 9150
________________________________________
From: Du, Fan [fan...@intel.com <mailto:fan...@intel.com>
<mailto:fan...@intel.com <mailto:fan...@intel.com>>]
Sent: 14 June 2016 07:24
To: user@mesos.apache.org <mailto:user@mesos.apache.org>
<mailto:user@mesos.apache.org <mailto:user@mesos.apache.org>>;
d...@mesos.apache.org <mailto:d...@mesos.apache.org>
<mailto:d...@mesos.apache.org <mailto:d...@mesos.apache.org>>
Cc: Joris Van Remoortere; vinodk...@apache.org
<mailto:vinodk...@apache.org>
<mailto:vinodk...@apache.org <mailto:vinodk...@apache.org>>
Subject: Re: Rack awareness support for Mesos
Hi everyone
Let me summarize the discussion about Rack awareness in the
community so
far. First thanks for all the comments, advices or
challenges! :)
#1. Stick with attributes for rack awareness
For compatibility with existing framework, I tend to be ok
with using
attributes to convey the rack information, but with the
goal to do it
automatically, easy to maintain and with good attributes
schema. This
will bring up below question where the controversy starts.
#2. Scripts vs programmatic way
Both can be used to set attributes, I've made my arguments
in the Jira
and the Design doc, I'm not gonna to argue more here. But
please take a
look discussion at MESOS-3366 before, which allow
resources/attributes
discovery.
A module to implement *slaveAttributesDecorator* hook will
works like
a charm here in a static way. And need to justify
attributes updating.
#3. Allow updating attributes
Several cases need to be covered here:
a). Mesos runs inside VMs or container, where live
migration happens, so
rack information need to be updated.
b). LLDP packets are broadcasted by the interval 10s~30s, a
vendor
specific implementation, and rack information are usually
stored in LLDP
daemon to be queried. Worst cases(nodes fresh reboot, or
daemon restart)
would be: Mesos slave have to wait 10s~30s for a valid rack
information
before register to master. Allow updating attributes will
mitigate this
problem.
c). Framework affinity
Framework X prefers to run on the same nodes with another
framwork Y.
For example, it's desirable for Shark or Spark-SQL to
reside on the
*worker* node where Alluxio(former Tachyon) to gain more
performance
boosting as SPARK-6707 ticket message
{tachyon=true;us-east-1=false}
If framework could advertise agent attributes in the
ResourcesOffer
process, awesome!
#4. Rearrange agents in a more scalable manner, like per
rack basis
Randomly offering agents resource to framework does not
improve data
locality, imagine the likelihood of a framework getting
resources
underneath the same rack, at the scale of +30000 nodes.
Moreover time to
randomly shuffle the agents also grows.
How about rearranging the agent in a per rack basis, and a
minor change
to the way how resources are allocated will fix this.
I might not see the whole picture here, so comments are
welcomed!
On 2016/6/6 17:17, Du, Fan wrote:
> Hi, Mesos folks
>
> I’ve been thinking about Mesos rack awareness support
for a while,
>
> it’s a common interest for lots of data center
applications to
provide
> data locality,
>
> fault tolerance and better task placement. Create
MESOS-5545 to track
> the story,
>
> and here is the initial design doc [1] to support rack
awareness
in Mesos.
>
> Looking forward to hear any comments from end user and other
developers,
>
> Thanks!
>
> [1]:
>
https://docs.google.com/document/d/1rql_LZSwtQzBPALnk0qCLsmxcT3-zB7X7aJp-H3xxyE/edit?usp=sharing
>