Re: Rack awareness support for Mesos

Jörg Schad Mon, 06 Jun 2016 08:49:11 -0700

Hi,
thanks for your idea and design doc!
Just a few thoughts:
a) The scheduling part would be implemented in a framework scheduler and
not the Mesos Core, or?
b) As mentioned by James, this needs to be very flexible (and not
necessarily based on network structure), afaik people are using labels on
the agents to identify different fault domains which can then be
interpreted by framework scheduler. Maybe it would make sense (instead of
identifying the network structure) to come up with a common label naming
scheme which can be understood by all/different frameworks.


Looking forward to your thoughts on this!

On Mon, Jun 6, 2016 at 3:27 PM, james <gar...@verizon.net> wrote:

> Hello,
>
>
> @Stephen::I guess Stephen is bringing up the 'security' aspect of who
> get's access to the information, particularly cluster/cloud devops,
> customers or interlopers....?
>
>
> @Fan:: As a consultant, most of my customers either have  or are planning
> hybrid installations, where some codes run on a local cluster or using 'the
> cloud' for dynamic load requirements. I would think your proposed scheme
> needs to be very flexible, both in application to a campus or Metropolitan
> Area Network, if not massively distributed around the globe. What about
> different resouce types (racks of arm64, gpu centric hardware, DSPs, FPGA
> etc etc. Hardware diversity bring many
> benefits to the cluster/cloud capabilities.
>
>
> This also begs the quesion of hardware management (boot/config/online)
> of the various hardware, such as is built into coreOS. Are several
> applications going to be supported? Standards track? Just Mesos DC/OS
> centric?
>
>
> TIMING DATA:: This is the main issue I see. Once you start 'vectoring
> in resources' you need to add timing (latency) data to encourage robust
> and diversified use of of this data. For HPC, this could be very valuable
> for rDMA abusive algorithms where memory constrained workloads not only
> need the knowledge of additional nearby memory resources, but
> the approximated (based on previous data collected) latency and bandwidth
> constraints to use those additional resources.
>
>
> Great idea. I do like it very much.
>
> hth,
> James
>
>
>
> On 06/06/2016 05:06 AM, Stephen Gran wrote:
>
>> Hi,
>>
>> This looks potentially interesting.  How does it work in a public cloud
>> deployment scenario?  I assume you would just have to disable this
>> feature, or not enable it?
>>
>> Cheers,
>>
>> On 06/06/16 10:17, Du, Fan wrote:
>>
>>> Hi, Mesos folks
>>>
>>> I’ve been thinking about Mesos rack awareness support for a while,
>>>
>>> it’s a common interest for lots of data center applications to provide
>>> data locality,
>>>
>>> fault tolerance and better task placement. Create MESOS-5545 to track
>>> the story,
>>>
>>> and here is the initial design doc [1] to support rack awareness in
>>> Mesos.
>>>
>>> Looking forward to hear any comments from end user and other developers,
>>>
>>> Thanks!
>>>
>>> [1]:
>>>
>>> https://docs.google.com/document/d/1rql_LZSwtQzBPALnk0qCLsmxcT3-zB7X7aJp-H3xxyE/edit?usp=sharing
>>>
>>>
>>
>

Re: Rack awareness support for Mesos

Reply via email to