Re: Rack awareness support for Mesos

Charles Allen Mon, 06 Jun 2016 09:32:08 -0700

There are a lot of things in Mesos which require a-priori communication
between an agent and a framework in order to properly set resource usage
expectations (example: what does 1 cpu mean?). I'm not seeing how having
customizations in core mesos per "way of looking at resources" is scalable
and future-proof.


On Mon, Jun 6, 2016 at 8:48 AM Jörg Schad <jo...@mesosphere.io> wrote:

> Hi,
> thanks for your idea and design doc!
> Just a few thoughts:
> a) The scheduling part would be implemented in a framework scheduler and
> not the Mesos Core, or?
> b) As mentioned by James, this needs to be very flexible (and not
> necessarily based on network structure), afaik people are using labels on
> the agents to identify different fault domains which can then be
> interpreted by framework scheduler. Maybe it would make sense (instead of
> identifying the network structure) to come up with a common label naming
> scheme which can be understood by all/different frameworks.
>
> Looking forward to your thoughts on this!
>
> On Mon, Jun 6, 2016 at 3:27 PM, james <gar...@verizon.net> wrote:
>
>> Hello,
>>
>>
>> @Stephen::I guess Stephen is bringing up the 'security' aspect of who
>> get's access to the information, particularly cluster/cloud devops,
>> customers or interlopers....?
>>
>>
>> @Fan:: As a consultant, most of my customers either have  or are planning
>> hybrid installations, where some codes run on a local cluster or using 'the
>> cloud' for dynamic load requirements. I would think your proposed scheme
>> needs to be very flexible, both in application to a campus or Metropolitan
>> Area Network, if not massively distributed around the globe. What about
>> different resouce types (racks of arm64, gpu centric hardware, DSPs, FPGA
>> etc etc. Hardware diversity bring many
>> benefits to the cluster/cloud capabilities.
>>
>>
>> This also begs the quesion of hardware management (boot/config/online)
>> of the various hardware, such as is built into coreOS. Are several
>> applications going to be supported? Standards track? Just Mesos DC/OS
>> centric?
>>
>>
>> TIMING DATA:: This is the main issue I see. Once you start 'vectoring
>> in resources' you need to add timing (latency) data to encourage robust
>> and diversified use of of this data. For HPC, this could be very valuable
>> for rDMA abusive algorithms where memory constrained workloads not only
>> need the knowledge of additional nearby memory resources, but
>> the approximated (based on previous data collected) latency and bandwidth
>> constraints to use those additional resources.
>>
>>
>> Great idea. I do like it very much.
>>
>> hth,
>> James
>>
>>
>>
>> On 06/06/2016 05:06 AM, Stephen Gran wrote:
>>
>>> Hi,
>>>
>>> This looks potentially interesting.  How does it work in a public cloud
>>> deployment scenario?  I assume you would just have to disable this
>>> feature, or not enable it?
>>>
>>> Cheers,
>>>
>>> On 06/06/16 10:17, Du, Fan wrote:
>>>
>>>> Hi, Mesos folks
>>>>
>>>> I’ve been thinking about Mesos rack awareness support for a while,
>>>>
>>>> it’s a common interest for lots of data center applications to provide
>>>> data locality,
>>>>
>>>> fault tolerance and better task placement. Create MESOS-5545 to track
>>>> the story,
>>>>
>>>> and here is the initial design doc [1] to support rack awareness in
>>>> Mesos.
>>>>
>>>> Looking forward to hear any comments from end user and other developers,
>>>>
>>>> Thanks!
>>>>
>>>> [1]:
>>>>
>>>> https://docs.google.com/document/d/1rql_LZSwtQzBPALnk0qCLsmxcT3-zB7X7aJp-H3xxyE/edit?usp=sharing
>>>>
>>>>
>>>
>>
>

Re: Rack awareness support for Mesos

Reply via email to