There are a lot of things in Mesos which require a-priori communication between an agent and a framework in order to properly set resource usage expectations (example: what does 1 cpu mean?). I'm not seeing how having customizations in core mesos per "way of looking at resources" is scalable and future-proof.
On Mon, Jun 6, 2016 at 8:48 AM Jörg Schad <jo...@mesosphere.io> wrote: > Hi, > thanks for your idea and design doc! > Just a few thoughts: > a) The scheduling part would be implemented in a framework scheduler and > not the Mesos Core, or? > b) As mentioned by James, this needs to be very flexible (and not > necessarily based on network structure), afaik people are using labels on > the agents to identify different fault domains which can then be > interpreted by framework scheduler. Maybe it would make sense (instead of > identifying the network structure) to come up with a common label naming > scheme which can be understood by all/different frameworks. > > Looking forward to your thoughts on this! > > On Mon, Jun 6, 2016 at 3:27 PM, james <gar...@verizon.net> wrote: > >> Hello, >> >> >> @Stephen::I guess Stephen is bringing up the 'security' aspect of who >> get's access to the information, particularly cluster/cloud devops, >> customers or interlopers....? >> >> >> @Fan:: As a consultant, most of my customers either have or are planning >> hybrid installations, where some codes run on a local cluster or using 'the >> cloud' for dynamic load requirements. I would think your proposed scheme >> needs to be very flexible, both in application to a campus or Metropolitan >> Area Network, if not massively distributed around the globe. What about >> different resouce types (racks of arm64, gpu centric hardware, DSPs, FPGA >> etc etc. Hardware diversity bring many >> benefits to the cluster/cloud capabilities. >> >> >> This also begs the quesion of hardware management (boot/config/online) >> of the various hardware, such as is built into coreOS. Are several >> applications going to be supported? Standards track? Just Mesos DC/OS >> centric? >> >> >> TIMING DATA:: This is the main issue I see. Once you start 'vectoring >> in resources' you need to add timing (latency) data to encourage robust >> and diversified use of of this data. For HPC, this could be very valuable >> for rDMA abusive algorithms where memory constrained workloads not only >> need the knowledge of additional nearby memory resources, but >> the approximated (based on previous data collected) latency and bandwidth >> constraints to use those additional resources. >> >> >> Great idea. I do like it very much. >> >> hth, >> James >> >> >> >> On 06/06/2016 05:06 AM, Stephen Gran wrote: >> >>> Hi, >>> >>> This looks potentially interesting. How does it work in a public cloud >>> deployment scenario? I assume you would just have to disable this >>> feature, or not enable it? >>> >>> Cheers, >>> >>> On 06/06/16 10:17, Du, Fan wrote: >>> >>>> Hi, Mesos folks >>>> >>>> I’ve been thinking about Mesos rack awareness support for a while, >>>> >>>> it’s a common interest for lots of data center applications to provide >>>> data locality, >>>> >>>> fault tolerance and better task placement. Create MESOS-5545 to track >>>> the story, >>>> >>>> and here is the initial design doc [1] to support rack awareness in >>>> Mesos. >>>> >>>> Looking forward to hear any comments from end user and other developers, >>>> >>>> Thanks! >>>> >>>> [1]: >>>> >>>> https://docs.google.com/document/d/1rql_LZSwtQzBPALnk0qCLsmxcT3-zB7X7aJp-H3xxyE/edit?usp=sharing >>>> >>>> >>> >> >