+dev. @Fan, I responded on the JIRA with some next steps. Thanks for bringing this up!
— *Joris Van Remoortere* Mesosphere On Tue, Jun 7, 2016 at 12:58 PM, james <gar...@verizon.net> wrote: > On 06/07/2016 09:57 AM, Du, Fan wrote: > >> >> >> On 2016/6/6 21:27, james wrote: >> >>> Hello, >>> >>> >>> @Stephen::I guess Stephen is bringing up the 'security' aspect of who >>> get's access to the information, particularly cluster/cloud devops, >>> customers or interlopers....? >>> >> >> ACLs should play in this part to address security concern. >> > > YES, and so much more! I know folks that their primary (in house cluster) > usage is deep packet inspection on the cluster.... > With a cluster (inside) there is no limit to new tools that can be > judiciously altered to benefit from cluster codes.... > > >> >>> @Fan:: As a consultant, most of my customers either have or are >>> planning hybrid installations, where some codes run on a local cluster >>> or using 'the cloud' for dynamic load requirements. I would think your >>> proposed scheme needs to be very flexible, both in application to a >>> campus or Metropolitan Area Network, if not massively distributed around >>> the globe. What about different resouce types (racks of arm64, gpu >>> centric hardware, DSPs, FPGA etc etc. Hardware diversity bring many >>> benefits to the cluster/cloud capabilities. >>> >>> >>> This also begs the quesion of hardware management (boot/config/online) >>> of the various hardware, such as is built into coreOS. Are several >>> applications going to be supported? Standards track? Just Mesos DC/OS >>> centric? >>> >> >> It depends whether this proposal is accepted by Mesos, if you think >> this feature is useful, let's discuss detailed requirement under >> MESOS-5545. >> > > OK. Take a look at 'Rackview' on sourceforge:: > 'http://rackview.sourceforge.net/' > > > Do I have access to the jira system by default joining this list, > or do I have to request permission somewhere? (sorry jira is new to me > so recommendations on jira, per mesos, in a document, would be keen.) > > >> btw, I have limited knowledge of CoreOS, will look into it. >> > > CoreOS has some great ideas. But many of their codes are not current > (when compared to the gentoo portage tree) and thus many are suspect > for security/function. > > I thought the purpose was to get more folks involved here in discussions > and then better formulated ideas can migrate to the ticket (5545) and > repos. > > >> >>> TIMING DATA:: This is the main issue I see. Once you start 'vectoring >>> in resources' you need to add timing (latency) data to encourage robust >>> and diversified use of of this data. For HPC, this could be very >>> valuable for rDMA abusive algorithms where memory constrained workloads >>> not only need the knowledge of additional nearby memory resources, but >>> the approximated (based on previous data collected) latency and >>> bandwidth constraints to use those additional resources. >>> >> >> Out of curiosity, which open sourced Mesos framework do you/your >> customer run MPI? >> > > Easy dude. Most of this work in tightly help and nothing to publish > or open up yet. It's a mess (my professional opinion) right now and > I'm testing a variety of tools just be able to have better instrumentation > on these codes. Still rDMA is very attractive so it does warrant much > attention and extreme, internal, excitement. > > > > > Mesos can support MPI framework, but AFIK, it's immature [1][2]. >> > > YEP. > > I think this part of work should be investigated in future. >> >> [1]: https://github.com/apache/mesos/tree/master/mpi <- mpd ring >> version >> [2]:https://github.com/mesosphere/mesos-hydra <- hydra version >> > > Many codes floating around. Much excitement on new compiler features. Lots > of hard work and testing going on. That said, the point I was try to make > is "Vectoring in" resources, with a variety of parameters as a companion to > your idea, is warranted for these aforementioned use cases > and other opportunities. > > >> >>> Great idea. I do like it very much. >>> >>> hth, >>> James >>> >>> >>> On 06/06/2016 05:06 AM, Stephen Gran wrote: >>> >>>> Hi, >>>> >>>> This looks potentially interesting. How does it work in a public cloud >>>> deployment scenario? I assume you would just have to disable this >>>> feature, or not enable it? >>>> >>>> Cheers, >>>> >>>> On 06/06/16 10:17, Du, Fan wrote: >>>> >>>>> Hi, Mesos folks >>>>> >>>>> I’ve been thinking about Mesos rack awareness support for a while, >>>>> >>>>> it’s a common interest for lots of data center applications to provide >>>>> data locality, >>>>> >>>>> fault tolerance and better task placement. Create MESOS-5545 to track >>>>> the story, >>>>> >>>>> and here is the initial design doc [1] to support rack awareness in >>>>> Mesos. >>>>> >>>>> Looking forward to hear any comments from end user and other >>>>> developers, >>>>> >>>>> Thanks! >>>>> >>>>> [1]: >>>>> >>>>> https://docs.google.com/document/d/1rql_LZSwtQzBPALnk0qCLsmxcT3-zB7X7aJp-H3xxyE/edit?usp=sharing >>>>> >>>>> >>>>> >>>>> >>>> >>> >>> >> >> >