my want: write a daemon to query mesos framework api, get the statistics from mesos api. then invoke the IaaS's API to scale the cluster size.
2015-08-02 22:32 GMT+08:00 Alex Rukletsov <[email protected]>: > I agree with Vinod that the Master accumulates a lot of statistics that can > be used for smarter decisions about cluster scaling. However, I'm not sure > this feature should reside in Mesos. I would rather expose statistics and / > or recommendations and let external tooling or an operator do the job. > On 31 Jul 2015 7:15 pm, "Vinod Kone" <[email protected]> wrote: > > > Thanks for pinging again Mathieu! > > > > I think auto-scaling of a Mesos cluster is a nifty feature to have. The > > only question in my mind (and likely others) is whether this > functionality > > should reside in Mesos, or a framework or an operator. As you mentioned, > > Netflix took the framework way but it doesn't necessarily work in a > > multi-framework environment. If the functionality lies with an operator > it > > has to be a library (likely a service) so that more people can take > > advantage of it. > > > > In my mind, it is not hard to imagine having this functionality in Mesos. > > Since Mesos is in the best position to know the (current and perhaps > > projected) state of the cluster it could make smart decisions about the > > shape and size of the new nodes that can be added. This also becomes > > interesting in the face of the quota > > <https://issues.apache.org/jira/browse/MESOS-1791> work that we are > > currently doing. > > > > Having said that, I think you can do this today by writing an allocator > > module. Note that Mesos already provides a requestResources() API call > > (similar to Wish in your ppt) that is passed to the allocator. You should > > be able to write an allocator module that takes this signal and talks to > > your favorite IaaS API to spin up new node(s) if necessary. > > > > > > On Fri, Jul 31, 2015 at 8:29 AM, Roger Ignazio <[email protected]> > wrote: > > > > > With the number of IaaS providers out there, and the fact that Mesos > > > doesn't really concern itself with where it's running (IaaS, > bare-metal, > > > on-prem, in the cloud), this sounds more like an operations problem > than > > a > > > feature that should be in Mesos core. > > > > > > By any chance, have you had a chance to look at > > > https://github.com/thefactory/autoscale-python? I'd venture to guess > > that > > > project (or a homegrown solution talking to your IaaS' API), combined > > with > > > some custom AWS AMIs (or vSphere templates or OpenStack images or ...), > > > would satisfy your use-case. > > > > > > -- Roger > > > > > > On Fri, Jul 31, 2015 at 5:37 AM, VELTEN, MATHIEU < > > [email protected]> > > > wrote: > > > > > > > Hi, > > > > > > > > I am currently working for some projects using Mesos at Atos Toulouse > > and > > > > we are using it on top of a classical IaaS. > > > > > > > > After playing with Mesos and looking at some code it appears to me > that > > > > there is no elasticity mechanism in place. I opened an issue in Jira > > some > > > > months ago here, which contains most of the content of this email : > > > > https://issues.apache.org/jira/browse/MESOS-2453 > > > > > > > > Here is what I have in mind (ppt in the following link for the > detailed > > > > and visual version ☺ ) : > > > > - Add the possibility for a framework to signal that it has some work > > > > pending (with or without further semantics regarding what resources > is > > > > wished ?) > > > > - Modify the Mesos algo to call a pluggable driver when no resource > is > > > > available and at least one framework has some work to do. > > > > In this case the driver should scale up the Mesos cluster by > > launching > > > > VMs. How much and of which size is a little tricky here without > adding > > > > semantics to the framework signal. > > > > - We should also add a flag somewhere to mark the slave as "volatile" > > so > > > > we can prefer the use of static resources, and shut down the volatile > > > > slaves after some time left unused. > > > > > > > > > > > > > > > > > > https://docs.google.com/presentation/d/1eNQSvDQ64gPNbmf0YVPq9tIWLMCbAHExos5WXrm0uqI/edit?usp=sharing > > > > > > > > Does it look doable to you ? what do you think about the principle ? > > > > Do you think we can add some semantics to the "I have work to do" > > > > framework signal without breaking the two-level scheduling principle > ? > > > > I don't think it violates it since both mechanisms (signaling a need > > and > > > > effectively take a resource from an offer) are fully independent in > my > > > > proposal but I feel a little out of my league to be sure. > > > > > > > > This proposal currently doesn't specifically address bin packing, > > however > > > > with the aforementioned modifications in place it should be easy to > add > > > > since we know which resources are volatile. > > > > > > > > I have seen some other work (by Netflix for example) address this > > problem > > > > however it always seems to be at the framework level and not inside > the > > > > core Mesos architecture, is there a reason for that except lack of > time > > > for > > > > specification/contribution ? > > > > > > > > > > > > > > http://fr.slideshare.net/spodila/aws-reinvent-2014-talk-scheduling-using-apache-mesos-in-the-cloud > > > > > > > > Regards, > > > > > > > > Mathieu Velten > > > > Ce message et toutes les pièces jointes (ci-après le "message") sont > > > > établis à l’intention exclusive des destinataires désignés. Il > contient > > > des > > > > informations confidentielles et pouvant être protégé par le secret > > > > professionnel. Si vous recevez ce message par erreur, merci d'en > > avertir > > > > immédiatement l'expéditeur et de détruire le message. Toute > utilisation > > > de > > > > ce message non conforme à sa destination, toute diffusion ou toute > > > > publication, totale ou partielle, est interdite, sauf autorisation > > > expresse > > > > de l’émetteur. L'internet ne garantissant pas l'intégrité de ce > message > > > > lors de son acheminement, Atos (et ses filiales) décline(nt) toute > > > > responsabilité au titre de son contenu. Bien que ce message ait fait > > > > l’objet d’un traitement anti-virus lors de son envoi, l’émetteur ne > > peut > > > > garantir l’absence totale de logiciels malveillants dans son contenu > et > > > ne > > > > pourrait être tenu pour responsable des dommages engendrés par la > > > > transmission de l’un d’eux. > > > > > > > > This message and any attachments (the "message") are intended solely > > for > > > > the addressee(s). It contains confidential information, that may be > > > > privileged. If you receive this message in error, please notify the > > > sender > > > > immediately and delete the message. Any use of the message in > violation > > > of > > > > its purpose, any dissemination or disclosure, either wholly or > > partially > > > is > > > > strictly prohibited, unless it has been explicitly authorized by the > > > > sender. As its integrity cannot be secured on the internet, Atos and > > its > > > > subsidiaries decline any liability for the content of this message. > > > > Although the sender endeavors to maintain a computer virus-free > > network, > > > > the sender does not warrant that this transmission is virus-free and > > will > > > > not be liable for any damages resulting from any virus transmitted. > > > > > > > > > > -- Deshi Xiao Twitter: xds2000 E-mail: xiaods(AT)gmail.com
