Hi,

I am currently working for some projects using Mesos at Atos Toulouse and we 
are using it on top of a classical IaaS.

After playing with Mesos and looking at some code it appears to me that there 
is no elasticity mechanism in place. I opened an issue in Jira some months ago 
here, which contains most of the content of this email :
https://issues.apache.org/jira/browse/MESOS-2453

Here is what I have in mind (ppt in the following link for the detailed and 
visual version ☺ ) :
- Add the possibility for a framework to signal that it has some work pending 
(with or without further semantics regarding what resources is wished ?)
- Modify the Mesos algo to call a pluggable driver when no resource is 
available and at least one framework has some work to do.
   In this case the driver should scale up the Mesos cluster by launching VMs. 
How much and of which size is a little tricky here without adding semantics to 
the framework signal.
- We should also add a flag somewhere to mark the slave as "volatile" so we can 
prefer the use of static resources, and shut down the volatile slaves after 
some time left unused.

https://docs.google.com/presentation/d/1eNQSvDQ64gPNbmf0YVPq9tIWLMCbAHExos5WXrm0uqI/edit?usp=sharing

Does it look doable to you ? what do you think about the principle ?
Do you think we can add some semantics to the "I have work to do" framework 
signal without breaking the two-level scheduling principle ?
I don't think it violates it since both mechanisms (signaling a need and 
effectively take a resource from an offer) are fully independent in my proposal 
but I feel a little out of my league to be sure.

This proposal currently doesn't specifically address bin packing, however with 
the aforementioned modifications in place it should be easy to add since we 
know which resources are volatile.

I have seen some other work (by Netflix for example) address this problem 
however it always seems to be at the framework level and not inside the core 
Mesos architecture, is there a reason for that except lack of time for 
specification/contribution ?
http://fr.slideshare.net/spodila/aws-reinvent-2014-talk-scheduling-using-apache-mesos-in-the-cloud

Regards,

Mathieu Velten
Ce message et toutes les pièces jointes (ci-après le "message") sont établis à 
l’intention exclusive des destinataires désignés. Il contient des informations 
confidentielles et pouvant être protégé par le secret professionnel. Si vous 
recevez ce message par erreur, merci d'en avertir immédiatement l'expéditeur et 
de détruire le message. Toute utilisation de ce message non conforme à sa 
destination, toute diffusion ou toute publication, totale ou partielle, est 
interdite, sauf autorisation expresse de l’émetteur. L'internet ne garantissant 
pas l'intégrité de ce message lors de son acheminement, Atos (et ses filiales) 
décline(nt) toute responsabilité au titre de son contenu. Bien que ce message 
ait fait l’objet d’un traitement anti-virus lors de son envoi, l’émetteur ne 
peut garantir l’absence totale de logiciels malveillants dans son contenu et ne 
pourrait être tenu pour responsable des dommages engendrés par la transmission 
de l’un d’eux.

This message and any attachments (the "message") are intended solely for the 
addressee(s). It contains confidential information, that may be privileged. If 
you receive this message in error, please notify the sender immediately and 
delete the message. Any use of the message in violation of its purpose, any 
dissemination or disclosure, either wholly or partially is strictly prohibited, 
unless it has been explicitly authorized by the sender. As its integrity cannot 
be secured on the internet, Atos and its subsidiaries decline any liability for 
the content of this message. Although the sender endeavors to maintain a 
computer virus-free network, the sender does not warrant that this transmission 
is virus-free and will not be liable for any damages resulting from any virus 
transmitted.

Reply via email to