Thanks Gabriel, that makes sense. It sounds like labels on static reservations might be the most expedient path toward a solution to this problem, but that is not without its complications, as suggested in the related ticket which Neil filed a while back: https://issues.apache.org/jira/browse/MESOS-4476
Povilas, also see this related ticket that Gabriel pointed me to: https://issues.apache.org/jira/browse/MESOS-6939 It sounds like this is a real issue for stateful framework developers, so hopefully we will find some time soon to implement a solution. In the meantime, Povilas, I'm afraid to say I don't know exactly what solution to recommend. If anybody else in the community has some ideas, it would be great to hear them :) Cheers, Greg On Tue, Jan 17, 2017 at 2:52 PM, Gabriel Hartmann <gabr...@mesosphere.io> wrote: > @Greg: The reason people use static reservation is to enforce that > particular resources (usually disks) can only be consumed by a particular > framework. They also don't know when the stateful service is going to be > installed necessarily so they don't want to race with other frameworks to > consume those special resources. So static reservation is desirable. > However, all stateful services also need more information about reserved > resources than is natively provided by Mesos in the static reservation case > (i.e. the labels he describes). `dcos-commons` does the same thing. > Various work arounds exist, but none are able to provide resource > allocation enforcement because only roles do that. An alternate resource > allocation enforcement mechanism is needed. Usually this is the part where > people start talking about quota. > > Neither option 1 nor option 2 provided a race proof way to get fully > labeled reserved resources. It's been proposed in the past that it be > allowed to add labels to statically reserved resources. That's kind of > fine except now you have these things that can't really be UNRESERVEd but > look exactly like dynamic resources which can... > > Quota w/ chunks as a step in the deployment of stateful services is very > desirable in an adversarial environment. However if your'e in a > cooperative environment (i.e. you're not in an adversarial relationship > with other frameworks) if you had resources (particularly disk resources) > with attributes on them you could have frameworks voluntarily choose not to > consume resources not meant for them. > > e.g. Disk resource has attribute `CASSANDRA`. Ok, since I'm a Kafka > framework I won't go use that disk. > > On Tue, Jan 17, 2017 at 11:24 AM Greg Mann <g...@mesosphere.io> wrote: > >> Hi Povilas, >> Another approach you could try is to use dynamic reservations only. You >> could either: >> >> 1. Alter your stateful framework to dynamically reserve the resources >> that it needs, or >> 2. Add a script to your cluster tooling that would make use of the >> operator endpoint for dynamic reservations [1] >> <http://mesos.apache.org/documentation/latest/reservation/> to >> dynamically reserve the stateful framework's resources when your cluster >> is >> initially provisioned. This would have a similar effect to static >> reservations, but would allow you to set labels >> >> Approach #1 makes sense to me; is there a reason that it's not feasible >> for your stateful framework to dynamically reserve its own resources? This >> is the typical workflow that I would recommend. I'm not too familiar with >> Aurora, so perhaps it's adding some complexity that I'm unaware of? >> >> Cheers, >> Greg >> >> [1] http://mesos.apache.org/documentation/latest/reservation/ >> >> >> On Tue, Jan 17, 2017 at 12:28 AM, Povilas Versockas < >> p.versoc...@gmail.com> wrote: >> >> Hey, >> >> Thanks for writing me back! >> >> Maybe there is some other method to solve this problem on statically >> reserved cluster? The solution could be making agent's resources appear as >> unreserved resources to only selected framework. I can see that mesos-agent >> has --acls flag, so maybe tinkering with this could help me. Of course it >> is possible to implement this in the framework scheduler, but this will add >> way more clunkiness to the code. It feels like this kind of resource >> management should be part of Mesos. Maybe I'm missing something? >> >> >> >> On Mon, Jan 16, 2017 at 4:58 PM, haosdent <haosd...@gmail.com> wrote: >> >> Hi, @Povilas It is possible to dynamic reserve unreserved resources on >> those agents. >> >> On Fri, Jan 13, 2017 at 2:47 PM, Povilas Versockas <p.versoc...@gmail.com >> > wrote: >> >> Hi, >> >> Maybe someone can help me with a problem I'm having. Short version of the >> question is: >> Is it possible to use dynamic reservation on statically reserved Mesos >> agents? >> >> The current situation is that we have Mesos cluster which runs many >> frameworks (aurora, spark, cassandra) and we are developing a custom >> framework for stateful tasks. Our framework manages stateful tasks for many >> users. Currently we statically reserved our hardware which has good disks >> only to be used by our framework (via --resources flag on Mesos Agents). >> >> The problem we are facing is that if one stateful task fails we would >> like to relaunch it on the same host with the same port, cpu, disk and >> memory. >> With dynamic reservations we would put a label with task id on a >> reservation and on failure would just simply reuse the reserved offer. >> On the other hand with statically reserved Mesos agents we cannot put any >> labels and so we cannot distinguish offers which should have been reserved >> for a task and a new offer. >> This leaves us in the situation that if one stateful task fails and there >> are new stateful tasks, the new tasks can be scheduled on failed task's >> Mesos agent, filling it up and taking it's port, cpu and memory. >> >> >> -- >> Regards >> Povilas Versockas >> >> >> >> >> -- >> Best Regards, >> Haosdent Huang >> >> >> >> >> -- >> Pagarbiai >> Povilas Versockas >> >> >>