Re: Question on dynamic reservations

2017-01-17 Thread Greg Mann
Thanks Gabriel, that makes sense. It sounds like labels on static reservations might be the most expedient path toward a solution to this problem, but that is not without its complications, as suggested in the related ticket which Neil filed a while back: https://issues.apache.org/jira/browse/MESOS

Re: Question on dynamic reservations

2017-01-17 Thread Gabriel Hartmann
@Greg: The reason people use static reservation is to enforce that particular resources (usually disks) can only be consumed by a particular framework. They also don't know when the stateful service is going to be installed necessarily so they don't want to race with other frameworks to consume th

Re: Question on dynamic reservations

2017-01-17 Thread Greg Mann
Hi Povilas, Another approach you could try is to use dynamic reservations only. You could either: 1. Alter your stateful framework to dynamically reserve the resources that it needs, or 2. Add a script to your cluster tooling that would make use of the operator endpoint for dynamic res

Re: CUDA support makes slave receiving no jobs

2017-01-17 Thread Kevin Klues
If you are running on standalone mesos+marathon, make sure you enable the marathon flag for '--enable_features=gpu_resources' (and make sure you have a version of marathon that supports this, i.e. 1.3). If you are on DC/OS, then make sure you are running a very recent build (no version that's been

Re: Default executor grace period

2017-01-17 Thread Tomek Janiszewski
Created issue for this: https://issues.apache.org/jira/browse/MESOS-6933 pon., 16 sty 2017 o 17:13 użytkownik Tomek Janiszewski napisał: > I looks like it's supported because executor prints grace period[1]. On > the other hand executor launches sh that launch command and shell executes > faster

CUDA support makes slave receiving no jobs

2017-01-17 Thread Cecile, Adam
Hello, I just tried to enable CUDA support but when it's done the slave refuse to start anything (marathon job stuck in deploying state). If I replace isolation setting from "cgroups/cpu,cgroups/mem,cgroups/devices,gpu/nvidia" to "cgroups/cpu,cgroups/mem,cgroups/devices" jobs get started agai

Re: Question on dynamic reservations

2017-01-17 Thread Povilas Versockas
Hey, Thanks for writing me back! Maybe there is some other method to solve this problem on statically reserved cluster? The solution could be making agent's resources appear as unreserved resources to only selected framework. I can see that mesos-agent has --acls flag, so maybe tinkering with thi