Hi:

tl;dr: We are proposing to add two new V1 scheduler APIs: unsuppress and
clear_filter in order to decouple the dual-semantics of the current revive
call.

As pointed out in the Mesos framework scalability guide
<http://mesos.apache.org/documentation/latest/app-framework-development-guide/#multi-scheduler-scalability>,
utilizing the suppress
<http://mesos.apache.org/documentation/latest/scheduler-http-api/#suppress>
call is the key to get your cluster to a large number of frameworks
<https://schd.ws/hosted_files/mesoscon18/84/Scaling%20Mesos%20to%20Thousands%20of%20Frameworks.pdf>.
In short, when a framework is idling with no intention to launch any tasks,
it should suppress to inform the Mesos to stop sending any more offers. And
the framework should revive
<http://mesos.apache.org/documentation/latest/scheduler-http-api/#revive>
when new work arrives. This way, the allocator will skip the framework when
performing resource allocations. As a result, thorny issues such as offer
starvation and resource fragmentation would be greatly mitigated.

That being said. The suppress/revive calls currently are a little bit
unwieldy due to MESOS-9028
<https://issues.apache.org/jira/browse/MESOS-9028>:

The revive call has two semantics. It unsuppresses the framework AND clears
all the existing filters. The later makes the revive call non-idempotent.
And sometimes users may want to keep the existing filters when reiving
which is not possible atm.

To decouple the semantics, as suggested in the ticket, we propose to add
two new V1 scheduler calls:

(1) `UNSUPPRESS` call requests the Mesos to resume sending offers;
(2) `CLEAR_FILTER` call will explicitly clear all the existing filters.

To make life easier, both calls will return 200 OK (as opposed to 202
returned by most existing scheduler calls, including `SUPPRESS` and
`REVIVE`).

We will keep the revive call and its semantics (i.e. unsupppress AND clear
filters) for backward compatibility.

Note, the changes are proposed for V1 API only. Thus, once the changes are
landed, framework developers are encouraged to move to V1 API to take
advantage of the new calls (among many other benefits).

Any feedback/comments are welcome.

-Meng

Reply via email to