Re: Review Request 35702: Added /reserve HTTP endpoint to the master.

Michael Park Fri, 31 Jul 2015 10:03:55 -0700


> On June 22, 2015, 1:32 p.m., Alexander Rukletsov wrote:
> > src/master/master.cpp, line 749
> > <https://reviews.apache.org/r/35702/diff/6/?file=989449#file989449line749>
> >
> >     I think reserve is too abstract and may collide with future actions 
> > (think quota). How about `/dynamic/reserve`?
> 
> Alexander Rukletsov wrote:
>     Though we currently do not support slashes in endpoints, I think we 
> should fix that first before introducing a `/reserve` endpoint, given these 
> endpoint are not targeted for 0.23.
> 
> Joris Van Remoortere wrote:
>     Cody had some patches for enabling sub namespaces in endpoints (as in 
> enabling slashes). Might be worth pulling those in.
> 
> Alexander Rukletsov wrote:
>     Yep, it's https://issues.apache.org/jira/browse/MESOS-2130, I plan to 
> bring up the discussion today at the community sync.
> 
> Michael Park wrote:
>     The concensus for now seems that (1) we introduce the allocator changes, 
> but address the allocator refactor sooner rather than later, (2) go with 
> `/reserve` for now and update them once the HTTP API folks get to supporting 
> the nested endpoint stuff.
> 
> Alexander Rukletsov wrote:
>     And (3) we update endpoints names before the following release, i.e. 
> there is no Mesos release, where `/reserve` will exist. Correct?
> 
> Michael Park wrote:
>     That is the ideal outcome. But if we commit this now/soon, whether we can 
> update the names before 0.24.0 gets out entirely depends on whether the 
> nested endpoint names capabilities get committed on time.
> 
> Alexander Rukletsov wrote:
>     I think we agreed that we should, in order to avoid deprecation cycle for 
> "old" endpoints.


I synced with BenH and Jie regarding this topic, they both suggested that we 
should get this in as is and update later. Jie suggested that if the HTTP API 
doesn't make it in in time, we can either not mention it in the `CHANGELOG` or 
mention it as an alpha feature that is subject to change.


- Michael


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35702/#review88781
-----------------------------------------------------------


On July 28, 2015, 9:03 p.m., Michael Park wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/35702/
> -----------------------------------------------------------
> 
> (Updated July 28, 2015, 9:03 p.m.)
> 
> 
> Review request for mesos, Adam B, Benjamin Hindman, Ben Mahler, Jie Yu, Joris 
> Van Remoortere, and Vinod Kone.
> 
> 
> Bugs: MESOS-2600
>     https://issues.apache.org/jira/browse/MESOS-2600
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> This involved a lot more challenges than I anticipated, I've captured the 
> various approaches and limitations and deal-breakers of those approaches 
> here: [Master Endpoint Implementation 
> Challenges](https://docs.google.com/document/d/1cwVz4aKiCYP9Y4MOwHYZkyaiuEv7fArCye-vPvB2lAI/edit#)
> 
> Key points:
> 
> * This is a stop-gap solution until we shift the offer creation/management 
> logic from the master to the allocator.
> * `updateAvailable` and `updateSlave` are kept separate because
>   (1) `updateAvailable` is allowed to fail whereas `updateSlave` must not.
>   (2) `updateAvailable` returns a `Future` whereas `updateSlave` does not.
>   (3) `updateAvailable` never leaves the allocator in an over-allocated state 
> and must not, whereas `updateSlave` does, and can.
> * The algorithm:
>     * Initially, the master pessimistically assume that what seems like 
> "available" resources will be gone.
>       This is due to the race between the allocator scheduling an `allocate` 
> call to itself vs master's `allocator->updateAvailable` invocation.
>       As such, we first try to satisfy the request only with the offered 
> resources.
>     * We greedily rescind one offer at a time until we've rescinded 
> sufficiently many offers.
>       IMPORTANT: We perform `recoverResources(..., Filters())` rather than 
> `recoverResources(..., None())` so that we can pretty much always win the 
> race against `allocate`.
>                  In the case that we lose, no disaster occurs. We simply fail 
> to satisfy the request.
>     * If we still don't have enough resources after resciding all offers, be 
> optimistic and forward the request to the allocator since there may be 
> available resources to satisfy the request.
>     * If the allocator returns a failure, report the error to the user with 
> `PreconditionFailed`. This could be updated to be `Forbidden`, or `Conflict` 
> maybe as well. We'll pick one eventually.
> 
> This approach is clearly not ideal, since we would prefer to rescind as 
> little offers as possible.
> The challenges of implementing the ideal solution in the current state is 
> described in the document above.
> 
> TODO(mpark): Add more comments and test cases.
> 
> 
> Diffs
> -----
> 
>   src/master/http.cpp 3a1598fad4db03e5f62fd4a6bd26b2bedeee4070 
>   src/master/master.hpp 827d0d599912b2936beb9615610f627f6c9a2d43 
>   src/master/master.cpp 5b5e3c37d4433c8524db267866aebc0a35a181f1 
>   src/master/validation.hpp 469d6f56c3de28a34177124aae81ce24cb4ad160 
>   src/master/validation.cpp 9d128aa1b349b018b8e4a1916434d848761ca051 
> 
> Diff: https://reviews.apache.org/r/35702/diff/
> 
> 
> Testing
> -------
> 
> `make check`
> 
> 
> Thanks,
> 
> Michael Park
> 
>

Re: Review Request 35702: Added /reserve HTTP endpoint to the master.

Reply via email to