> On July 24, 2014, 6:27 p.m., Ben Mahler wrote:
> > src/master/master.cpp, lines 3451-3457
> > <https://reviews.apache.org/r/22796/diff/7/?file=634636#file634636line3451>
> >
> >     I forgot to mention the bug here in my comment!
> >     
> >     With using an offerTimeout function, you can properly get the resources 
> > back from the allocator.
> >     
> >     This current patch removes the offer but doesn't tell the allocator!
> 
> Ben Mahler wrote:
>     Ideally we could capture the allocator expectations in the test, which 
> would have caught this issue.
> 
> Timothy Chen wrote:
>     Not sure I understand, I thought removeOffer call already handles 
> rescinding offers which also gives back allocated resources to the slave and 
> framework?
>     This patch simply adds a timeout to call rescind when it's not claimed?

Take a look at other calls to removeOffer, this one is in the same vein as this 
review: https://github.com/apache/mesos/blob/0.19.1/src/master/master.cpp#L1857

This trickiness was the motivation for: 
https://issues.apache.org/jira/browse/MESOS-1452

Let's improve the test here! We should expect that after the timeout, the 
scheduler receives another offer for the same resources, that will not happen 
with the current diff.


- Ben


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22796/#review48670
-----------------------------------------------------------


On July 28, 2014, 8:34 p.m., Timothy Chen wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22796/
> -----------------------------------------------------------
> 
> (Updated July 28, 2014, 8:34 p.m.)
> 
> 
> Review request for mesos, Adam B, Ben Mahler, and Niklas Nielsen.
> 
> 
> Bugs: MESOS-186
>     https://issues.apache.org/jira/browse/MESOS-186
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> Based on Kapil's patch (https://reviews.apache.org/r/22066/), adding timeout 
> for each offer from master to remove the offer when it's no longer used.
> 
> 
> Diffs
> -----
> 
>   src/master/flags.hpp 32704ce 
>   src/master/master.hpp d8a4d9e 
>   src/master/master.cpp 273a516 
>   src/tests/master_tests.cpp 5a1cf7f 
> 
> Diff: https://reviews.apache.org/r/22796/diff/
> 
> 
> Testing
> -------
> 
> Added three more unit tests from Kapil's patch: Testing offer not rescinded 
> after task launched, offer not rescinded when framework/slave unregistered.
> The test exposed a race condition that can lead to a segfault if two remove 
> offers are called on the same offer.
> 
> make check.
> 
> 
> Thanks,
> 
> Timothy Chen
> 
>

Reply via email to