> On Jan. 24, 2014, 7:47 a.m., Benjamin Hindman wrote:
> > src/launcher/executor.cpp, line 298
> > <https://reviews.apache.org/r/17305/diff/1/?file=447761#file447761line298>
> >
> >     I think it makes sense to have a "global" reaper just like we do with 
> > statistics. Having multiple libprocess processes be delaying every 1 second 
> > to call waitpid is a bit wasteful.
> 
> Ben Mahler wrote:
>     That's true, at the same time we would be doing a 1 second no-op event 
> loop on all libprocess binaries, unless we added the ability to start/stop 
> the Reaper.
>     
>     I was thinking of a library-style API:
>     
>     namespace process {
>     Future<Option<int> > reap(pid_t);
>     }
>     
>     This would use a global Reaper under the hood, I'll be updating the diffs 
> with this cleaner interface.
> 
> Ben Mahler wrote:
>     One side-effect of the global reaper that I discovered while testing is 
> that when running the reaping related tests in repetition, they gradually 
> become slower. This is because each test that advances time for the global 
> reaper to run it's wait() loop pushes the next wait() loop further into the 
> future, requiring more and more calls to Clock::advance + Clock::settle.
>     
>     Before these changes, since a Reaper was re-constructed across each test 
> and each component, we were immune to this. I'll leave a note about this in 
> the code. We can fix this if we were to implement the approach where we have 
> a thread per pid, each blocking on waitpid().
> 
> Benjamin Hindman wrote:
>     Why isn't resuming the clock sufficient for this?

Because the global ReaperProcess lives across the tests.

>From its point of view, it is scheduling every 1 second. However, we're 
>locally pausing and advancing to force a schedule of the wait() function. Once 
>wait() is run, it will get re-scheduled 1 second further from when it was run 
>**in the advanced time-frame**. So the next test that wants to pause and 
>advance the clock, now must reach ~2 seconds into the advanced time-frame 
>rather than 1 second.

Resuming would be sufficient if it made time appear to continue in the advanced 
time frame, but it will actually result in time going _back_ to what it would 
have been had the clock had never been paused in the first place. Sorry this is 
a bit tricky to explain clearly. Let me know if I'm missing something or if 
it's still not clear.


- Ben


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/17305/#review32706
-----------------------------------------------------------


On Jan. 27, 2014, 11:25 p.m., Ben Mahler wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/17305/
> -----------------------------------------------------------
> 
> (Updated Jan. 27, 2014, 11:25 p.m.)
> 
> 
> Review request for mesos, Benjamin Hindman and Vinod Kone.
> 
> 
> Bugs: MESOS-943
>     https://issues.apache.org/jira/browse/MESOS-943
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> This removes the Mesos Reaper in preference of using the libprocess Reaper, 
> which no longer reaps non-monitored processes.
> 
> 
> Diffs
> -----
> 
>   src/Makefile.am d58b46e99e0a041cf2a26abe44bbd1504a9539c0 
>   src/launcher/executor.cpp b73ab479500a7347a38ba53acecfab9229f1080d 
>   src/slave/cgroups_isolator.hpp e86062e9abaaa263c32c55e9dbfefd700f605886 
>   src/slave/cgroups_isolator.cpp 80155a3f0cd684a6aad9a37f61c5204d86032735 
>   src/slave/process_isolator.hpp 4ae093fe65775a2b9bec42071961dd58aa0c3d8b 
>   src/slave/process_isolator.cpp 0bc698f04f7c8eaad166dc9d646e13310129dd01 
>   src/slave/reaper.hpp 9a31c754475ecbce5299d8f18f38253c542404e5 
>   src/slave/reaper.cpp 5eabbc3911584cf47c353bcf4ca660c47c2c17be 
>   src/tests/environment.cpp 6edce4552ef9a12b7b58cefea97ebacc9224ab04 
>   src/tests/reaper_tests.cpp 608ec0eff4eaae115d75621937a39b22e3bdb068 
>   src/tests/slave_recovery_tests.cpp 5a4c4fc4f687a37409d1afbda4c0d07fcdc3a4c7 
> 
> Diff: https://reviews.apache.org/r/17305/diff/
> 
> 
> Testing
> -------
> 
> make check
> 
> I've also added an orphan check in the testing environment tear down.
> 
> 
> Thanks,
> 
> Ben Mahler
> 
>

Reply via email to