> On Nov. 19, 2015, 6:16 p.m., Neil Conway wrote:
> > Good find. I wonder:
> > 
> > (a) is there some general advice we should give to people implementing 
> > Processes (e.g., "always provide a destructor that does terminate/wait" -- 
> > that is probably too broad though). Would be nice to add this to the 
> > libprocess README.
> > (b) does this problem occur anywhere else? and/or is there a way to detect 
> > it?
> 
> Bernd Mathiske wrote:
>     [~neilc] a) I agree that we should provide documentation in this regard. 
> This kind of pattern is too easily overlooked, also by reviewers, as 
> exemplified when the bug got introduced: https://reviews.apache.org/r/27483/
>     b) For test code (such as is the case here), we could put something into 
> the inherited fixture that scans for orphaned processes at TearDown.
> 
> Benjamin Bannier wrote:
>     Note that there is some assymmetry in the API: when *not giving* a 
> `manage` arg to `spawn`. you get the default value `false`; but you then *do 
> need to add code* to terminate and wait for the process.
>     
>     I personally would naively have expected an API to either (i) require me 
> to be explicitly in both places (explictly set `manage=false`, and to manual 
> `terminate` and `wait`), or (ii) it just do The Right Thing if I was brief (I 
> called `spawn` with default args, and do not have to worry about cleanup).
>     
>     I.e., wouldn't it in the long run make more sense the change `spawn` to 
> `PID<T> spawn(T& t, bool manage = true)` than to add more documentation?

[~neilc] 
a) I agree that documentation is a good direction.   (I'll draft something.)

[~bernd-mesos]
b) We already have an orphaned process detector in `src/tests/environment.cpp` 
`Environment::TearDown`.  However, this only catches **orphaned** processes 
(whose parents have been killed :( ).  This HTTP process's parent is the test 
process.
There is an endpoint `/__processes__` which we might be able to use to check 
for extraneous processes.  But this might be messy, since we have a great deal 
of global processes.

[~bbannier]
The `manage` argument will only change ownership of the process pointer.  If 
you dig into the `GarbageCollector`, you'll see that all it does is `delete` 
the pointer when the process terminates.  The original process is still in 
charge of its own termination.


- Joseph


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/40501/#review107290
-----------------------------------------------------------


On Nov. 19, 2015, 1:51 p.m., Joseph Wu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/40501/
> -----------------------------------------------------------
> 
> (Updated Nov. 19, 2015, 1:51 p.m.)
> 
> 
> Review request for mesos, Bernd Mathiske, Artem Harutyunyan, and Joris Van 
> Remoortere.
> 
> 
> Bugs: MESOS-3753
>     https://issues.apache.org/jira/browse/MESOS-3753
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Two of the fetcher tests will spawn a process which is stored in the stack 
> (i.e. local variable in the test).  `spawn` will store a pointer to the 
> process in libprocess's `ProcessManager`.  When the test finishes, the 
> process goes out of scope and is therefore lost.  However, the process is 
> **not** terminated.
> 
> Failing to terminate this process will lead to an infinite loop in 
> `~ProcessManager`, which is called in `process::finalize`.  In 
> `ProcessManager` 's destructor, we will loop and try to kill all processes.  
> The process spawned in the test will be running.  However, since the pointer 
> lives in the stack, the `ProcessManager` will be unable to find the process 
> and will thereby be stuck trying to kill a process it cannot find.
> 
> 
> Diffs
> -----
> 
>   src/tests/fetcher_tests.cpp 04079964b3539f555351d1444f3635c64700a1a8 
> 
> Diff: https://reviews.apache.org/r/40501/diff/
> 
> 
> Testing
> -------
> 
> `make check`
> 
> Additional testing:
> 
> Insert a `process::finalize` in `src/test/main.cpp`.  i.e.
> ```
>   // Replace `return RUN_ALL_TESTS();` with this:
>   int ret = RUN_ALL_TESTS();
>   process::finalize();
>   return ret;
> ```
> 
> Then `make check 
> GTEST_FILTER="*FetcherTest.OSNetUriTest*:*FetcherTest.OSNetUriSpaceTest*"`.
> The test program should not stall or segfault or abort in some weird way.
> 
> 
> Thanks,
> 
> Joseph Wu
> 
>

Reply via email to