> On May 3, 2019, 11:31 p.m., Chun-Hung Hsiao wrote: > > 3rdparty/libprocess/src/tests/process_tests.cpp > > Lines 2092-2108 (patched) > > <https://reviews.apache.org/r/70595/diff/1/?file=2142838#file2142838line2092> > > > > I tried this unit test w/o the fix in the following settings: > > 1. `--gtest_repeat=100`, which resulted in 3 passes and 97 failures. > > 2. Invoked the test w/o repetition in a for-loop, which resulted in 60 > > passes and 40 failures. > > > > So it seems to me that this test passes fairly easily if it only runs > > once. > > > > Then I replace this snippet with the following code, which just used 1 > > actor but failed every time in the above two settings: > > ``` > > UPID pid = spawn(new ProcessBase(), true); > > > > dispatch(pid, [] { os::sleep(Milliseconds(50)); }); > > > > http::URL url = http::URL( > > "http", > > process::address().ip, > > process::address().port, > > "/__processes__"); > > > > Future<http::Response> response = http::get(url); > > > > terminate(pid, true); > > ``` > > Alexander Rukletsov wrote: > Agree with Chun, we should avoid races as much as we can. Several > suggestions to further improve Chun's version: > - Pause the clock in the beginnig. We can then sleep for 42 minutes to > make sure the first message is still being processes when `http::get()` is > invoked (will of course require `settle()` before). > - `settle()` after `http::get` to ensure `/__processes__` handler is put > into the `pid`'s mailbaox. > - Check that the resulted JSON does not include the entry for `pid`, > i.e., is empty. > > Chun-Hung Hsiao wrote: > I'm not sure if adding settles works: > 1. Adding a `settle` before `http::get` makes it wait for the completion > of the sleep. > 2. Adding a `settle` after `http::get` makes `terminate` wait for the > completion of the sleep. > IIUC both will increase the likelihood of the test passing w/o che fix.
Thanks guys, I will upload a test that doesn't rely on racing, we could also go with Chun's simpler test which relies on racing but will only rarely not exercise what we want. > On May 3, 2019, 11:31 p.m., Chun-Hung Hsiao wrote: > > 3rdparty/libprocess/src/tests/process_tests.cpp > > Lines 2110-2112 (patched) > > <https://reviews.apache.org/r/70595/diff/1/?file=2142838#file2142838line2110> > > > > `AWAIT_EXPECT_RESPONSE_STATUS_EQ(http::OK().status, response);` Now that I look at it, it looks less clear/readable with the long macro. - Benjamin ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/70595/#review215035 ----------------------------------------------------------- On May 3, 2019, 7:58 p.m., Benjamin Mahler wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/70595/ > ----------------------------------------------------------- > > (Updated May 3, 2019, 7:58 p.m.) > > > Review request for mesos, Alexander Rukletsov and Chun-Hung Hsiao. > > > Bugs: MESOS-9766 > https://issues.apache.org/jira/browse/MESOS-9766 > > > Repository: mesos > > > Description > ------- > > This test fails on master prior to applying the fix for MESOS-9766. > It attempts to ensure that processes are terminated after the > /__processes__ handler dispatches to them. > > > Diffs > ----- > > 3rdparty/libprocess/src/tests/process_tests.cpp > 60f3dd653153c2b2ccf9c3a7eae8c75fd6ff025c > > > Diff: https://reviews.apache.org/r/70595/diff/1/ > > > Testing > ------- > > Ran in repetition, although it appears to consistently fail on master without > repetition needed. > > > Thanks, > > Benjamin Mahler > >