----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/70728/#review216741 -----------------------------------------------------------
src/tests/master_tests.cpp Line 9017 (original), 9017 (patched) <https://reviews.apache.org/r/70728/#comment303953> There are two data races here: 1. `updateSlaveMessage` being ready doesn't mean that the `SUBSCRIBED` has been received by the resource provider process. 2. This line and the `subscribed` function accesses `info` in two differen threads without a proper synchronization. I'd suggest that we have a `Future<ResourceProviderInfo>` or `Future<ResourceProviderID>` or `Future<Subscribed>` in `MockResourceProvider` that is set up in the `subscribed` callback, and we wait for the future here, then avoid accessing `process->info` directly in the remaining of this test. If you don't want a copy, a reference that is set up after the wait is also okay since it would be clear that we have validated the reference. Maybe something like ``` AWAIT_READY(resourceProvider->info()); const ResourceProviderID& resourceProviderId = resourceProvider->info().id(); ``` src/tests/master_tests.cpp Line 9246 (original), 9247 (patched) <https://reviews.apache.org/r/70728/#comment303955> Let's not run the actor's member functions in a different context: ``` dispatch(resourceProvider.process, &MockResourceProviderProcess::operationDefault, operation.get()); ``` src/tests/master_tests.cpp Line 9376 (original), 9377 (patched) <https://reviews.apache.org/r/70728/#comment303956> We need to have a proper synchronization here to avoid data races. See my comments in `UpdateSlaveMessageWithPendingOffers`. Also, let's avoid accessing `process->info` directly in the remaining of this test. src/tests/master_tests.cpp Line 9472 (original), 9473 (patched) <https://reviews.apache.org/r/70728/#comment303957> Let's not run the actor's member functions in a different context: ``` dispatch(resourceProvider.process, &MockResourceProviderProcess::operationDefault, applyOperation.get()); ``` src/tests/mesos.hpp Line 3030 (original), 3041 (patched) <https://reviews.apache.org/r/70728/#comment303942> No need to have a type alias here if you use `Self`. src/tests/mesos.hpp Line 3037 (original), 3048 (patched) <https://reviews.apache.org/r/70728/#comment303947> Not yours, but this template parameter is not used in the class template. src/tests/mesos.hpp Lines 3062 (patched) <https://reviews.apache.org/r/70728/#comment303943> Let's use `Self::connectDefault` instead. Ditto below. src/tests/mesos.hpp Line 3152 (original), 3168 (patched) <https://reviews.apache.org/r/70728/#comment303949> s/`MockResourceProviderProcessT`/`Self`/. Ditto below. src/tests/mesos.hpp Line 3270 (original), 3269 (patched) <https://reviews.apache.org/r/70728/#comment303948> Not yours, but since `Source` can be deduced from `Resoruce`, let's use `Resource::DiskInfo::Source` here and remove the `Source` template parameter above. src/tests/mesos.hpp Lines 3322 (patched) <https://reviews.apache.org/r/70728/#comment303951> Just a naming suggestion. Since `MockResourceProvider` is no longer a "mock" class (i.e., has no mock function), how about renaming it to `TestResourceProvider`, and the backing process becomes `TestResourceProviderProcess`? This is the pattern I used for `TestDiskProfileServer`. Feel free to drop this though. src/tests/mesos.hpp Lines 3335 (patched) <https://reviews.apache.org/r/70728/#comment303974> Let's do `process::terminate(*process)` instead and remove the `terminate` function. See below. src/tests/mesos.hpp Lines 3350 (patched) <https://reviews.apache.org/r/70728/#comment303973> The use of this `terminate` function in other tests are error prone to me. Instead, let's have a `stop` function in `MockResourceProviderProcess`: ``` void stop() { driver.reset(); } ``` And get rid of this function. See my comments in `ResubscribeResourceProvider`. Also, this function provides a second way to stop the resource provider other than `resourceProvider.reset()`, and we use both in tests. I'd suggesting unify them by removing this function. As of the name, I'd prefer `stop` over `terminate` to match the `start` call. src/tests/mesos.hpp Lines 3361-3372 (patched) <https://reviews.apache.org/r/70728/#comment303950> No need to introduce this type alias in this class template since the class template `MockResourceProviderProcess` is already public. src/tests/operation_reconciliation_tests.cpp Line 1760 (original), 1760 (patched) <https://reviews.apache.org/r/70728/#comment303958> We need to have a proper synchronization here to avoid data races. See my comments in `UpdateSlaveMessageWithPendingOffers`. src/tests/resource_provider_manager_tests.cpp Line 1167 (original), 1167 (patched) <https://reviews.apache.org/r/70728/#comment303959> We need to have a proper synchronization here to avoid data races. See my comments in `UpdateSlaveMessageWithPendingOffers`. src/tests/resource_provider_manager_tests.cpp Line 1195 (original), 1195 (patched) <https://reviews.apache.org/r/70728/#comment303970> We're accessing the local variable `resourceProvider` in its process context, which seems error prone. How about having a `MockResourceProviderProcess::stop` function that resets its driver, and invoke it here: ``` Invoke(*resourceProvder->process, &MockResourceProviderProcess::stop) ``` src/tests/resource_provider_manager_tests.cpp Line 1272 (original), 1272 (patched) <https://reviews.apache.org/r/70728/#comment303971> Let's avoid accessing the local variable `resourceProvider` in its process context: ``` Invoke(*resourceProvder->process, &MockResourceProviderProcess::stop) ``` See my comments in `ResubscribeResourceProvider`. src/tests/resource_provider_manager_tests.cpp Line 1342 (original), 1342 (patched) <https://reviews.apache.org/r/70728/#comment303960> We need to have a proper synchronization here to avoid data races. See my comments in `UpdateSlaveMessageWithPendingOffers`. src/tests/resource_provider_manager_tests.cpp Line 1421 (original), 1420 (patched) <https://reviews.apache.org/r/70728/#comment303972> Let's avoid accessing the local variable `resourceProvider1` in its process context: ``` Invoke(*resourceProvder1->process, &MockResourceProviderProcess::stop) ``` See my comments in `ResubscribeResourceProvider`. src/tests/slave_tests.cpp Line 10835 (original), 10836 (patched) <https://reviews.apache.org/r/70728/#comment303962> In this test, `applyOperation` being ready could indicate that `subscribed` has been called, which means that `info` has been set up, so no data race here. However, it would be nice to make these more explicit and avoid direct accesses to `process->info` in different threads. See my comments in `UpdateSlaveMessageWithPendingOffers`. src/tests/slave_tests.cpp Lines 10870 (patched) <https://reviews.apache.org/r/70728/#comment303964> Let's use `resourceProviderId` here instead. src/tests/slave_tests.cpp Line 11256 (original), 11258 (patched) <https://reviews.apache.org/r/70728/#comment303965> In this test, `operation` being ready could indicate that `subscribed` has been called, which means that `info` has been set up, so no data race here. However, it would be nice to make these more explicit and avoid direct accesses to `process->info` in the remaining of this test. See my comments in `UpdateSlaveMessageWithPendingOffers`. src/tests/slave_tests.cpp Line 11402 (original), 11406 (patched) <https://reviews.apache.org/r/70728/#comment303967> s/`CHECK`/`ASSERT_TRUE`/. We need to have a proper synchronization here to avoid data races. See my comments in `UpdateSlaveMessageWithPendingOffers`. Also, let's avoid accessing `process->info` directly in the remaining of this test. src/tests/slave_tests.cpp Line 11769 (original), 11775 (patched) <https://reviews.apache.org/r/70728/#comment303975> Let's make `resourceProvider` an `Owned` or `unique_ptr`, and do `resourceProvider.reset()` instead. - Chun-Hung Hsiao On May 27, 2019, 4:32 p.m., Benjamin Bannier wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/70728/ > ----------------------------------------------------------- > > (Updated May 27, 2019, 4:32 p.m.) > > > Review request for mesos, Chun-Hung Hsiao and Jan Schlicht. > > > Bugs: MESOS-9560 > https://issues.apache.org/jira/browse/MESOS-9560 > > > Repository: mesos > > > Description > ------- > > We were passing callbacks into `MockResourceProvider` to the HTTP > driver. Since the lifecycle of the callbacks and the mock provider were > decoupled and these callbacks were binding the mock provider instance > the code was not safe as written as the driver could invoke the callback > after the provider had been destructed. > > This patch makes sure that the callbacks are defered to a process. We > also dispatch a number of other functions which are strongly coupled to > the lifecycle of the provider. We still do not hide the provider away > completely so the provider can be mocked in tests. > > > Diffs > ----- > > src/tests/api_tests.cpp af1d215f00c8c2224e807677afb4af2d3521235a > src/tests/master_slave_reconciliation_tests.cpp > 7b6ac50adacc8416b91c0dde55ff7ba46a20515c > src/tests/master_tests.cpp 097f1b77a59e29c6690210773d1556ebf2bb701e > src/tests/mesos.hpp d0c82fa4c1800ed2a1825d563a9462ecd41b77fd > src/tests/operation_reconciliation_tests.cpp > eae318da2273faae904f0155e49bb23cdb24f007 > src/tests/resource_provider_manager_tests.cpp > 7d48f18e89f046df6c92e52edeef592acfb13b10 > src/tests/slave_tests.cpp c2035976713abb31b3646c0d23771fa40df93271 > > > Diff: https://reviews.apache.org/r/70728/diff/2/ > > > Testing > ------- > > `make check` > > > Thanks, > > Benjamin Bannier > >