[ https://issues.apache.org/jira/browse/MESOS-6907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Alexander Rojas reassigned MESOS-6907: -------------------------------------- Assignee: Alexander Rojas > FutureTest.After3 is flaky > -------------------------- > > Key: MESOS-6907 > URL: https://issues.apache.org/jira/browse/MESOS-6907 > Project: Mesos > Issue Type: Bug > Components: libprocess > Reporter: Alexander Rojas > Assignee: Alexander Rojas > > There is apparently a race condition between the time an instance of > {{Future<T>}} goes out of scope and when the enclosing data is actually > deleted, if {{Future<T>::after(Duration, lambda::function<Future<T>(const > Future<T>&)>)}} is called. > The issue is more likely to occur if the machine is under load or if it is > not a very powerful one. The easiest way to reproduce it is to run: > {code} > $ stress -c 4 -t 2600 -d 2 -i 2 & > $ ./libprocess-tests --gtest_filter="FutureTest.After3" --gtest_repeat=-1 > --gtest_break_on_failure > {code} > An exploratory fix for the issue is to change the test to: > {code} > TEST(FutureTest, After3) > { > Future<Nothing> future; > process::WeakFuture<Nothing> weak_future(future); > EXPECT_SOME(weak_future.get()); > { > Clock::pause(); > // The original future disappears here. After this call the > // original future goes out of scope and should not be reachable > // anymore. > future = future > .after(Milliseconds(1), [](Future<Nothing> f) { > f.discard(); > return Nothing(); > }); > Clock::advance(Seconds(2)); > Clock::settle(); > AWAIT_READY(future); > } > if (weak_future.get().isSome()) { > os::sleep(Seconds(1)); > } > EXPECT_NONE(weak_future.get()); > EXPECT_FALSE(future.hasDiscard()); > } > {code} > The interesting thing of the fix is that both extra snippets are needed > (either one or the other is not enough) to prevent the issue from happening. -- This message was sent by Atlassian JIRA (v6.3.4#6332)