----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/38645/#review100066 -----------------------------------------------------------
For the sake of repro'ing, maybe you could add a sleep before waiting on the future? Obviously not something we want in the actual patch though. - Neil Conway On Sept. 22, 2015, 8:46 p.m., Anand Mazumdar wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/38645/ > ----------------------------------------------------------- > > (Updated Sept. 22, 2015, 8:46 p.m.) > > > Review request for mesos, Isabel Jimenez and Vinod Kone. > > > Repository: mesos > > > Description > ------- > > This showed up on ASF CI. From the logs: > > `I0922 17:31:49.819221 28463 slave.cpp:4104] Finished recoveryā€¯` > Then .... > `../../src/tests/executor_http_api_tests.cpp:290: Failure` > `Failed to wait 15secs for __recover` > > Instead of doing a `FUTURE_DISPATCH` after `StartSlave()` we should be doing > it before starting the slave. In some cases, slave would have already > recovered by the time we invoke `FUTURE_DISPATCH` leading to the flakiness. > > > Diffs > ----- > > src/tests/executor_http_api_tests.cpp > 9dbc5191b5950df2faa693720f3740e97c7df758 > > Diff: https://reviews.apache.org/r/38645/diff/ > > > Testing > ------- > > I was not able to reproduce it before or after this change but looking at the > logs it is quite obvious what the issue was. Ran in a loop 100 times. > > ASF CI error log: > https://builds.apache.org/job/Mesos/COMPILER=gcc,CONFIGURATION=--verbose,OS=ubuntu%3A14.04,label_exp=docker%7C%7CHadoop/839/changes > > > Thanks, > > Anand Mazumdar > >