> On March 6, 2014, 11:58 p.m., Ben Mahler wrote: > > src/log/recover.cpp, lines 132-134 > > <https://reviews.apache.org/r/18584/diff/2/?file=512999#file512999line132> > > > > How do you feel about pulling the watch and broadcast apart: > > > > chain = watch() > > .then(defer(self(), &Self::broadcast)) > > .then(defer(self(), &Self::receive, lambda::_1)) > > .onAny(defer(self(), &Self::finished, lambda::_1)); > > > > > > Then we could make '_broadcast' only do the setting > > > > Future<Nothing> broadcasted() > > { > > responses = _responses; > > > > // Reset the counters. > > responsesReceived.clear(); > > lowestBeginPosition = None(); > > highestEndPosition = None(); > > > > return Nothing(); > > } > > > > This allows 'handle' to become 'receive' and '_handle' to become > > 'received': > > > > Future<Option<RecoverResponse> > receive() > > { > > // Instead of using a for loop here, we use select to process > > // responses one after another so that we can ignore the rest if > > // we have collected enough responses. > > return select(responses) > > .then(defer(self(), &Self::received, lambda::_1)); > > } > > > >
Liked it. Done. > On March 6, 2014, 11:58 p.m., Ben Mahler wrote: > > src/log/recover.cpp, line 179 > > <https://reviews.apache.org/r/18584/diff/2/?file=512999#file512999line179> > > > > Is there a race between watch() and broadcast() that can cause us to > > select() on an empty set and get stuck? > > > > watch() -> returns size == quorum > > // Now ZK blip, 0 network members. > > broadcast() -> returns empty set > > select(empty_set) -> stuck pending? > > > > Is this safe? Is there a timeout built-in at a higher layer to bail if > > we get stuck? Solved. > On March 6, 2014, 11:58 p.m., Ben Mahler wrote: > > src/log/recover.cpp, lines 415-418 > > <https://reviews.apache.org/r/18584/diff/2/?file=512999#file512999line415> > > > > Do we still need this now that finalize() is a no-op? I prefer to add a LOG line indicating the termination of the recover process. > On March 6, 2014, 11:58 p.m., Ben Mahler wrote: > > src/log/recover.cpp, lines 496-511 > > <https://reviews.apache.org/r/18584/diff/2/?file=512999#file512999line496> > > > > Do we need to have these helpers that implicitly set 'status' instead > > of just returning the Metadata::Status? > > > > It looks like we can remove the optimization in updateReplicaStatus and > > avoid the need to store a 'status' member. > > > > If the optimization is important, Replica::update(Metadata::Status) > > could avoid the write if there's no change, right? Liked it. Done. - Jie ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/18584/#review36447 ----------------------------------------------------------- On March 6, 2014, 10:24 p.m., Jie Yu wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/18584/ > ----------------------------------------------------------- > > (Updated March 6, 2014, 10:24 p.m.) > > > Review request for mesos, Benjamin Hindman and Ben Mahler. > > > Bugs: MESOS-984 > https://issues.apache.org/jira/browse/MESOS-984 > > > Repository: mesos-git > > > Description > ------- > > Previously, we use onAny to connect most of the steps. As a result, we see a > lot of the error checking code like the following: > > void checked(const Future<Metadata::Status>& future) > { > if (!future.isReady()) { > promise.fail( > future.isFailed() ? > "Failed to get replica status: " + future.failure() : > "Not expecting discarded future"); > terminate(self()); > return; > } > ... > } > > Another drawback is: discarding becomes difficult and hard to reason. > > Now, I cleaned up the code to use the continuation style (i.e., chaining > using Future.then). The code becomes much more clear now! > > Also, I factored out the recover protocol part from the log recover process > to make the log recover process less cumbersome. > > This patch does not change any semantics, but is for MESOS-984 (log auto > initialization for registrar). Will follow up with a patch to integrate the > log auto initialization based on this patch. > > > Diffs > ----- > > src/log/recover.cpp e611a4e > > Diff: https://reviews.apache.org/r/18584/diff/ > > > Testing > ------- > > make check > > > Thanks, > > Jie Yu > >
