Re: Review Request 18584: Cleaned up log recovery code to use continuation style.

Ben Mahler Thu, 06 Mar 2014 15:59:31 -0800

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18584/#review36447
-----------------------------------------------------------




src/log/recover.cpp
<https://reviews.apache.org/r/18584/#comment67422>

    How do you feel about pulling the watch and broadcast apart:
    
    chain = watch()
      .then(defer(self(), &Self::broadcast))
      .then(defer(self(), &Self::receive, lambda::_1))
      .onAny(defer(self(), &Self::finished, lambda::_1));
    
    
    Then we could make '_broadcast' only do the setting
    
    Future<Nothing> broadcasted()
    {
        responses = _responses;
    
        // Reset the counters.
        responsesReceived.clear();
        lowestBeginPosition = None();
        highestEndPosition = None();
    
        return Nothing();
    }
    
    This allows 'handle' to become 'receive' and '_handle' to become 'received':
    
      Future<Option<RecoverResponse> > receive()
      {
        // Instead of using a for loop here, we use select to process
        // responses one after another so that we can ignore the rest if
        // we have collected enough responses.
        return select(responses)
          .then(defer(self(), &Self::received, lambda::_1));
      }
    
    



src/log/recover.cpp
<https://reviews.apache.org/r/18584/#comment67427>

    Is there a race between watch() and broadcast() that can cause us to 
select() on an empty set and get stuck?
    
    watch() -> returns size == quorum
    // Now ZK blip, 0 network members.
    broadcast() -> returns empty set
    select(empty_set) -> stuck pending?
    
    Is this safe? Is there a timeout built-in at a higher layer to bail if we 
get stuck?



src/log/recover.cpp
<https://reviews.apache.org/r/18584/#comment67393>

    How about: "Starting replica recovery"



src/log/recover.cpp
<https://reviews.apache.org/r/18584/#comment67394>

    Do we still need this now that finalize() is a no-op?



src/log/recover.cpp
<https://reviews.apache.org/r/18584/#comment67406>

    Do we need to have these helpers that implicitly set 'status' instead of 
just returning the Metadata::Status?
    
    It looks like we can remove the optimization in updateReplicaStatus and 
avoid the need to store a 'status' member.
    
    If the optimization is important, Replica::update(Metadata::Status) could 
avoid the write if there's no change, right?



src/log/recover.cpp
<https://reviews.apache.org/r/18584/#comment67408>

    Can you move discard up to the beginning of the private block?


- Ben Mahler


On March 6, 2014, 10:24 p.m., Jie Yu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/18584/
> -----------------------------------------------------------
> 
> (Updated March 6, 2014, 10:24 p.m.)
> 
> 
> Review request for mesos, Benjamin Hindman and Ben Mahler.
> 
> 
> Bugs: MESOS-984
>     https://issues.apache.org/jira/browse/MESOS-984
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> Previously, we use onAny to connect most of the steps. As a result, we see a 
> lot of the error checking code like the following:
> 
> void checked(const Future<Metadata::Status>& future)  
> {
>   if (!future.isReady()) {
>     promise.fail(
>         future.isFailed() ?
>         "Failed to get replica status: " + future.failure() :
>         "Not expecting discarded future");
>     terminate(self());
>     return;
>   }
>   ...
> }
> 
> Another drawback is: discarding becomes difficult and hard to reason.
> 
> Now, I cleaned up the code to use the continuation style (i.e., chaining 
> using Future.then). The code becomes much more clear now!
> 
> Also, I factored out the recover protocol part from the log recover process 
> to make the log recover process less cumbersome.
> 
> This patch does not change any semantics, but is for MESOS-984 (log auto 
> initialization for registrar). Will follow up with a patch to integrate the 
> log auto initialization based on this patch.
> 
> 
> Diffs
> -----
> 
>   src/log/recover.cpp e611a4e 
> 
> Diff: https://reviews.apache.org/r/18584/diff/
> 
> 
> Testing
> -------
> 
> make check
> 
> 
> Thanks,
> 
> Jie Yu
> 
>

Re: Review Request 18584: Cleaned up log recovery code to use continuation style.

Reply via email to