-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30514/#review71021
-----------------------------------------------------------


Very nice! First high level pass, haven't looked at the tests yet.


src/master/master.cpp
<https://reviews.apache.org/r/30514/#comment116535>

    Short-circuiting with a pre-increment seems pretty tricky, how about:
    
    ```
    if (pinged) {
      // No pong was received within the timeout!
      timeouts++;
      
      if (timeouts >= MAX_SLAVE_PING_TIMEOUTS) {
        shutdown();
      }
    }
    ```



src/master/master.cpp
<https://reviews.apache.org/r/30514/#comment116545>

    It looks like we only log when we're shutting down the slave, when there's 
no rate limiter?
    
    Have you considered having a single shutdown code path? This gets you a 
single dispatch, and a single log statement and it looks like it's less control 
flow?
    
    ```
      void shutdown()
      {
        if (shuttingDown.isNone()) {
          Future<Nothing> future = Nothing();
    
          if (limiter.isSome()) {
            LOG(INFO) << "Scheduling shutdown of slave " << slaveId
                      << " due to health check timeout";
    
            future = limiter.get()->acquire();
          }
    
          shuttingDown = future.onAny(defer(self(), &Self::_shutdown));
    
          ++metrics->slave_shutdowns_scheduled;
        }
      }
    
      void _shutdown()
      {
        CHECK_SOME(shuttingDown);
    
        const Future<Nothing>& future = shuttingDown.get();
    
        CHECK(!future.isFailed()) << future.failure();
    
        if (future.isReady()) {
          LOG(INFO) << "Shutting down slave " << slaveId
                    << " due to health check timeout";
    
          dispatch(master,
                   &Master::shutdownSlave,
                   slaveId,
                   "health check timed out");
        } else if (future.isDiscarded()) {
          // Shutdown was canceled.
          LOG(INFO) << "Canceling shutdown of slave " << slaveId
                    << " since a pong is received!";
    
          ++metrics->slave_shutdowns_canceled;
        }
    
        shuttingDown = None();
      }
    ```



src/master/master.cpp
<https://reviews.apache.org/r/30514/#comment116548>

    Whoops, indentation?


- Ben Mahler


On Feb. 4, 2015, 1:51 a.m., Vinod Kone wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/30514/
> -----------------------------------------------------------
> 
> (Updated Feb. 4, 2015, 1:51 a.m.)
> 
> 
> Review request for mesos, Ben Mahler, David Robinson, and Jie Yu.
> 
> 
> Bugs: MESOS-1148
>     https://issues.apache.org/jira/browse/MESOS-1148
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> The algorithm is simple. All the slave observers share a rate limiter whose 
> rate is configured via command line. When a slave times out on health check, 
> a permit is acquired to shutdown the slave *but* the pings are continued to 
> be sent. If a pong arrives before the permit is satisifed, the shutdown is 
> cancelled.
> 
> 
> Diffs
> -----
> 
>   src/master/flags.hpp 6c18a1af625311ef149b5877b08f63c2b12c040d 
>   src/master/master.hpp cd37ee9d3c57bcd91f08cd402ec1e4a09d9e42ee 
>   src/master/master.cpp d04b2c4041d8fe8978b877f07579a6f907903e1b 
>   src/tests/partition_tests.cpp fea78016268b007590516798eb30ff423fd0ae58 
>   src/tests/slave_tests.cpp e7e2af63da785644f3f7e6e23607c02be962a2c6 
> 
> Diff: https://reviews.apache.org/r/30514/diff/
> 
> 
> Testing
> -------
> 
> make check
> 
> Ran the new tests 100 times
> 
> 
> Thanks,
> 
> Vinod Kone
> 
>

Reply via email to