-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/44258/#review121643
-----------------------------------------------------------




src/master/http.cpp (lines 1999 - 2000)
<https://reviews.apache.org/r/44258/#comment183392>

    I think we can just remove `master->updateUnavailability(id, updated[id]);` 
here, so other machine will `UP` and scheduler will be updated in next loop.
    
    Futhur more, we can avoid the loop for update. The draft code will be:
    
    ```
    hashmap<MachineID, Unavailability> updated;
    
    // delete the loop for "updated[id] = window.unavailability();"
    
    foreach (const mesos::maintenance::Window& window, schedule.windows()) {
      foreach (const MachineID& id, window.machine_ids()) {
        ...
        master->updateUnavailability(id, window.unavailability());
        updated[id] = window.unavailability();
      }
    }
    
    foreachkey (const MachineID& id, utils::copy(master->machines)) {
      if (updated.contains(id)) {
        continue;
      }
    
      // Transition each removed machine back to the `UP` mode and remove the
      // unavailability.
      master->machines[id].info.set_mode(MachineInfo::UP);
      master->updateUnavailability(id, None());
    }
    ```


- Klaus Ma


On March 2, 2016, 3:45 p.m., Guangya Liu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/44258/
> -----------------------------------------------------------
> 
> (Updated March 2, 2016, 3:45 p.m.)
> 
> 
> Review request for mesos, Anand Mazumdar, Joris Van Remoortere, and Joseph Wu.
> 
> 
> Bugs: MESOS-4831
>     https://issues.apache.org/jira/browse/MESOS-4831
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> There is a bug when setting host maintain with http endpoint: 
> https://github.com/apache/mesos/blob/master/src/master/http.cpp#L1987-L2021
> The logic is as this:
> 1) Get all host list from maintain window and put it to updated hashmap.
> 2) If the machine in was in updated was also in master->machines, call master 
> updateUnavailability to trigger recoverResources, updateUnavailability etc in 
> allocator
> 3) Otherwise, clear the unavailabity time window for the machine.
> 4) Update each new machines in updated to call master updateUnavailability
> 
> But the logic in step 4) is getting all machines from the schedule windows 
> but not the machines that is new to the cluster, this caused master get two 
> updateUnavailability calls for a machine in the updated hashmap.
> 
> The fix is filter machines in updated hashmap when handling new machines.
> 
> 
> Diffs
> -----
> 
>   src/master/http.cpp 5e9e28e904ba0045ee27eb828f47231632a91d74 
>   src/tests/master_maintenance_tests.cpp 
> 3faa8136cf57276295553910319480028f433e4c 
> 
> Diff: https://reviews.apache.org/r/44258/diff/
> 
> 
> Testing
> -------
> 
> make
> make check
>  ./bin/mesos-tests.sh --gtest_filter="MasterMaintenanceTest.*" --verbose
> 
> 
> Thanks,
> 
> Guangya Liu
> 
>

Reply via email to