-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25035/#review52048
-----------------------------------------------------------


Mind writing a test for this in allocator_tests.cpp?


src/master/hierarchical_allocator_process.hpp
<https://reviews.apache.org/r/25035/#comment90782>

    I suggest to delete this comment altogether because frameworks can utilize 
offers with either no memory or no cpus based on how they allocate resources 
between executors and tasks. Also, change the code to 
    
    ```
    return (cpus.isSome() && cpus.get() >= MIN_CPUS) || 
           (mem.isSome() && mem.get() >= MIN_MEM);
    ```
    
    The important thing to note here is that executors should be launched with 
both cpus *and* memory. Mind adding a TODO in ResourceUsageChecker in 
master.cpp to that effect and log a warning? The reason we are doing a TODO and 
warning instead of fixing ResourceUsageChecker is to give frameworks (e.g., 
Spark) time to update their code to adhere to these new semantics. We will 
enforce this in the next release. Sounds good?


- Vinod Kone


On Sept. 2, 2014, 5:52 p.m., Martin Weindel wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/25035/
> -----------------------------------------------------------
> 
> (Updated Sept. 2, 2014, 5:52 p.m.)
> 
> 
> Review request for mesos and Vinod Kone.
> 
> 
> Bugs: MESOS-1688
>     https://issues.apache.org/jira/browse/MESOS-1688
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> As already explained in JIRA MESOS-1688, there are schedulers allocating 
> memory only for the executor and not for tasks. For tasks only CPU resources 
> are allocated in this case.
> Such a scheduler does not get offered any idle CPUs if the slave has nearly 
> used up all memory.
> This can easily lead to a dead lock (in the application, not in Mesos).
> 
> Simple example:
> 1. Scheduler allocates all memory of a slave for an executor
> 2. Scheduler launches a task for this executor (allocating 1 CPU)
> 3. Task finishes: 1 CPU , 0 MB memory allocatable.
> 4. No offers are made, as no memory is left. Scheduler will wait for offers 
> forever. Dead lock in the application.
> 
> To fix this problem, offers must be made if CPU resources are allocatable 
> without considering allocatable memory
> 
> 
> Diffs
> -----
> 
>   src/master/hierarchical_allocator_process.hpp 
> 34f8cd658920b36b1062bd3b7f6bfbd1bcb6bb52 
> 
> Diff: https://reviews.apache.org/r/25035/diff/
> 
> 
> Testing
> -------
> 
> Deployed patched Mesos 0.19.1 on a small cluster with 3 slaves and tested 
> running multiple parallel Spark jobs in "fine-grained" mode to saturate 
> allocatable memory. The jobs run fine now. This load always caused a dead 
> lock in all Spark jobs within one minute with the unpatched Mesos.
> 
> 
> Thanks,
> 
> Martin Weindel
> 
>

Reply via email to