----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/25035/#review52550 -----------------------------------------------------------
Patch looks great! Reviews applied: [25035] All tests passed. - Mesos ReviewBot On Sept. 6, 2014, 10:03 p.m., Martin Weindel wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/25035/ > ----------------------------------------------------------- > > (Updated Sept. 6, 2014, 10:03 p.m.) > > > Review request for mesos and Vinod Kone. > > > Bugs: MESOS-1688 > https://issues.apache.org/jira/browse/MESOS-1688 > > > Repository: mesos-git > > > Description > ------- > > As already explained in JIRA MESOS-1688, there are schedulers allocating > memory only for the executor and not for tasks. For tasks only CPU resources > are allocated in this case. > Such a scheduler does not get offered any idle CPUs if the slave has nearly > used up all memory. > This can easily lead to a dead lock (in the application, not in Mesos). > > Simple example: > 1. Scheduler allocates all memory of a slave for an executor > 2. Scheduler launches a task for this executor (allocating 1 CPU) > 3. Task finishes: 1 CPU , 0 MB memory allocatable. > 4. No offers are made, as no memory is left. Scheduler will wait for offers > forever. Dead lock in the application. > > To fix this problem, offers must be made if CPU resources are allocatable > without considering allocatable memory > > > Diffs > ----- > > src/common/resources.cpp edf36b1 > src/master/constants.hpp ce7995b > src/master/constants.cpp faa1503 > src/master/hierarchical_allocator_process.hpp 34f8cd6 > src/master/master.cpp 18464ba > src/tests/allocator_tests.cpp 774528a > > Diff: https://reviews.apache.org/r/25035/diff/ > > > Testing > ------- > > Deployed patched Mesos 0.19.1 on a small cluster with 3 slaves and tested > running multiple parallel Spark jobs in "fine-grained" mode to saturate > allocatable memory. The jobs run fine now. This load always caused a dead > lock in all Spark jobs within one minute with the unpatched Mesos. > > > Thanks, > > Martin Weindel > >