----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/25035/ -----------------------------------------------------------
(Updated Sept. 17, 2014, 6:36 p.m.) Review request for mesos and Vinod Kone. Changes ------- Updated the summary. Also edited the CHANGELOG to point to a new ticket regarding deprecation. I'll commit this now. Summary (updated) ----------------- Updated allocator to offer cpu only or memory only resources. Bugs: MESOS-1688 https://issues.apache.org/jira/browse/MESOS-1688 Repository: mesos-git Description ------- As already explained in JIRA MESOS-1688, there are schedulers allocating memory only for the executor and not for tasks. For tasks only CPU resources are allocated in this case. Such a scheduler does not get offered any idle CPUs if the slave has nearly used up all memory. This can easily lead to a dead lock (in the application, not in Mesos). Simple example: 1. Scheduler allocates all memory of a slave for an executor 2. Scheduler launches a task for this executor (allocating 1 CPU) 3. Task finishes: 1 CPU , 0 MB memory allocatable. 4. No offers are made, as no memory is left. Scheduler will wait for offers forever. Dead lock in the application. To fix this problem, offers must be made if CPU resources are allocatable without considering allocatable memory Diffs ----- CHANGELOG a822cc4 src/common/resources.cpp edf36b1 src/master/constants.cpp faa1503 src/master/hierarchical_allocator_process.hpp 34f8cd6 src/master/master.cpp 18464ba src/tests/allocator_tests.cpp 774528a Diff: https://reviews.apache.org/r/25035/diff/ Testing ------- Deployed patched Mesos 0.19.1 on a small cluster with 3 slaves and tested running multiple parallel Spark jobs in "fine-grained" mode to saturate allocatable memory. The jobs run fine now. This load always caused a dead lock in all Spark jobs within one minute with the unpatched Mesos. Thanks, Martin Weindel