Re: Framework Starvation

2014-06-17 Thread Claudiu Barbura
Hi Vinod, Yo are looking at logs I had posted before we implemented our fix (files attached in my last email). I will write a detailed blog post on the issue … after the Spark Summit at the end of this month. What wold happen before is that frameworks with the same share (0) would also have th

Re: Framework Starvation

2014-06-17 Thread Vinod Kone
Hey Claudiu, I spent some time trying to understand the logs you posted. Whats strange to me is that in the very beginning when framework's 1 and 2 are registered, only one framework gets offers for a period of 9s. It's not clear why this happens. I even wrote a test ( https://reviews.apache.org/r

Re: cgroups memory isolation

2014-06-17 Thread Ian Downes
Hello Thomas, Your impression is mostly correct: the kernel will *try* to reclaim memory by writing out dirty pages before killing processes in a cgroup but if it's unable to reclaim sufficient pages within some interval (I don't recall this off-hand) then it will start killing things. We observe

Re: cgroups memory isolation

2014-06-17 Thread Benjamin Mahler
+Ian Downes who is knowledgable about the OOMing behavior in the kernel. >From your logs, it doesn't look related to MESOS-762, that was a bug in 0.14.0. On Tue, Jun 17, 2014 at 2:23 PM, Thomas Petr wrote: > Hello, > > We're running Mesos 0.18.0 with cgroups isolation, and have run into > situ

cgroups memory isolation

2014-06-17 Thread Thomas Petr
Hello, We're running Mesos 0.18.0 with cgroups isolation, and have run into situations where lots of file I/O causes tasks to be killed due to exceeding memory limits. Here's an example: https://gist.github.com/tpetr/ce5d80a0de9f713765f0 We were under the impression that if cache was using a lot

Re: ZooKeeper C++ Client

2014-06-17 Thread Yan Xu
Hi Scott, Yes Apache Mesos uses ZooKeeper for leader election and membership detection. You can take a look at https://github.com/apache/mesos/tree/master/src/zookeeper to see if it suits your needs and send your questions to user@mesos.apache.org Even though we don't currently distribute our zoo