Re: Spark (Streaming?) holding on to Mesos resources

2015-01-29 Thread Gerard Maas
Thanks a lot. After reading Mesos-1688, I still don't understand how/why a job will hoard and hold on to so many resources even in the presence of that bug. Looking at the release notes, I think this ticket could be relevant to preventing the behavior we're seeing: [MESOS-186] - Resource offers

Re: Spark (Streaming?) holding on to Mesos resources

2015-01-27 Thread Tim Chen
Hi Gerard, As others has mentioned I believe you're hitting Mesos-1688, can you upgrade to the latest Mesos release (0.21.1) and let us know if it resolves your problem? Thanks, Tim On Tue, Jan 27, 2015 at 10:39 AM, Sam Bessalah samkiller@gmail.com wrote: Hi Geraard, isn't this the same

Re: Spark (Streaming?) holding on to Mesos resources

2015-01-27 Thread Sam Bessalah
Hi Geraard, isn't this the same issueas this? https://issues.apache.org/jira/browse/MESOS-1688 On Mon, Jan 26, 2015 at 9:17 PM, Gerard Maas gerard.m...@gmail.com wrote: Hi, We are observing with certain regularity that our Spark jobs, as Mesos framework, are hoarding resources and not

Re: Spark (Streaming?) holding on to Mesos Resources

2015-01-27 Thread Adam Bordelon
Hopefully some very bad ugly bug that has been fixed already and that will urge us to upgrade our infra? Mesos 0.20 + Marathon 0.7.4 + Spark 1.1.0 Could be https://issues.apache.org/jira/browse/MESOS-1688 (fixed in Mesos 0.21) On Mon, Jan 26, 2015 at 2:45 PM, Gerard Maas gerard.m...@gmail.com

Spark (Streaming?) holding on to Mesos resources

2015-01-26 Thread Gerard Maas
Hi, We are observing with certain regularity that our Spark jobs, as Mesos framework, are hoarding resources and not releasing them, resulting in resource starvation to all jobs running on the Mesos cluster. For example: This is a job that has spark.cores.max = 4 and spark.executor.memory=3g

Spark (Streaming?) holding on to Mesos Resources

2015-01-26 Thread Gerard Maas
(looks like the list didn't like a HTML table on the previous email. My excuses for any duplicates) Hi, We are observing with certain regularity that our Spark jobs, as Mesos framework, are hoarding resources and not releasing them, resulting in resource starvation to all jobs running on the

Re: Spark (Streaming?) holding on to Mesos Resources

2015-01-26 Thread Jörn Franke
Hi, What do your jobs do? Ideally post source code, but some description would already helpful to support you. Memory leaks can have several reasons - it may not be Spark at all. Thank you. Le 26 janv. 2015 22:28, Gerard Maas gerard.m...@gmail.com a écrit : (looks like the list didn't like

Re: Spark (Streaming?) holding on to Mesos Resources

2015-01-26 Thread Gerard Maas
Hi Jörn, A memory leak on the job would be contained within the resources reserved for it, wouldn't it? And the job holding resources is not always the same. Sometimes it's one of the Streaming jobs, sometimes it's a heavy batch job that runs every hour. Looks to me that whatever is causing the