Re: dynamic allocation w/ spark streaming on mesos?

2015-11-11 Thread PhuDuc Nguyen
Dean, Thanks for the reply. I'm searching (via spark mailing list archive and google) and can't find the previous thread you mentioned. I've stumbled upon a few but may not be the thread you're referring to. I'm very interested in reading that discussion and any links/keywords would be greatly

Re: dynamic allocation w/ spark streaming on mesos?

2015-11-11 Thread Dean Wampler
Dynamic allocation doesn't work yet with Spark Streaming in any cluster scenario. There was a previous thread on this topic which discusses the issues that need to be resolved. Dean Wampler, Ph.D. Author: Programming Scala, 2nd Edition (O'Reilly)

dynamic allocation w/ spark streaming on mesos?

2015-11-11 Thread PhuDuc Nguyen
I'm trying to get Spark Streaming to scale up/down its number of executors within Mesos based on workload. It's not scaling down. I'm using Spark 1.5.1 reading from Kafka using the direct (receiver-less) approach. Based on this ticket https://issues.apache.org/jira/browse/SPARK-6287 with the

Re: dynamic allocation w/ spark streaming on mesos?

2015-11-11 Thread Saisai Shao
I think for receiver-less Streaming connectors like direct Kafka input stream or hdfs connector, dynamic allocation could be worked compared to other receiver-based streaming connectors, since for receiver-less connectors, the behavior of streaming app is more like a normal Spark app, so dynamic

Re: dynamic allocation w/ spark streaming on mesos?

2015-11-11 Thread Saisai Shao
Yeah, agreed. Only for some extreme streaming workload we designed to fit the pattern of dynamic allocation that could be worked very well. In normal cases, no executor will remain idle for long time, so frequently scale up and ramp down of executors will bring large overhead and latency to

Re: dynamic allocation w/ spark streaming on mesos?

2015-11-11 Thread Tathagata Das
The reason the existing dynamic allocation does not work out of the box for spark streaming is because the heuristics used for decided when to scale up/down is not the right one for micro-batch workloads. It works great for typical batch workloads. However you can use the underlying developer API

Re: dynamic allocation w/ spark streaming on mesos?

2015-11-11 Thread PhuDuc Nguyen
Awesome, thanks for the tip! On Wed, Nov 11, 2015 at 2:25 PM, Tathagata Das wrote: > The reason the existing dynamic allocation does not work out of the box > for spark streaming is because the heuristics used for decided when to > scale up/down is not the right one for