Dean,

Thanks for the reply. I'm searching (via spark mailing list archive and
google) and can't find the previous thread you mentioned. I've stumbled
upon a few but may not be the thread you're referring to. I'm very
interested in reading that discussion and any links/keywords would be
greatly appreciated.

I can see it's a non-trivial problem to solve for every use case in
streaming and thus not yet supported in general. However, I think (maybe
naively) it can be solved for specific use cases. If I use the available
features to create a fault tolerant design - i.e. failures/dead nodes can
occur on master nodes, driver node, or executor nodes without data loss and
"at-least-once" semantics is acceptable - then can't I safely scale down in
streaming by killing executors? If this is not true, then are we saying
that streaming is not fault tolerant?

I know it won't be as simple as setting a config like
spark.dyanmicAllocation.enabled=true and magically we'll have elastic
streaming, but I'm interested if anyone else has attempted to solve this
for their specific use case with extra coding involved? Pitfalls? Thoughts?

thanks,
Duc




On Wed, Nov 11, 2015 at 8:36 AM, Dean Wampler <deanwamp...@gmail.com> wrote:

> Dynamic allocation doesn't work yet with Spark Streaming in any cluster
> scenario. There was a previous thread on this topic which discusses the
> issues that need to be resolved.
>
> Dean Wampler, Ph.D.
> Author: Programming Scala, 2nd Edition
> <http://shop.oreilly.com/product/0636920033073.do> (O'Reilly)
> Typesafe <http://typesafe.com>
> @deanwampler <http://twitter.com/deanwampler>
> http://polyglotprogramming.com
>
> On Wed, Nov 11, 2015 at 8:09 AM, PhuDuc Nguyen <duc.was.h...@gmail.com>
> wrote:
>
>> I'm trying to get Spark Streaming to scale up/down its number of
>> executors within Mesos based on workload. It's not scaling down. I'm using
>> Spark 1.5.1 reading from Kafka using the direct (receiver-less) approach.
>>
>> Based on this ticket https://issues.apache.org/jira/browse/SPARK-6287
>> with the right configuration, I have a simple example working with the
>> spark-shell connected to a Mesos cluster. By working I mean the number of
>> executors scales up/down based on workload. However, the spark-shell is not
>> a streaming example.
>>
>> What is that status of dynamic resource allocation with Spark Streaming
>> on Mesos? Is it supported at all? Or supported but with some caveats to
>> ensure no data loss?
>>
>> thanks,
>> Duc
>>
>
>

Reply via email to