Re: Long running streaming application - worker death

Ashwin Giridharan Sun, 26 Jul 2015 08:47:59 -0700

Hi Aviemzur,

As of now the Spark workers just run indefinitely in a loop irrespective of
whether the data source (kafka) is active or lost its connection, due to
the fact that it just reads the zookeeper for the offset of the data to be
consumed. So when your DStream receiver is lost, its LOST!


As mentioned here (
http://www.michael-noll.com/blog/2014/10/01/kafka-spark-streaming-integration-example-tutorial/)
,

"One crude workaround is to restart your streaming application whenever it
runs into an upstream data source failure or a receiver failure. This
workaround may not help you though if your use case requires you to set the
Kafka configuration option auto.offset.reset to “smallest” – because of a
known bug in Spark Streaming the resulting behavior of your streaming
application may not be what you want."

The jira "https://spark-project.atlassian.net/browse/SPARK-1340";
corresponding to this bug is yet to be resolved.

Also have a look at
http://apache-spark-user-list.1001560.n3.nabble.com/spark-streaming-and-the-spark-shell-td3347.html

Thanks,
Ashwin


On Sun, Jul 26, 2015 at 9:29 AM, aviemzur <aviem...@gmail.com> wrote:

> Hi all,
>
> I have a question about long running streaming applications and workers
> that
> act as consumers.
>
> Specifically my program runs on a spark standalone cluster with a small
> number of workers, acting as kafka consumers using spark streaming.
>
> What I noticed was that in a long running application, if one of the
> workers
> dies for some reason and then a new worker registers to replace it, we have
> effectively lost that worker as a consumer.
>
> When the driver first runs, I create a configured amount of
> KafkaInputDStream instances, in my case, the same number as the number of
> workers in the cluster, and spark distributes these among the workers, so
> each one of my workers consumes from Kafka.
>
> I then unify the streams to a single stream using SparkStreamingContext
> union.
>
> This code never runs again though, and there is no code that monitors that
> we have X number of consumers at all time.
>
> So when a worker dies, we effectively lose a consumer, and never create a
> new one, then the lag in Kafka starts growing.
>
> Does anybody have a solution / ideas regarding this issue?
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/Long-running-streaming-application-worker-death-tp23997.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>


-- 
Thanks & Regards,
Ashwin Giridharan

Re: Long running streaming application - worker death

Reply via email to