Spark Streaming handling Kafka exceptions

2017-07-17 Thread Jean-Francois Gosselin
How can I handle an error with Kafka with my DirectStream (network issue, zookeeper or broker going down) ? For example when the consumer fails to connect with Kafka (at startup) I only get a DEBUG log (not even an ERROR) and no exception are thrown ... I'm using Spark 2.1.1 and

Re: Kafka Exceptions

2016-06-13 Thread Cody Koeninger
Is the exception on the driver or the executor? To be clear, you're going to see a task fail if a partition changes leader while the task is running, regardless of configuration settings. The task should be retried up the maxFailures though. What are maxRetries and maxFailures set to? How

Re: Kafka Exceptions

2016-06-13 Thread Bryan Jeffrey
Cody, We already set the maxRetries. We're still seeing issue - when leader is shifted, for example, it does not appear that direct stream reader correctly handles this. We're running 1.6.1. Bryan Jeffrey On Mon, Jun 13, 2016 at 10:37 AM, Cody Koeninger wrote: >

Re: Kafka Exceptions

2016-06-13 Thread Cody Koeninger
http://spark.apache.org/docs/latest/configuration.html spark.streaming.kafka.maxRetries spark.task.maxFailures On Mon, Jun 13, 2016 at 8:25 AM, Bryan Jeffrey wrote: > All, > > We're running a Spark job that is consuming data from a large Kafka cluster > using the

Kafka Exceptions

2016-06-13 Thread Bryan Jeffrey
All, We're running a Spark job that is consuming data from a large Kafka cluster using the Direct Stream receiver. We're seeing intermittent NotLeaderForPartitionExceptions when the leader is moved to another broker. Currently even with retry enabled we're seeing the job fail at this exception.