Yes.. Auto restart is enabled in my low level consumer ..when there is some unhandled exception comes...
Even if you see KafkaConsumer.java, for some cases ( like broker failure, kafka leader changes etc ) it can even refresh the Consumer (The Coordinator which talks to a Leader) which will recover from those failures.. Dib On Mon, Mar 16, 2015 at 1:40 PM, Jun Yang <yangjun...@gmail.com> wrote: > I have checked Dibyendu's code, it looks that his implementation has > auto-restart mechanism: > > > -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- > src/main/java/consumer/kafka/client/KafkaReceiver.java: > > private void start() { > > // Start the thread that receives data over a connection > KafkaConfig kafkaConfig = new KafkaConfig(_props); > ZkState zkState = new ZkState(kafkaConfig); > _kConsumer = new KafkaConsumer(kafkaConfig, zkState, this); > _kConsumer.open(_partitionId); > > Thread.UncaughtExceptionHandler eh = new > Thread.UncaughtExceptionHandler() { > public void uncaughtException(Thread th, Throwable ex) { > restart("Restarting Receiver for Partition " + _partitionId , > ex, 5000); > } > }; > > _consumerThread = new Thread(_kConsumer); > _consumerThread.setDaemon(true); > _consumerThread.setUncaughtExceptionHandler(eh); > _consumerThread.start(); > } > > -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- > I also checked Spark's native Kafka Receiver implementation, and it looks > not have any auto-restart support. > > Any comments from Dibyendu? > > On Mon, Mar 16, 2015 at 3:39 PM, Akhil Das <ak...@sigmoidanalytics.com> > wrote: > >> As i seen, once i kill my receiver on one machine, it will automatically >> spawn another receiver on another machine or on the same machine. >> >> Thanks >> Best Regards >> >> On Mon, Mar 16, 2015 at 1:08 PM, Jun Yang <yangjun...@gmail.com> wrote: >> >>> Dibyendu, >>> >>> Thanks for the reply. >>> >>> I am reading your project homepage now. >>> >>> One quick question I care about is: >>> >>> If the receivers failed for some reasons(for example, killed brutally by >>> someone else), is there any mechanism for the receiver to fail over >>> automatically? >>> >>> On Mon, Mar 16, 2015 at 3:25 PM, Dibyendu Bhattacharya < >>> dibyendu.bhattach...@gmail.com> wrote: >>> >>>> Which version of Spark you are running ? >>>> >>>> You can try this Low Level Consumer : >>>> http://spark-packages.org/package/dibbhatt/kafka-spark-consumer >>>> >>>> This is designed to recover from various failures and have very good >>>> fault recovery mechanism built in. This is being used by many users and at >>>> present we at Pearson running this Receiver in Production for almost 3 >>>> months without any issue. >>>> >>>> You can give this a try. >>>> >>>> Regards, >>>> Dibyendu >>>> >>>> On Mon, Mar 16, 2015 at 12:47 PM, Akhil Das <ak...@sigmoidanalytics.com >>>> > wrote: >>>> >>>>> You need to figure out why the receivers failed in the first place. >>>>> Look in your worker logs and see what really happened. When you run a >>>>> streaming job continuously for longer period mostly there'll be a lot of >>>>> logs (you can enable log rotation etc.) and if you are doing a groupBy, >>>>> join, etc type of operations, then there will be a lot of shuffle data. So >>>>> You need to check in the worker logs and see what happened (whether DISK >>>>> full etc.), We have streaming pipelines running for weeks without having >>>>> any issues. >>>>> >>>>> Thanks >>>>> Best Regards >>>>> >>>>> On Mon, Mar 16, 2015 at 12:40 PM, Jun Yang <yangjun...@gmail.com> >>>>> wrote: >>>>> >>>>>> Guys, >>>>>> >>>>>> We have a project which builds upon Spark streaming. >>>>>> >>>>>> We use Kafka as the input stream, and create 5 receivers. >>>>>> >>>>>> When this application runs for around 90 hour, all the 5 receivers >>>>>> failed for some unknown reasons. >>>>>> >>>>>> In my understanding, it is not guaranteed that Spark streaming >>>>>> receiver will do fault recovery automatically. >>>>>> >>>>>> So I just want to figure out a way for doing fault-recovery to deal >>>>>> with receiver failure. >>>>>> >>>>>> There is a JIRA post mentioned using StreamingLister for monitoring >>>>>> the status of receiver: >>>>>> >>>>>> >>>>>> https://issues.apache.org/jira/browse/SPARK-2381?focusedCommentId=14056836&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14056836 >>>>>> >>>>>> However I haven't found any open doc about how to do this stuff. >>>>>> >>>>>> Any guys have met the same issue and deal with it? >>>>>> >>>>>> Our environment: >>>>>> Spark 1.3.0 >>>>>> Dual Master Configuration >>>>>> Kafka 0.8.2 >>>>>> >>>>>> Thanks >>>>>> >>>>>> -- >>>>>> yangjun...@gmail.com >>>>>> http://hi.baidu.com/yjpro >>>>>> >>>>> >>>>> >>>> >>> >>> >>> -- >>> yangjun...@gmail.com >>> http://hi.baidu.com/yjpro >>> >> >> > > > -- > yangjun...@gmail.com > http://hi.baidu.com/yjpro >