Thanks for the info guys. For now I'm using the high level consumer i will give this one a try.
As far as the queries are concerned, check pointing helps. I'm still no t sure whats the best way to gracefully stop the application in yarn cluster mode. On 5 Feb 2015 09:38, "Dibyendu Bhattacharya" <dibyendu.bhattach...@gmail.com> wrote: > Thanks Akhil for mentioning this Low Level Consumer ( > https://github.com/dibbhatt/kafka-spark-consumer ) . Yes it has better > fault tolerant mechanism than any existing Kafka consumer available . This > has no data loss on receiver failure and have ability to reply or restart > itself in-case of failure. You can definitely give it a try . > > Dibyendu > > On Thu, Feb 5, 2015 at 1:04 AM, Akhil Das <ak...@sigmoidanalytics.com> > wrote: > >> AFAIK, From Spark 1.2.0 you can have WAL (Write Ahead Logs) for fault >> tolerance, which means it can handle the receiver/driver failures. You can >> also look at the lowlevel kafka consumer >> <https://github.com/dibbhatt/kafka-spark-consumer> which has a better >> fault tolerance mechanism for receiver failures. This low level consumer >> will push the offset of the message being read into zookeeper for fault >> tolerance. In your case i think mostly the "inflight data" would be lost if >> you arent using any of the fault tolerance mechanism. >> >> Thanks >> Best Regards >> >> On Wed, Feb 4, 2015 at 5:24 PM, Mukesh Jha <me.mukesh....@gmail.com> >> wrote: >> >>> Hello Sprakans, >>> >>> I'm running a spark streaming app which reads data from kafka topic does >>> some processing and then persists the results in HBase. >>> >>> I am using spark 1.2.0 running on Yarn cluster with 3 executors (2gb, 8 >>> cores each). I've enable checkpointing & I am also rate limiting my >>> kafkaReceivers so that the number of items read is not more than 10 records >>> per sec. >>> The kafkaReceiver I'm using is *not* ReliableKafkaReceiver. >>> >>> This app was running fine for ~3 days then there was an increased load >>> on the HBase server because of some other process querying HBase tables. >>> This led to increase in the batch processing time of the spark batches >>> (processed 1 min batch in 10 min) which previously was finishing in 20 sec >>> which in turn led to the shutdown of the spark application, PFA the >>> executor logs. >>> >>> From the logs I'm getting below exceptions *[1]* & *[2]* looks like >>> there was some outstanding Jobs that didn't get processed or the Job >>> couldn't find the input data. From the logs it looks seems that the >>> shutdown hook gets invoked but it cannot process the in-flight block. >>> >>> I have a couple of queries on this >>> 1) Does this mean that these jobs failed and the *in-flight data *is >>> lost? >>> 2) Does the Spark job *buffers kafka* input data while the Job is >>> under processing state for 10 mins and on shutdown is that too lost? (I do >>> not see any OOM error in the logs). >>> 3) Can we have *explicit commits* enabled in the kafkaReceiver so >>> that the offsets gets committed only when the RDD(s) get successfully >>> processed? >>> >>> Also I'd like to know if there is a *graceful way to shutdown a spark >>> app running on yarn*. Currently I'm killing the yarn app to stop it >>> which leads to loss of that job's history wheras in this case the >>> application stops and succeeds and thus preserves the logs & history. >>> >>> *[1]* 15/02/02 19:30:11 ERROR client.TransportResponseHandler: Still >>> have 1 requests outstanding when connection from >>> hbase28.usdc2.cloud.com/10.193.150.221:43189 is closed >>> *[2]* java.lang.Exception: Could not compute split, block >>> input-2-1422901498800 not found >>> *[3]* >>> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException): >>> No lease on /tmp/spark/realtime-failover/msg_2378481654720966.avro (inode >>> 879488): File does not exist. Holder DFSClient_NONMAPREDUCE_-148264920_63 >>> does not have any open files. >>> >>> -- >>> Thanks & Regards, >>> >>> *Mukesh Jha <me.mukesh....@gmail.com>* >>> >>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >>> For additional commands, e-mail: user-h...@spark.apache.org >>> >> >> >