Hi Cody,

Thanks for your reply.

Is there a way in Spark-Kafka-Direct API, so that if an exception to write to 
Cassandra occurs, we stop updating the checkpoint ?

In this way, there will be no message lost, once cassandra comes up, we can 
start reading from the point we left off.

Regards,
Sam

From: Cody Koeninger [mailto:c...@koeninger.org]
Sent: Thursday, September 10, 2015 1:13 AM
To: Samya MAITI <samya.ma...@amadeus.com>
Cc: user@spark.apache.org
Subject: Re: Spark streaming -> cassandra : Fault Tolerance

It's been a while since I've looked at the cassandra connector, so I can't give 
you specific advice on it.

But in general, if a spark task fails (uncaught exception), it will be retried 
automatically.  In the case of the kafka direct stream rdd, it will have 
exactly the same messages as the first attempt (as long as they're still in the 
kafka log).

If you or the cassandra connector are catching the exception, the task won't be 
retried automatically and it's up to you to deal with it.



On Wed, Sep 9, 2015 at 2:09 PM, Samya 
<samya.ma...@amadeus.com<mailto:samya.ma...@amadeus.com>> wrote:
Hi Team,

I have an sample spark application which reads from Kafka using direct API &
then does some transformation & stores to cassandra (using
saveToCassandra(....)).

If Cassandra goes down, then application logs NoHostAvailable exception (as
expected). But in the mean time the new incoming messages are lost, as the
Direct API creates new checkpoint & deletes the previous one's.

Does that mean, I should handle the exception at application side?

Or is there any other hook to handle the same?

Thanks in advance.

Regards,
Sam



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-streaming-cassandra-Fault-Tolerance-tp24625.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: 
user-unsubscr...@spark.apache.org<mailto:user-unsubscr...@spark.apache.org>
For additional commands, e-mail: 
user-h...@spark.apache.org<mailto:user-h...@spark.apache.org>

Reply via email to