Spark Kinesis Checkpointing/Processing Delay

2015-08-10 Thread Phil Kallos
Hi! Sorry if this is a repost. I'm using Spark + Kinesis ASL to process and persist stream data to ElasticSearch. For the most part it works nicely. There is a subtle issue I'm running into about how failures are handled. For example's sake, let's say I am processing a Kinesis stream that

Re: Spark Kinesis Checkpointing/Processing Delay

2015-08-10 Thread Tathagata Das
You are correct. The earlier Kinesis receiver (as of Spark 1.4) was not saving checkpoints correctly and was in general not reliable (even with WAL enabled). We have improved this in Spark 1.5 with updated Kinesis receiver, that keeps track of the Kinesis sequence numbers as part of the Spark

Re: Spark Kinesis Checkpointing/Processing Delay

2015-08-06 Thread Patanachai Tangchaisin
is successfully written to ES. Thanks, Phil -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Kinesis-Checkpointing-Processing-Delay-tp24157.html Sent from the Apache Spark User List mailing list archive at Nabble.com

Spark Kinesis Checkpointing/Processing Delay

2015-08-06 Thread phibit
would be able to configure the process to only submit Kinesis checkpoints only after my data is successfully written to ES. Thanks, Phil -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Kinesis-Checkpointing-Processing-Delay-tp24157.html Sent from