Hi! Sorry if this is a repost.
I'm using Spark + Kinesis ASL to process and persist stream data to
ElasticSearch. For the most part it works nicely.
There is a subtle issue I'm running into about how failures are handled.
For example's sake, let's say I am processing a Kinesis stream that
You are correct. The earlier Kinesis receiver (as of Spark 1.4) was not
saving checkpoints correctly and was in general not reliable (even with WAL
enabled). We have improved this in Spark 1.5 with updated Kinesis receiver,
that keeps track of the Kinesis sequence numbers as part of the Spark
is successfully written to ES.
Thanks,
Phil
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Kinesis-Checkpointing-Processing-Delay-tp24157.html
Sent from the Apache Spark User List mailing list archive at Nabble.com
would be able to configure the process to only submit Kinesis
checkpoints only after my data is successfully written to ES.
Thanks,
Phil
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Kinesis-Checkpointing-Processing-Delay-tp24157.html
Sent from