Resiliency with SparkStreaming - fileStream

Scott W Wed, 26 Oct 2016 09:22:17 -0700

Hello,

I'm planning to use fileStream Spark streaming API to stream data from
HDFS. My Spark job would essentially process these files and post the
results to an external endpoint.


*How does fileStream API handle checkpointing of the file it processed ? *In
other words, if my Spark job failed while posting the results to an
external endpoint, I want that same original file to be picked up again and
get reprocessed.

Thanks much!

Resiliency with SparkStreaming - fileStream

Reply via email to