Hello,

I'm planning to use fileStream Spark streaming API to stream data from
HDFS. My Spark job would essentially process these files and post the
results to an external endpoint.

*How does fileStream API handle checkpointing of the file it processed ? *In
other words, if my Spark job failed while posting the results to an
external endpoint, I want that same original file to be picked up again and
get reprocessed.

Thanks much!

Reply via email to