identifying newly arrived files in s3 in spark streaming

pandees waran Mon, 06 Jun 2016 20:22:20 -0700

I am fairly new to spark streaming and i have a basic question on how spark
streaming works on s3 bucket which is periodically getting new files once
in 10 mins.
When i use spark streaming to process these files in this s3 path, will it
process all the files in this path (old+new files) every batch or is there
any way i can make it to process only the new files leaving the already
processed old files in the same path?


Thanks

identifying newly arrived files in s3 in spark streaming

Reply via email to