I am fairly new to spark streaming and i have a basic question on how spark
streaming works on s3 bucket which is periodically getting new files once
in 10 mins.
When i use spark streaming to process these files in this s3 path, will it
process all the files in this path (old+new files) every batch or is there
any way i can make it to process only the new files leaving the already
processed old files in the same path?

Thanks

Reply via email to