I am fairly new to spark streaming and i have a basic question on how spark streaming works on s3 bucket which is periodically getting new files once in 10 mins. When i use spark streaming to process these files in this s3 path, will it process all the files in this path (old+new files) every batch or is there any way i can make it to process only the new files leaving the already processed old files in the same path?
Thanks