what is best practice while processing files from s3 bucket in spark file 
streaming ?? Like I keep on getting files in s3 path, have to process those in 
batch but while processing some other files might come up. In this steaming 
job, should I have to move files after end of our streaming batch to other 
location or is there any other way to do it?

Let's say, batch interval is 15 minutes, and current batch takes more than 15 
minutes.. batch gets started irrespective of the other batch being processed? 
Is there a way that I can control to hold on current batch if other batch is 
under processing ??

Thanks?
Asmath
---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Reply via email to