what is best practice while processing files from s3 bucket in spark file streaming ?? Like I keep on getting files in s3 path, have to process those in batch but while processing some other files might come up. In this steaming job, should I have to move files after end of our streaming batch to other location or is there any other way to do it?
Let's say, batch interval is 15 minutes, and current batch takes more than 15 minutes.. batch gets started irrespective of the other batch being processed? Is there a way that I can control to hold on current batch if other batch is under processing ?? Thanks? Asmath --------------------------------------------------------------------- To unsubscribe e-mail: user-unsubscr...@spark.apache.org