Thanks.
A workaround I can think of is to rename/move the objects which have been
processed to a different prefix (which is not monitored), But with
StreamingContext. textFileStream method there doesn't seem to be a way to
know where each record is coming from. Is there another way to do this?
On
On 25 Jun 2018, at 23:59, Farshid Zavareh
mailto:fhzava...@gmail.com>> wrote:
I'm writing a Spark Streaming application where the input data is put into an
S3 bucket in small batches (using Database Migration Service - DMS). The Spark
application is the only consumer. I'm considering two poss
I'm writing a Spark Streaming application where the input data is put into
an S3 bucket in small batches (using Database Migration Service - DMS). The
Spark application is the only consumer. I'm considering two possible
architectures:
Have Spark Streaming watch an S3 prefix and pick up new objects