Spark Structured Streaming checkpointing with S3 data source

2018-08-30 Thread sherif98
I have data that is continuously pushed to multiple S3 buckets. I want to set up a structured streaming application that uses the S3 buckets as the data source and do stream-stream joins. My question is if the application is down for some reason, will restarting the application would continue

Spark Structured Streaming using S3 as data source

2018-08-26 Thread sherif98
I have data that is continuously pushed to multiple S3 buckets. I want to set up a structured streaming application that uses the S3 buckets as the data source and do stream-stream joins. My question is if the application is down for some reason, will restarting the application would continue