I would like to have a Spark Streaming SQS Receiver which deletes SQS messages only after they were successfully stored on S3.
For this a Custom Receiver can be implemented with the semantics of the Reliable Receiver. The store(multiple-records) call blocks until the given records have been stored and replicated inside Spark. If the write-ahead logs are enabled, all the data received from a receiver gets written into a write ahead log in the configuration checkpoint directory. The checkpoint directory can be pointed to S3. After the store(multiple-records) blocking call finishes, are the records already stored in the checkpoint directory (and thus can be safely deleted from SQS)? --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org