I would like to have a Spark Streaming SQS Receiver which deletes SQS
messages only after they were successfully stored on S3.

For this a Custom Receiver can be implemented with the semantics of
the Reliable Receiver.

The store(multiple-records) call blocks until the given records have
been stored and replicated inside Spark.

If the write-ahead logs are enabled, all the data received from a
receiver gets written into a write ahead log in the configuration
checkpoint directory. The checkpoint directory can be pointed to S3.

After the store(multiple-records) blocking call finishes, are the
records already stored in the checkpoint directory (and thus can be
safely deleted from SQS)?

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to