Dmitry Gorbatsevich created BAHIR-233:
-----------------------------------------
Summary: Add SNS message support for SQS structured streaming
connector
Key: BAHIR-233
URL: https://issues.apache.org/jira/browse/BAHIR-233
Project: Bahir
Issue Type: New Feature
Components: Spark Structured Streaming Connectors
Reporter: Dmitry Gorbatsevich
h3. Motivation
Current implementation of SQS streaming connector handles the following "route"
of the s3 notification event:
1. S3 -> SQS -> Spark
This approach works just fine until you need to have multiple listeners
(consumers) for the same S3 path. In case multiple applications require to
listen and process same S3 path the following approach is recommended:
2. S3 -> SNS -> SQS -> Spark
In this case we can route messages from 1 SNS topic to multiple different SQS
queues. This enables an ability to listen same S3 path for multiple
applications Using approach #2, original S3 notification is wrapped into SNS
message and then delivered to the SQS queue. (link to the [AWS
docs|https://docs.aws.amazon.com/sns/latest/dg/sns-message-and-json-formats.html]
describing SNS message format)
To extract original S3 event from SNS message one need to look at "Message"
field in json document.
h4. Proposed approach
# Add option to the s3-sqs connector: "messageWrapper"
# It can be 'None' or 'SNS'
# Default value is 'None'
In case if 'SNS' is specified – "unwrap" original s3 notification event from
SNS message and continue processing.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)