Re: Help with a stream processing use case

Chesnay Schepler Sun, 10 Feb 2019 01:42:34 -0800

I'll need someone else to chime in here for a definitive answer (cc'dGordon), so I'm really just guessing here.

For the partitioning it looks like you can use a custom partitioner, seeFlinkKinesisProducer#setCustomPartitioner.Have you looked at the KinesisSerializationSchema described in thedocumentation<https://ci.apache.org/projects/flink/flink-docs-release-1.7/dev/connectors/kinesis.html#kinesis-producer>?It allows you to write to a specific stream based on incoming events,but I'm not sure whether this translates to S3 buckets and keyspaces.


On 08.02.2019 16:43, Sandybayev, Turar (CAI - Atlanta) wrote:

Hi all,
I wonder whether it’s possible to use Flink for the followingrequirement. We need to process a Kinesis stream and based on valuesin each record, route those records to different S3 buckets andkeyspaces, with support for batching up of files and control overpartitioning scheme (so preferably through Firehose).
I know it’s straightforward to have a Kinesis source and a Kinesissink, and the hook up Firehose to the sink from AWS, but I need a “fanout” to potentially thousands of different buckets, based on contentof each event.
Thanks!

Turar

Re: Help with a stream processing use case

Reply via email to