Help with a stream processing use case

2019-02-08 Thread Sandybayev, Turar (CAI - Atlanta)
Hi all, I wonder whether it’s possible to use Flink for the following requirement. We need to process a Kinesis stream and based on values in each record, route those records to different S3 buckets and keyspaces, with support for batching up of files and control over partitioning scheme (so pr

Re: Help with a stream processing use case

2019-02-10 Thread Chesnay Schepler
I'll need someone else to chime in here for a definitive answer (cc'd Gordon), so I'm really just guessing here. For the partitioning it looks like you can use a custom partitioner, see FlinkKinesisProducer#setCustomPartitioner. Have you looked at the KinesisSerializationSchema described in the

Re: Help with a stream processing use case

2019-02-10 Thread Tzu-Li (Gordon) Tai
Hi, If Firehouse already supports sinking records from a Kinesis stream to an S3 bucket, then yes, Chesnay's suggestion would work. You route each record to a specific Kinesis stream depending on some value in the record using the KinesisSerializationSchema, and sink each Kinesis stream to their