Re: End-to-end exactly once from kafka source to S3 sink

2018-01-31 Thread Hung
"Flink will only commit the kafka offsets when the data has been saved to S3" -> no, you can check the BucketingSink code, and it would mean BucketingSink depends on Kafka which is not reasonable. Flink stores checkpoint in disk of each worker, not Kafka. (KafkaStream, the other streaming API prov

End-to-end exactly once from kafka source to S3 sink

2018-01-28 Thread chris snow
I’m working with a kafka environment where I’m limited to 100 partitions @ 1GB log.retention.bytes each. I’m looking to implement exactly once processing from this kafka source to a S3 sink. If I have understood correctly, Flink will only commit the kafka offsets when the data has been saved to S