Hello,

I'm wondering if fault tolerant state management with kafka streams works
seamlessly if partitions are scaled up.  My understanding is that this is
indeed a problem that stateful stream processing frameworks need to solve,
and that:

with samza, this is not a solved problem (though I also understand it's
being worked on, based on a conversation I had yesterday at the kafka
summit with someone who works on samza)

with flink, there's a plan to solve this:  "The way we plan to implement
this in Flink is by shutting the dataflow down with a checkpoint, and
bringing the dataflow back up with a different parallelism."
http://www.confluent.io/blog/real-time-stream-processing-the-next-step-for-apache-flink/

with kafka streams, I haven't been able to find a solid answer on whether
or not this problem is solved for users, or if we need to handle it
ourselves.

Thanks,
Ryan

Reply via email to