Scaling up partitions with kafka streams and other stateful stream processing frameworks

Ryan Thompson Wed, 27 Apr 2016 10:01:25 -0700

Hello,

I'm wondering if fault tolerant state management with kafka streams works
seamlessly if partitions are scaled up.  My understanding is that this is
indeed a problem that stateful stream processing frameworks need to solve,
and that:


with samza, this is not a solved problem (though I also understand it's
being worked on, based on a conversation I had yesterday at the kafka
summit with someone who works on samza)

with flink, there's a plan to solve this:  "The way we plan to implement
this in Flink is by shutting the dataflow down with a checkpoint, and
bringing the dataflow back up with a different parallelism."
http://www.confluent.io/blog/real-time-stream-processing-the-next-step-for-apache-flink/

with kafka streams, I haven't been able to find a solid answer on whether
or not this problem is solved for users, or if we need to handle it
ourselves.

Thanks,
Ryan

Scaling up partitions with kafka streams and other stateful stream processing frameworks

Reply via email to