Hi Mathieu, We've been deploying stateful Kafka streams apps in Kubernetes autoscaling on CPU and lag successfully for about a year now. We also automate releases of new versions of apps using Flux. Kafka Streams is really good at doing the right thing - e.g., it uses Kafka's consumer groups to balance work across multiple instances of the same service and retain consuming progress. Persistent state stores have changelogs persisted via Kafka, I'm not entirely sure how Kafka Streams handles it if you change the serialisation format / aggregation of the persistent store, but I would be cautious and rename the persistent state store if fundamentally altering it to create a new one.
In terms of rolling out fundamental changes to the data API being emitted by a streaming app, the usual principles of backwards compatible APIs apply. Don't remove fields downstream consumers expect, ensure downstream consumers can ignore "unknown" new fields etc. If you're changing fundamental semantics, we typically do a managed switch-over - new version of producing app writes to a new topic which is consumed by new version of consuming app, while old versions of consuming app remain and drain the old topic until empty. I'm afraid I'm not available for voice chats etc., but if you can provide more examples of your pipeline composition and what changes you envisage happening, I can give you more focused advice. Cheers, Liam Clarke On Mon, Apr 20, 2020 at 10:18 AM Mathieu D <[email protected]> wrote: > Hey Kafka lovers, > > I have a lot of questioning around the roll-out and the update of kafka > streams app (continuous deployment fashion). > > We are used to ship our apps in dockers quite frequently (our current > orchestrator is AWS ECS). > In the context of streams app, i understand we should be much more careful > to avoid messing up internal state. Each app update could potentially > change a computation of an internal state, change how we compute some > output values, etc. > > What do you usually do ? change the app.id ? use the application reset > tool > ? > Application reset only manages internal states. What about all the rest > downstream ? (We will have a couple of databases downstream). Do you > manually reset all this stuff ? > How do you manage orchestration of app shutdown, downstream cleanup, start > of the new version in a distributed context ? > Is there a way to query kafka broker about running apps to wait the > shutdown of the previous ? How do you automate all this ? With a bunch of > scripts ? > Are you able to test (like in automated CI test) that kind of update ? > > I know that 's a lot of questions ... :-/ > > If you're experienced in the topic and have 10-15mn of time to spare, I'd > be happy to have a quick talk over google-meet or anything. This could be > easier to discuss than to write a long email. > > If you have any resources, books, posts, conference talks, I'd be happy (if > they are about real-life⢠apps... not really interested in hello-worlds...) > > Thanks a lot ! > Mathieu D >
