Hi folks,

I'm hoping to get some deeper clarification on which framework, Flink or
KStreams, to use in a given scenario. I've read over the following blog
article which I think sets a great baseline understanding of the
differences between those frameworks but I would like to get some outside
opinions:
https://www.confluent.io/blog/apache-flink-apache-kafka-streams-comparison-guideline-users/

My understanding of this article is that KStreams works well as an embedded
library in a microservice, API layer, or as a standalone application at a
company with centralized deployment infrastructure, such as a shared
Kubernetes cluster.

In this case, there is discussion around deploying KStreams as a standalone
application stack backed by EC2 or ECS, and whether or not Flink is better
suited to serve as the data transformation layer. We already do run Flink
applications on EMR.

The point against Flink is that it requires a cluster whereas KStreams does
not, and can be deployed on ECS or EC2. We do not have a centralized
deployment team, and will still have to maintain either the CNF
Stack/AutoScaling Group or EMR Cluster ourselves.

What are some of the advantages of using Flink over KStreams standalone?

The Job management UI is one that comes to mind, and another are some of
the more advanced API options such as CEP. But I would really love to hear
the opinions of people who are familiar with both. In what scenarios would
you choose one over the other? Is it advisable or even preferable to
attempt to deploy KStreams as it's own stack and avoid the complexity of
maintaining a cluster?

Thanks,
Peter

Reply via email to