Hello Miguel and Samson, Just to add to what Dave just stated and to summarize the use cases a bit:
The Kafka documentation has a good summary here with the APIs that are provided within the project https://kafka.apache.org/documentation/#api To add to what's in the documentation, we have a couple of scenarios for you to consider: (a) Just fetching events from a kafka topic and processing it using your own mechanisms and logic. This is when you use the Consumer API https://kafka.apache.org/documentation/#consumerapi With this API, you provide the logic to filter, process or merge events across multiple topics if you need to You can also use the Confluent Parallel Consumer for this, especially when you don't want to limit your scaling capabilities to the number of partitions in the topics you are consuming from. https://www.confluent.io/blog/introducing-confluent-parallel-message-processing-client/ (b) Fetching data from one or more Kafka topics, joining, filtering, transforming, enriching the events and then putting the events back into another topic. This is when the Kafka Streams API shines as it allows you to scale horizontally to support the processing of the events. This API simplifies a lot of things for the end user when it comes to joins, transformations, filtering and scaling up to keep up with the rate at which events are ingested into the topics. https://kafka.apache.org/documentation/#streamsapi (c) Pulling events out of data stores (like MongoDB, Cassandra, MySQL, PostgreSQL) into Kafka or pushing events from Kafka into datastores (like ElasticSearch) https://kafka.apache.org/documentation/#connectapi All these APIs have libraries you can use via Maven or Gradle Then there is another scenario where you may not have the time or skills to write Java/Scala logic with the streams or Consumer API and you just want to perform simple transformations, filters and merging of events from multiple topics. Then you can use the KSQLDB framework to process your events in near realtime using SQL-like syntax https://ksqldb.io/ These types of questions are a recurring theme so I think I am going to create a video tutorial soon about it to illustrate the use cases, similarities and differences. I hope this was helpful. On Sat, Jun 26, 2021 at 10:35 AM Dave Klein <davekl...@usa.net> wrote: > Yes, Kafka Consumer and Kafka Streams are just libraries. My point with > that, is that it’s not difficult to switch from one to the other as your > needs evolve. > > There are several ways that Kafka Streams aids in processing. It provides > a rich set of functions for transforming, filtering, branching, etc. Also > it manages state for any stateful processing, like aggregations, joins, > etc. If you don’t need any of these and are just consuming events and > writing them to a database, Kafka Consumer will work fine. But if your > needs change, you can switch to Kafka Streams later. > > Also, if you really are just consuming to write to a DB, you may want to > consider Kafka Connect. > > Let me know if this is unclear. > > Thanks, > Dave > > > > On Jun 26, 2021, at 7:05 AM, SuarezMiguelC > > <suarezmigu...@protonmail.com.invalid> > wrote: > > > > DaveKlein, in the reply email of "Kafka Streams" on the question to use > Kafka Streams or just a consumer you specified: > > > >> But Streams and Consumer are just libraries, so start with Consumer and > if you find yourself doing more processing, consider moving to Kafka > Streams. > > > > I though Kafka Consumer was also just a library, and didn't knew Streams > helped with processing, can you elaborate on this? > > > > Thanks for sharing your knowledge! > > > > Miguel Suárez > >