Hello Miguel and Samson,

Just to add to what Dave just stated and to summarize the use cases a bit:

The Kafka documentation has a good summary here with the APIs that are
provided within the project
https://kafka.apache.org/documentation/#api

To add to what's in the documentation, we have a couple of scenarios for
you to consider:

(a) Just fetching events from a kafka topic and processing it using your
own mechanisms and logic. This is when you use the Consumer API
https://kafka.apache.org/documentation/#consumerapi
With this API, you provide the logic to filter, process or merge events
across multiple topics if you need to

You can also use the Confluent Parallel Consumer for this, especially when
you don't want to limit your scaling capabilities to the number of
partitions in the topics you are consuming from.
https://www.confluent.io/blog/introducing-confluent-parallel-message-processing-client/

(b) Fetching data from one or more Kafka topics, joining, filtering,
transforming, enriching the events and then putting the events back into
another topic. This is when the Kafka Streams API shines as it allows you
to scale horizontally to support the processing of the events. This API
simplifies a lot of things for the end user when it comes to joins,
transformations, filtering and scaling up to keep up with the rate at which
events are ingested into the topics.
https://kafka.apache.org/documentation/#streamsapi

(c) Pulling events out of data stores (like MongoDB, Cassandra, MySQL,
PostgreSQL) into Kafka or pushing events from Kafka into datastores (like
ElasticSearch)
https://kafka.apache.org/documentation/#connectapi

All these APIs have libraries you can use via Maven or Gradle

Then there is another scenario where you may not have the time or skills to
write Java/Scala logic with the streams or Consumer API and you just want
to perform simple transformations, filters and merging of events from
multiple topics. Then you can use the KSQLDB framework to process your
events in near realtime using SQL-like syntax
https://ksqldb.io/

These types of questions are a recurring theme so I think I am going to
create a video tutorial soon about it to illustrate the use cases,
similarities and differences.

I hope this was helpful.


On Sat, Jun 26, 2021 at 10:35 AM Dave Klein <davekl...@usa.net> wrote:

> Yes, Kafka Consumer and Kafka Streams are just libraries. My point with
> that, is that it’s not difficult to switch from one to the other as your
> needs evolve.
>
> There are several ways that Kafka Streams aids in processing. It provides
> a rich set of functions for transforming, filtering, branching, etc.  Also
> it manages state for any stateful processing, like aggregations, joins,
> etc.  If you don’t need any of these and are just consuming events and
> writing them to a database, Kafka Consumer will work fine.  But if your
> needs change, you can switch to Kafka Streams later.
>
> Also, if you really are just consuming to write to a DB, you may want to
> consider Kafka Connect.
>
> Let me know if this is unclear.
>
> Thanks,
> Dave
>
>
> > On Jun 26, 2021, at 7:05 AM, SuarezMiguelC 
> > <suarezmigu...@protonmail.com.invalid>
> wrote:
> >
> > DaveKlein, in the reply email of "Kafka Streams" on the question to use
> Kafka Streams or just a consumer you specified:
> >
> >> But Streams and Consumer are just libraries, so start with Consumer and
> if you find yourself doing more processing, consider moving to Kafka
> Streams.
> >
> > I though Kafka Consumer was also just a library, and didn't knew Streams
> helped with processing, can you elaborate on this?
> >
> > Thanks for sharing your knowledge!
> >
> > Miguel Suárez
>
>

Reply via email to