The way I see it - you can only do a few things - if you are sure there is no room for perf optimization of the processing itself : 1. speed up your processing per consumer thread: which you already tried by splitting your logic into a 2-step pipeline instead of 1-step, and delegating the work of writing to a DB to the second step ( make sure your second intermediate Kafka topic is created with much more partitions to be able to parallelize your work much higher - I'd say at least 40) 2. if you can change the incoming topic - I would create it with many more partitions as well - say at least 40 or so - to parallelize your first step service processing more 3. and if you can't increase partitions for the original topic ) - you could artificially achieve the same by adding one more step (service) in your pipeline that would just read data from the original 7-partition topic1 and just push it unchanged into a new topic2 with , say 40 partitions - and then have your other services pick up from this topic2
good luck, Marina Sent with ProtonMail Secure Email. ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐ On Saturday, December 19, 2020 6:46 PM, Yana K <yanak1...@gmail.com> wrote: > Hi > > I am new to the Kafka world and running into this scale problem. I thought > of reaching out to the community if someone can help. > So the problem is I am trying to consume from a Kafka topic that can have a > peak of 12 million messages/hour. That topic is not under my control - it > has 7 partitions and sending json payload. > I have written a consumer (I've used Java and Spring-Kafka lib) that will > read that data, filter it and then load it into a database. I ran into a > huge consumer lag that would take 10-12hours to catch up. I have 7 > instances of my application running to match the 7 partitions and I am > using auto commit. Then I thought of splitting the write logic to a > separate layer. So now my architecture has a component that reads and > filters and produces the data to an internal topic (I've done 7 partitions > but as you see it's under my control). Then a consumer picks up data from > that topic and writes it to the database. It's better but still it takes > 3-5hours for the consumer lag to catch up. > Am I missing something fundamentally? Are there any other ideas for > optimization that can help overcome this scale challenge. Any pointer and > article will help too. > > Appreciate your help with this. > > Thanks > Yana