Hi I am new to the Kafka world and running into this scale problem. I thought of reaching out to the community if someone can help. So the problem is I am trying to consume from a Kafka topic that can have a peak of 12 million messages/hour. That topic is not under my control - it has 7 partitions and sending json payload. I have written a consumer (I've used Java and Spring-Kafka lib) that will read that data, filter it and then load it into a database. I ran into a huge consumer lag that would take 10-12hours to catch up. I have 7 instances of my application running to match the 7 partitions and I am using auto commit. Then I thought of splitting the write logic to a separate layer. So now my architecture has a component that reads and filters and produces the data to an internal topic (I've done 7 partitions but as you see it's under my control). Then a consumer picks up data from that topic and writes it to the database. It's better but still it takes 3-5hours for the consumer lag to catch up. Am I missing something fundamentally? Are there any other ideas for optimization that can help overcome this scale challenge. Any pointer and article will help too.
Appreciate your help with this. Thanks Yana