Brian Byrne created KAFKA-8904:
----------------------------------

             Summary: Reduce metadata lookups when producting to a large number 
of topics
                 Key: KAFKA-8904
                 URL: https://issues.apache.org/jira/browse/KAFKA-8904
             Project: Kafka
          Issue Type: Improvement
          Components: controller, producer 
            Reporter: Brian Byrne


Per [~lbradstreet]:
 
"The problem was that the producer starts with no knowledge of topic metadata. 
So they start the producer up, and then they start sending messages to any of 
the thousands of topics that exist. Each time a message is sent to a new topic, 
it'll trigger a metadata request if the producer doesn't know about it. These 
metadata requests are done in serial such that if you send 2000 messages to 
2000 topics, it will trigger 2000 new metadata requests.
 
Each successive metadata request will include every topic seen so far, so the 
first metadata request will include 1 topic, the second will include 2 topics, 
etc.
 
An additional problem is that this can take a while, and metadata expiry (for 
metadata that has not been recently used) is hard coded to 5 mins, so if this 
the initial fetches take long enough you can end up evicting the metadata 
before you send another message to a topic.

So the approaches above are:
1. We can linger for a bit before making a metadata request, allow more sends 
to go through, and then batch the metadata request for topics we we need in a 
single metadata request.
2. We can allow pre-seeding the producer with metadata for a list of topics you 
care about.

I prefer 1 if we can make it work."



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

Reply via email to