[GitHub] [druid] cloventt opened a new pull request, #12842: Change Kafka Lookup Extractor to not register consumer group

GitBox Sun, 31 Jul 2022 21:09:02 -0700


cloventt opened a new pull request, #12842:
URL: https://github.com/apache/druid/pull/12842


   ### Description
   
   The Kafka lookup extractor has to consume an entire topic from the beginning 
in order to build the internal lookup map. Previously, the extractor would 
always use a randomly generated Kafka `group.id`. This meant that the service 
would register a new consumer group every time it started, essentially 
"forgetting" it's previously committed consumer offsets. This guarantees that 
the service will always consume the entire topic.
   
   This has the unintended side-effect of also leaving a lot of "ghost" 
consumers registered with the Kafka cluster. These consumer groups will never 
be used again and so they just hang around on the broker until Kafka decides to 
delete them (by default, after 2 days). This needlessly adds bloat to the Kafka 
broker.
   
   This has been fixed by setting the Kafka consumer config 
`enable.auto.commit` to `false`. This means that the consumer never attempts to 
commit offsets, achieving the same result as before without leaving a bunch of 
"ghost" consumer groups registered on the broker.
   
   I also took the chance to flesh out the documentation a whole bunch.
   
   <hr>
   
   ##### Key changed/added classes in this PR
    * `org.apache.druid.query.lookup.KafkaLookupExtractorFactory`
   
   <hr>
   
   This PR has:
   - [x] been self-reviewed.
   - [x] added documentation for new or modified features or behaviors.
   - [ ] added Javadocs for most classes and all non-trivial methods. Linked 
related entities via Javadoc links.
   - [ ] added or updated version, license, or notice information in 
[licenses.yaml](https://github.com/apache/druid/blob/master/dev/license.md)
   - [x] added comments explaining the "why" and the intent of the code 
wherever would not be obvious for an unfamiliar reader.
   - [x] added unit tests or modified existing tests to cover new code paths, 
ensuring the threshold for [code 
coverage](https://github.com/apache/druid/blob/master/dev/code-review/code-coverage.md)
 is met.
   - [ ] added integration tests.
   - [x] been tested in a test Druid cluster.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [druid] cloventt opened a new pull request, #12842: Change Kafka Lookup Extractor to not register consumer group

Reply via email to