[ 
https://issues.apache.org/jira/browse/KAFKA-13593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steven Schlansker resolved KAFKA-13593.
---------------------------------------
    Resolution: Fixed

> ThrottledChannelReaper slows broker shutdown by multiple seconds
> ----------------------------------------------------------------
>
>                 Key: KAFKA-13593
>                 URL: https://issues.apache.org/jira/browse/KAFKA-13593
>             Project: Kafka
>          Issue Type: Improvement
>          Components: core
>    Affects Versions: 3.0.0
>            Reporter: Steven Schlansker
>            Priority: Minor
>
> We run an embedded KRaft broker in integration tests, to test that our 
> Producer / Consumers are all hooked up and verify our overall pipeline.
> While trying to deliver CI speed improvements, we noticed that the majority 
> of time for a small test is actually spent in Kafka broker shutdown. From the 
> logs, it looks like the ClientQuotaManager has multiple 
> ThrottledChannelReaper threads and each of them takes up to a second to 
> shutdown.
> {code:java}
> 2022-01-12T15:26:31.932Z [main] INFO  kafka.log.LogManager - Shutdown 
> complete.
> 2022-01-12T15:26:31.934Z [main] INFO  
> k.s.ClientQuotaManager$ThrottledChannelReaper - 
> [ThrottledChannelReaper-Fetch]: Shutting down
> 2022-01-12T15:26:32.311Z [main] INFO  
> k.s.ClientQuotaManager$ThrottledChannelReaper - 
> [ThrottledChannelReaper-Fetch]: Shutdown completed
> 2022-01-12T15:26:32.311Z [ThrottledChannelReaper-Fetch] INFO  
> k.s.ClientQuotaManager$ThrottledChannelReaper - 
> [ThrottledChannelReaper-Fetch]: Stopped
> 2022-01-12T15:26:32.311Z [main] INFO  
> k.s.ClientQuotaManager$ThrottledChannelReaper - 
> [ThrottledChannelReaper-Produce]: Shutting down
> 2022-01-12T15:26:33.312Z [ThrottledChannelReaper-Produce] INFO  
> k.s.ClientQuotaManager$ThrottledChannelReaper - 
> [ThrottledChannelReaper-Produce]: Stopped
> 2022-01-12T15:26:33.312Z [main] INFO  
> k.s.ClientQuotaManager$ThrottledChannelReaper - 
> [ThrottledChannelReaper-Produce]: Shutdown completed
> 2022-01-12T15:26:33.312Z [main] INFO  
> k.s.ClientQuotaManager$ThrottledChannelReaper - 
> [ThrottledChannelReaper-Request]: Shutting down
> 2022-01-12T15:26:34.315Z [ThrottledChannelReaper-Request] INFO  
> k.s.ClientQuotaManager$ThrottledChannelReaper - 
> [ThrottledChannelReaper-Request]: Stopped
> 2022-01-12T15:26:34.315Z [main] INFO  
> k.s.ClientQuotaManager$ThrottledChannelReaper - 
> [ThrottledChannelReaper-Request]: Shutdown completed
> 2022-01-12T15:26:34.315Z [main] INFO  
> k.s.ClientQuotaManager$ThrottledChannelReaper - 
> [ThrottledChannelReaper-ControllerMutation]: Shutting down
> 2022-01-12T15:26:35.317Z [main] INFO  
> k.s.ClientQuotaManager$ThrottledChannelReaper - 
> [ThrottledChannelReaper-ControllerMutation]: Shutdown completed
> 2022-01-12T15:26:35.317Z [ThrottledChannelReaper-ControllerMutation] INFO  
> k.s.ClientQuotaManager$ThrottledChannelReaper - 
> [ThrottledChannelReaper-ControllerMutation]: Stopped{code}
> Inspecting the code, the ThrottledChannelReaper threads are marked as not 
> interruptible, so ShutdownableThread does not interrupt the worker on 
> shutdown. Unfortunately, the doWork method polls with a 1 second timeout. So 
> you wait up to 1s for every type of quota, in this case for a total of almost 
> 4s.
>  
> While this is not a problem for production Kafka brokers, where instances are 
> expected to stay up for months, in tests that expect to spin up and down it 
> is easily noticed and adds real overhead to CI.
>  
> Suggested possible remediations:
>  * Allow ThrottledChannelReaper to be interrupted by ShutdownableThread
>  * ThrottledChannelReaper overrides initiateShutdown and after calling 
> {{super.initiateShutdown}} then enqueues a {{null}} element on the delayQueue 
> to force worker thread to notice shutdown state
>  * Reduce 1s poll timeout to something small (carries overhead penalty for 
> all users though, so this is less desirable), or make it configurable so we 
> can set it to e.g. 10ms in unit tests



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to