nicktelford opened a new pull request, #17018: URL: https://github.com/apache/kafka/pull/17018
We currently use a `CountDownLatch` to signal when a thread has completed shutdown to the blocking `shutdown` method. However, this latch triggers _before_ the thread has fully exited. Dependent on the OS thread scheduling, it's possible that this thread will still be "alive" after the latch has unblocked the `shutdown` method. In practice, this is mostly a problem for `StreamThreadTest`, which now checks that there are no `TaskExecutor` or `StateUpdater` threads immediately after shutting them down. Sometimes, after shutdown returns, we find that these threads are still "alive", usually completing execution of the "thread shutdown" log message, or even the `Thread#exit` JVM method that's invoked to clean up threads just before they exit. This causes sporadic test failures, even though these threads did indeed shutdown correctly. Instead of using a `CountDownLatch`, let's just await the thread to exit directly, using `Thread#join`. Just as before, we set a timeout, and if the Thread is still alive after the timeout, we throw a `StreamsException`, maintaining the contract of the `shutdown` method. There should be no measurable impact on production code here. This will mostly just improve the reliability of tests that require these threads have fully exited after calling `shutdown`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org