ableegoldman commented on a change in pull request #9984: URL: https://github.com/apache/kafka/pull/9984#discussion_r565797234
########## File path: streams/src/main/java/org/apache/kafka/streams/KafkaStreams.java ########## @@ -997,19 +1002,63 @@ private StreamThread createAndAddStreamThread(final long cacheSizePerThread, fin * no stream threads are alive */ public Optional<String> removeStreamThread() { + return removeStreamThread(Long.MAX_VALUE); + } + + /** + * Removes one stream thread out of the running stream threads from this Kafka Streams client. + * <p> + * The removed stream thread is gracefully shut down. This method does not specify which stream + * thread is shut down. + * <p> + * Since the number of stream threads decreases, the sizes of the caches in the remaining stream + * threads are adapted so that the sum of the cache sizes over all stream threads equals the total + * cache size specified in configuration {@link StreamsConfig#CACHE_MAX_BYTES_BUFFERING_CONFIG}. + * + * @param timeout The the length of time to wait for the thread to shutdown + * @throws TimeoutException if the thread does not stop in time + * @return name of the removed stream thread or empty if a stream thread could not be removed because + * no stream threads are alive + */ + public Optional<String> removeStreamThread(final Duration timeout) throws TimeoutException { + final String msgPrefix = prepareMillisCheckFailMsgPrefix(timeout, "timeout"); + final long timeoutMs = validateMillisecondDuration(timeout, msgPrefix); + return removeStreamThread(timeoutMs); + } + + private Optional<String> removeStreamThread(final long timeoutMs) throws TimeoutException { + final long begin = time.milliseconds(); if (isRunningOrRebalancing()) { synchronized (changeThreadCount) { // make a copy of threads to avoid holding lock for (final StreamThread streamThread : new ArrayList<>(threads)) { if (streamThread.isAlive() && (!streamThread.getName().equals(Thread.currentThread().getName()) || threads.size() == 1)) { + final Optional<String> groupInstanceID = streamThread.getGroupInstanceID(); streamThread.shutdown(); if (!streamThread.getName().equals(Thread.currentThread().getName())) { - streamThread.waitOnThreadState(StreamThread.State.DEAD); + if (!streamThread.waitOnThreadState(StreamThread.State.DEAD, timeoutMs)) { + log.warn("Thread " + streamThread.getName() + " did not stop in the allotted time"); + throw new TimeoutException("Thread " + streamThread.getName() + " did not stop in the allotted time"); Review comment: It does seem like kind of a gray area. Still, the TimeoutException isn't necessarily saying that it failed, just that we didn't wait long enough for it to finish the shutdown. But we have at least definitely initiated the shutdown -- besides, if the thread really is stuck in its shutdown then it's probably a benefit to go ahead with the `removeMembersFromConsumerGroup` call to get it kicked out all the sooner. But, in the end, we really make no guarantees about the application should a user choose to ignore the TimeoutException (though they absolutely can). I can imagine that some users might choose to just swallow it and decide that they don't care if the shutdown is taking a long time. It's hard to say ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org