ableegoldman commented on a change in pull request #9984:
URL: https://github.com/apache/kafka/pull/9984#discussion_r565611900



##########
File path: streams/src/main/java/org/apache/kafka/streams/KafkaStreams.java
##########
@@ -91,6 +93,7 @@
 import java.util.concurrent.Executors;
 import java.util.concurrent.ScheduledExecutorService;
 import java.util.concurrent.TimeUnit;
+import java.util.concurrent.TimeoutException;

Review comment:
       There's actually a kafka-specific version of `TimeoutException` that you 
should use to keep in line with other kafka APIs. It's 
`org.apache.kafka.common.errors.TimeoutException`

##########
File path: streams/src/main/java/org/apache/kafka/streams/KafkaStreams.java
##########
@@ -1005,11 +1008,60 @@ private StreamThread createAndAddStreamThread(final 
long cacheSizePerThread, fin
                             || threads.size() == 1)) {
                         streamThread.shutdown();
                         if 
(!streamThread.getName().equals(Thread.currentThread().getName())) {
-                            
streamThread.waitOnThreadState(StreamThread.State.DEAD);
+                            
streamThread.waitOnThreadState(StreamThread.State.DEAD, -1);
                         }
                         threads.remove(streamThread);
                         final long cacheSizePerThread = 
getCacheSizePerThread(threads.size());
                         resizeThreadCache(cacheSizePerThread);
+                        final Collection<MemberToRemove> membersToRemove = 
Collections.singletonList(new 
MemberToRemove(streamThread.getGroupInstanceID()));
+                        
adminClient.removeMembersFromConsumerGroup(config.getString(StreamsConfig.APPLICATION_ID_CONFIG),
 new RemoveMembersFromConsumerGroupOptions(membersToRemove));
+                        return Optional.of(streamThread.getName());
+                    }
+                }
+            }
+            log.warn("There are no threads eligible for removal");
+        } else {
+            log.warn("Cannot remove a stream thread when Kafka Streams client 
is in state  " + state());
+        }
+        return Optional.empty();
+    }
+
+    /**
+     * Removes one stream thread out of the running stream threads from this 
Kafka Streams client.
+     * <p>
+     * The removed stream thread is gracefully shut down. This method does not 
specify which stream
+     * thread is shut down.
+     * <p>
+     * Since the number of stream threads decreases, the sizes of the caches 
in the remaining stream
+     * threads are adapted so that the sum of the cache sizes over all stream 
threads equals the total
+     * cache size specified in configuration {@link 
StreamsConfig#CACHE_MAX_BYTES_BUFFERING_CONFIG}.
+     *
+     * @param timeout The the length of time to wait for the thread to shutdown
+     * @throws TimeoutException if the thread does not stop in time
+     * @return name of the removed stream thread or empty if a stream thread 
could not be removed because
+     *         no stream threads are alive
+     */
+    public Optional<String> removeStreamThread(final Duration timeout) throws 
TimeoutException {
+        final String msgPrefix = prepareMillisCheckFailMsgPrefix(timeout, 
"timeout");
+        final long timeoutMs = validateMillisecondDuration(timeout, msgPrefix);
+        if (isRunningOrRebalancing()) {
+            synchronized (changeThreadCount) {
+                // make a copy of threads to avoid holding lock
+                for (final StreamThread streamThread : new 
ArrayList<>(threads)) {
+                    if (streamThread.isAlive() && 
(!streamThread.getName().equals(Thread.currentThread().getName())
+                            || threads.size() == 1)) {
+                        streamThread.shutdown();
+                        if 
(!streamThread.getName().equals(Thread.currentThread().getName())) {
+                            if 
(!streamThread.waitOnThreadState(StreamThread.State.DEAD, timeoutMs)) {
+                                log.warn("Thread " + streamThread.getName() + 
" did not stop in the allotted time");
+                                throw new TimeoutException("Thread " + 
streamThread.getName() + " did not stop in the allotted time");
+                            }
+                        }
+                        threads.remove(streamThread);
+                        final long cacheSizePerThread = 
getCacheSizePerThread(threads.size());
+                        resizeThreadCache(cacheSizePerThread);
+                        Collection<MemberToRemove> membersToRemove = 
Collections.singletonList(new 
MemberToRemove(streamThread.getGroupInstanceID()));

Review comment:
       I'm not sure how `removeMembersFromConsumerGroup` would behave if you 
passed in `""` as the `group.instance.id`, do you know? If not then let's just 
be safe and check what `streamThread.getGroupInstanceID()` returns, and skip 
this call if there is no group.instance.id (ie if not static)

##########
File path: streams/src/main/java/org/apache/kafka/streams/KafkaStreams.java
##########
@@ -1005,11 +1008,60 @@ private StreamThread createAndAddStreamThread(final 
long cacheSizePerThread, fin
                             || threads.size() == 1)) {
                         streamThread.shutdown();
                         if 
(!streamThread.getName().equals(Thread.currentThread().getName())) {
-                            
streamThread.waitOnThreadState(StreamThread.State.DEAD);
+                            
streamThread.waitOnThreadState(StreamThread.State.DEAD, -1);
                         }
                         threads.remove(streamThread);
                         final long cacheSizePerThread = 
getCacheSizePerThread(threads.size());
                         resizeThreadCache(cacheSizePerThread);
+                        final Collection<MemberToRemove> membersToRemove = 
Collections.singletonList(new 
MemberToRemove(streamThread.getGroupInstanceID()));
+                        
adminClient.removeMembersFromConsumerGroup(config.getString(StreamsConfig.APPLICATION_ID_CONFIG),
 new RemoveMembersFromConsumerGroupOptions(membersToRemove));
+                        return Optional.of(streamThread.getName());
+                    }
+                }
+            }
+            log.warn("There are no threads eligible for removal");
+        } else {
+            log.warn("Cannot remove a stream thread when Kafka Streams client 
is in state  " + state());
+        }
+        return Optional.empty();
+    }
+
+    /**
+     * Removes one stream thread out of the running stream threads from this 
Kafka Streams client.
+     * <p>
+     * The removed stream thread is gracefully shut down. This method does not 
specify which stream
+     * thread is shut down.
+     * <p>
+     * Since the number of stream threads decreases, the sizes of the caches 
in the remaining stream
+     * threads are adapted so that the sum of the cache sizes over all stream 
threads equals the total
+     * cache size specified in configuration {@link 
StreamsConfig#CACHE_MAX_BYTES_BUFFERING_CONFIG}.
+     *
+     * @param timeout The the length of time to wait for the thread to shutdown
+     * @throws TimeoutException if the thread does not stop in time
+     * @return name of the removed stream thread or empty if a stream thread 
could not be removed because
+     *         no stream threads are alive
+     */
+    public Optional<String> removeStreamThread(final Duration timeout) throws 
TimeoutException {
+        final String msgPrefix = prepareMillisCheckFailMsgPrefix(timeout, 
"timeout");
+        final long timeoutMs = validateMillisecondDuration(timeout, msgPrefix);
+        if (isRunningOrRebalancing()) {
+            synchronized (changeThreadCount) {
+                // make a copy of threads to avoid holding lock
+                for (final StreamThread streamThread : new 
ArrayList<>(threads)) {
+                    if (streamThread.isAlive() && 
(!streamThread.getName().equals(Thread.currentThread().getName())
+                            || threads.size() == 1)) {
+                        streamThread.shutdown();
+                        if 
(!streamThread.getName().equals(Thread.currentThread().getName())) {
+                            if 
(!streamThread.waitOnThreadState(StreamThread.State.DEAD, timeoutMs)) {
+                                log.warn("Thread " + streamThread.getName() + 
" did not stop in the allotted time");
+                                throw new TimeoutException("Thread " + 
streamThread.getName() + " did not stop in the allotted time");
+                            }
+                        }
+                        threads.remove(streamThread);
+                        final long cacheSizePerThread = 
getCacheSizePerThread(threads.size());
+                        resizeThreadCache(cacheSizePerThread);
+                        Collection<MemberToRemove> membersToRemove = 
Collections.singletonList(new 
MemberToRemove(streamThread.getGroupInstanceID()));
+                        
adminClient.removeMembersFromConsumerGroup(config.getString(StreamsConfig.APPLICATION_ID_CONFIG),
 new RemoveMembersFromConsumerGroupOptions(membersToRemove));

Review comment:
       Ok, this is going to be a little 
tricky...`removeMembersFromConsumerGroup` is async so we have two options. (1) 
just ignore the returned result and hope that it succeeded, or (2) check the 
returned `KafkaFuture` and wait/make sure that it succeeded.
   
   Probably we should go with (2) and just apply the remaining time of the 
timeout. If you haven't mucked around with the KafkaFuture class before, I 
believe `KafkaFuture#get(long timeout, TimeUnit unit)` is what you'd need here

##########
File path: streams/src/main/java/org/apache/kafka/streams/KafkaStreams.java
##########
@@ -1005,11 +1008,60 @@ private StreamThread createAndAddStreamThread(final 
long cacheSizePerThread, fin
                             || threads.size() == 1)) {
                         streamThread.shutdown();
                         if 
(!streamThread.getName().equals(Thread.currentThread().getName())) {
-                            
streamThread.waitOnThreadState(StreamThread.State.DEAD);
+                            
streamThread.waitOnThreadState(StreamThread.State.DEAD, -1);

Review comment:
       To be consistent with the semantics of `KafkaStreams#close`, the 
overload with no parameter should probably default to be fully blocking, ie 
with a timeout of `Long.MAX_VALUE`. Also, to avoid duplicate code, I would just 
have this method call `removeStreamThread(final Duration timeout)` instead of 
doing everything twice. Again, something like what we do for `#close`

##########
File path: 
streams/src/main/java/org/apache/kafka/streams/processor/internals/StreamThread.java
##########
@@ -610,17 +610,32 @@ public void setStreamsUncaughtExceptionHandler(final 
java.util.function.Consumer
         this.streamsUncaughtExceptionHandler = streamsUncaughtExceptionHandler;
     }
 
-    public void waitOnThreadState(final StreamThread.State targetState) {
+    public boolean waitOnThreadState(final StreamThread.State targetState, 
long timeoutMs) {
+        if (timeoutMs < 0) {

Review comment:
       I think if you fix the semantics of `removeStreamThread()` to match that 
of `close()` then there's no need for a `-1` sentinel, in which case we should 
just throw an `IllegalArgumentException` here (or it's probably better to check 
and throw that in the actual `removeStreamThread(timeout)` call to fail fast

##########
File path: 
streams/src/main/java/org/apache/kafka/streams/processor/internals/StreamThread.java
##########
@@ -610,17 +610,32 @@ public void setStreamsUncaughtExceptionHandler(final 
java.util.function.Consumer
         this.streamsUncaughtExceptionHandler = streamsUncaughtExceptionHandler;
     }
 
-    public void waitOnThreadState(final StreamThread.State targetState) {
+    public boolean waitOnThreadState(final StreamThread.State targetState, 
long timeoutMs) {
+        if (timeoutMs < 0) {
+            timeoutMs = 0;
+        } else if (timeoutMs == 0) {
+            timeoutMs = Long.MAX_VALUE;

Review comment:
       We definitely shouldn't modify the passed in timeout like this -- a user 
should be able to pass in `0` to mean "don't block at all". Mysteriously 
blocking forever when they do so would be pretty weird

##########
File path: streams/src/main/java/org/apache/kafka/streams/KafkaStreams.java
##########
@@ -91,6 +93,7 @@
 import java.util.concurrent.Executors;
 import java.util.concurrent.ScheduledExecutorService;
 import java.util.concurrent.TimeUnit;
+import java.util.concurrent.TimeoutException;

Review comment:
       (Tbh that drives me crazy, I once spent like 4 hours debugging something 
only to realize that I wasn't using the correct TimeoutException 😠 )




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to