[GitHub] [kafka] cadonna commented on a change in pull request #9615: KAFKA-10500: Add thread option

GitBox Wed, 02 Dec 2020 07:34:56 -0800


cadonna commented on a change in pull request #9615:
URL: https://github.com/apache/kafka/pull/9615#discussion_r534261843




##########
File path: streams/src/main/java/org/apache/kafka/streams/KafkaStreams.java
##########
@@ -870,43 +899,73 @@ private KafkaStreams(final InternalTopologyBuilder 
internalTopologyBuilder,
                 cacheSizePerThread,
                 stateDirectory,
                 delegatingStateRestoreListener,
-                i + 1,
+                threadIdx,
                 KafkaStreams.this::closeToError,
-                this::defaultStreamsUncaughtExceptionHandler
-            );
-            threads.add(streamThread);
-            threadState.put(streamThread.getId(), streamThread.state());
-            storeProviders.add(new 
StreamThreadStateStoreProvider(streamThread));
-        }
+                streamsUncaughtExceptionHandler
+        );
+        streamThread.setStateListener(streamStateListener);
+        threads.add(streamThread);
+        threadState.put(streamThread.getId(), streamThread.state());
+        storeProviders.add(new StreamThreadStateStoreProvider(streamThread));
+        return streamThread;
+    }
 
-        ClientMetrics.addNumAliveStreamThreadMetric(streamsMetrics, 
(metricsConfig, now) ->
-            Math.toIntExact(threads.stream().filter(thread -> 
thread.state().isAlive()).count()));
+    /**
+     * Adds and starts a stream thread in addition to the stream threads that 
are already running in this
+     * Kafka Streams client.
+     * <p>
+     * Since the number of stream threads increases, the sizes of the caches 
in the new stream thread
+     * and the existing stream threads are adapted so that the sum of the 
cache sizes over all stream
+     * threads does not exceed the total cache size specified in configuration
+     * {@link StreamsConfig#CACHE_MAX_BYTES_BUFFERING_CONFIG}.
+     * <p>
+     * Stream threads can only be added if this Kafka Streams client is in 
state RUNNING or REBALANCING.
+     *
+     * @return name of the added stream thread or empty if a new stream thread 
could not be added
+     */
+    public Optional<String> addStreamThread() {
+        if (isRunningOrRebalancing()) {
+            final int threadIdx = getNextThreadIndex();
+            final long cacheSizePerThread = 
getCacheSizePerThread(threads.size() + 1);
+            resizeThreadCache(cacheSizePerThread);
+            final StreamThread streamThread = 
createStreamThread(cacheSizePerThread, threadIdx);
+            synchronized (stateLock) {
+                if (isRunningOrRebalancing()) {
+                    streamThread.start();
+                    return Optional.of(streamThread.getName());
+                } else {
+                    threads.remove(streamThread);

Review comment:
       Good point about the shutdown of the stream thread!
   
   Actually, I did not want to have everything in the synchronized block 
because I thought blocking the client state more than needed was not a good 
idea. I thought decreasing the size of the cache might be costly if the evicted 
records are forwarded downstream. 
   Now that you mention to synchronize on a separate lock, I noticed that we 
probably need to put resize, start, and cleanup in the same synchronized block. 
The reason is that if two threads call `addStreamThread()` one after the other 
and the later thread passes
   
   ```
   final long cacheSizePerThread = getCacheSizePerThread(threads.size() + 1);
   ```
   
   before the earlier thread adds the new stream thread to `threads` in 
`createStreamThread()`, the later thread would compute the wrong cache size.
   
   So, I am in favor of having a separate lock that just synchronizes the 
threads calling `addStreamThread()`. Maybe we can simply synchronize the whole 
method (which means to synchronize with `start()` and `close()`).
   
   Still a minor issue seems to be the synchronization 
between`isRunningOrRebalancing()` and `streamThread.start()`. If between these 
two calls the Streams client transits to `ERROR` (the global stream thread 
died) an `IllegalStateException` would be thrown from the `StreamStateListener` 
because the Streams client would try to transit from `ERROR` to `REBALANCING`. 
But I guess that would also happen if the Streams client transits to `ERROR` 
before the new stream thread transits to `PARTITION_ASSIGNED` and calls the 
`StreamStateListener` that would transit the Streams client to `REBALANCING`. 
So it needs to be fixed somewhere else.
   
   Did I miss something? 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [kafka] cadonna commented on a change in pull request #9615: KAFKA-10500: Add thread option

Reply via email to