cadonna commented on a change in pull request #9615: URL: https://github.com/apache/kafka/pull/9615#discussion_r534261843
########## File path: streams/src/main/java/org/apache/kafka/streams/KafkaStreams.java ########## @@ -870,43 +899,73 @@ private KafkaStreams(final InternalTopologyBuilder internalTopologyBuilder, cacheSizePerThread, stateDirectory, delegatingStateRestoreListener, - i + 1, + threadIdx, KafkaStreams.this::closeToError, - this::defaultStreamsUncaughtExceptionHandler - ); - threads.add(streamThread); - threadState.put(streamThread.getId(), streamThread.state()); - storeProviders.add(new StreamThreadStateStoreProvider(streamThread)); - } + streamsUncaughtExceptionHandler + ); + streamThread.setStateListener(streamStateListener); + threads.add(streamThread); + threadState.put(streamThread.getId(), streamThread.state()); + storeProviders.add(new StreamThreadStateStoreProvider(streamThread)); + return streamThread; + } - ClientMetrics.addNumAliveStreamThreadMetric(streamsMetrics, (metricsConfig, now) -> - Math.toIntExact(threads.stream().filter(thread -> thread.state().isAlive()).count())); + /** + * Adds and starts a stream thread in addition to the stream threads that are already running in this + * Kafka Streams client. + * <p> + * Since the number of stream threads increases, the sizes of the caches in the new stream thread + * and the existing stream threads are adapted so that the sum of the cache sizes over all stream + * threads does not exceed the total cache size specified in configuration + * {@link StreamsConfig#CACHE_MAX_BYTES_BUFFERING_CONFIG}. + * <p> + * Stream threads can only be added if this Kafka Streams client is in state RUNNING or REBALANCING. + * + * @return name of the added stream thread or empty if a new stream thread could not be added + */ + public Optional<String> addStreamThread() { + if (isRunningOrRebalancing()) { + final int threadIdx = getNextThreadIndex(); + final long cacheSizePerThread = getCacheSizePerThread(threads.size() + 1); + resizeThreadCache(cacheSizePerThread); + final StreamThread streamThread = createStreamThread(cacheSizePerThread, threadIdx); + synchronized (stateLock) { + if (isRunningOrRebalancing()) { + streamThread.start(); + return Optional.of(streamThread.getName()); + } else { + threads.remove(streamThread); Review comment: Good point about the shutdown of the stream thread! Actually, I did not want to have everything in the synchronized block because I thought blocking the client state more than needed was not a good idea. I thought decreasing the size of the cache might be costly if the evicted records are forwarded downstream. Now that you mention to synchronize on a separate lock, I noticed that we probably need to put resize, start, and cleanup in the same synchronized block. The reason is that if two threads call `addStreamThread()` one after the other and the later thread passes ``` final long cacheSizePerThread = getCacheSizePerThread(threads.size() + 1); ``` before the earlier thread adds the new stream thread to `threads` in `createStreamThread()`, the later thread would compute the wrong cache size. So, I am in favor of having a separate lock that just synchronizes the threads calling `addStreamThread()`. Maybe we can simply synchronize the whole method (which means to synchronize with `start()` and `close()`). Still a minor issue seems to be the synchronization between`isRunningOrRebalancing()` and `streamThread.start()`. If between these two calls the Streams client transits to `ERROR` (the global stream thread died) an `IllegalStateException` would be thrown from the `StreamStateListener` because the Streams client would try to transit from `ERROR` to `REBALANCING`. But I guess that would also happen if the Streams client transits to `ERROR` before the new stream thread transits to `PARTITION_ASSIGNED` and calls the `StreamStateListener` that would transit the Streams client to `REBALANCING`. So it needs to be fixed somewhere else. Did I miss something? ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org