cadonna commented on a change in pull request #9615:
URL: https://github.com/apache/kafka/pull/9615#discussion_r534261843
##########
File path: streams/src/main/java/org/apache/kafka/streams/KafkaStreams.java
##########
@@ -870,43 +899,73 @@ private KafkaStreams(final InternalTopologyBuilder
internalTopologyBuilder,
cacheSizePerThread,
stateDirectory,
delegatingStateRestoreListener,
- i + 1,
+ threadIdx,
KafkaStreams.this::closeToError,
- this::defaultStreamsUncaughtExceptionHandler
- );
- threads.add(streamThread);
- threadState.put(streamThread.getId(), streamThread.state());
- storeProviders.add(new
StreamThreadStateStoreProvider(streamThread));
- }
+ streamsUncaughtExceptionHandler
+ );
+ streamThread.setStateListener(streamStateListener);
+ threads.add(streamThread);
+ threadState.put(streamThread.getId(), streamThread.state());
+ storeProviders.add(new StreamThreadStateStoreProvider(streamThread));
+ return streamThread;
+ }
- ClientMetrics.addNumAliveStreamThreadMetric(streamsMetrics,
(metricsConfig, now) ->
- Math.toIntExact(threads.stream().filter(thread ->
thread.state().isAlive()).count()));
+ /**
+ * Adds and starts a stream thread in addition to the stream threads that
are already running in this
+ * Kafka Streams client.
+ * <p>
+ * Since the number of stream threads increases, the sizes of the caches
in the new stream thread
+ * and the existing stream threads are adapted so that the sum of the
cache sizes over all stream
+ * threads does not exceed the total cache size specified in configuration
+ * {@link StreamsConfig#CACHE_MAX_BYTES_BUFFERING_CONFIG}.
+ * <p>
+ * Stream threads can only be added if this Kafka Streams client is in
state RUNNING or REBALANCING.
+ *
+ * @return name of the added stream thread or empty if a new stream thread
could not be added
+ */
+ public Optional<String> addStreamThread() {
+ if (isRunningOrRebalancing()) {
+ final int threadIdx = getNextThreadIndex();
+ final long cacheSizePerThread =
getCacheSizePerThread(threads.size() + 1);
+ resizeThreadCache(cacheSizePerThread);
+ final StreamThread streamThread =
createStreamThread(cacheSizePerThread, threadIdx);
+ synchronized (stateLock) {
+ if (isRunningOrRebalancing()) {
+ streamThread.start();
+ return Optional.of(streamThread.getName());
+ } else {
+ threads.remove(streamThread);
Review comment:
Good point about the shutdown of the stream thread!
Actually, I did not want to have everything in the synchronized block
because I thought blocking the client state more than needed was not a good
idea. I thought decreasing the size of the cache might be costly if the evicted
records are forwarded downstream.
Now that you mention to synchronize on a separate lock, I noticed that we
probably need to put resize, start, and cleanup in the same synchronized block.
The reason is that if two threads call `addStreamThread()` one after the other
and the later thread passes
```
final long cacheSizePerThread = getCacheSizePerThread(threads.size() + 1);
```
before the earlier thread adds the new stream thread to `threads` in
`createStreamThread()`, the later thread would compute the wrong cache size.
So, I am in favor of having a separate lock that just synchronizes the
threads calling `addStreamThread()`. Maybe we can simply synchronize the whole
method (which means to synchronize with `start()` and `close()`).
Still a minor issue seems to be the synchronization
between`isRunningOrRebalancing()` and `streamThread.start()`. If between these
two calls the Streams client transits to `ERROR` (the global stream thread
died) an `IllegalStateException` would be thrown from the `StreamStateListener`
because the Streams client would try to transit from `ERROR` to `REBALANCING`.
But I guess that would also happen if the Streams client transits to `ERROR`
before the new stream thread transits to `PARTITION_ASSIGNED` and calls the
`StreamStateListener` that would transit the Streams client to `REBALANCING`.
So it needs to be fixed somewhere else.
Did I miss something?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]