[GitHub] [kafka] cadonna commented on a change in pull request #10646: KAFKA-8897 Follow-up: Consolidate the global state stores
cadonna commented on a change in pull request #10646: URL: https://github.com/apache/kafka/pull/10646#discussion_r635024188 ## File path: streams/src/main/java/org/apache/kafka/streams/processor/internals/GlobalStateManagerImpl.java ## @@ -128,8 +129,7 @@ public void setGlobalProcessorContext(final InternalProcessorContext globalProce } final Set changelogTopics = new HashSet<>(); -for (final StateStore stateStore : globalStateStores) { -globalStoreNames.add(stateStore.name()); +for (final StateStore stateStore : topology.globalStateStores()) { Review comment: Now, I see what you mean. However, I am not sure it is a good idea to rely on the code in `GlobalStreamThread` that catches the fatal exception to clean up state stores (and all the rest). If we know, we throw a fatal exception, then we should clean up immediately before we throw. That makes the `GlobalStateManagerImpl` less error-prone, because it does not need to rely on a different class for its clean up , IMO. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [kafka] cadonna commented on a change in pull request #10646: KAFKA-8897 Follow-up: Consolidate the global state stores
[GitHub] [kafka] cadonna commented on a change in pull request #10646: KAFKA-8897 Follow-up: Consolidate the global state stores
cadonna commented on a change in pull request #10646: URL: https://github.com/apache/kafka/pull/10646#discussion_r633518578 ## File path: streams/src/main/java/org/apache/kafka/streams/processor/internals/GlobalStateManagerImpl.java ## @@ -128,8 +129,7 @@ public void setGlobalProcessorContext(final InternalProcessorContext globalProce } final Set changelogTopics = new HashSet<>(); -for (final StateStore stateStore : globalStateStores) { -globalStoreNames.add(stateStore.name()); +for (final StateStore stateStore : topology.globalStateStores()) { final String sourceTopic = storeToChangelogTopic.get(stateStore.name()); changelogTopics.add(sourceTopic); stateStore.init((StateStoreContext) globalProcessorContext, stateStore); Review comment: > Hmm, for production, do we ever restart a thread even for illegal-state or illegal-argument? If the user decides to restart a stream thread in its exception handler it is possible. ## File path: streams/src/main/java/org/apache/kafka/streams/processor/internals/ProcessorStateManager.java ## @@ -328,7 +328,8 @@ public void registerStore(final StateStore store, final StateRestoreCallback sta converterForStore(store)) : new StateStoreMetadata(store); - +// register the store first, so that if later an exception is thrown then eventually while we call `close` +// on the state manager this state store would be closed as well stores.put(storeName, storeMetadata); Review comment: See my comment above. ## File path: streams/src/main/java/org/apache/kafka/streams/processor/internals/GlobalStateManagerImpl.java ## @@ -128,8 +129,7 @@ public void setGlobalProcessorContext(final InternalProcessorContext globalProce } final Set changelogTopics = new HashSet<>(); -for (final StateStore stateStore : globalStateStores) { -globalStoreNames.add(stateStore.name()); +for (final StateStore stateStore : topology.globalStateStores()) { Review comment: If I put ``` assertThat(store1.isOpen(), is(false)); assertThat(store2.isOpen(), is(false)); assertThat(store3.isOpen(), is(false)); assertThat(store4.isOpen(), is(false)); ``` on line 202 in `shouldThrowStreamsExceptionForOldTopicPartitions()` the test fails. Hence, we leak a state store. ## File path: streams/src/main/java/org/apache/kafka/streams/processor/internals/GlobalStateManagerImpl.java ## @@ -174,13 +176,15 @@ public void registerStore(final StateStore store, final StateRestoreCallback sta throw new IllegalArgumentException(String.format("Trying to register store %s that is not a known global store", store.name())); } +// register the store first, so that if later an exception is thrown then eventually while we call `close` Review comment: I agree that we would not be able to book-keep both, but the state store in `store` that we just opened is still open in line 172. So we need to close the state store in `store` before throwing the exception otherwise we will leak it. The same applies to line 176. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [kafka] cadonna commented on a change in pull request #10646: KAFKA-8897 Follow-up: Consolidate the global state stores
cadonna commented on a change in pull request #10646: URL: https://github.com/apache/kafka/pull/10646#discussion_r633518578 ## File path: streams/src/main/java/org/apache/kafka/streams/processor/internals/GlobalStateManagerImpl.java ## @@ -128,8 +129,7 @@ public void setGlobalProcessorContext(final InternalProcessorContext globalProce } final Set changelogTopics = new HashSet<>(); -for (final StateStore stateStore : globalStateStores) { -globalStoreNames.add(stateStore.name()); +for (final StateStore stateStore : topology.globalStateStores()) { final String sourceTopic = storeToChangelogTopic.get(stateStore.name()); changelogTopics.add(sourceTopic); stateStore.init((StateStoreContext) globalProcessorContext, stateStore); Review comment: > Hmm, for production, do we ever restart a thread even for illegal-state or illegal-argument? If the user decides to restart a stream thread in its exception handler it is possible. ## File path: streams/src/main/java/org/apache/kafka/streams/processor/internals/ProcessorStateManager.java ## @@ -328,7 +328,8 @@ public void registerStore(final StateStore store, final StateRestoreCallback sta converterForStore(store)) : new StateStoreMetadata(store); - +// register the store first, so that if later an exception is thrown then eventually while we call `close` +// on the state manager this state store would be closed as well stores.put(storeName, storeMetadata); Review comment: See my comment above. ## File path: streams/src/main/java/org/apache/kafka/streams/processor/internals/GlobalStateManagerImpl.java ## @@ -128,8 +129,7 @@ public void setGlobalProcessorContext(final InternalProcessorContext globalProce } final Set changelogTopics = new HashSet<>(); -for (final StateStore stateStore : globalStateStores) { -globalStoreNames.add(stateStore.name()); +for (final StateStore stateStore : topology.globalStateStores()) { Review comment: If I put ``` assertThat(store1.isOpen(), is(false)); assertThat(store2.isOpen(), is(false)); assertThat(store3.isOpen(), is(false)); assertThat(store4.isOpen(), is(false)); ``` on line 202 in `shouldThrowStreamsExceptionForOldTopicPartitions()` the test fails. Hence, we leak a state store. ## File path: streams/src/main/java/org/apache/kafka/streams/processor/internals/GlobalStateManagerImpl.java ## @@ -174,13 +176,15 @@ public void registerStore(final StateStore store, final StateRestoreCallback sta throw new IllegalArgumentException(String.format("Trying to register store %s that is not a known global store", store.name())); } +// register the store first, so that if later an exception is thrown then eventually while we call `close` Review comment: I agree that we would not be able to book-keep both, but the state store in `store` that we just opened is still open in line 172. So we need to close the state store in `store` before throwing the exception otherwise we will leak it. The same applies to line 176. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [kafka] cadonna commented on a change in pull request #10646: KAFKA-8897 Follow-up: Consolidate the global state stores
cadonna commented on a change in pull request #10646: URL: https://github.com/apache/kafka/pull/10646#discussion_r632371505 ## File path: streams/src/main/java/org/apache/kafka/streams/processor/internals/GlobalStateManagerImpl.java ## @@ -128,8 +129,7 @@ public void setGlobalProcessorContext(final InternalProcessorContext globalProce } final Set changelogTopics = new HashSet<>(); -for (final StateStore stateStore : globalStateStores) { -globalStoreNames.add(stateStore.name()); +for (final StateStore stateStore : topology.globalStateStores()) { final String sourceTopic = storeToChangelogTopic.get(stateStore.name()); changelogTopics.add(sourceTopic); stateStore.init((StateStoreContext) globalProcessorContext, stateStore); Review comment: On a second thought, it might also be relevant for production code since we now can restart the stream thread after a fatal error. This is not yet possible for a global stream thread, but it might be possible in future. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [kafka] cadonna commented on a change in pull request #10646: KAFKA-8897 Follow-up: Consolidate the global state stores
cadonna commented on a change in pull request #10646: URL: https://github.com/apache/kafka/pull/10646#discussion_r629968874 ## File path: streams/src/main/java/org/apache/kafka/streams/processor/internals/GlobalStateManagerImpl.java ## @@ -128,8 +129,7 @@ public void setGlobalProcessorContext(final InternalProcessorContext globalProce } final Set changelogTopics = new HashSet<>(); -for (final StateStore stateStore : globalStateStores) { -globalStoreNames.add(stateStore.name()); +for (final StateStore stateStore : topology.globalStateStores()) { Review comment: Not really related to this line. Could you verify that the state store is closed in the unit test that tests line 148? The name of the test is `shouldThrowStreamsExceptionForOldTopicPartitions()`. ## File path: streams/src/main/java/org/apache/kafka/streams/processor/internals/GlobalStateManagerImpl.java ## @@ -128,8 +129,7 @@ public void setGlobalProcessorContext(final InternalProcessorContext globalProce } final Set changelogTopics = new HashSet<>(); -for (final StateStore stateStore : globalStateStores) { -globalStoreNames.add(stateStore.name()); +for (final StateStore stateStore : topology.globalStateStores()) { final String sourceTopic = storeToChangelogTopic.get(stateStore.name()); changelogTopics.add(sourceTopic); stateStore.init((StateStoreContext) globalProcessorContext, stateStore); Review comment: There are a a `IllegalStateException` and a couple of `IllegalArgumentException`s on the path from opening the state store within `stateStore.init()` to line 182 in `this.registerStore()`. We do not close the state stores before we throw. I do not think this is relevant for production code, but we could leak state stores in unit tests if we do not explicitly close the state stores in the unit tests. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org