[GitHub] [kafka] dongjinleekr commented on pull request #9414: KAFKA-10585: Kafka Streams should clean up the state store directory from cleanup
dongjinleekr commented on pull request #9414: URL: https://github.com/apache/kafka/pull/9414#issuecomment-858453218 Rebased onto the latest trunk. cc/ @vvcephei -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [kafka] dongjinleekr commented on pull request #9414: KAFKA-10585: Kafka Streams should clean up the state store directory from cleanup
dongjinleekr commented on pull request #9414: URL: https://github.com/apache/kafka/pull/9414#issuecomment-847886290 Hi @guozhangwang @vvcephei, Could you have a look now? :smiley: -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [kafka] dongjinleekr commented on pull request #9414: KAFKA-10585: Kafka Streams should clean up the state store directory from cleanup
dongjinleekr commented on pull request #9414: URL: https://github.com/apache/kafka/pull/9414#issuecomment-834361327 Hi @vvcephei, Could you review this PR now? :bow: -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [kafka] dongjinleekr commented on pull request #9414: KAFKA-10585: Kafka Streams should clean up the state store directory from cleanup
dongjinleekr commented on pull request #9414: URL: https://github.com/apache/kafka/pull/9414#issuecomment-822991690 > I know @vvcephei was quite swamped in the past months with pretty heavy release management duties. Totally agree. It's also why I did not press him. :smile: I rebased the PR into the latest trunk and checked it passes all tests! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [kafka] dongjinleekr commented on pull request #9414: KAFKA-10585: Kafka Streams should clean up the state store directory from cleanup
dongjinleekr commented on pull request #9414: URL: https://github.com/apache/kafka/pull/9414#issuecomment-822452651 Hi @guozhangwang, Thanks for your mention. I completed applying @vvcephei's comment (on March 25, 2021) and awaiting review and merging, with maintaining the PR to the latest trunk. As you can see here, this issue was started much earlier but It seems like I misunderstood @vvcephei's intention. And I recognized it this March. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [kafka] dongjinleekr commented on pull request #9414: KAFKA-10585: Kafka Streams should clean up the state store directory from cleanup
dongjinleekr commented on pull request #9414: URL: https://github.com/apache/kafka/pull/9414#issuecomment-806739608 Add to this, I also updated `EOSUncleanShutdownIntegrationTest` also. I inspected the history of this test suite precisely and found the following: 1. This suite was introduced in c2ec974 by @abbccdda. It asserts that the Task's statestore directory (not root one in KafkaStreams's `state.dir` property) is deleted after an unclean shutdown. However, IMHO this assertion is a little bit inaccurate since the Task's statestore directory may not be deleted, for example, with an empty checkpoint file. 2. A comment is added on L160 in d3c067f by @guozhangwang - 'the state directory should still exist with the empty checkpoint file'. However, IMHO this comment seems to be a mistake, since `assertFalse(stateDir.exists());` on L161 asserts that the Task's statestore directory does not exist. There are some other commits on this file later, but all of them are just minor improvements, not changing the testing logic. With this PR, Kafka Streams now deletes the empty application statestore directory (i.e., `{state.dir}/{application.id}`). So, I improved this suite like this: 1. After initialization, wait until the Task's statestore directory is populated. (i.e., `{state.dir}/{application.id}/0_0/*`) 2. Assert that one of the following is satisfied: - The Task's statestore directory is empty, so deleted. - The Task's statestore directory is not empty but without a checkpoint file. - The Task's statestore directory is not empty but with an empty checkpoint file. (Wait, then should we separate the modifications on `EOSUncleanShutdownIntegrationTest` into an independent PR?) Please have a look when you are free. Thanks again for reviewing my PR! :smiley: -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [kafka] dongjinleekr commented on pull request #9414: KAFKA-10585: Kafka Streams should clean up the state store directory from cleanup
dongjinleekr commented on pull request #9414: URL: https://github.com/apache/kafka/pull/9414#issuecomment-806736975 Hi @vvcephei, It seems like I have failed to understand your intention correctly. Here is the fix. I reorganized the PR into two commits, one for `StateDirectory` and the other one for `EOSUncleanShutdownIntegrationTest`. The following summarizes the updates: 1. Rebase the PR onto the latest trunk, resolving all broken syntaxes. 2. Remove redundant assertion in `StateDirectoryTest#shouldNotDeleteAppDirWhenCleanUpIfNotEmpty`; it now validates logging message only. 3. Make `StateDirectoryIntegrationTest` to use default timeout (`IntegrationTestUtils.DEFAULT_TIMEOUT`) with its latches, keeping the suite from being flaky. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [kafka] dongjinleekr commented on pull request #9414: KAFKA-10585: Kafka Streams should clean up the state store directory from cleanup
dongjinleekr commented on pull request #9414: URL: https://github.com/apache/kafka/pull/9414#issuecomment-762079038 Rebased onto the latest trunk. cc/ @vvcephei @mjsax This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [kafka] dongjinleekr commented on pull request #9414: KAFKA-10585: Kafka Streams should clean up the state store directory from cleanup
dongjinleekr commented on pull request #9414: URL: https://github.com/apache/kafka/pull/9414#issuecomment-739132524 Rebased onto the latest trunk. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [kafka] dongjinleekr commented on pull request #9414: KAFKA-10585: Kafka Streams should clean up the state store directory from cleanup
dongjinleekr commented on pull request #9414: URL: https://github.com/apache/kafka/pull/9414#issuecomment-733525231 Retest this please. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [kafka] dongjinleekr commented on pull request #9414: KAFKA-10585: Kafka Streams should clean up the state store directory from cleanup
dongjinleekr commented on pull request #9414: URL: https://github.com/apache/kafka/pull/9414#issuecomment-731208645 Hi @vvcephei, I re-based the branch onto the latest trunk, reorganized the changes into two commits, and fixed `EOSUncleanShutdownIntegrationTest#shouldWorkWithUncleanShutdownWipeOutStateStore` - Sure, its contents do not follow the description of its name. :wink: This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [kafka] dongjinleekr commented on pull request #9414: KAFKA-10585: Kafka Streams should clean up the state store directory from cleanup
dongjinleekr commented on pull request #9414: URL: https://github.com/apache/kafka/pull/9414#issuecomment-724725724 Hi @vvcephei, Here is the fix. The reason for the broken integration tests was: `KafkaStreams#cleanUp` can be called regardless of the `StreamThread`s are terminated. That is, if `KafkaStreams#cleanUp` is called when the `KafkaStreams` is still running, the `{state-store-directory}/{application-id}` directory is deleted and `StreamThread` crashes with the exception from `StateDirectory#directoryForTask` - since it fails to create the task state store directory, i.e., `{state-store-directory}/{application-id}/{task-id}`. Moreover, `StateDirectory#directoryForTask` method has two additional vulnerabilities: 1. When it creates the task state store directory, it does not create its parent directories automatically - if there is not `{state-store-directory}/{application-id}`, `taskDir.mkdir()` returns false and it throws an exception. For this behavior breaks the integration tests, I modified `taskDir.mkdir()` to `taskDir.mkdirs()` to create `{state-store-directory}/{application-id}` automatically. 2. This method only checks whether there is a `File` at `{state-store-directory}/{application-id}/{task-id}`, regardless of it is actually a directory or not. I added an additional check condition for this case and `StateDirectoryTest#shouldThrowProcessorStateException` is updated accordingly. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [kafka] dongjinleekr commented on pull request #9414: KAFKA-10585: Kafka Streams should clean up the state store directory from cleanup
dongjinleekr commented on pull request #9414: URL: https://github.com/apache/kafka/pull/9414#issuecomment-721135460 Hi @vvcephei, Sorry for being late. Defending myself, I have been too busy to finalize my project last week, and it ended yesterday. Sure, I am now resolving the broken tests. There are 21 broken tests with the latest trunk, and I just resolved 16 of them. It will be completed soon. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [kafka] dongjinleekr commented on pull request #9414: KAFKA-10585: Kafka Streams should clean up the state store directory from cleanup
dongjinleekr commented on pull request #9414: URL: https://github.com/apache/kafka/pull/9414#issuecomment-714293384 @vvcephei Here it is; I deleted a outdated condition statement in `EOSUncleanShutdownIntegrationTest`. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [kafka] dongjinleekr commented on pull request #9414: KAFKA-10585: Kafka Streams should clean up the state store directory from cleanup
dongjinleekr commented on pull request #9414: URL: https://github.com/apache/kafka/pull/9414#issuecomment-713251214 @vvcephei Here it is; I updated the timeout to `IntegrationTestUtils.DEFAULT_TIMEOUT` and rebased onto the latest trunk also. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [kafka] dongjinleekr commented on pull request #9414: KAFKA-10585: Kafka Streams should clean up the state store directory from cleanup
dongjinleekr commented on pull request #9414: URL: https://github.com/apache/kafka/pull/9414#issuecomment-707498871 Hi @vvcephei, 1. Sure. I added some validations in `StateDirectoryTest`. - If `StateDirectory#clean` is called, the empty `appDir` is also deleted. (see `StateDirectoryTest#shouldLogManualUserCallMessage`.) - If `StateDirectory#clean` is not called, the global state directory and it parent, `appDir` is not deleted. (see `StateDirectoryTest#shouldLogStateDirCleanerMessage`.) Please note the difference in `StateDirectoryTest#shouldCleanupAllTaskDirectoriesIncludingGlobalOne`; the `appDir` was an empty directory before but it is now deleted. 2. Since the goal of this modification is deleting the empty directory, we don't need a recursive delete; it is also why I called `File#delete` here, since it works and returns `true` iff the target directory is empty. And one more thing: I added an exception handling for `SecurityException`. :smile: This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org