[GitHub] [kafka] cadonna commented on pull request #10953: MINOR: Default GRACE with Old API should set as 24H minus window-size / inactivity-gap
cadonna commented on pull request #10953: URL: https://github.com/apache/kafka/pull/10953#issuecomment-880699127 Opened the following PR for the session window bug: https://issues.apache.org/jira/browse/KAFKA-13094 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [kafka] cadonna commented on pull request #10953: MINOR: Default GRACE with Old API should set as 24H minus window-size / inactivity-gap
cadonna commented on pull request #10953: URL: https://github.com/apache/kafka/pull/10953#issuecomment-879704540 I do not agree. If users deploy Streams apps in 2.8 we guarantee retention time T, but when users upgrade the Streams apps to 3.0 we guarantee retention time T+X. We do not change the retention time of the changelog topic during the upgrade. Hence, we miss to guarantee T+X. I think that is an issue, because we might get different results if we run the same upgraded Streams app multiple times on the same data without reset. Note, that it is not only the retention time. In 2.8, we guarantee a different default grace period than in 3.0. I am aware that the default grace period is deprecated in 3.0, but the behavior should still match the old. Anyways, we need to come to a decision about this PR. The 3.0 release is around the corner. \cc @guozhangwang @vvcephei @ableegoldman -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [kafka] cadonna commented on pull request #10953: MINOR: Default GRACE with Old API should set as 24H minus window-size / inactivity-gap
cadonna commented on pull request #10953: URL: https://github.com/apache/kafka/pull/10953#issuecomment-878999564 I think we had a bug in the `SessionWindows` because we have never considered a user set grace period when we computed the retention time. That should be fixed with the new API, but it might also cause incompatibilities regarding the retention time of the changelog topic. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [kafka] cadonna commented on pull request #10953: MINOR: Default GRACE with Old API should set as 24H minus window-size / inactivity-gap
cadonna commented on pull request #10953: URL: https://github.com/apache/kafka/pull/10953#issuecomment-878965687 > However, looking into the code I am a little bit confused about maintainMs() in TimeWindows: > ``` public long maintainMs() { return Math.max(maintainDurationMs, sizeMs + gracePeriodMs()); } ``` This code is indeed a bit complicated. I did some investigation and came to the following results: For 2.8 the retention times with default grace period are: - `TimeWindows`: - **24h** if window size is less than 24h, - **window size** if window size is greater than or equal to 24h. - `SessionWindows`: - **24h** if the inactivity gap is less than 24h, - **inactivity gap** if the inactivity gap is greater than or equal to 24h. - `JoinWindows`: - **24h** if window size is less than 24h, - **window size** if window size is greater than or equal to 24h. For 3.0 the retention times with default grace period are: - `TimeWindows`: - **window size + grace windows** - `SessionWindows`: - **inactivity gap + grace windows** - `JoinWindows`: - **window size + grace windows** -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [kafka] cadonna commented on pull request #10953: MINOR: Default GRACE with Old API should set as 24H minus window-size / inactivity-gap
cadonna commented on pull request #10953: URL: https://github.com/apache/kafka/pull/10953#issuecomment-878835507 > I don't see how there could be data loss? There could be data loss, because locally the windowed state store would store records for a longer period of time than in the changelog topic. If a Kafka Streams client is restarted with wiped out state it might restore less records into the state store than the state store had before the restart. In other words, if the Kafka Streams state store were not restarted, it would have more records than after the restart. This might happen because records that are outside the shorter retention time of the changelog topic might have already been removed from the changelog topic. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [kafka] cadonna commented on pull request #10953: MINOR: Default GRACE with Old API should set as 24H minus window-size / inactivity-gap
cadonna commented on pull request #10953: URL: https://github.com/apache/kafka/pull/10953#issuecomment-878149711 > Are these changes only necessary for SessionWindows and TimeWindows classes? What about the JoinWindows and SlidingWindows classes? As far as I can see `SlidingWindows` do not have a default grace period. I think we need also to fix `JoinWindows`, but I would like confirmation from @mjsax. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [kafka] cadonna commented on pull request #10953: MINOR: Default GRACE with Old API should set as 24H minus window-size / inactivity-gap
cadonna commented on pull request #10953: URL: https://github.com/apache/kafka/pull/10953#issuecomment-873033160 @showuon It is not different. With this PR it is again equal. PR #10378 made it different, but PR #10378 has been never released. It was merged to trunk after 2.8 was released. If we merge this PR before 3.0 is released everything is fine and the grace period computation never changed. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [kafka] cadonna commented on pull request #10953: MINOR: Default GRACE with Old API should set as 24H minus window-size / inactivity-gap
cadonna commented on pull request #10953: URL: https://github.com/apache/kafka/pull/10953#issuecomment-872811076 Some details for the other reviewers. In 2.8 and before, we computed the default grace period with `Math.max(maintainDurationMs - sizeMs, 0);` in method `gracePeriodMs()` in `TimeWindows`, `SessionWindows`, and `JoinWindows`. That means that the default grace period has never been 24 hours but 24 hours - window size. Since `gracePeriodMs()` is used to compute the retention time of the changelog topic for the corresponding state store it is important to keep the same computation for the deprecated methods. Otherwise, Streams app that run with 2.8 and before might not be compatible with Streams 3.0 because the retention time of the changelog topics created with older Streams apps will be smaller than the assumed retention time for Streams apps in 3.0. For example, with a window size of 10 hours, an old Streams app would have created a changelog topic with retention time 10 hours (window size) + 14 hours (default grace period, 24 hours - 10 hours). A 3.0 Streams app would assume a retention time of 10 hours (window size) + 24 hours (deprecated default grace period as currently specified on trunk). In the presence of failures, where a state store needs to recreated, records might get lost, because before the failure the state store of a 3.0 Streams app contained 10 hours + 24 hours of records whereas the changelog topic that was created with the old Streams app would only contain 10 hours + 14 hours of records. All this happened due to us always stating that the default grace period was 24 hours although it was not completely correct and a connected and unfortunate misunderstanding when we removed deprecated windows APIs (#10378). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org