[ https://issues.apache.org/jira/browse/KAFKA-13499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sagar Rao reassigned KAFKA-13499: --------------------------------- Assignee: (was: Sagar Rao) > Avoid restoring outdated records > -------------------------------- > > Key: KAFKA-13499 > URL: https://issues.apache.org/jira/browse/KAFKA-13499 > Project: Kafka > Issue Type: Improvement > Components: streams > Reporter: Matthias J. Sax > Priority: Major > > Kafka Streams has the config `windowstore.changelog.additional.retention.ms` > to allow for an increase retention time. > While an increase retention time can be useful, it can also lead to > unnecessary restore cost, especially for stream-stream joins. Assume a > stream-stream join with 1h window size and a grace period of 1h. For this > case, we only need 2h of data to restore. If we lag, the > `windowstore.changelog.additional.retention.ms` helps to prevent the broker > from truncating data too early. However, if we don't lag and we need to > restore, we restore everything from the changelog. > Instead of doing a seek-to-beginning, we could use the timestamp index to > seek the first offset older than the 2h "window" of data that we need to > restore, to avoid unnecessary work. -- This message was sent by Atlassian Jira (v8.20.10#820010)