[jira] [Commented] (KAFKA-4325) Improve processing of late records for window operations
[ https://issues.apache.org/jira/browse/KAFKA-4325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16036512#comment-16036512 ] Guozhang Wang commented on KAFKA-4325: -- I think the timestamp computation needs some further thoughts: https://issues.apache.org/jira/browse/KAFKA-3514. So I'd suggest let's not rush to fix one-toe-at-a-time, and I will upload a summary / proposal for the timestamp computation under KAFKA-3514 for people to discuss and then propose a KIP. > Improve processing of late records for window operations > > > Key: KAFKA-4325 > URL: https://issues.apache.org/jira/browse/KAFKA-4325 > Project: Kafka > Issue Type: Improvement > Components: streams >Reporter: Matthias J. Sax >Priority: Minor > > Windows are kept until their retention time passed. If a late arriving record > is processed that is older than any window kept, a new window is created > containing this single late arriving record, the aggregation is computed and > the window is immediately discarded afterward (as it is older than retention > time). > This behavior might case problems for downstream application as the original > window aggregate might we overwritten with the late single-record- aggregate > value. Thus, we should rather not process the late arriving record for this > case. > However, data loss might not be acceptable for all use cases. In order to > enable the use to not lose any data, window operators should allow to > register a handler function that is called instead of just dropping the late > arriving record. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (KAFKA-4325) Improve processing of late records for window operations
[ https://issues.apache.org/jira/browse/KAFKA-4325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16036135#comment-16036135 ] Matthias J. Sax commented on KAFKA-4325: I assume yes. \cc [~guozhang] > Improve processing of late records for window operations > > > Key: KAFKA-4325 > URL: https://issues.apache.org/jira/browse/KAFKA-4325 > Project: Kafka > Issue Type: Improvement > Components: streams >Reporter: Matthias J. Sax >Priority: Minor > > Windows are kept until their retention time passed. If a late arriving record > is processed that is older than any window kept, a new window is created > containing this single late arriving record, the aggregation is computed and > the window is immediately discarded afterward (as it is older than retention > time). > This behavior might case problems for downstream application as the original > window aggregate might we overwritten with the late single-record- aggregate > value. Thus, we should rather not process the late arriving record for this > case. > However, data loss might not be acceptable for all use cases. In order to > enable the use to not lose any data, window operators should allow to > register a handler function that is called instead of just dropping the late > arriving record. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (KAFKA-4325) Improve processing of late records for window operations
[ https://issues.apache.org/jira/browse/KAFKA-4325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16035688#comment-16035688 ] Jeyhun Karimov commented on KAFKA-4325: --- [~mjsax], would this jira requre KIP? > Improve processing of late records for window operations > > > Key: KAFKA-4325 > URL: https://issues.apache.org/jira/browse/KAFKA-4325 > Project: Kafka > Issue Type: Improvement > Components: streams >Reporter: Matthias J. Sax >Priority: Minor > > Windows are kept until their retention time passed. If a late arriving record > is processed that is older than any window kept, a new window is created > containing this single late arriving record, the aggregation is computed and > the window is immediately discarded afterward (as it is older than retention > time). > This behavior might case problems for downstream application as the original > window aggregate might we overwritten with the late single-record- aggregate > value. Thus, we should rather not process the late arriving record for this > case. > However, data loss might not be acceptable for all use cases. In order to > enable the use to not lose any data, window operators should allow to > register a handler function that is called instead of just dropping the late > arriving record. -- This message was sent by Atlassian JIRA (v6.3.15#6346)