[ 
https://issues.apache.org/jira/browse/KAFKA-10755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias J. Sax reassigned KAFKA-10755:
---------------------------------------

    Assignee: Matthias J. Sax

> Should consider commit latency when computing next commit timestamp
> -------------------------------------------------------------------
>
>                 Key: KAFKA-10755
>                 URL: https://issues.apache.org/jira/browse/KAFKA-10755
>             Project: Kafka
>          Issue Type: Bug
>          Components: streams
>    Affects Versions: 2.6.0
>            Reporter: Matthias J. Sax
>            Assignee: Matthias J. Sax
>            Priority: Major
>
> In 2.6, we reworked the main processing/commit loop in `StreamThread` and 
> introduced a regression, by _not_ updating the current time after committing. 
> This implies that we compute the next commit timestamp too low (ie, too 
> early).
> For small commit intervals and high commit latency (like in EOS), this big 
> may lead to an increased commit frequency and fewer processed records between 
> two commits, and thus to reduced throughput.
> For example, assume that the commit interval is 100ms and the commit latency 
> is 50ms, and we start the commit at timestamp 10000. The commit finishes at 
> 10050, and the next commit should happen at 10150. However, if we don't 
> update the current timestamp, we incorrectly compute the next commit time as 
> 10100, ie, 50ms too early, and we have only 50ms to process data instead of 
> the intended 100ms.
> In the worst case, if the commit latency is larger than the commit interval, 
> it would imply that we commit after processing a single record per task.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to