[
https://issues.apache.org/jira/browse/KAFKA-10755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Matthias J. Sax updated KAFKA-10755:
------------------------------------
Priority: Critical (was: Major)
> Should consider commit latency when computing next commit timestamp
> -------------------------------------------------------------------
>
> Key: KAFKA-10755
> URL: https://issues.apache.org/jira/browse/KAFKA-10755
> Project: Kafka
> Issue Type: Bug
> Components: streams
> Affects Versions: 2.6.0
> Reporter: Matthias J. Sax
> Assignee: Matthias J. Sax
> Priority: Critical
>
> In 2.6, we reworked the main processing/commit loop in `StreamThread` and
> introduced a regression, by _not_ updating the current time after committing.
> This implies that we compute the next commit timestamp too low (ie, too
> early).
> For small commit intervals and high commit latency (like in EOS), this big
> may lead to an increased commit frequency and fewer processed records between
> two commits, and thus to reduced throughput.
> For example, assume that the commit interval is 100ms and the commit latency
> is 50ms, and we start the commit at timestamp 10000. The commit finishes at
> 10050, and the next commit should happen at 10150. However, if we don't
> update the current timestamp, we incorrectly compute the next commit time as
> 10100, ie, 50ms too early, and we have only 50ms to process data instead of
> the intended 100ms.
> In the worst case, if the commit latency is larger than the commit interval,
> it would imply that we commit after processing a single record per task.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)