[jira] [Resolved] (KAFKA-10091) Improve task idling

John Roesler (Jira) Mon, 12 Jul 2021 14:20:04 -0700


     [ 
https://issues.apache.org/jira/browse/KAFKA-10091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


John Roesler resolved KAFKA-10091.
----------------------------------
    Resolution: Fixed

> Improve task idling
> -------------------
>
>                 Key: KAFKA-10091
>                 URL: https://issues.apache.org/jira/browse/KAFKA-10091
>             Project: Kafka
>          Issue Type: Task
>          Components: streams
>            Reporter: John Roesler
>            Assignee: John Roesler
>            Priority: Blocker
>              Labels: needs-kip
>             Fix For: 3.0.0
>
>
> When Streams is processing a task with multiple inputs, each time it is ready 
> to process a record, it has to choose which input to process next. It always 
> takes from the input for which the next record has the least timestamp. The 
> result of this is that Streams processes data in timestamp order. However, if 
> the buffer for one of the inputs is empty, Streams doesn't know what 
> timestamp the next record for that input will be.
> Streams introduced a configuration "max.task.idle.ms" in KIP-353 to address 
> this issue.
> [https://cwiki.apache.org/confluence/display/KAFKA/KIP-353%3A+Improve+Kafka+Streams+Timestamp+Synchronization]
> The config allows Streams to wait some amount of time for data to arrive on 
> the empty input, so that it can make a timestamp-ordered decision about which 
> input to pull from next.
> However, this config can be hard to use reliably and efficiently, since what 
> we're really waiting for is the next poll that _would_ return data from the 
> empty input's partition, and this guarantee is a function of the poll 
> interval, the max poll interval, and the internal logic that governs when 
> Streams will poll again.
> The ideal case is you'd be able to guarantee at a minimum that _any_ amount 
> of idling would guarantee you poll data from the empty partition if there's 
> data to fetch.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (KAFKA-10091) Improve task idling

Reply via email to