[ 
https://issues.apache.org/jira/browse/KAFKA-4237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma resolved KAFKA-4237.
--------------------------------
    Resolution: Duplicate

Duplicate of KAFKA-7050.

> Avoid long request timeout for the consumer
> -------------------------------------------
>
>                 Key: KAFKA-4237
>                 URL: https://issues.apache.org/jira/browse/KAFKA-4237
>             Project: Kafka
>          Issue Type: Improvement
>          Components: consumer
>            Reporter: Jason Gustafson
>            Priority: Major
>
> In the consumer rebalance protocol, the JoinGroup can stay in purgatory on 
> the server for as long as the rebalance timeout. For the Java client, that 
> means that the request timeout must be at least as large as the rebalance 
> timeout (which is governed by {{max.poll.interval.ms}} since KIP-62 and 
> {{session.timeout.ms}} before then). By default, since 0.10.1, this is 5 
> minutes plus some change, which makes the clients slow to detect some hard 
> failures.
> To fix this, two options come to mind:
> 1. Right now, all request APIs are limited by the same request timeout in 
> {{NetworkClient}}, but there's not really any reason why this must be so. We 
> could use a separate timeout for the JoinGroup request (the implementations 
> of this is straightforward: 
> https://github.com/confluentinc/kafka/pull/108/files).
> 2. Alternatively, we could prevent the server from holding the JoinGroup in 
> purgatory for such a long time. Instead, it could return early from the 
> JoinGroup (say before the session timeout has expired) with an error code 
> (e.g. REBALANCE_IN_PROGRESS), which tells the client that it should just 
> resend the JoinGroup.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to