Jiangjie,

Great start. I have a couple of comments.

Under the motivation section, is it really true that the request will never
be completed? Presumably if the broker goes down the connection will be
severed, at worst by a TCP timeout, which should clean up the connection
and any outstanding requests, right? I think the real reason we need a
different timeout is that the default TCP timeouts are ridiculously long in
this context.

My second question is about whether this is the right level to tackle the
issue/what user-facing changes need to be made. A related problem came up
in https://issues.apache.org/jira/browse/KAFKA-1788 where producer records
get stuck indefinitely because there's no client-side timeout. This KIP
wouldn't fix that problem or any problems caused by lack of connectivity
since this would only apply to in flight requests, which by definition must
have been sent on an active connection.

I suspect both types of problems probably need to be addressed separately
by introducing explicit timeouts. However, because the settings introduced
here are very much about the internal implementations of the clients, I'm
wondering if this even needs to be a user-facing setting, especially if we
have to add other timeouts anyway. For example, would a fixed, generous
value that's still much shorter than a TCP timeout, say 15s, be good
enough? If other timeouts would allow, for example, the clients to properly
exit even if requests have not hit their timeout, then what's the benefit
of being able to configure the request-level timeout?

I know we have a similar setting, max.in.flights.requests.per.connection,
exposed publicly (which I just discovered is missing from the new producer
configs documentation). But it looks like the new consumer is not exposing
that option, using a fixed value instead. I think we should default to
hiding these implementation values unless there's a strong case for a
scenario that requires customization.

In other words, since the only user-facing change was the addition of the
setting, I'm wondering if we can avoid the KIP altogether by just choosing
a good default value for the timeout.

-Ewen

On Mon, Apr 13, 2015 at 2:35 PM, Jiangjie Qin <j...@linkedin.com.invalid>
wrote:

> Hi,
>
> I just created a KIP to add a request timeout to NetworkClient for new
> Kafka clients.
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-19+-+Add+a+request+timeout+to+NetworkClient
>
> Comments and suggestions are welcome!
>
> Thanks.
>
> Jiangjie (Becket) Qin
>
>


-- 
Thanks,
Ewen

Reply via email to