Hi David,

I may be a bit confused before, just clarifying a few things:

1. As you mentioned, a client will always try to first establish the
connection with a broker node before it tries to send any request to it.
And after connection is established, it will either continuously send many
requests (e.g. produce) for just a single request (e.g. metadata) to the
broker, so these two phases are indeed different.

2. In the connected phase, connections.max.idle.ms is used to
auto-disconnect the socket if no requests has been sent / received during
that period of time; in the connecting phase, we always try to create the
socket via "socketChannel.connect" in a non-blocking call, and then checks
if the connection has been established, but all the callers of this
function (in either producer or consumer) has a timeout parameter as in
`selector.poll()`, and the timeout parameter is set either by calculations
based on metadata.expiration.time and backoff for producer#sender, or by
directly passed values from consumer#poll(timeout), so although there is no
directly config controlling that, users can still control how much time in
maximum to wait for inside code.

I originally thought your scenarios is more on the connected phase, but now
I feel you are talking about the connecting phase. For that case, I still
feel currently the timeout value passed in `selector.poll()` which is
controllable from user code should be sufficient?


Guozhang




On Sun, May 14, 2017 at 2:37 AM, 东方甲乙 <254479...@qq.com> wrote:

> Hi Guozhang,
>
>
> Sorry for the delay, thanks for the question.  It seems two different
> parameters to me:
> connect.timeout.ms: only work for the connecting phrase, after connected
> phrase this parameter is not used.
> connections.max.idle.ms: currently not work in the connecting phrase
> (only select return readyKeys >0) will add to the expired manager, after
> connected will check if the connection is still alive in some time.
>
>
> Even if we change the connections.max.idle.ms to work including the
> connecting phrase, we can not set this parameter to a small value, such as
> 5 seconds. Because the client is maybe busy sending message to other node,
> it will be disconnected in 5 seconds, so the default value of
> connections.max.idle.ms is setting to a larger time. We should have two
> parameters to control the connecting phrase behavior and the connected
> phrase behavior, do you think so?
>
>
> Thanks,
>
>
> David
>
>
>
>
> ------------------ 原始邮件 ------------------
> 发件人: "Guozhang Wang";<wangg...@gmail.com>;
> 发送时间: 2017年5月6日(星期六) 上午7:52
> 收件人: "dev@kafka.apache.org"<dev@kafka.apache.org>;
>
> 主题: Re: [DISCUSS] KIP-148: Add a connect timeout for client
>
>
>
> Hello David,
>
> Thanks for the KIP. For the described issue, I'm wondering if it can be
> resolved by tuning the CONNECTIONS_MAX_IDLE_MS_CONFIG (
> connections.max.idle.ms) on the client side? Default is 9 minutes.
>
>
> Guozhang
>
> On Tue, May 2, 2017 at 8:22 AM, 东方甲乙 <254479...@qq.com> wrote:
>
> > Hi all,
> >
> > Currently in our test environment, we found that after one of the broker
> > node crash (reboot or os crash), the client may still be connecting to
> the
> > crash node to send metadata request or other request, and it needs
> several
> > minutes to be aware that the connection is timeout then try another node
> to
> > connect to send the request. Then the client may still not be aware of
> the
> > metadata change after several minutes.
> >
> >
> > So I want to add a connect timeout on the  client,  please take a look
> at:
> >
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > 148%3A+Add+a+connect+timeout+for+client
> >
> > Regards,
> >
> > David
>
>
>
>
> --
> -- Guozhang
>



-- 
-- Guozhang

Reply via email to