Re: IEP-83 Thin Client Keepalive (heartbeat)

2022-02-15 Thread Ivan Daschinsky
I personally prefer heartbeatInterval

вт, 15 февр. 2022 г., 18:25 Pavel Tupitsyn :

> > What about "keepAlive", "keepAliveInterval" then? It looks more common
> and matches the IEP title :)
> According to Google, HeartbeatInterval has ~169K results, and
> KeepAliveInterval has ~110K :)
>
> In my experience, both are well understood. I am equally willing to use any
> of them.
> Any other opinions?
>
> On Tue, Feb 15, 2022 at 6:11 PM Maksim Timonin 
> wrote:
>
> > What about "keepAlive", "keepAliveInterval" then? It looks more common
> and
> > matches the IEP title :)
> >
> > On Tue, Feb 15, 2022 at 5:54 PM Pavel Tupitsyn 
> > wrote:
> >
> > > To summarize, we add two properties to the ClientConfiguration:
> > > bool heartbeatsEnabled = true;
> > > long defaultHeartbeatInterval = 60_000; // Default 1 minute, used
> > >
> > > Logic:
> > > if (heartbeatsEnabled) {
> > >   heartbeatInterval = serverIdleTimeout > 0 ? serverIdleTimeout / 3 :
> > > defaultHeartbeatInterval;
> > > }
> > >
> > >
> > > Thoughts, objections?
> > >
> > > On Tue, Feb 15, 2022 at 4:32 PM Ivan Daschinsky 
> > > wrote:
> > >
> > > > Pavel, sorry, i've made mistake. But current behaviour is ok for me.
> > This
> > > > timeout cannot be change on server side runtime. But we can simplify
> > > > protocol just use one opcode and message
> > > >
> > > > вт, 15 февр. 2022 г., 14:54 Ivan Daschinsky :
> > > >
> > > > > > Idle timeout can't change, why send it back with every heartbeat
> > > > > response?
> > > > > May be I am wrong, but from code I see this behaviour. But if I am
> > > wrong,
> > > > > this is ok behaviour for me.
> > > > >
> > > > >
> > > > >
> > > > > вт, 15 февр. 2022 г. в 14:00, Pavel Tupitsyn  >:
> > > > >
> > > > >> Ivan, I mostly agree with your proposal, except this point:
> > > > >>
> > > > >> > Response to heartbeat request -- is idle timeout
> > > > >> Idle timeout can't change, why send it back with every heartbeat
> > > > response?
> > > > >>
> > > > >> > possible cases with cluster restart, upgrade
> > > > >> In those cases, a new connection will be established, and we'll
> > > retrieve
> > > > >> the new timeout after the handshake.
> > > > >>
> > > > >>
> > > > >> On Tue, Feb 15, 2022 at 12:04 PM Maksim Timonin <
> > > > timoninma...@apache.org>
> > > > >> wrote:
> > > > >>
> > > > >> > Hi Ivan,
> > > > >> >
> > > > >> > Cases you described sound reasonable to me. Then the client
> should
> > > > just
> > > > >> set
> > > > >> > up the `keepAlive` flag, and it just works.
> > > > >> >
> > > > >> > So, there are 3 branches:
> > > > >> > 1. Users don't configure keepAlive at all.
> > > > >> > 2. Users configure keepAliveHeartbeatInterval (long, ms).
> > > > >> > 3. Users configure keepAlive (boolean).
> > > > >> >
> > > > >> > AFAIU, Pavel's proposal is about covering the second case only.
> > But
> > > > >> > actually the 2nd and 3rd aren't conflicted with each other.I
> think
> > > for
> > > > >> both
> > > > >> > branches, a cluster should respond with idleTimeout value on
> every
> > > > keep
> > > > >> > alive client request. Because there are possible cases with
> > cluster
> > > > >> > restart, upgrade, etc. Clients should check every response and
> in
> > > case
> > > > >> of
> > > > >> > changed idleTimeout. For 2nd case write a WARN message, and for
> > 3rd
> > > -
> > > > >> > reconfigure themself in case of changed idleTimeout.
> > > > >> >
> > > > >> >
> > > > >> >
> > > > >> >
> > > > >> > On Tue, Feb 15, 2022 at 9:51 AM Ivan Daschinsky <
> > > ivanda...@gmail.com>
> > > > >> > wrote:
> > > > >> >
> > > > >> > > Regarding discussion here [1]
> > > > >> > >
> > > > >> > > I suppose that this feature, despite the fact that initial
> > > intention
> > > > >> of
> > > > >> > > Pavel was different, can drastically
> > > > >> > > improve the usage pattern of thin clients and give a lot of
> > > > >> opportunities
> > > > >> > > if the following is done:
> > > > >> > >
> > > > >> > > 1. GridNioServer has a great feature -- idle timeout. If  a
> > server
> > > > did
> > > > >> > not
> > > > >> > > receive any from a client -- it will be kicked off.
> > > > >> > > But there are some scenarios that make the use of this
> > feature
> > > > >> > > impossible:
> > > > >> > > a. Multiple workers waiting for batch tasks and relatively low
> > > > >> requests
> > > > >> > > rate -- this services will be often kicked off and must
> > reconnect.
> > > > >> > > In order to prevent this behaviour, the user must implement a
> > kind
> > > > of
> > > > >> > > heartbeating by himself.
> > > > >> > > b. Quite often user may want to implement leader-follower
> > pattern
> > > > for
> > > > >> > > services for HA, so followers also will be considered as idle.
> > > > Kicking
> > > > >> > off
> > > > >> > > these followers
> > > > >> > > is not acceptable, so user  should also implement heartbeating
> > by
> > > > >> > himself.
> > > > >> > >
> > > > >> > > My proposition is:
> > > > >> > > 1. Add two flags -- 

Re: IEP-83 Thin Client Keepalive (heartbeat)

2022-02-15 Thread Pavel Tupitsyn
> What about "keepAlive", "keepAliveInterval" then? It looks more common
and matches the IEP title :)
According to Google, HeartbeatInterval has ~169K results, and
KeepAliveInterval has ~110K :)

In my experience, both are well understood. I am equally willing to use any
of them.
Any other opinions?

On Tue, Feb 15, 2022 at 6:11 PM Maksim Timonin 
wrote:

> What about "keepAlive", "keepAliveInterval" then? It looks more common and
> matches the IEP title :)
>
> On Tue, Feb 15, 2022 at 5:54 PM Pavel Tupitsyn 
> wrote:
>
> > To summarize, we add two properties to the ClientConfiguration:
> > bool heartbeatsEnabled = true;
> > long defaultHeartbeatInterval = 60_000; // Default 1 minute, used
> >
> > Logic:
> > if (heartbeatsEnabled) {
> >   heartbeatInterval = serverIdleTimeout > 0 ? serverIdleTimeout / 3 :
> > defaultHeartbeatInterval;
> > }
> >
> >
> > Thoughts, objections?
> >
> > On Tue, Feb 15, 2022 at 4:32 PM Ivan Daschinsky 
> > wrote:
> >
> > > Pavel, sorry, i've made mistake. But current behaviour is ok for me.
> This
> > > timeout cannot be change on server side runtime. But we can simplify
> > > protocol just use one opcode and message
> > >
> > > вт, 15 февр. 2022 г., 14:54 Ivan Daschinsky :
> > >
> > > > > Idle timeout can't change, why send it back with every heartbeat
> > > > response?
> > > > May be I am wrong, but from code I see this behaviour. But if I am
> > wrong,
> > > > this is ok behaviour for me.
> > > >
> > > >
> > > >
> > > > вт, 15 февр. 2022 г. в 14:00, Pavel Tupitsyn :
> > > >
> > > >> Ivan, I mostly agree with your proposal, except this point:
> > > >>
> > > >> > Response to heartbeat request -- is idle timeout
> > > >> Idle timeout can't change, why send it back with every heartbeat
> > > response?
> > > >>
> > > >> > possible cases with cluster restart, upgrade
> > > >> In those cases, a new connection will be established, and we'll
> > retrieve
> > > >> the new timeout after the handshake.
> > > >>
> > > >>
> > > >> On Tue, Feb 15, 2022 at 12:04 PM Maksim Timonin <
> > > timoninma...@apache.org>
> > > >> wrote:
> > > >>
> > > >> > Hi Ivan,
> > > >> >
> > > >> > Cases you described sound reasonable to me. Then the client should
> > > just
> > > >> set
> > > >> > up the `keepAlive` flag, and it just works.
> > > >> >
> > > >> > So, there are 3 branches:
> > > >> > 1. Users don't configure keepAlive at all.
> > > >> > 2. Users configure keepAliveHeartbeatInterval (long, ms).
> > > >> > 3. Users configure keepAlive (boolean).
> > > >> >
> > > >> > AFAIU, Pavel's proposal is about covering the second case only.
> But
> > > >> > actually the 2nd and 3rd aren't conflicted with each other.I think
> > for
> > > >> both
> > > >> > branches, a cluster should respond with idleTimeout value on every
> > > keep
> > > >> > alive client request. Because there are possible cases with
> cluster
> > > >> > restart, upgrade, etc. Clients should check every response and in
> > case
> > > >> of
> > > >> > changed idleTimeout. For 2nd case write a WARN message, and for
> 3rd
> > -
> > > >> > reconfigure themself in case of changed idleTimeout.
> > > >> >
> > > >> >
> > > >> >
> > > >> >
> > > >> > On Tue, Feb 15, 2022 at 9:51 AM Ivan Daschinsky <
> > ivanda...@gmail.com>
> > > >> > wrote:
> > > >> >
> > > >> > > Regarding discussion here [1]
> > > >> > >
> > > >> > > I suppose that this feature, despite the fact that initial
> > intention
> > > >> of
> > > >> > > Pavel was different, can drastically
> > > >> > > improve the usage pattern of thin clients and give a lot of
> > > >> opportunities
> > > >> > > if the following is done:
> > > >> > >
> > > >> > > 1. GridNioServer has a great feature -- idle timeout. If  a
> server
> > > did
> > > >> > not
> > > >> > > receive any from a client -- it will be kicked off.
> > > >> > > But there are some scenarios that make the use of this
> feature
> > > >> > > impossible:
> > > >> > > a. Multiple workers waiting for batch tasks and relatively low
> > > >> requests
> > > >> > > rate -- this services will be often kicked off and must
> reconnect.
> > > >> > > In order to prevent this behaviour, the user must implement a
> kind
> > > of
> > > >> > > heartbeating by himself.
> > > >> > > b. Quite often user may want to implement leader-follower
> pattern
> > > for
> > > >> > > services for HA, so followers also will be considered as idle.
> > > Kicking
> > > >> > off
> > > >> > > these followers
> > > >> > > is not acceptable, so user  should also implement heartbeating
> by
> > > >> > himself.
> > > >> > >
> > > >> > > My proposition is:
> > > >> > > 1. Add two flags -- enable/disable heartbeats, and very optional
> > > >> > heartbeat
> > > >> > > timeout. Set enable to true by default, timeout to default
> > heartbeat
> > > >> > > timeout.
> > > >> > > 2. If server and client both support this feature, and
> heartbeats
> > > are
> > > >> not
> > > >> > > explicitly disabled on client side:
> > > >> > > 3. Response to heartbeat request -- is idle 

Re: IEP-83 Thin Client Keepalive (heartbeat)

2022-02-15 Thread Maksim Timonin
What about "keepAlive", "keepAliveInterval" then? It looks more common and
matches the IEP title :)

On Tue, Feb 15, 2022 at 5:54 PM Pavel Tupitsyn  wrote:

> To summarize, we add two properties to the ClientConfiguration:
> bool heartbeatsEnabled = true;
> long defaultHeartbeatInterval = 60_000; // Default 1 minute, used
>
> Logic:
> if (heartbeatsEnabled) {
>   heartbeatInterval = serverIdleTimeout > 0 ? serverIdleTimeout / 3 :
> defaultHeartbeatInterval;
> }
>
>
> Thoughts, objections?
>
> On Tue, Feb 15, 2022 at 4:32 PM Ivan Daschinsky 
> wrote:
>
> > Pavel, sorry, i've made mistake. But current behaviour is ok for me. This
> > timeout cannot be change on server side runtime. But we can simplify
> > protocol just use one opcode and message
> >
> > вт, 15 февр. 2022 г., 14:54 Ivan Daschinsky :
> >
> > > > Idle timeout can't change, why send it back with every heartbeat
> > > response?
> > > May be I am wrong, but from code I see this behaviour. But if I am
> wrong,
> > > this is ok behaviour for me.
> > >
> > >
> > >
> > > вт, 15 февр. 2022 г. в 14:00, Pavel Tupitsyn :
> > >
> > >> Ivan, I mostly agree with your proposal, except this point:
> > >>
> > >> > Response to heartbeat request -- is idle timeout
> > >> Idle timeout can't change, why send it back with every heartbeat
> > response?
> > >>
> > >> > possible cases with cluster restart, upgrade
> > >> In those cases, a new connection will be established, and we'll
> retrieve
> > >> the new timeout after the handshake.
> > >>
> > >>
> > >> On Tue, Feb 15, 2022 at 12:04 PM Maksim Timonin <
> > timoninma...@apache.org>
> > >> wrote:
> > >>
> > >> > Hi Ivan,
> > >> >
> > >> > Cases you described sound reasonable to me. Then the client should
> > just
> > >> set
> > >> > up the `keepAlive` flag, and it just works.
> > >> >
> > >> > So, there are 3 branches:
> > >> > 1. Users don't configure keepAlive at all.
> > >> > 2. Users configure keepAliveHeartbeatInterval (long, ms).
> > >> > 3. Users configure keepAlive (boolean).
> > >> >
> > >> > AFAIU, Pavel's proposal is about covering the second case only. But
> > >> > actually the 2nd and 3rd aren't conflicted with each other.I think
> for
> > >> both
> > >> > branches, a cluster should respond with idleTimeout value on every
> > keep
> > >> > alive client request. Because there are possible cases with cluster
> > >> > restart, upgrade, etc. Clients should check every response and in
> case
> > >> of
> > >> > changed idleTimeout. For 2nd case write a WARN message, and for 3rd
> -
> > >> > reconfigure themself in case of changed idleTimeout.
> > >> >
> > >> >
> > >> >
> > >> >
> > >> > On Tue, Feb 15, 2022 at 9:51 AM Ivan Daschinsky <
> ivanda...@gmail.com>
> > >> > wrote:
> > >> >
> > >> > > Regarding discussion here [1]
> > >> > >
> > >> > > I suppose that this feature, despite the fact that initial
> intention
> > >> of
> > >> > > Pavel was different, can drastically
> > >> > > improve the usage pattern of thin clients and give a lot of
> > >> opportunities
> > >> > > if the following is done:
> > >> > >
> > >> > > 1. GridNioServer has a great feature -- idle timeout. If  a server
> > did
> > >> > not
> > >> > > receive any from a client -- it will be kicked off.
> > >> > > But there are some scenarios that make the use of this feature
> > >> > > impossible:
> > >> > > a. Multiple workers waiting for batch tasks and relatively low
> > >> requests
> > >> > > rate -- this services will be often kicked off and must reconnect.
> > >> > > In order to prevent this behaviour, the user must implement a kind
> > of
> > >> > > heartbeating by himself.
> > >> > > b. Quite often user may want to implement leader-follower pattern
> > for
> > >> > > services for HA, so followers also will be considered as idle.
> > Kicking
> > >> > off
> > >> > > these followers
> > >> > > is not acceptable, so user  should also implement heartbeating by
> > >> > himself.
> > >> > >
> > >> > > My proposition is:
> > >> > > 1. Add two flags -- enable/disable heartbeats, and very optional
> > >> > heartbeat
> > >> > > timeout. Set enable to true by default, timeout to default
> heartbeat
> > >> > > timeout.
> > >> > > 2. If server and client both support this feature, and heartbeats
> > are
> > >> not
> > >> > > explicitly disabled on client side:
> > >> > > 3. Response to heartbeat request -- is idle timeout. If idle
> timeout
> > >> is
> > >> > set
> > >> > > on the server side , set heartbeat timeout to one-third of it,
> > instead
> > >> > set
> > >> > > to default or specified value.
> > >> > >
> > >> > > Pros:
> > >> > > 1. Easy to set up -- just flag on client side and just set timeout
> > on
> > >> > > server side.
> > >> > > 2. Hard to configure improperly, i.e set heartbeat timeout not
> short
> > >> > enough
> > >> > > in order to prevent kicking out by server.
> > >> > > 3. If the user just wants heartbeats without setting idle timeout
> --
> > >> > > heartbeats are by default on and with reasonable timeout.
> > >> > >
> 

Re: IEP-83 Thin Client Keepalive (heartbeat)

2022-02-15 Thread Pavel Tupitsyn
To summarize, we add two properties to the ClientConfiguration:
bool heartbeatsEnabled = true;
long defaultHeartbeatInterval = 60_000; // Default 1 minute, used

Logic:
if (heartbeatsEnabled) {
  heartbeatInterval = serverIdleTimeout > 0 ? serverIdleTimeout / 3 :
defaultHeartbeatInterval;
}


Thoughts, objections?

On Tue, Feb 15, 2022 at 4:32 PM Ivan Daschinsky  wrote:

> Pavel, sorry, i've made mistake. But current behaviour is ok for me. This
> timeout cannot be change on server side runtime. But we can simplify
> protocol just use one opcode and message
>
> вт, 15 февр. 2022 г., 14:54 Ivan Daschinsky :
>
> > > Idle timeout can't change, why send it back with every heartbeat
> > response?
> > May be I am wrong, but from code I see this behaviour. But if I am wrong,
> > this is ok behaviour for me.
> >
> >
> >
> > вт, 15 февр. 2022 г. в 14:00, Pavel Tupitsyn :
> >
> >> Ivan, I mostly agree with your proposal, except this point:
> >>
> >> > Response to heartbeat request -- is idle timeout
> >> Idle timeout can't change, why send it back with every heartbeat
> response?
> >>
> >> > possible cases with cluster restart, upgrade
> >> In those cases, a new connection will be established, and we'll retrieve
> >> the new timeout after the handshake.
> >>
> >>
> >> On Tue, Feb 15, 2022 at 12:04 PM Maksim Timonin <
> timoninma...@apache.org>
> >> wrote:
> >>
> >> > Hi Ivan,
> >> >
> >> > Cases you described sound reasonable to me. Then the client should
> just
> >> set
> >> > up the `keepAlive` flag, and it just works.
> >> >
> >> > So, there are 3 branches:
> >> > 1. Users don't configure keepAlive at all.
> >> > 2. Users configure keepAliveHeartbeatInterval (long, ms).
> >> > 3. Users configure keepAlive (boolean).
> >> >
> >> > AFAIU, Pavel's proposal is about covering the second case only. But
> >> > actually the 2nd and 3rd aren't conflicted with each other.I think for
> >> both
> >> > branches, a cluster should respond with idleTimeout value on every
> keep
> >> > alive client request. Because there are possible cases with cluster
> >> > restart, upgrade, etc. Clients should check every response and in case
> >> of
> >> > changed idleTimeout. For 2nd case write a WARN message, and for 3rd -
> >> > reconfigure themself in case of changed idleTimeout.
> >> >
> >> >
> >> >
> >> >
> >> > On Tue, Feb 15, 2022 at 9:51 AM Ivan Daschinsky 
> >> > wrote:
> >> >
> >> > > Regarding discussion here [1]
> >> > >
> >> > > I suppose that this feature, despite the fact that initial intention
> >> of
> >> > > Pavel was different, can drastically
> >> > > improve the usage pattern of thin clients and give a lot of
> >> opportunities
> >> > > if the following is done:
> >> > >
> >> > > 1. GridNioServer has a great feature -- idle timeout. If  a server
> did
> >> > not
> >> > > receive any from a client -- it will be kicked off.
> >> > > But there are some scenarios that make the use of this feature
> >> > > impossible:
> >> > > a. Multiple workers waiting for batch tasks and relatively low
> >> requests
> >> > > rate -- this services will be often kicked off and must reconnect.
> >> > > In order to prevent this behaviour, the user must implement a kind
> of
> >> > > heartbeating by himself.
> >> > > b. Quite often user may want to implement leader-follower pattern
> for
> >> > > services for HA, so followers also will be considered as idle.
> Kicking
> >> > off
> >> > > these followers
> >> > > is not acceptable, so user  should also implement heartbeating by
> >> > himself.
> >> > >
> >> > > My proposition is:
> >> > > 1. Add two flags -- enable/disable heartbeats, and very optional
> >> > heartbeat
> >> > > timeout. Set enable to true by default, timeout to default heartbeat
> >> > > timeout.
> >> > > 2. If server and client both support this feature, and heartbeats
> are
> >> not
> >> > > explicitly disabled on client side:
> >> > > 3. Response to heartbeat request -- is idle timeout. If idle timeout
> >> is
> >> > set
> >> > > on the server side , set heartbeat timeout to one-third of it,
> instead
> >> > set
> >> > > to default or specified value.
> >> > >
> >> > > Pros:
> >> > > 1. Easy to set up -- just flag on client side and just set timeout
> on
> >> > > server side.
> >> > > 2. Hard to configure improperly, i.e set heartbeat timeout not short
> >> > enough
> >> > > in order to prevent kicking out by server.
> >> > > 3. If the user just wants heartbeats without setting idle timeout --
> >> > > heartbeats are by default on and with reasonable timeout.
> >> > >
> >> > > Cons:
> >> > > 1. If someone will rely on old behavior and just wants to drop his
> >> > clients
> >> > > on timeout -- this will not work without reconfiguring, he should
> >> disable
> >> > > heartbeats.
> >> > > But I cannot even imagine that someone will find this behaviour
> >> > desirable.
> >> > > I strongly believe that this behaviour prevents users from using
> >> > > idleTimeout on server side.
> >> > >
> >> > > [1] --
> >> 

Re: IEP-83 Thin Client Keepalive (heartbeat)

2022-02-15 Thread Ivan Daschinsky
Let's think about unify two operations

Pros.
1. Just one operation.
2. Possibility to change idle timeout in runtime on cluster (using
distributed property)

Cons.
1. Extra 8 bytes (as for me, it is negligible)

As for me, less op_codes and format messages is better.

вт, 15 февр. 2022 г. в 16:32, Ivan Daschinsky :

> Pavel, sorry, i've made mistake. But current behaviour is ok for me. This
> timeout cannot be change on server side runtime. But we can simplify
> protocol just use one opcode and message
>
> вт, 15 февр. 2022 г., 14:54 Ivan Daschinsky :
>
>> > Idle timeout can't change, why send it back with every heartbeat
>> response?
>> May be I am wrong, but from code I see this behaviour. But if I am wrong,
>> this is ok behaviour for me.
>>
>>
>>
>> вт, 15 февр. 2022 г. в 14:00, Pavel Tupitsyn :
>>
>>> Ivan, I mostly agree with your proposal, except this point:
>>>
>>> > Response to heartbeat request -- is idle timeout
>>> Idle timeout can't change, why send it back with every heartbeat
>>> response?
>>>
>>> > possible cases with cluster restart, upgrade
>>> In those cases, a new connection will be established, and we'll retrieve
>>> the new timeout after the handshake.
>>>
>>>
>>> On Tue, Feb 15, 2022 at 12:04 PM Maksim Timonin >> >
>>> wrote:
>>>
>>> > Hi Ivan,
>>> >
>>> > Cases you described sound reasonable to me. Then the client should
>>> just set
>>> > up the `keepAlive` flag, and it just works.
>>> >
>>> > So, there are 3 branches:
>>> > 1. Users don't configure keepAlive at all.
>>> > 2. Users configure keepAliveHeartbeatInterval (long, ms).
>>> > 3. Users configure keepAlive (boolean).
>>> >
>>> > AFAIU, Pavel's proposal is about covering the second case only. But
>>> > actually the 2nd and 3rd aren't conflicted with each other.I think for
>>> both
>>> > branches, a cluster should respond with idleTimeout value on every keep
>>> > alive client request. Because there are possible cases with cluster
>>> > restart, upgrade, etc. Clients should check every response and in case
>>> of
>>> > changed idleTimeout. For 2nd case write a WARN message, and for 3rd -
>>> > reconfigure themself in case of changed idleTimeout.
>>> >
>>> >
>>> >
>>> >
>>> > On Tue, Feb 15, 2022 at 9:51 AM Ivan Daschinsky 
>>> > wrote:
>>> >
>>> > > Regarding discussion here [1]
>>> > >
>>> > > I suppose that this feature, despite the fact that initial intention
>>> of
>>> > > Pavel was different, can drastically
>>> > > improve the usage pattern of thin clients and give a lot of
>>> opportunities
>>> > > if the following is done:
>>> > >
>>> > > 1. GridNioServer has a great feature -- idle timeout. If  a server
>>> did
>>> > not
>>> > > receive any from a client -- it will be kicked off.
>>> > > But there are some scenarios that make the use of this feature
>>> > > impossible:
>>> > > a. Multiple workers waiting for batch tasks and relatively low
>>> requests
>>> > > rate -- this services will be often kicked off and must reconnect.
>>> > > In order to prevent this behaviour, the user must implement a kind of
>>> > > heartbeating by himself.
>>> > > b. Quite often user may want to implement leader-follower pattern for
>>> > > services for HA, so followers also will be considered as idle.
>>> Kicking
>>> > off
>>> > > these followers
>>> > > is not acceptable, so user  should also implement heartbeating by
>>> > himself.
>>> > >
>>> > > My proposition is:
>>> > > 1. Add two flags -- enable/disable heartbeats, and very optional
>>> > heartbeat
>>> > > timeout. Set enable to true by default, timeout to default heartbeat
>>> > > timeout.
>>> > > 2. If server and client both support this feature, and heartbeats
>>> are not
>>> > > explicitly disabled on client side:
>>> > > 3. Response to heartbeat request -- is idle timeout. If idle timeout
>>> is
>>> > set
>>> > > on the server side , set heartbeat timeout to one-third of it,
>>> instead
>>> > set
>>> > > to default or specified value.
>>> > >
>>> > > Pros:
>>> > > 1. Easy to set up -- just flag on client side and just set timeout on
>>> > > server side.
>>> > > 2. Hard to configure improperly, i.e set heartbeat timeout not short
>>> > enough
>>> > > in order to prevent kicking out by server.
>>> > > 3. If the user just wants heartbeats without setting idle timeout --
>>> > > heartbeats are by default on and with reasonable timeout.
>>> > >
>>> > > Cons:
>>> > > 1. If someone will rely on old behavior and just wants to drop his
>>> > clients
>>> > > on timeout -- this will not work without reconfiguring, he should
>>> disable
>>> > > heartbeats.
>>> > > But I cannot even imagine that someone will find this behaviour
>>> > desirable.
>>> > > I strongly believe that this behaviour prevents users from using
>>> > > idleTimeout on server side.
>>> > >
>>> > > [1] --
>>> https://github.com/apache/ignite/pull/9817#discussion_r805628955
>>> > >
>>> > > пт, 11 февр. 2022 г. в 10:58, Pavel Tupitsyn :
>>> > >
>>> > > > I've prepared a PR, please have a look:
>>> > > > 

Re: IEP-83 Thin Client Keepalive (heartbeat)

2022-02-15 Thread Ivan Daschinsky
Pavel, sorry, i've made mistake. But current behaviour is ok for me. This
timeout cannot be change on server side runtime. But we can simplify
protocol just use one opcode and message

вт, 15 февр. 2022 г., 14:54 Ivan Daschinsky :

> > Idle timeout can't change, why send it back with every heartbeat
> response?
> May be I am wrong, but from code I see this behaviour. But if I am wrong,
> this is ok behaviour for me.
>
>
>
> вт, 15 февр. 2022 г. в 14:00, Pavel Tupitsyn :
>
>> Ivan, I mostly agree with your proposal, except this point:
>>
>> > Response to heartbeat request -- is idle timeout
>> Idle timeout can't change, why send it back with every heartbeat response?
>>
>> > possible cases with cluster restart, upgrade
>> In those cases, a new connection will be established, and we'll retrieve
>> the new timeout after the handshake.
>>
>>
>> On Tue, Feb 15, 2022 at 12:04 PM Maksim Timonin 
>> wrote:
>>
>> > Hi Ivan,
>> >
>> > Cases you described sound reasonable to me. Then the client should just
>> set
>> > up the `keepAlive` flag, and it just works.
>> >
>> > So, there are 3 branches:
>> > 1. Users don't configure keepAlive at all.
>> > 2. Users configure keepAliveHeartbeatInterval (long, ms).
>> > 3. Users configure keepAlive (boolean).
>> >
>> > AFAIU, Pavel's proposal is about covering the second case only. But
>> > actually the 2nd and 3rd aren't conflicted with each other.I think for
>> both
>> > branches, a cluster should respond with idleTimeout value on every keep
>> > alive client request. Because there are possible cases with cluster
>> > restart, upgrade, etc. Clients should check every response and in case
>> of
>> > changed idleTimeout. For 2nd case write a WARN message, and for 3rd -
>> > reconfigure themself in case of changed idleTimeout.
>> >
>> >
>> >
>> >
>> > On Tue, Feb 15, 2022 at 9:51 AM Ivan Daschinsky 
>> > wrote:
>> >
>> > > Regarding discussion here [1]
>> > >
>> > > I suppose that this feature, despite the fact that initial intention
>> of
>> > > Pavel was different, can drastically
>> > > improve the usage pattern of thin clients and give a lot of
>> opportunities
>> > > if the following is done:
>> > >
>> > > 1. GridNioServer has a great feature -- idle timeout. If  a server did
>> > not
>> > > receive any from a client -- it will be kicked off.
>> > > But there are some scenarios that make the use of this feature
>> > > impossible:
>> > > a. Multiple workers waiting for batch tasks and relatively low
>> requests
>> > > rate -- this services will be often kicked off and must reconnect.
>> > > In order to prevent this behaviour, the user must implement a kind of
>> > > heartbeating by himself.
>> > > b. Quite often user may want to implement leader-follower pattern for
>> > > services for HA, so followers also will be considered as idle. Kicking
>> > off
>> > > these followers
>> > > is not acceptable, so user  should also implement heartbeating by
>> > himself.
>> > >
>> > > My proposition is:
>> > > 1. Add two flags -- enable/disable heartbeats, and very optional
>> > heartbeat
>> > > timeout. Set enable to true by default, timeout to default heartbeat
>> > > timeout.
>> > > 2. If server and client both support this feature, and heartbeats are
>> not
>> > > explicitly disabled on client side:
>> > > 3. Response to heartbeat request -- is idle timeout. If idle timeout
>> is
>> > set
>> > > on the server side , set heartbeat timeout to one-third of it, instead
>> > set
>> > > to default or specified value.
>> > >
>> > > Pros:
>> > > 1. Easy to set up -- just flag on client side and just set timeout on
>> > > server side.
>> > > 2. Hard to configure improperly, i.e set heartbeat timeout not short
>> > enough
>> > > in order to prevent kicking out by server.
>> > > 3. If the user just wants heartbeats without setting idle timeout --
>> > > heartbeats are by default on and with reasonable timeout.
>> > >
>> > > Cons:
>> > > 1. If someone will rely on old behavior and just wants to drop his
>> > clients
>> > > on timeout -- this will not work without reconfiguring, he should
>> disable
>> > > heartbeats.
>> > > But I cannot even imagine that someone will find this behaviour
>> > desirable.
>> > > I strongly believe that this behaviour prevents users from using
>> > > idleTimeout on server side.
>> > >
>> > > [1] --
>> https://github.com/apache/ignite/pull/9817#discussion_r805628955
>> > >
>> > > пт, 11 февр. 2022 г. в 10:58, Pavel Tupitsyn :
>> > >
>> > > > I've prepared a PR, please have a look:
>> > > > https://github.com/apache/ignite/pull/9817
>> > > >
>> > > > On Mon, Feb 7, 2022 at 6:37 PM Ivan Daschinsky > >
>> > > > wrote:
>> > > >
>> > > > > I see potential in this feature, especially if we use something
>> like
>> > > > > continuous query. Stale clients can consume a lot of resources
>> and it
>> > > is
>> > > > > worth kick these clients out.
>> > > > >
>> > > > > пн, 7 февр. 2022 г. в 18:25, Pavel Tupitsyn > >:
>> > > > >
>> > > > > > > If we use new approach, we 

Ignite 3 Join Protocol

2022-02-15 Thread Alexander Polovtcev
Hello, dear Igniters!

We've prepared an IEP

about how to assemble a cluster and add new nodes to it in Ignite 3. Please
note that some aspects are out of scope of this particular IEP (like the
procedure of a node leaving a cluster). Any feedback is appreciated.

-- 
With regards,
Aleksandr Polovtcev


Re: IEP-83 Thin Client Keepalive (heartbeat)

2022-02-15 Thread Ivan Daschinsky
> Idle timeout can't change, why send it back with every heartbeat response?
May be I am wrong, but from code I see this behaviour. But if I am wrong,
this is ok behaviour for me.



вт, 15 февр. 2022 г. в 14:00, Pavel Tupitsyn :

> Ivan, I mostly agree with your proposal, except this point:
>
> > Response to heartbeat request -- is idle timeout
> Idle timeout can't change, why send it back with every heartbeat response?
>
> > possible cases with cluster restart, upgrade
> In those cases, a new connection will be established, and we'll retrieve
> the new timeout after the handshake.
>
>
> On Tue, Feb 15, 2022 at 12:04 PM Maksim Timonin 
> wrote:
>
> > Hi Ivan,
> >
> > Cases you described sound reasonable to me. Then the client should just
> set
> > up the `keepAlive` flag, and it just works.
> >
> > So, there are 3 branches:
> > 1. Users don't configure keepAlive at all.
> > 2. Users configure keepAliveHeartbeatInterval (long, ms).
> > 3. Users configure keepAlive (boolean).
> >
> > AFAIU, Pavel's proposal is about covering the second case only. But
> > actually the 2nd and 3rd aren't conflicted with each other.I think for
> both
> > branches, a cluster should respond with idleTimeout value on every keep
> > alive client request. Because there are possible cases with cluster
> > restart, upgrade, etc. Clients should check every response and in case of
> > changed idleTimeout. For 2nd case write a WARN message, and for 3rd -
> > reconfigure themself in case of changed idleTimeout.
> >
> >
> >
> >
> > On Tue, Feb 15, 2022 at 9:51 AM Ivan Daschinsky 
> > wrote:
> >
> > > Regarding discussion here [1]
> > >
> > > I suppose that this feature, despite the fact that initial intention of
> > > Pavel was different, can drastically
> > > improve the usage pattern of thin clients and give a lot of
> opportunities
> > > if the following is done:
> > >
> > > 1. GridNioServer has a great feature -- idle timeout. If  a server did
> > not
> > > receive any from a client -- it will be kicked off.
> > > But there are some scenarios that make the use of this feature
> > > impossible:
> > > a. Multiple workers waiting for batch tasks and relatively low requests
> > > rate -- this services will be often kicked off and must reconnect.
> > > In order to prevent this behaviour, the user must implement a kind of
> > > heartbeating by himself.
> > > b. Quite often user may want to implement leader-follower pattern for
> > > services for HA, so followers also will be considered as idle. Kicking
> > off
> > > these followers
> > > is not acceptable, so user  should also implement heartbeating by
> > himself.
> > >
> > > My proposition is:
> > > 1. Add two flags -- enable/disable heartbeats, and very optional
> > heartbeat
> > > timeout. Set enable to true by default, timeout to default heartbeat
> > > timeout.
> > > 2. If server and client both support this feature, and heartbeats are
> not
> > > explicitly disabled on client side:
> > > 3. Response to heartbeat request -- is idle timeout. If idle timeout is
> > set
> > > on the server side , set heartbeat timeout to one-third of it, instead
> > set
> > > to default or specified value.
> > >
> > > Pros:
> > > 1. Easy to set up -- just flag on client side and just set timeout on
> > > server side.
> > > 2. Hard to configure improperly, i.e set heartbeat timeout not short
> > enough
> > > in order to prevent kicking out by server.
> > > 3. If the user just wants heartbeats without setting idle timeout --
> > > heartbeats are by default on and with reasonable timeout.
> > >
> > > Cons:
> > > 1. If someone will rely on old behavior and just wants to drop his
> > clients
> > > on timeout -- this will not work without reconfiguring, he should
> disable
> > > heartbeats.
> > > But I cannot even imagine that someone will find this behaviour
> > desirable.
> > > I strongly believe that this behaviour prevents users from using
> > > idleTimeout on server side.
> > >
> > > [1] --
> https://github.com/apache/ignite/pull/9817#discussion_r805628955
> > >
> > > пт, 11 февр. 2022 г. в 10:58, Pavel Tupitsyn :
> > >
> > > > I've prepared a PR, please have a look:
> > > > https://github.com/apache/ignite/pull/9817
> > > >
> > > > On Mon, Feb 7, 2022 at 6:37 PM Ivan Daschinsky 
> > > > wrote:
> > > >
> > > > > I see potential in this feature, especially if we use something
> like
> > > > > continuous query. Stale clients can consume a lot of resources and
> it
> > > is
> > > > > worth kick these clients out.
> > > > >
> > > > > пн, 7 февр. 2022 г. в 18:25, Pavel Tupitsyn  >:
> > > > >
> > > > > > > If we use new approach, we can reduce this timeout. But this
> can
> > > > affect
> > > > > > old clients.
> > > > > >
> > > > > > idleTimeout is disabled by default, we are not going to change
> > this.
> > > > > >
> > > > > > > Also, let's think about that sending heartbeats and interval of
> > > > sending
> > > > > > > heartbeats could be calculated on the server side (i.e. one
> third
> > > of
> > > 

Re: IEP-83 Thin Client Keepalive (heartbeat)

2022-02-15 Thread Pavel Tupitsyn
Ivan, I mostly agree with your proposal, except this point:

> Response to heartbeat request -- is idle timeout
Idle timeout can't change, why send it back with every heartbeat response?

> possible cases with cluster restart, upgrade
In those cases, a new connection will be established, and we'll retrieve
the new timeout after the handshake.


On Tue, Feb 15, 2022 at 12:04 PM Maksim Timonin 
wrote:

> Hi Ivan,
>
> Cases you described sound reasonable to me. Then the client should just set
> up the `keepAlive` flag, and it just works.
>
> So, there are 3 branches:
> 1. Users don't configure keepAlive at all.
> 2. Users configure keepAliveHeartbeatInterval (long, ms).
> 3. Users configure keepAlive (boolean).
>
> AFAIU, Pavel's proposal is about covering the second case only. But
> actually the 2nd and 3rd aren't conflicted with each other.I think for both
> branches, a cluster should respond with idleTimeout value on every keep
> alive client request. Because there are possible cases with cluster
> restart, upgrade, etc. Clients should check every response and in case of
> changed idleTimeout. For 2nd case write a WARN message, and for 3rd -
> reconfigure themself in case of changed idleTimeout.
>
>
>
>
> On Tue, Feb 15, 2022 at 9:51 AM Ivan Daschinsky 
> wrote:
>
> > Regarding discussion here [1]
> >
> > I suppose that this feature, despite the fact that initial intention of
> > Pavel was different, can drastically
> > improve the usage pattern of thin clients and give a lot of opportunities
> > if the following is done:
> >
> > 1. GridNioServer has a great feature -- idle timeout. If  a server did
> not
> > receive any from a client -- it will be kicked off.
> > But there are some scenarios that make the use of this feature
> > impossible:
> > a. Multiple workers waiting for batch tasks and relatively low requests
> > rate -- this services will be often kicked off and must reconnect.
> > In order to prevent this behaviour, the user must implement a kind of
> > heartbeating by himself.
> > b. Quite often user may want to implement leader-follower pattern for
> > services for HA, so followers also will be considered as idle. Kicking
> off
> > these followers
> > is not acceptable, so user  should also implement heartbeating by
> himself.
> >
> > My proposition is:
> > 1. Add two flags -- enable/disable heartbeats, and very optional
> heartbeat
> > timeout. Set enable to true by default, timeout to default heartbeat
> > timeout.
> > 2. If server and client both support this feature, and heartbeats are not
> > explicitly disabled on client side:
> > 3. Response to heartbeat request -- is idle timeout. If idle timeout is
> set
> > on the server side , set heartbeat timeout to one-third of it, instead
> set
> > to default or specified value.
> >
> > Pros:
> > 1. Easy to set up -- just flag on client side and just set timeout on
> > server side.
> > 2. Hard to configure improperly, i.e set heartbeat timeout not short
> enough
> > in order to prevent kicking out by server.
> > 3. If the user just wants heartbeats without setting idle timeout --
> > heartbeats are by default on and with reasonable timeout.
> >
> > Cons:
> > 1. If someone will rely on old behavior and just wants to drop his
> clients
> > on timeout -- this will not work without reconfiguring, he should disable
> > heartbeats.
> > But I cannot even imagine that someone will find this behaviour
> desirable.
> > I strongly believe that this behaviour prevents users from using
> > idleTimeout on server side.
> >
> > [1] -- https://github.com/apache/ignite/pull/9817#discussion_r805628955
> >
> > пт, 11 февр. 2022 г. в 10:58, Pavel Tupitsyn :
> >
> > > I've prepared a PR, please have a look:
> > > https://github.com/apache/ignite/pull/9817
> > >
> > > On Mon, Feb 7, 2022 at 6:37 PM Ivan Daschinsky 
> > > wrote:
> > >
> > > > I see potential in this feature, especially if we use something like
> > > > continuous query. Stale clients can consume a lot of resources and it
> > is
> > > > worth kick these clients out.
> > > >
> > > > пн, 7 февр. 2022 г. в 18:25, Pavel Tupitsyn :
> > > >
> > > > > > If we use new approach, we can reduce this timeout. But this can
> > > affect
> > > > > old clients.
> > > > >
> > > > > idleTimeout is disabled by default, we are not going to change
> this.
> > > > >
> > > > > > Also, let's think about that sending heartbeats and interval of
> > > sending
> > > > > > heartbeats could be calculated on the server side (i.e. one third
> > of
> > > > idle
> > > > > > timeout) and sent to the client during handshake.
> > > > > > Also we can introduce something like a negotiation mechanism as
> in
> > > > > > zookeeper.
> > > > >
> > > > > I tend to agree with Maksim here, let's keep it simple and
> explicit.
> > > > > Log a warning, but don't do anything clever.
> > > > >
> > > > > On Mon, Feb 7, 2022 at 6:15 PM Ivan Daschinsky <
> ivanda...@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > >> idleTimeout already exists, I don't 

Re: IEP-83 Thin Client Keepalive (heartbeat)

2022-02-15 Thread Maksim Timonin
Hi Ivan,

Cases you described sound reasonable to me. Then the client should just set
up the `keepAlive` flag, and it just works.

So, there are 3 branches:
1. Users don't configure keepAlive at all.
2. Users configure keepAliveHeartbeatInterval (long, ms).
3. Users configure keepAlive (boolean).

AFAIU, Pavel's proposal is about covering the second case only. But
actually the 2nd and 3rd aren't conflicted with each other.I think for both
branches, a cluster should respond with idleTimeout value on every keep
alive client request. Because there are possible cases with cluster
restart, upgrade, etc. Clients should check every response and in case of
changed idleTimeout. For 2nd case write a WARN message, and for 3rd -
reconfigure themself in case of changed idleTimeout.




On Tue, Feb 15, 2022 at 9:51 AM Ivan Daschinsky  wrote:

> Regarding discussion here [1]
>
> I suppose that this feature, despite the fact that initial intention of
> Pavel was different, can drastically
> improve the usage pattern of thin clients and give a lot of opportunities
> if the following is done:
>
> 1. GridNioServer has a great feature -- idle timeout. If  a server did not
> receive any from a client -- it will be kicked off.
> But there are some scenarios that make the use of this feature
> impossible:
> a. Multiple workers waiting for batch tasks and relatively low requests
> rate -- this services will be often kicked off and must reconnect.
> In order to prevent this behaviour, the user must implement a kind of
> heartbeating by himself.
> b. Quite often user may want to implement leader-follower pattern for
> services for HA, so followers also will be considered as idle. Kicking off
> these followers
> is not acceptable, so user  should also implement heartbeating by himself.
>
> My proposition is:
> 1. Add two flags -- enable/disable heartbeats, and very optional heartbeat
> timeout. Set enable to true by default, timeout to default heartbeat
> timeout.
> 2. If server and client both support this feature, and heartbeats are not
> explicitly disabled on client side:
> 3. Response to heartbeat request -- is idle timeout. If idle timeout is set
> on the server side , set heartbeat timeout to one-third of it, instead set
> to default or specified value.
>
> Pros:
> 1. Easy to set up -- just flag on client side and just set timeout on
> server side.
> 2. Hard to configure improperly, i.e set heartbeat timeout not short enough
> in order to prevent kicking out by server.
> 3. If the user just wants heartbeats without setting idle timeout --
> heartbeats are by default on and with reasonable timeout.
>
> Cons:
> 1. If someone will rely on old behavior and just wants to drop his clients
> on timeout -- this will not work without reconfiguring, he should disable
> heartbeats.
> But I cannot even imagine that someone will find this behaviour desirable.
> I strongly believe that this behaviour prevents users from using
> idleTimeout on server side.
>
> [1] -- https://github.com/apache/ignite/pull/9817#discussion_r805628955
>
> пт, 11 февр. 2022 г. в 10:58, Pavel Tupitsyn :
>
> > I've prepared a PR, please have a look:
> > https://github.com/apache/ignite/pull/9817
> >
> > On Mon, Feb 7, 2022 at 6:37 PM Ivan Daschinsky 
> > wrote:
> >
> > > I see potential in this feature, especially if we use something like
> > > continuous query. Stale clients can consume a lot of resources and it
> is
> > > worth kick these clients out.
> > >
> > > пн, 7 февр. 2022 г. в 18:25, Pavel Tupitsyn :
> > >
> > > > > If we use new approach, we can reduce this timeout. But this can
> > affect
> > > > old clients.
> > > >
> > > > idleTimeout is disabled by default, we are not going to change this.
> > > >
> > > > > Also, let's think about that sending heartbeats and interval of
> > sending
> > > > > heartbeats could be calculated on the server side (i.e. one third
> of
> > > idle
> > > > > timeout) and sent to the client during handshake.
> > > > > Also we can introduce something like a negotiation mechanism as in
> > > > > zookeeper.
> > > >
> > > > I tend to agree with Maksim here, let's keep it simple and explicit.
> > > > Log a warning, but don't do anything clever.
> > > >
> > > > On Mon, Feb 7, 2022 at 6:15 PM Ivan Daschinsky 
> > > > wrote:
> > > >
> > > > > >> idleTimeout already exists, I don't think we should change the
> way
> > > it
> > > > > works (or did I misunderstand you?)
> > > > > If we use new approach, we can reduce this timeout. But this can
> > affect
> > > > old
> > > > > clients.
> > > > >
> > > > >
> > > > > Also, let's think about that sending heartbeats and interval of
> > sending
> > > > > heartbeats could be calculated on the server side (i.e. one third
> of
> > > idle
> > > > > timeout) and sent to the client
> > > > > during handshake.
> > > > > Also we can introduce something like a negotiation mechanism as in
> > > > > zookeeper.
> > > > >
> > > > >
> > > > > пн, 7 февр. 2022 г. в 18:05, Pavel Tupitsyn  >:
> > > > >
>