Re: API and terms: idle-time-out and heartbeat intervals.

Rob Godfrey Wed, 28 Sep 2016 03:19:54 -0700

I agree that specifying that the communicated figure should be "half"
the "actual" timeout was a mistake.


What the spec should have tried to communicate is that the sender
should communicate a value somewhat less than the period it uses to
determine that the connection has actually timed-out to allow for the
receiver to process and emit a heartbeat frame.  Similarly the sender
should ensure that a frame has been emitted well within the timeout
period to allow for any communication / processing delay.  In practice
these "wiggle room" factors should not be determined by the
application level timeout setting but by sensible calculations on
transport delay variance / processing time, etc...  these calculation
may differ between different use-cases / environments (for example in
a low latency / real-time environment you may be able to make hard
guarantees about the number of milliseconds that communication /
processing delay will take... on the other hand if you are using an
interpreted language with stop-the-world garbage collection you may
not be able to say much better than the delay should be less than 30s
or whatever).

I think application level APIs should be in terms of the timeouts that
will affect the application.  The AMQP library should be massaging
those numbers in such a way that they can fulfil the application
requirements.

-- Rob

On 28 September 2016 at 10:42, Robbie Gemmell <robbie.gemm...@gmail.com> wrote:
> On 27 September 2016 at 22:24, Alan Conway <acon...@redhat.com> wrote:
>> On Tue, 2016-09-27 at 15:37 -0400, Alan Conway wrote:
>>> I want to clarify and document the meaning of these terms for our
>>> APIs,
>>> presently I can't find anywhere where they are documented clearly.
>>>
>>> The AMQP spec says: "Each peer has its own (independent) idle
>>> timeout.
>>> At connection open each peer communicates the maximum
>>> period between activity (frames) on the connection that it desires
>>> from
>>> its partner.The open frame carries the idletime-out
>>> field for this purpose. To avoid spurious timeouts, the value in
>>> idle-
>>> time-out SHOULD be half the peer’s
>>> actual timeout threshold."
>>>
>>> In other words: if I send you an "open" frame with idle-time-out=N
>>> that
>>> means *you* should not wait for longer than N milliseconds to send a
>>> frame to me. It does not mean *I* will close the connection after N
>>> milliseconds, I SHOULD be more patient and wait for N*2 ms to avoid
>>> closing prematurely due to minor timing wobbles.
>>>
>>> I think the choice of name is slightly ambiguous but the spec is
>>> clear
>>> on the semantics, so it's important to document it to remove the
>>> ambiguity.
>>>
>>> Anybody disagree?
>>>
>>
>> Sigh. Sadly proton-C interprets "idle-timeout" differently depending on
>> which end of the connection you are on:
>>
>>       // as per the recommendation in the spec, advertise half our
>>       // actual timeout to the remote
>>       const pn_millis_t idle_timeout = transport->local_idle_timeout
>>           ? (transport->local_idle_timeout/2)
>>           : 0;
>>
>> So in proton, pn_set_idle_timeout does NOT mean set the AMQP idle-
>> timeout value, it means set the local "receive timeout" value and send
>> half that as the AMQP "send timeout" for the peer.
>>
>> I'm tempted to use a new term in the Go API: "heartbeat". To me that
>> clearly means the "send timeout" (hearts beat, they don't listen for
>> beats) so it coincides with the meaning of the AMQP "idle-timeout", but
>> without the ambiguity that is exacerbated by proton interpreting it
>> both ways.
>>
>>
>
> Proton may seem to behave differently on each end, but I don't think
> its necessarily a bad thing that it does, and it is also I think
> largely just reflecting an annoying bit in the spec around this where
> different behaviours are allowed for, whereas it would be easier if it
> had less wiggle room.
>
> The transport setter/getter for the local timeout takes the 'actual
> timeout' and then sends half of it as the advertised value in the Open
> sent. This makes a certain amount of sense since it ensures that
> appropriate behaviour is actually satisfied, rather than expecting the
> user to ensure they only give half the value they really want for
> their actual timeout. The getter for the remote timeout value on the
> other hand returns the advertised value from the Open that is
> received. I expect it does that since it cant actually ever return the
> remotes 'actual timeout' without making an assumption, i.e that they
> did in fact advertise half (or less) of their actual timeout, which
> the spec only says that they SHOULD do.
>
> Yes the local setter taking the advertised value may have been better
> for method consistency with the remote getter. On the other hand,
> sending of necessary heartbeats is handled directly by the transport
> during the tick process, so users may not necessarily even use the
> getter themselves, and proton uses that remote value internally by
> pessimistically halfing it to account for the case that folks on the
> other end did not advertise half their actual timeout (since the spec
> doesnt require that they do). Side note: proton could arguably be less
> pessimistic here and go for say a percentage much nearer the full
> advertised value, but then you'd probably need to start guaging how
> close is too close.
>
> I think ensuring the doccumentation on the methods is clear what they
> do is sufficient enough here. I actually prefer idle-timeout as an
> name rather than heartbeat due to the way this all works. Since you
> only tell the other side [half] your timeout, you dont actually have
> direct control over when they send any needed empty frames to satisfy
> it (as the above shows, we might send them more often than they
> require) and 'heartbeat' might seem to imply that you do, and possibly
> even that they need be sent at that period all the time even despite
> regular traffic, which is not the case.
>
> Robbie
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
> For additional commands, e-mail: dev-h...@qpid.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org

Re: API and terms: idle-time-out and heartbeat intervals.

Reply via email to