From: Josh Poimboeuf > Sent: 02 March 2021 00:11 > > We had a report of a regression in the TCP keepalive timer. The user > had a 3600s keepalive timer for preventing firewall disconnects (on a > 3650s interval). They observed keepalive timers coming in up to four > minutes late, causing unexpected disconnects. > > The regression was observed to have come from the timer wheel rewrite > from almost five years ago: > > 500462a9de65 ("timers: Switch to a non-cascading wheel") > > As you mentioned, with a HZ of 1000, the granularity for a one-hour > timer is four minutes, which matches the seen behavior.
That seems horribly broken - if technically valid. Reading the big comment even the 32sec for the next finer 'wheel' seems a little coarse for a 1h timer. The second finer wheel has 4sec resolution - which is probably reasonable. David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)