Hello, On Tue, 9 Jan 2024 08:59:14 +0300 Maxim Dounin <mdou...@mdounin.ru> wrote:
> Hello! > > On Mon, Jan 08, 2024 at 01:31:11PM +0000, J Carter wrote: > > > On Mon, 8 Jan 2024 11:25:55 +0000 > > J Carter <jordanc.car...@outlook.com> wrote: > > > > > Hello, > > > > > > On Mon, 27 Nov 2023 05:50:27 +0300 > > > Maxim Dounin <mdou...@mdounin.ru> wrote: > > > > > > > # HG changeset patch > > > > # User Maxim Dounin <mdou...@mdounin.ru> > > > > # Date 1701050170 -10800 > > > > # Mon Nov 27 04:56:10 2023 +0300 > > > > # Node ID 00c3e7333145ddb5ea0eeaaa66b3d9c26973c9c2 > > > > # Parent 61d08e4cf97cc073200ec32fc6ada9a2d48ffe51 > > > > AIO operations now add timers (ticket #2162). > > > > > > > > Each AIO (thread IO) operation being run is now accompanied with > > > > 1-minute > > > > timer. This timer prevents unexpected shutdown of the worker process > > > > while > > > > an AIO operation is running, and logs an alert if the operation is > > > > running > > > > for too long. > > > > > > Shouldn't this timer's duration be set to match worker_shutdown_timeout's > > > duration rather than being hard coded to 60s ? > > > > Ah nevermind, I understand. > > > > These timers will either expire from passing the 60s set duration, or > > will expire as worker_process_timeout itself expires, kills the > > connection and times out associated timers (including the aio timers). > > > > Setting it to worker_shutdown_timeout's duration would be pointless > > (an 'infinite' timer would give the same result). > > > > So the only situation in which a different value for these AIO > > timers would make sense is if these AIO operations are expected to > > take longer 60s, but less than worker_shutdown_timeout (in cases > > where it has been increased from it's own default of 60s). > > > > In that case the AIO operation's timeout would have to be one > > (or more) of it's own directives, with a value less than > > worker_shutdown_timeout. > > Not really. > > When worker_shutdown_timeout expires, it tries to terminate the > request, but it can't as long as an AIO operation is running. > When the AIO operation completes, the request will be actually > terminated and the worker process will be allowed to exit. So far > so good. > > But if the AIO operation never completes, the timer will expire > after 1 minute, will log an alert, and the worker processes will > be anyway allowed to exit (with the request still present). This > might not be actually possible though - for example, > ngx_thread_pool_exit_worker() will just block waiting for all > pending operations to complete. > > In theory, the situation when an AIO operation never completes > should never happen, and just a counter of pending AIO > operations can be used instead to delay shutdown (which is > essentially equivalent to an infinite timer). > > In practice, though, using a large enough guard timer might be > better: it provides additional level of protection against bugs or > system misbehaviour, and at least logs an alert if something > really weird happens. It is also looks more universal and in line > with current approach of using existing timers as an indicator > that something is going on and shutdown should be delayed. > > The timer is expected to be "large enough", since we cannot do > anything meaningful with an AIO operation which never completes, > we can only complain loudly, so the timer should never expire > unless there is something really wrong. This is not the case with > worker_shutdown_timeout: it can be set to an arbitrary low value, > which is not expected to mean that something is really wrong if > the timer expires, but rather means that nginx shouldn't try to > process remaining requests, but should instead close them as long > as it can do so. That is, worker_shutdown_timeout does not fit > semantically. Further, worker_shutdown_timeout is not set by > default, so it simply cannot be used. > Good point, for whatever reason I had it in my mind that was set set by default (and you're right it doesn't fit in any case). > The 1 minute was chosen as it matches default send_timeout, which > typically accompanies AIO operations when sending responses (and > also delays shutdown, so no "open socket left" alerts are normally > seen). Still, send_timeout is also quite different semantically, > and therefore a hardcoded value is used instead. > > I don't think there are valid cases when AIO operations can take > longer than 1 minute, as these are considered to be small and fast > operations, which are normally done synchronously within nginx > event loop when not using AIO, and an operations which takes 1m > would mean nginx is completely unresponsive. Still, if we'll > found out that 1 minute is not large enough in some cases, > we can just bump it to a larger value. > > [...] > Thanks for the detailed explanation - makes sense to me. _______________________________________________ nginx-devel mailing list nginx-devel@nginx.org https://mailman.nginx.org/mailman/listinfo/nginx-devel