On Fri, Jul 21, 2017 at 3:07 PM, Luca Toscano <toscano.l...@gmail.com> wrote: >> >> To prevent starvation of deferred lingering closes, the listener may >> create a worker at the of its loop, when/if the chain is (fully) >> filled. > > IIUC the trick is to run "(have_idle_worker && push2worker(NULL) == > APR_SUCCESS)" that reserves a worker that in turn eventually checks the > defer_linger_chain as part of its new code.
That will be a dedicated worker, it won't handle a connection before flushing/closing the ones in defer_linger_chain, straight to that (thanks to the NULL arg). > This also seems sto leverage > workers_were_busy that will prioritize lingering closes over ka connections > right? (or at least try to) workers_were_busy is per loop (zeroed), so the new (late) get_worker which updates it won't affect the next loop (and since there will be a poll() in between this get_worker and the next time we'll have to decide whether to kill KA connections or not, it looks reasonable to not depend on the previous loop anyway). Regarding this part, though : if (defer_linger_chain) { get_worker(&have_idle_worker, 0, &workers_were_busy); if (have_idle_worker && push2worker(NULL) == APR_SUCCESS) { have_idle_worker = 0; } I think we could be smarter and possibly not create a worker if no lingering close was chained in *this* loop. With the current patch we would create one if, for example, a new connection is accepted before the worker in charge of the lingering closes (from the previous loop) did not finish its work. Here defer_linger_chain was not filled by the current loop, but not completely emptied either because the worker is blocking on its first connections to shutdown, hence != NULL. So, should we favor the draining of defer_linger_chain as much workers as necessary like the current patch, or should we have as few workers as possible and not start new workers in loops with no effect on defer_linger_chain? Also, it seems that in the deferred lingering case we should probaly shorten the socket timeout before calling (and possibly blocking on) ap_start_lingering_close()'s hooks/flush, since we likely come from a time-up already... Thoughts? Regards, Yann.