Re: [Bug 60956] Event MPM listener thread may get blocked by SSL shutdowns

Luca Toscano Tue, 18 Jul 2017 10:13:53 -0700

2017-07-17 9:33 GMT+02:00 Stefan Eissing <[email protected]>:


>
> > Am 14.07.2017 um 21:52 schrieb Yann Ylavic <[email protected]>:
> >
> > On Fri, Jun 30, 2017 at 1:33 PM, Yann Ylavic <[email protected]>
> wrote:
> >> On Fri, Jun 30, 2017 at 1:20 PM, Ruediger Pluem <[email protected]>
> wrote:
> >>>
> >>> On 06/30/2017 12:18 PM, Yann Ylavic wrote:
> >>>>
> >>>> IMHO mod_ssl shoudn't (BIO_)flush unconditionally in
> >>>> modssl_smart_shutdown(), only in the "abortive" mode of
> >>>> ssl_filter_io_shutdown().
> >>>
> >>> I think the issue starts before that.
> >>> ap_prep_lingering_close calls the pre_close_connection hook and
> modules that are registered
> >>> to this hook can perform all sort of long lasting blocking operations
> there.
> >>> While it can be argued that this would be a bug in the module I think
> the only safe way
> >>> is to have the whole start_lingering_close_nonblocking being executed
> by a worker thread.
> >>
> >> Correct, that'd be much simpler/safer indeed.
> >> We need a new SHUTDOWN state then, right?
> >
> > Actually it was less simple than expected, and it has some caveats
> obviously...
> >
> > The attached patch does not introduce a new state but reuses the
> > existing CONN_STATE_LINGER since it was not really considered by the
> > listener thread (which uses CONN_STATE_LINGER_NORMAL and
> > CONN_STATE_LINGER_SHORT instead), but that's a detail.
> >
> > Mainly, start_lingering_close_nonblocking() now simply schedules a
> > shutdown (i.e. pre_close_connection() followed by immediate close)
> > that will we be run by a worker thread.
> > A new shutdown_linger_q is created/handled (with the same timeout as
> > the short_linger_q, namely 2 seconds) to hold connections to be
> > shutdown.
> >
> > So now when a connection times out in the write_completion or
> > keepalive queues, it needs (i.e. the listener may wait for) an
> > available worker to process its shutdown/close.
> > This means we can *not* close kept alive connections immediatly like
> > before when becoming short of workers, which will favor active KA
> > connections over new ones in this case (I don't think it's that
> > serious but the previous was taking care of that. For me it's up to
> > the admin to size the workers appropriately...).
> >
> > Same when a connection in the shutdown_linger_q itself times out, the
> > patch will require a worker immediatly to do the job (see
> > shutdown_lingering_close() callback).
> >
> > So overall, this patch may introduce the need for more workers than
> > before, what was (wrongly) done by the listener thread has to be done
> > somewhere anyway...
> >
> > Finally, I think there is room for improvements like batching
> > shutdowns in the same worker if there is no objection on the approach
> > so far.
> >
> > WDYT?
>
> I will test the patch, most likely today. I lot of +1s for the initiative!
>
>
Thanks a lot for this work Yann, really nice! I will try to test it as well
during the next days, I was not convinced in the beginning that triggering
a (potential) increase in workers usage was super great for our users but
it is definitely the most correct and safe thing to do. Even if we find a
way to fix the ssl-lingering-close-block issue we might encounter a similar
one in the future, so it is better imho to fix it at the source.

Luca

Re: [Bug 60956] Event MPM listener thread may get blocked by SSL shutdowns

Reply via email to