Re: [Bug 60956] Event MPM listener thread may get blocked by SSL shutdowns

Stefan Priebe - Profihost AG Wed, 19 Jul 2017 08:06:03 -0700

Am 19.07.2017 um 16:59 schrieb Stefan Priebe - Profihost AG:
> Hello Yann,
> 
> i'm observing some deadlocks again.
> 
> I'm using
> httpd 2.4.27
> + mod_h2
> + httpd-2.4.x-mpm_event-wakeup-v7.1.patch
> + your ssl linger fix patch from this thread
> 
> What kind of information do you need? If you need a full stack backtrace
>  - from which pid? Or from all httpd pids?


Something i forgot to tell:

it seems httpd is running at max threads:
awk '{print $10 $11}' lsof.txt | sort | uniq -c | grep LISTEN
  25050 *:http(LISTEN)
  25050 *:https(LISTEN)

Stefan

> 
> Thanks!
> 
> Greets,
> Stefan
> 
> Am 14.07.2017 um 21:52 schrieb Yann Ylavic:
>> On Fri, Jun 30, 2017 at 1:33 PM, Yann Ylavic <ylavic....@gmail.com> wrote:
>>> On Fri, Jun 30, 2017 at 1:20 PM, Ruediger Pluem <rpl...@apache.org> wrote:
>>>>
>>>> On 06/30/2017 12:18 PM, Yann Ylavic wrote:
>>>>>
>>>>> IMHO mod_ssl shoudn't (BIO_)flush unconditionally in
>>>>> modssl_smart_shutdown(), only in the "abortive" mode of
>>>>> ssl_filter_io_shutdown().
>>>>
>>>> I think the issue starts before that.
>>>> ap_prep_lingering_close calls the pre_close_connection hook and modules 
>>>> that are registered
>>>> to this hook can perform all sort of long lasting blocking operations 
>>>> there.
>>>> While it can be argued that this would be a bug in the module I think the 
>>>> only safe way
>>>> is to have the whole start_lingering_close_nonblocking being executed by a 
>>>> worker thread.
>>>
>>> Correct, that'd be much simpler/safer indeed.
>>> We need a new SHUTDOWN state then, right?
>>
>> Actually it was less simple than expected, and it has some caveats 
>> obviously...
>>
>> The attached patch does not introduce a new state but reuses the
>> existing CONN_STATE_LINGER since it was not really considered by the
>> listener thread (which uses CONN_STATE_LINGER_NORMAL and
>> CONN_STATE_LINGER_SHORT instead), but that's a detail.
>>
>> Mainly, start_lingering_close_nonblocking() now simply schedules a
>> shutdown (i.e. pre_close_connection() followed by immediate close)
>> that will we be run by a worker thread.
>> A new shutdown_linger_q is created/handled (with the same timeout as
>> the short_linger_q, namely 2 seconds) to hold connections to be
>> shutdown.
>>
>> So now when a connection times out in the write_completion or
>> keepalive queues, it needs (i.e. the listener may wait for) an
>> available worker to process its shutdown/close.
>> This means we can *not* close kept alive connections immediatly like
>> before when becoming short of workers, which will favor active KA
>> connections over new ones in this case (I don't think it's that
>> serious but the previous was taking care of that. For me it's up to
>> the admin to size the workers appropriately...).
>>
>> Same when a connection in the shutdown_linger_q itself times out, the
>> patch will require a worker immediatly to do the job (see
>> shutdown_lingering_close() callback).
>>
>> So overall, this patch may introduce the need for more workers than
>> before, what was (wrongly) done by the listener thread has to be done
>> somewhere anyway...
>>
>> Finally, I think there is room for improvements like batching
>> shutdowns in the same worker if there is no objection on the approach
>> so far.
>>
>> WDYT?
>>
>> Regards,
>> Yann.
>>

Re: [Bug 60956] Event MPM listener thread may get blocked by SSL shutdowns

Reply via email to