Am 19.07.2017 um 16:59 schrieb Stefan Priebe - Profihost AG: > Hello Yann, > > i'm observing some deadlocks again. > > I'm using > httpd 2.4.27 > + mod_h2 > + httpd-2.4.x-mpm_event-wakeup-v7.1.patch > + your ssl linger fix patch from this thread > > What kind of information do you need? If you need a full stack backtrace > - from which pid? Or from all httpd pids?
Something i forgot to tell: it seems httpd is running at max threads: awk '{print $10 $11}' lsof.txt | sort | uniq -c | grep LISTEN 25050 *:http(LISTEN) 25050 *:https(LISTEN) Stefan > > Thanks! > > Greets, > Stefan > > Am 14.07.2017 um 21:52 schrieb Yann Ylavic: >> On Fri, Jun 30, 2017 at 1:33 PM, Yann Ylavic <ylavic....@gmail.com> wrote: >>> On Fri, Jun 30, 2017 at 1:20 PM, Ruediger Pluem <rpl...@apache.org> wrote: >>>> >>>> On 06/30/2017 12:18 PM, Yann Ylavic wrote: >>>>> >>>>> IMHO mod_ssl shoudn't (BIO_)flush unconditionally in >>>>> modssl_smart_shutdown(), only in the "abortive" mode of >>>>> ssl_filter_io_shutdown(). >>>> >>>> I think the issue starts before that. >>>> ap_prep_lingering_close calls the pre_close_connection hook and modules >>>> that are registered >>>> to this hook can perform all sort of long lasting blocking operations >>>> there. >>>> While it can be argued that this would be a bug in the module I think the >>>> only safe way >>>> is to have the whole start_lingering_close_nonblocking being executed by a >>>> worker thread. >>> >>> Correct, that'd be much simpler/safer indeed. >>> We need a new SHUTDOWN state then, right? >> >> Actually it was less simple than expected, and it has some caveats >> obviously... >> >> The attached patch does not introduce a new state but reuses the >> existing CONN_STATE_LINGER since it was not really considered by the >> listener thread (which uses CONN_STATE_LINGER_NORMAL and >> CONN_STATE_LINGER_SHORT instead), but that's a detail. >> >> Mainly, start_lingering_close_nonblocking() now simply schedules a >> shutdown (i.e. pre_close_connection() followed by immediate close) >> that will we be run by a worker thread. >> A new shutdown_linger_q is created/handled (with the same timeout as >> the short_linger_q, namely 2 seconds) to hold connections to be >> shutdown. >> >> So now when a connection times out in the write_completion or >> keepalive queues, it needs (i.e. the listener may wait for) an >> available worker to process its shutdown/close. >> This means we can *not* close kept alive connections immediatly like >> before when becoming short of workers, which will favor active KA >> connections over new ones in this case (I don't think it's that >> serious but the previous was taking care of that. For me it's up to >> the admin to size the workers appropriately...). >> >> Same when a connection in the shutdown_linger_q itself times out, the >> patch will require a worker immediatly to do the job (see >> shutdown_lingering_close() callback). >> >> So overall, this patch may introduce the need for more workers than >> before, what was (wrongly) done by the listener thread has to be done >> somewhere anyway... >> >> Finally, I think there is room for improvements like batching >> shutdowns in the same worker if there is no objection on the approach >> so far. >> >> WDYT? >> >> Regards, >> Yann. >>