On Sun, Jul 22, 2012 at 8:49 AM, Rainer Jung <rainer.j...@kippdata.de> wrote: > On 19.07.2012 23:31, traw...@apache.org wrote: >> >> Author: trawick >> Date: Thu Jul 19 21:31:52 2012 >> New Revision: 1363557 >> >> URL: http://svn.apache.org/viewvc?rev=1363557&view=rev >> Log: >> mpm_event, mpm_worker: Remain active amidst prevalent child process >> resource shortages. >> >> This is a somewhat different direction than r168182 ("transient thread >> creation errors shouldn't take down the whole server"). >> >> r168182: If APEXIT_CHILDSICK is received and there aren't any >> active children at the time, exit. >> >> Now: If APEXIT_CHILDSICK is received and we never successfully >> initialized a child, exit. >> >> The issue seen with the r168182 handling is that it is rather easy >> to be left with no active child processes (which causes the server >> to exit completely) during a resource shortage that lasts for some >> measurable period of time, as contrasted with a resource shortage >> that results in only a handful of allocation failures. >> >> Now the server will remain active, though as long as the resource >> shortage exists children may continually fail and the parent will >> try once per second to create a replacement. The existing logic >> to reduce the spawn rate after such errors will prevent the >> parent from trying to create children more rapidly. >> >> Modified: >> httpd/httpd/trunk/CHANGES >> httpd/httpd/trunk/docs/log-message-tags/next-number >> httpd/httpd/trunk/server/mpm/event/event.c >> httpd/httpd/trunk/server/mpm/worker/worker.c >> >> Modified: httpd/httpd/trunk/CHANGES >> URL: >> http://svn.apache.org/viewvc/httpd/httpd/trunk/CHANGES?rev=1363557&r1=1363556&r2=1363557&view=diff >> >> ============================================================================== >> --- httpd/httpd/trunk/CHANGES [utf-8] (original) >> +++ httpd/httpd/trunk/CHANGES [utf-8] Thu Jul 19 21:31:52 2012 >> @@ -1,6 +1,9 @@ >> -*- coding: >> utf-8 -*- >> Changes with Apache 2.5.0 >> >> + *) mpm_event, mpm_worker: Remain active amidst prevalent child process >> + resource shortages. [Jeff Trawick] >> + >> *) mpm_event, mpm_worker: Fix cases where the spawn rate wasn't >> reduced >> after child process resource shortages. [Jeff Trawick] >> >> >> Modified: httpd/httpd/trunk/docs/log-message-tags/next-number >> URL: >> http://svn.apache.org/viewvc/httpd/httpd/trunk/docs/log-message-tags/next-number?rev=1363557&r1=1363556&r2=1363557&view=diff >> >> ============================================================================== >> --- httpd/httpd/trunk/docs/log-message-tags/next-number (original) >> +++ httpd/httpd/trunk/docs/log-message-tags/next-number Thu Jul 19 >> 21:31:52 2012 >> @@ -1 +1 @@ >> -2324 >> +2326 >> >> Modified: httpd/httpd/trunk/server/mpm/event/event.c >> URL: >> http://svn.apache.org/viewvc/httpd/httpd/trunk/server/mpm/event/event.c?rev=1363557&r1=1363556&r2=1363557&view=diff >> >> ============================================================================== >> --- httpd/httpd/trunk/server/mpm/event/event.c (original) >> +++ httpd/httpd/trunk/server/mpm/event/event.c Thu Jul 19 21:31:52 2012 >> @@ -183,6 +183,7 @@ static apr_uint32_t lingering_count = 0; >> static apr_uint32_t suspended_count = 0; /* Number of suspended >> connections */ >> static apr_uint32_t clogged_count = 0; /* Number of threads >> processing ssl conns */ >> static int resource_shortage = 0; >> +static int had_healthy_child = 0; >> static fd_queue_t *worker_queue; >> static fd_queue_info_t *worker_queue_info; >> static int mpm_state = AP_MPMQ_STARTING; >> @@ -2403,6 +2404,7 @@ static void perform_idle_server_maintena >> int any_dying_threads = 0; >> int any_dead_threads = 0; >> int all_dead_threads = 1; >> + int child_threads_active = 0; >> >> if (i >= retained->max_daemons_limit >> && totally_free_length == retained->idle_spawn_rate) >> @@ -2438,6 +2440,7 @@ static void perform_idle_server_maintena >> } >> if (status >= SERVER_READY && status < SERVER_GRACEFUL) >> { >> ++active_thread_count; >> + ++child_threads_active; > > > Couldn't we now simplify by adding child_threads_active to > active_thread_count outside the loop instead of incrementing > active_thread_count for each thread?
yes/thanks! > > >> } >> } >> } >> @@ -2464,6 +2467,9 @@ static void perform_idle_server_maintena >> } >> ++free_length; >> } >> + else if (child_threads_active == threads_per_child) { >> + had_healthy_child = 1; >> + } > > > As I understand it had_healthy_child is never reset. So when we do an > "apachectl restart" (or graceful) the children started before that event > still count. I'd say after a restart the condition should be reset, because > often the config will have changed. Ahh, I had loadable-MPM in my mind, and had_healthy_child would be reset when the MPM was reloaded. But with a statically linked MPM had_healthy_child doesn't work right. I'll fix that too. (Generally I see some potential confusion between the flags/counters used in children, for which a forked child always gets a clean, initialized copy, and flags/counters used in the parent but aren't intended to be retained across unload/load, which can have initialization issues. Maybe some rearrangement will help.) > > ... > > The same comments apply to the worker MPM patch. Yep. Thanks again for looking... > > Regards, > > Rainer -- Born in Roswell... married an alien... http://emptyhammock.com/