Re: [PATCH] #39275 MaxClients on startup [Was: Bug in 2.0.56-dev]

Jeff Trawick Wed, 12 Apr 2006 07:05:13 -0700

On 4/11/06, Chris Darroch <[EMAIL PROTECTED]> wrote:
> Hi --
>
> Alexander Lazic wrote:
>
> >> After 'make install' i started apache, then some seconds later i got the
> >> message '...MaxClients reached...' but there was no entry in the access
> >> log, and nobody have make a request to this server.
>
> Jeff Trawick wrote:
>
> > There are problems accounting for child processes which are trying to
> > initialize that result in the parent thinking it needs to create more
> > children.  The less harmful flavor is when it thinks (incorrectly) it
> > is already at MaxClients and issues the "reached MaxClients" message.
> > More disturbing is when MaxClients is very high and the parent keeps
> > creating new children using exponential ramp-up.  That can be very
> > painful.
>
>    I have been seeing something similar with 2.2.0 using the worker
> MPM, where with the following settings, I get over 10 child processes
> initializing immediately (e.g., up to 15), and then they drop back to
> 10.  I see the "server reached MaxClients" message as well right
> after httpd startup, although nothing is connecting yet.
>
> <IfModule mpm_worker_module>
>     StartServers         10
>     MaxClients          150
>     MinSpareThreads      25
>     MaxSpareThreads     100
>     ThreadsPerChild      10
> </IfModule>
>
>    In my case, the problem relates to how long the child_init phase
> takes to execute.  I can "tune" this by raising DBDMin (and DBDKeep)
> so that mod_dbd attempts to open increasingly large numbers of
> DB connections during child_init.  With DBDMin set to 0 or 1,
> all is well; no funny behaviour.  Up at DBDMin and DBDKeep at 3,
> that's when (for me) things go pear-shaped.
>
>    In server/mpm/worker/worker.c, after make_child() creates a
> child process it immediately sets the scoreboard parent slot's pid
> value.  The main process goes into server_main_loop() and begins
> executing perform_idle_server_maintenance() every second; this
> looks at any process with a non-zero pid in the scoreboard and
> assumes that any of its worker threads marked SERVER_DEAD are,
> in fact, dead.
>
>    However, if the child processes are starting "slowly" because
> ap_run_child_init() in child_main() is taking its time, then
> start_threads() hasn't even been run yet, so the threads aren't
> marked SERVER_STARTING -- they're just set to 0 as the default
> value.  But 0 == SERVER_DEAD, so the main process sees a lot
> of dead worker threads and begins spawning new child processes,
> up to MaxClients/ThreadsPerChild in the worst case.  In this case,
> when no worker threads have started yet, but all possible child
> processes have been spawned (and are working through their
> child_init phases), then the following is true and the
> "server reached MaxClients" message is printed, even though
> the server hasn't started accepting connections yet:
>
>     else if (idle_thread_count < min_spare_threads) {
>         /* terminate the free list */
>         if (free_length == 0) {
>
>    I considered wedging another thread status into the
> scoreboard, between SERVER_DEAD (the initial value) and
> SERVER_STARTING.  The make_child() would set all the thread
> slots to this value and start_threads() would later flip them
> to SERVER_STARTING after actually creating the worker threads.
>
>    That would have various ripple effects on other bits of
> httpd, though, like mod_status and other MPMs, etc.


In other words, breaks binary compatibility...

Other modules should see the threads in SERVER_STARTING state anyway. 
IOW, I think we should set state to SERVER_STARTING before we do any
potentially-lengthy work like running child-init hooks so that the
state as seen from the outside makes sense.  That also means resetting
the state if something fails (e.g., pthread_create()).

But that isn't needed for proper operation of the MPM, which is what
we're after at the moment...  But it would be great to be able to see
from mod_status that a child is taking way too long in the
SERVER_STARTING state.

>                                                         So instead
> I tried adding a status field to the process_score scoreboard
> structure, and making the following changes to worker.c such that
> this field is set by make_child to SERVER_STARTING and then
> changed to SERVER_READY once the start thread that runs
> start_threads() has done its initial work.

I was considering adding something to process_score for this issue but
I decided against it, hopefully for an bogus reason -- binary
compatibility breakage.

This isn't binary compatibility breakage since we provide
ap_get_scoreboard_process() for modules to retrieve a process_score
structure, and if fields get added to the end for the use of the MPM
then no worries since we don't support modules creating their own
process_score structures and stuffing them in the scoreboard.

(confirmation from the crowd?)

Instead of "unsigned char status" I'd prefer something like

apr_int32_t mpm_state;   /* internal state for MPM; meaning may change
                                      * in the future, so not for use
by other modules
                                      */

If a particular MPM wants to store SERVER_STARTING/SERVER_DEAD/etc. then fine.

>    During this period, while the new child process is running
> ap_run_child_init() and friends, perform_idle_server_maintenance()
> just counts that child process's worker threads as all being
> effectively in SERVER_STARTING mode.  Once the process_score.status
> field changes to SERVER_READY, perform_idle_server_maintenance()
> begins to look at the individual thread status values.
>
>    Any thoughts?  The patch in Bugzilla doesn't address other
> MPMs that might see the same behaviour (event, and maybe prefork?)
>
> http://issues.apache.org/bugzilla/show_bug.cgi?id=39275
>
> It also doesn't necessarily play ideally well with the fact that
> new child processes can gradually take over thread slots in
> the scoreboard from a gracefully exiting old process -- the
> count of idle threads for that process will be pegged (only
> by perform_idle_server_maintenance()) at ap_threads_per_child
> until the new process creates its first new worker thread.
> But, that may be just fine....  I'll keep poking around and
> testing and maybe a better idea will present itself.

A gracefully exiting process has lost its process score field and
gradually loses its worker_score fields as well.  Gracefully exiting
threads aren't counted as active or idle.

I think this means we can create a new process to make up for
gracefully exiting threads that we won't necessarily need once they
finish and new threads in that process scoreboard slot take over. 
Unavoidable, since gracefully exiting threads can take forever.

Re: [PATCH] #39275 MaxClients on startup [Was: Bug in 2.0.56-dev]

Reply via email to