I like this idea better then just checking for pid == -1, though sending
TERM signal to 0 shouldn't be that damaging (if damaging at all).
Mostly because apachectl will run in different process group, so it will not
be killed, and will finish restarting apache.
And unless apache itself is embeded or started by some other software --
signal shouldn't kill anything but apache.


On Sun, Apr 17, 2011 at 9:58 PM, pqf <p...@mailtech.cn> wrote:

>  Hi, all
> Another question, does proc_wait_process() should update procnode->proc_id
> to 0 too? or else mod_fcgid may send a signal to another irrelevant process
> while apache is shutting down? I don't follow up mod_fcgid for a while, I
> just took a glance, maybe it's updated somewhere else?
> By the way, procnode->proc_id is set to 0 while apache startup, so why not
> update procnode->proc_id to 0 while fcgid_create_privileged_process() is
> fail? And then check magic number 0 rather than both -1 and 0,  in both
> proc_kill_gracefully() and proc_kill_force().
>
> Cheers.
> Ryan
>
>
>
>  Hello,
>
> There is a very interesting, and quite a rare bug in mod_fcgid. It is easy
> to reproduce if you can cause fork to fail (which can be done with
> CloudLinux -- if anyone wants to replicate it).
>
> *Here is how it works: *
> mod_fcgid tries to spawn a new process (proc_spawn_process in
> fcgid_proc_unix.c), but fork returns -1.
> More exactly fcgid_create_privileged_process function call returns error,
> and fills in tmpproc.pid with -1 & tmpproc is assiged to procnode->proc_id).
>
> Now, if at the same time service httpd restart is executed, function
> kill_all_subprocess in fcgid_pm_main.c will execute, and it will try to go
> through all procnodes, sending SIGTERM via proc_kill_gracefully, (and then
> SIGKILL via proc_kill_force) to procnode->proc_id.pid
> Yet, one procnode will be pointing to procnode->proc_id.pid, causing kill
> -15 -1 (kill all).
> The end results all services on the server failing, including SSH, apache,
> syslogd, etc..
>
> I guess the problem is really rare for most people. Also it is quite hard
> to diagnose, as it is completely not clear where the signal came from, and
> it took us some time to correlate them with apache restarts.. Yet due to our
> OS being used by shared hosts (where httpd restart is common thing), and our
> ability to limit memory available to processes on per virtual host bases
> (which causes fork to fail once that virtual host reaches memory limit), we
> see the issue quite often.
>
> The solution is quite simple (not sure if it is the best / right solution),
> in file: fcgid_proc_unix.c, in methods proc_kill_gracefully, line:
>
>     rv = apr_proc_kill(&(procnode->proc_id), SIGTERM);
>
> should be changed to:
>    if (procnode->proc_id.pid != -1) {
>           rv = apr_proc_kill(&(procnode->proc_id), SIGTERM);
>    } else {
>           rv = APR_SUCCESS;
>    }
>
> Similarly in proc_kill_force
>     rv = apr_proc_kill(&(procnode->proc_id), SIGKILL);
> should be changed to:
>    if (procnode->proc_id.pid != -1) {
>           rv = apr_proc_kill(&(procnode->proc_id), SIGKILL);
>    } else {
>           rv = APR_SUCCESS;
>    }
>
> Regards,
> Igor Seletskiy
> CEO @ Cloud Linux Inc
>
>
>

Reply via email to