At 01:30 PM 1/30/2003, Bill Stoddard wrote: >William A. Rowe, Jr. wrote: > >>I belive I've deciphered the "RotateLogs doesn't work for access logs >>on Windows" Apache 2.0.44 bug. It's actually many bugs in conformance. >> >>Finally, it looks like apr_proc_other_child_read is the function we *really* >>wanted >>to use within the health check. But it seems all of these >>apr_proc_other_child >>functions are really misdocumented within APR. Would someone step up and >>spell out exactly what they are *supposed* to be doing within unix, and then >>we >>can discuss how to make them portable to Win32? It seems we have too much >>bubblegum and bailing wire holding them together, and the fixes that I made to >>do *exactly* what Unix was doing has killed the WinNT mpm.
>What we want to do in the winnt MPM maintenance loop is peridically check >to see of the process is still alive. It it is dead, it needs to be restarted >(ie, reliable >piped logs). apr_proc_other_child_read was used in the Unix side of the house >to do a periodic read of the pipe to an OC. On Unix, you can tell if the >process >on the other end is still alive by doing a read on the pipe. Cannot do that >with pipes >on Windows, so we use apr_proc_other_child_check instead. We never check the either end of OC pipes on Unix today in Apache 2.0. Because we 'hang on' to the pipe and send that same pipe to the next generation of child, we catch the death of the child instead. However, the dying child is caught by apr_proc_wait_all_procs(). This is bubbled to ap_wait_or_timeout which bubbles the offending PID to the run_mpm main loop. The apr_proc_wait_all_procs() is simply a apr_proc_wait(-1) flavor of the same. Win32 could do the very same thing. However, I'm considering whether or not to drive this; will WaitForMultipleObjects (suffering the usual issues of 64 wait events) or if it might make more sense to setup RegisterWaitForSingleObject on each child process handle. This may give us most similar results to Unix. Bug number one, in my mind, was the abuse of the function that was named apr_proc_other_child_read(). That fn must be called apr_proc_other_child_died(). Only the name of that function changes. The second abused function was... >I think we need one function (call it what you will but >apr_proc_other_child_check >seems okay to me) that checks the status of an OC and performs an action >(specified on the call to apr_proc_other_child_check) based on the status. The *existing* apr_proc_other_child_check doesn't do at all what you describe on Unix. We only call it from the mpm and it's only as we are shutting down or restarting. *That* function is misnamed, I'm thinking apr_proc_other_child_restart(). This function would continue to be called as the MPM recycles. The *new* function you describe (it didn't exist today) would have been nicely named _check() but that would be a namespace clash. So I'm thinking _refresh() or some such that would check the health of the other children and invoke their callbacks only if something bad has happened. This might even be on a periodic basis on Unix, if we discover that some platforms aren't reliably reporting apr_proc_other_child_died() due to apr_proc_wait_all_procs() missing some signals. >I am guessing that the windows MPM is whacking the piped logger because >ocr->proc->hproc is somehow hosed. No, that is working fine. It is whacking it because I modified the code in _check() to do exactly the same thing on Win32 as it does on Unix. (Yes, on unix we whack the children from mod_log_config each restart or shutdown. Good? Maybe not but until this behaves correctly it remains a useful test case for me.) So now Win32 would whack the child once per second based on WinNT MPM's calls to _check(). And the code in _check() - al la _restart(), the code in _read() al la _died(), and the new code in a _refresh() must behave the same between Unix and Win32, or nobody will ever successfully code portable modules. So, renaming _check() to restart() and adding the _refresh() we will safely be able to wander between the two platforms and not run into this again. Sure, _check() sounded like a good place to put the logic you added, but it was entirely different in purpose from the Unix code. That's unacceptable. I fault the docs, again, for not providing sufficient details. Sure, the MPM details can be different, but they should be *converging* around common functions, not divirging wildly. If my theory on RegisterWaitForSingleObject to signal apr_proc_wait_all_procs() actually works, we might just see that convergence. I started this thread not to ask what Win32 was doing, but what the essential design from Unix was. Once I understand that, I stand a real chance to get Win32 identical or so similar you don't notice the discrepancies. This may involve an extra function, then again it might not. But Unix sure didn't agree with the documented apr_thread_proc.h documentation, and that's what really drove me nuts these few days as I continue to try untangling the problem. Bill