At 01:30 PM 1/30/2003, Bill Stoddard wrote:
>William A. Rowe, Jr. wrote:
>
>>I belive I've deciphered the "RotateLogs doesn't work for access logs
>>on Windows" Apache 2.0.44 bug.  It's actually many bugs in conformance.
>>
>>Finally, it looks like apr_proc_other_child_read is the function we *really* 
>>wanted
>>to use within the health check.  But it seems all of these 
>>apr_proc_other_child
>>functions are really misdocumented within APR.  Would someone step up and
>>spell out exactly what they are *supposed* to be doing within unix, and then 
>>we
>>can discuss how to make them portable to Win32?  It seems we have too much
>>bubblegum and bailing wire holding them together, and the fixes that I made to
>>do *exactly* what Unix was doing has killed the WinNT mpm.

>What we want to do in the winnt MPM maintenance loop is peridically check 
>to see of the process is still alive. It it is dead, it needs to be restarted 
>(ie, reliable 
>piped logs). apr_proc_other_child_read was used in the Unix side of the house 
>to do a periodic read of the pipe to an OC.  On Unix, you can tell if the 
>process 
>on the other end is still alive by doing a read on the pipe. Cannot do that 
>with pipes 
>on Windows, so we use apr_proc_other_child_check instead.  

We never check the either end of OC pipes on Unix today in Apache 2.0.
Because we 'hang on' to the pipe and send that same pipe to the next 
generation of child, we catch the death of the child instead.

However, the dying child is caught by apr_proc_wait_all_procs().  This is
bubbled to ap_wait_or_timeout which bubbles the offending PID to the
run_mpm main loop.

The apr_proc_wait_all_procs() is simply a apr_proc_wait(-1) flavor of the same.
Win32 could do the very same thing.  However, I'm considering whether or not
to drive this; will WaitForMultipleObjects (suffering the usual issues of 64 
wait 
events) or if it might make more sense to setup RegisterWaitForSingleObject 
on each child process handle.  This may give us most similar results to Unix.

Bug number one, in my mind, was the abuse of the function that was named
apr_proc_other_child_read().  That fn must be called 
apr_proc_other_child_died().
Only the name of that function changes.

The second abused function was...

>I think we need one function (call it what you will but 
>apr_proc_other_child_check 
>seems okay to me) that checks the status of an OC and performs an action 
>(specified on the call to apr_proc_other_child_check) based on the status.  

The *existing* apr_proc_other_child_check doesn't do at all what you describe on
Unix.  We only call it from the mpm and it's only as we are shutting down or
restarting.  *That* function is misnamed, I'm thinking 
apr_proc_other_child_restart().
This function would continue to be called as the MPM recycles.

The *new* function you describe (it didn't exist today) would have been nicely 
named
_check() but that would be a namespace clash.  So I'm thinking _refresh() or 
some
such that would check the health of the other children and invoke their 
callbacks only
if something bad has happened.  This might even be on a periodic basis on Unix, 
if
we discover that some platforms aren't reliably reporting 
apr_proc_other_child_died()
due to apr_proc_wait_all_procs() missing some signals.

>I am guessing that the windows MPM is whacking the piped logger because 
>ocr->proc->hproc is somehow hosed.

No, that is working fine.  It is whacking it because I modified the code in 
_check()
to do exactly the same thing on Win32 as it does on Unix.  (Yes, on unix we 
whack
the children from mod_log_config each restart or shutdown.  Good?  Maybe not but
until this behaves correctly it remains a useful test case for me.)  So now 
Win32
would whack the child once per second based on WinNT MPM's calls to _check().

And the code in _check() - al la _restart(), the code in _read() al la _died(), 
and the 
new code in a _refresh() must behave the same between Unix and Win32, or nobody 
will ever successfully code portable modules.

So, renaming _check() to restart() and adding the _refresh() we will safely be 
able
to wander between the two platforms and not run into this again.  Sure, _check()
sounded like a good place to put the logic you added, but it was entirely 
different
in purpose from the Unix code.  That's unacceptable.  I fault the docs, again, 
for
not providing sufficient details.

Sure, the MPM details can be different, but they should be *converging* around
common functions, not divirging wildly.  If my theory on 
RegisterWaitForSingleObject
to signal apr_proc_wait_all_procs() actually works, we might just see that 
convergence.

I started this thread not to ask what Win32 was doing, but what the essential 
design
from Unix was.  Once I understand that, I stand a real chance to get Win32 
identical
or so similar you don't notice the discrepancies.  This may involve an extra 
function,
then again it might not.  

But Unix sure didn't agree with the documented apr_thread_proc.h documentation, 
and that's what really drove me nuts these few days as I continue to try 
untangling
the problem.

Bill


Reply via email to