Tom Jackson wrote:
Jeff,

I don't have a problem with it being in, but it actually needs to do something we can measure. There are significant overall differences between 4.0.10 and 4.5, although if you look at small parts it seems the same. Just as an example, the patch in driver.c was hundreds of lines off from 4.0.10 to 4.5.

I was suggesting adding in extra stuff so that the patch to register the AtReady proc would actually run; you're right in that it appears that without such the patch looks like it does nothing.

Looking at it slightly more deeply, the AtReady callback shouldn't be needed, because TriggerDriver is called from NsFreeConn, which is called every time through the ConnThread loop after the connections have been put back on the free list.

Somehow Gustaf has been able to reproduce the fix with just the driver.c change, and the new fix he is trying out also works. I'm not sure we understand why either of them work quite yet, or if they will continue to work...and that is the real problem. Magical fixes are nice, but in the long we might need to know more about it.

Magical fixes are fine so long as the magic keeps working. But something like there where there appears to be no effect may mean that it is super timing sensitive, maybe pushing an allocated structure into another block which causes a cache miss at just the right time to allow a task switch... ugh, my brain hurts just thinking about it. 5 guys and 5 chopsticks are so much simpler :)

I have ab run without either fix and never had a problem. Actually the call was in driver.c, but the function body was a noop, it never even registered the TriggerDriver, the AtReadyProcs fuction was completely removed, not even defined. To me this only means that I don't know how to reproduce the bad behavior, but at least it didn't cause any difference for me.

Gustaf's description of his reproduction scenario sounds much different from mine from last year, which suggests that the underlying cause is different, and this trigger/atready is a red herring.

-J


tom jackson

On Friday 19 October 2007 11:11, Jeff Rogers wrote:
I'm thinking the call to NsRunAtReadyProcs should be re-added in.  It
was apparently removed a long time ago, just after 4.0.10 was released:

http://aolserver.cvs.sourceforge.net/aolserver/aolserver/nsd/queue.c?r1=1.2
4&r2=1.25

As the api was not being used at all it was fine to remove it, until it
became useful for this race condition :)   I haven't successfully
reproduced the problem in 4.5 yet (or read the code deeply, but the main
connection service loop - queue.c:NsConnThread - is not significantly
different now) so it may be a different underlying problem than the one
I described, but it should be an easy test to see if adding the AtReady
callbacks back into the end of the loop helps things.


--
AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to <[EMAIL PROTECTED]> 
with the
body of "SIGNOFF AOLSERVER" in the email message. You can leave the Subject: 
field of your email blank.

Reply via email to