On 09/01/2011 05:47 AM, Amos Jeffries wrote: > This patch seeks to avoids race conditions between the AsyncCalls for > the pconn monitoring handlers and lookups pulling pconn up for use.
... > It adds checks for the timeout handler, as discussed in bug 3281, > preventing the related type of race from the timeout handler. > + // our connection timeout handler is scheduled to run already. unsafe > + if (fd_table[conn->fd].timeoutHandler == NULL) > + return false; I am not strongly against this, but it would be more elegant and may be more efficient to let the about-to-timeout connections to be reused instead. This can be safely done, for example, by canceling the pending timeout notification callback in the connection giving code. Thank you, Alex. P.S. Higher level rant: We have an increasing amount of code devoted to checking whether some Comm information is already "in flight" and trying to predict what it will be or trying to react to it before it comes to us. Most of that code accesses internal Comm structures. I understand that this code is being created to address specific problems, but the direction of this worries me. We need to think of a better (uniform and encapsulated) way to handle the asynchronous nature of Comm-Job relationship or we will end up with too many similar-but-different and half-working checks in all Comm-using Jobs. We have been there before with the old spaghetti code (for slightly different reasons); let's try not to fall into the same trap. I do not have a well-formulated suggestion for the above high-level problem, but I may be able to spend some time on it after Rock Store is in. Meanwhile, every time you write fd_table[] outside of Comm, please pause and ask yourself whether there is a better way of doing it :-).
