Can't sleep, so finally writing this email I've been meaning to write for about 7 months now :D
One of the challenges in the Simple MPM, and to a smaller degree in the Event MPM, is how to manage memory allocation, destruction, and thread safety. A 'simple' example: - 1) Thread A: Client Connection Created - 2) Thread A: Timer Event Added for 10 seconds in the future to detect IO timeout, - 3) Thread B: Client Socket closes in 9.99 seconds. - 4) Thread C: Timer Event for IO timeout is triggered after 10 seconds The simple answer is placing a Mutex around the connection object. Any operation which two threads are working on the connection, locks this Mutex. This has many problems, the first of which is destruction. In this case, Thread B would start destructing the connection, since the socket was closed, but thread C would already be waiting for this mutex.... and then the object underneath it was just free'ed. To solve this Thread B would unregister all existing (and unfired) triggers/timeouts first. Events would increment a reference count on the connection object, and Thread B would schedule a future event to check this reference count. If the reference count is zero, this timer would free the connection object, if there was still an outstanding reference in a running event, it would schedule itself for a future cleanup attempt. All of this is insanely error prone, difficult to debug, and painful to explain. Pools don't help, but don't really make it worse, and are good enough for the actual cleanup part -- the difficultly lies in knowing *when* you can cleanup an object. A related problem of using Mutex Guards on a connection object is that if a single connection 'locks up' a thread, its feasible forl other worker threads to get stuck waiting for this connection, and we would have no way to 'recover' these lost threads. I think it is possible to write a complete server that deals with all these intricacies and gets everything just 'right', but as soon as you introduce 3rd party module writers, no matter how 'smart' we are, our castle of event goodness will crumble. I am looking for an alternative that doesn't expose all this crazyness of when to free, destruct, or lock things. The best idea I can come up with is for each Connection, it would become 'semi-sticky' to a single thread. Meaning each worker thread would have its own queue of upcoming events to process, and all events for connection X would sit on the same 'queue'. This would prevent two threads waiting for destruction, and other cases of a single connection's mutex locking up all your works, essentially providing basic fault isolation. These queues could be mutable, and you could 'move' a connection between queues, but you would always take all of its events and triggers, and move them together to a different queue. Does the 'connection event queue' idea make sense? I'm not sure I'm expressing the idea fully over email.... but I'll be at OSCON in a few weeks if anyone wants beer :) -Paul
