Events, Destruction and Locking

Paul Querna Mon, 06 Jul 2009 22:21:25 -0700

Can't sleep, so finally writing this email I've been meaning to write
for about 7 months now :D


One of the challenges in the Simple MPM, and to a smaller degree in
the Event MPM, is how to manage memory allocation, destruction, and
thread safety.

A 'simple' example:
 - 1) Thread A: Client Connection Created
   -  2) Thread A: Timer Event Added for 10 seconds in the future to
detect  IO timeout,
 - 3) Thread B: Client Socket closes in 9.99 seconds.
 - 4) Thread C: Timer Event for IO timeout is triggered after 10 seconds

The simple answer is placing a Mutex around the connection object.
Any operation which two threads are working on the connection, locks
this Mutex.

This has many problems, the first of which is destruction.  In this
case, Thread B would start destructing the connection, since the
socket was closed, but thread C would already be waiting for this
mutex.... and then the object underneath it was just free'ed.

To solve this Thread B would unregister all existing (and unfired)
triggers/timeouts first.  Events would increment a reference count on
the connection object, and Thread B would schedule a future event to
check this reference count. If the reference count is zero, this timer
would free the connection object, if there was still an outstanding
reference in a running event, it would schedule itself for a future
cleanup attempt.

All of this is insanely error prone, difficult to debug, and painful to explain.

Pools don't help, but don't really make it worse, and are good enough
for the actual cleanup part -- the difficultly lies in knowing *when*
you can cleanup an object.

A related problem of using Mutex Guards on a connection object is that
if a single connection 'locks up' a thread, its feasible forl other
worker threads to get stuck waiting for this connection, and we would
have no way to 'recover' these lost threads.

I think it is possible to write a complete server that deals with all
these intricacies and gets everything just 'right', but as soon as you
introduce 3rd party module writers, no matter how 'smart' we are, our
castle of event goodness will crumble.

I am looking for an alternative that doesn't expose all this crazyness
of when to free, destruct, or lock things.  The best idea I can come
up with is for each Connection, it would become 'semi-sticky' to a
single thread.  Meaning each worker thread would have its own queue of
upcoming events to process, and all events for connection X would sit
on the same 'queue'.  This would prevent two threads waiting for
destruction, and other cases of a single connection's mutex locking up
all your works, essentially providing basic fault isolation.

These queues could be mutable, and you could 'move' a connection
between queues, but you would always take all of its events and
triggers, and move them together to a different queue.

Does the 'connection event queue' idea make sense?

I'm not sure I'm expressing the idea fully over email.... but I'll be
at OSCON in a few weeks if anyone wants beer :)

-Paul

Events, Destruction and Locking

Reply via email to