I'd like to (slowly, I'm expecting about 4 hours a week on this) cleanup the select loop in squid to allow things like using poller/libevent/completion ports. I've been using twisted a bit over the last couple of years, and their event loop - their reactor - has a lot in parallel with the ACE reactor/proactor pair of patterns. http://www.cs.wustl.edu/~schmidt/PDF/reactor-rules.pdf http://www.cs.wustl.edu/~schmidt/PDF/proactor.pdf
Our current structure is partway between a reactor and proactor - we have multiple reactor style objects - the async io thread stuff for disk io, the single instance select loop. Using the language of the proactor pattern.. Right now we have one thread for each reactor-like thing - we have one thread for the async disk engine when its in use, and one thread (the main thread) for the comms queue. This is a little ugly because its asymmetrical: theres nothing intrinsically special about sockets to make them be the reactor in the main thread. I think its ugly because it presupposes that our efficiency on sockets will be better than that on disk. So I'd like to propose that we tweak our code so that we no longer assume comms requests happen in the main thread. One way of doing this is to have a dispatcher instance for each type of event we can dispatch, and loop on them in the main loop. (this is a trivial tweak to what we have today). If we also have objects to represent each async-activity that occurs (i.e. a select loop, a poll l loop, a completion-port loop), I'd like to give all async-operating code paths an object to represent them. Now, as we would like to not busy wait, we need to pass in a non-zero timeout to any select/poll style calls, but this will cause latency if other async activity does occur concurrently. So - each async engine will have a method on it which can be called to 'cheaply' notify it of activity that is occuring (i.e. the threaded-async disk engine can inform the current primary engine that a disk io has completed). And because we don't know a-priori whether an async engine is os-backed (i.e. completion ports with overlapped I/O) or polled, there needs to be a poll() or checkEvents() or similar method called on each engine once per loops. our main loop can then become something like: while (!finished) { for (dispatchers::iterator i = dispatchers.first(); i != dispatchers.end(); ++i) { i->dispatch(); } for (engines::iterator i = engines.first(); i != engines.end(); ++i) { i->checkEvents(); } } This will have the following benefits: - We can properly support Overlapped IO on windows - We can change the engines in use at runtime - just keep an engine in the engines list until all the pending events on it have completed and then remove it. - management of the main loop and reconfiguration becomes conceptually clearer - we move the special casing out of the main loop and into specific event handlers. (processing signals like 'please reconfigure' becomes just another event that can be dispatched). I think we want a single completion dispatcher class for each group of events that occur asynchronously. I.e. a dispatcher that knows how to dispatch socket events, one that knows how to dispatch disk events, one that knows how to dispatch timer events etc, one that knows how to dispatch informational signals. How does this sound in principal? If it sounds ok, I'll start doing a series of small (a few hours each) patches heading in this direction. One of the reasons I want to do this is to make it possible to write a test harness that can exercise callback requiring code by having a trivial controllable event loop that can be invoked in a test. Rob -- GPG key available at: <http://www.robertcollins.net/keys.txt>.
signature.asc
Description: This is a digitally signed message part