Brian Pane wrote:
On Oct 10, 2005, at 12:01 AM, Paul Querna wrote:

If the content has already been generated, why add the overhead of the context switch/sending to another thread? Can't the same event thread do a non-blocking write?

Once it finishes writing, then yes, we do require a context-switch to another thread to do logging/cleanup.

I am mostly thinking about downloading a 1 gig file with the current pattern against a slow client. A non-blocking write might only do ~64k at a time, and causing 1 gig/64k context switches, which seems less than optimal.


If I had to choose, I'd rather do the context switches than devote a
thread (and the associated stack space) to the connection until
the writes are finished--especially if the server is delivering a
thousand 1GB files to slow clients concurrently.

However, it's probably possible to have _both_ a high ratio
of connections to threads (for scalability) and a low ratio of
context switches to megabytes delivered (for efficiency).
The Event MPM currently has to do a lot of context switching
because it detects events in one thread and processes them
in another.  If we add async write completion to the
Leader/Followers MPM (or incorporate a leader/follower
thread model into Event), it should reduce the context
switches considerably.

this is interesting to me because Brian Atkins recently reported that the event MPM was much slower. http://mail-archives.apache.org/mod_mbox/httpd-dev/200509.mbox/[EMAIL PROTECTED]

it would be nice to hear more details, but I assume that this means event is burning more CPU for a given workload rather than some kind of extra latency bug. we know that event has more context switching than worker when keepalives are in use but pipelining is not, and async write completion will add to it. I suppose we should profile event and worker and compare profiles in case there's some other unexpected CPU burner out there.

if context switch overhead is really the culprit, how do we reduce it? if I recall correctly, leader/follower sort of plays tag and the next thread that's It gets to be the listener. I can see that running the request processing on the same thread that does the accept would be more cache friendly, and it might save some of the current queuing logic. but doesn't this have about the same amount of pthread library/scheduler overhead to "tag" the new listener and dispatch it as we have now waking up worker threads?

another brainstorm is to use a short keepalive timeout, like 200ms*, on the worker thread. if it pops, turn the connection over to the event pollset using the remaining KeepAliveTimeout and give up the worker thread.

Greg

*200ms - the idea is to use something just big enough to cover most network round trip times, so we catch the case where the browser sends the next request immediately after getting our response.

Reply via email to