Justin Erenkrantz wrote:
On Fri, Apr 15, 2005 at 11:56:38AM -0400, Greg Ames wrote:

the reason that this and the corresponding 1.3 BUFF logic exists is to minimize "tinygrams" - ip packets that are less than a full mtu size. tinygrams definately degrade network performance and trigger things like Nagle's algorithm. in this particular case we are trying to combine response data from two or more different pipelined requests into a single packet.


I'm not sure this optimization is really necessary any more.

neither am I.

It seems overly complicated

amen

  given that most responses span multiple IP packets.  And, we
specifically disabled Nagle anyway.

maybe we should try an experiment. get rid of the combine-pipelined-responses logic and see what happens. tweak the core output filter to always flush, comment out the check_pipeline_flush call and logic that explicitly sends a FLUSH bucket, then benchmark it with a worst case scenario - a series of pipelined GETs for a one byte file across a network, or just HEAD requests.


however our post-filters implementation is very costly in terms of cpu. we make an extra trip down both filter chains. on the first trip down the output filter chain we are almost ready to call apr to transmit the data when the core output filter stashes the output data, temporarily abandons it, and unwinds. then we go down the input filter chain with

This will happen for anything less than AP_MIN_BYTES_TO_WRITE anyway.

right

AP_MODE_EAT_CRLF to see if there is more input data stashed (and do an extra socket read() syscall that doesn't happen in 1.3). assuming the answer is no (typical) we send a flush bucket down the output filter chain, get back to the core output filter, encounter numerous cache misses reading all the instructions and data back into the cpu caches to pick up where we left off, then finally hand off the response data to the apr socket functions.

what I'd like to see instead is for the input filter chain to keep track of whether there is any stashed input any time it is called. then the core


There's no real way to do that across all filters.

we have a way to do that today across all filters - AP_MODE_EATCRLF. we could invent a better way that doesn't involve traversing the input filters twice. but if we don't need to worry about combining pipelined responses there's no point.


Dorky idea: we could move the pipeline ready check to the if EOS clause in
ap_pass_brigade where we mark eos_sent.  If there is no data ready (via a
speculative read: I see no way to not do that), then we add the flush bucket.
We'd also take out that EOS check for the deferred writes in
core_output_filter as we're doing the check earlier on.  So, by the time we
hit core_output_filter, we already know if we should hang on to the
connection.

This would save us from the extra round trip.  I'm not sure where else we
could even place such a check besides ap_pass_brigade.  -- justin

I'm all in favor of simplifying this code and eliminating instructions in the main path, especially if they aren't really buying us anything any more. but I don't see that changing from AP_MODE_EATCRLF to speculative reads helps much. I would prefer a solution that uses neither for the common cases.


Greg



Reply via email to