OLD_WRITE stuff (was: Re: 2.0.34 - erratic behavior with autoindexes)

2002-04-02 Thread Greg Stein

On Tue, Apr 02, 2002 at 01:50:59PM -0800, Ryan Bloom wrote:
...
  chain.  Bill S. stuck his head in here and said something about a rule
  that if old_write is ever used, it has to always be used.
 
 If you read through mod_include, you will see that whenever we create a
 sub-request, we send the output from the current filter to the next
 filter.  This is required to make sure that things are output in the
 correct order.  If you are a handler and are generating data through
 ap_r* functions, you must flush the filter stack so that you are sure
 that OLD_WRITE isn't buffering for you.

That's not true. OLD_WRITE was designed to allow a mix/match of writing to
the stack *and* using the ap_r* functions.

The design is that ap_r* implies a write into the filter stack. That write
is optimized to use a brigade for buffering (so every ap_rputc() doesn't
traverse the filter stack). If OLD_WRITE isn't at the top, then it doesn't
do the buffering (because there are filters between the top and the
OLD_WRITE buffers).

If somebody writes directly into the filter stack, then it prepends any
buffered content to whatever is written, and then passes it all down the
stack.

[ just reviewed the code; it seems to still match the original design ]

It all looks pretty good, but there might be a problem where:

*) ap_r* is used, so OLD_WRITE is inserted, and the content is stored into
   its buffer.

*) another filter is inserted above OLD_WRITE, so it is not the top. ap_r*
   content should now go into this new filter.

*) ap_r* is called again, but it thinks it cannot buffer. it passes the new
   data directly into the filter stack.
   
   BUG: the previously-buffered content is not prepended.


So... the bug might appear because of the increased dynamicity of the filter
chain nowadays. Back when OLD_WRITE was written, it was perceived that the
chain would be quite static by the time the first write occurred.

Note that OLD_WRITE tries quite hard to keep its filter on top (its filter
type is RESOURCE-10). If something broke that, then it is possible to hit
the above bug.

The fix to the (potential) bug is to change line 1370 in server/protocol.c
to concat the brigades, similar to ap_old_write_filter() (defined just
above, at line 1316).

Another question is whether anybody defines their filter type as less than
RESOURCE-10 and ends up ahead of OLD_WRITE.

And lastly: I just realized that if a filter gets in front of OLD_WRITE,
then the branch at line 1366 will get called for each ap_r* call. If that is
a series of 1000's of ap_rputc() calls, then you'll end up creating
thousands of brigades for those transient buckets(!).

[ a similar brigade over-creation can occur if you alternate ap_r* and
  writing to the filter stack; ap_old_write_filter() will forget about the
  brigade it created, so the next ap_r* needs to recreate one ]

The answer might be to test for brigade-empty rather than ctx-bb == NULL.
Then the branch at 1366 can store and reuse a brigade in ctx-bb.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/



RE: OLD_WRITE stuff (was: Re: 2.0.34 - erratic behavior with autoindexes)

2002-04-02 Thread Ryan Bloom


Everything below is true, assuming you are not talking about
sub-requests.  According to the stack that Greg posted, the OLD_WRITE
filter was not found in the sub-request filter stack.  That is to be
expected, because the OLD_WRITE filter is a RESOURCE filter.  Since
RESOURCE filters are not kept for sub-requests, the data must be
flushed.

Hm  Actually it seems like Resource filters should be kept for
sub-requests that are created with a filter list.  If that is the case,
then it looks like the OLD_WRITE filter should be in the filter list.

Unless, the OLD_WRITE filter is being added after the sub-request is
created..

Lots to look at..

Ryan

 On Tue, Apr 02, 2002 at 01:50:59PM -0800, Ryan Bloom wrote:
 ...
   chain.  Bill S. stuck his head in here and said something about a
rule
   that if old_write is ever used, it has to always be used.
 
  If you read through mod_include, you will see that whenever we
create a
  sub-request, we send the output from the current filter to the next
  filter.  This is required to make sure that things are output in the
  correct order.  If you are a handler and are generating data through
  ap_r* functions, you must flush the filter stack so that you are
sure
  that OLD_WRITE isn't buffering for you.
 
 That's not true. OLD_WRITE was designed to allow a mix/match of
writing to
 the stack *and* using the ap_r* functions.
 
 The design is that ap_r* implies a write into the filter stack. That
write
 is optimized to use a brigade for buffering (so every ap_rputc()
doesn't
 traverse the filter stack). If OLD_WRITE isn't at the top, then it
doesn't
 do the buffering (because there are filters between the top and the
 OLD_WRITE buffers).
 
 If somebody writes directly into the filter stack, then it prepends
any
 buffered content to whatever is written, and then passes it all down
the
 stack.
 
 [ just reviewed the code; it seems to still match the original design
]
 
 It all looks pretty good, but there might be a problem where:
 
 *) ap_r* is used, so OLD_WRITE is inserted, and the content is stored
into
its buffer.
 
 *) another filter is inserted above OLD_WRITE, so it is not the top.
 ap_r*
content should now go into this new filter.
 
 *) ap_r* is called again, but it thinks it cannot buffer. it passes
the
 new
data directly into the filter stack.
 
BUG: the previously-buffered content is not prepended.
 
 
 So... the bug might appear because of the increased dynamicity of the
 filter
 chain nowadays. Back when OLD_WRITE was written, it was perceived that
the
 chain would be quite static by the time the first write occurred.
 
 Note that OLD_WRITE tries quite hard to keep its filter on top (its
 filter
 type is RESOURCE-10). If something broke that, then it is possible to
hit
 the above bug.
 
 The fix to the (potential) bug is to change line 1370 in
server/protocol.c
 to concat the brigades, similar to ap_old_write_filter() (defined just
 above, at line 1316).
 
 Another question is whether anybody defines their filter type as less
than
 RESOURCE-10 and ends up ahead of OLD_WRITE.
 
 And lastly: I just realized that if a filter gets in front of
OLD_WRITE,
 then the branch at line 1366 will get called for each ap_r* call. If
that
 is
 a series of 1000's of ap_rputc() calls, then you'll end up creating
 thousands of brigades for those transient buckets(!).
 
 [ a similar brigade over-creation can occur if you alternate ap_r* and
   writing to the filter stack; ap_old_write_filter() will forget about
the
   brigade it created, so the next ap_r* needs to recreate one ]
 
 The answer might be to test for brigade-empty rather than ctx-bb ==
NULL.
 Then the branch at 1366 can store and reuse a brigade in ctx-bb.
 
 Cheers,
 -g





Re: OLD_WRITE stuff (was: Re: 2.0.34 - erratic behavior with autoindexes)

2002-04-02 Thread Greg Stein

On Tue, Apr 02, 2002 at 02:56:33PM -0800, Ryan Bloom wrote:
 
 Everything below is true, assuming you are not talking about
 sub-requests.  According to the stack that Greg posted, the OLD_WRITE
 filter was not found in the sub-request filter stack.  That is to be
 expected, because the OLD_WRITE filter is a RESOURCE filter.  Since
 RESOURCE filters are not kept for sub-requests, the data must be
 flushed.

Right.

 Hm  Actually it seems like Resource filters should be kept for
 sub-requests that are created with a filter list.  If that is the case,
 then it looks like the OLD_WRITE filter should be in the filter list.

If the sub-request goes, then its resource filters should go. Those filters
are associated with that particular sub-request. That implies that the
resource filters should be flushed before they die.

[ but we should probably have a SUBREQ_ENDING_FLUSH bucket so the SUBREQ
  filter can chew it up; we don't want to flush all the way to the network
  just cuz the subreq is disappearing. ]

 Unless, the OLD_WRITE filter is being added after the sub-request is
 created..

The filter is added at the first call to ap_r*(). We decided on the lazy
approach so that pure-brigade systems would not suffer the overhead.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/