OLD_WRITE stuff (was: Re: 2.0.34 - erratic behavior with autoindexes)
On Tue, Apr 02, 2002 at 01:50:59PM -0800, Ryan Bloom wrote: ... chain. Bill S. stuck his head in here and said something about a rule that if old_write is ever used, it has to always be used. If you read through mod_include, you will see that whenever we create a sub-request, we send the output from the current filter to the next filter. This is required to make sure that things are output in the correct order. If you are a handler and are generating data through ap_r* functions, you must flush the filter stack so that you are sure that OLD_WRITE isn't buffering for you. That's not true. OLD_WRITE was designed to allow a mix/match of writing to the stack *and* using the ap_r* functions. The design is that ap_r* implies a write into the filter stack. That write is optimized to use a brigade for buffering (so every ap_rputc() doesn't traverse the filter stack). If OLD_WRITE isn't at the top, then it doesn't do the buffering (because there are filters between the top and the OLD_WRITE buffers). If somebody writes directly into the filter stack, then it prepends any buffered content to whatever is written, and then passes it all down the stack. [ just reviewed the code; it seems to still match the original design ] It all looks pretty good, but there might be a problem where: *) ap_r* is used, so OLD_WRITE is inserted, and the content is stored into its buffer. *) another filter is inserted above OLD_WRITE, so it is not the top. ap_r* content should now go into this new filter. *) ap_r* is called again, but it thinks it cannot buffer. it passes the new data directly into the filter stack. BUG: the previously-buffered content is not prepended. So... the bug might appear because of the increased dynamicity of the filter chain nowadays. Back when OLD_WRITE was written, it was perceived that the chain would be quite static by the time the first write occurred. Note that OLD_WRITE tries quite hard to keep its filter on top (its filter type is RESOURCE-10). If something broke that, then it is possible to hit the above bug. The fix to the (potential) bug is to change line 1370 in server/protocol.c to concat the brigades, similar to ap_old_write_filter() (defined just above, at line 1316). Another question is whether anybody defines their filter type as less than RESOURCE-10 and ends up ahead of OLD_WRITE. And lastly: I just realized that if a filter gets in front of OLD_WRITE, then the branch at line 1366 will get called for each ap_r* call. If that is a series of 1000's of ap_rputc() calls, then you'll end up creating thousands of brigades for those transient buckets(!). [ a similar brigade over-creation can occur if you alternate ap_r* and writing to the filter stack; ap_old_write_filter() will forget about the brigade it created, so the next ap_r* needs to recreate one ] The answer might be to test for brigade-empty rather than ctx-bb == NULL. Then the branch at 1366 can store and reuse a brigade in ctx-bb. Cheers, -g -- Greg Stein, http://www.lyra.org/
RE: OLD_WRITE stuff (was: Re: 2.0.34 - erratic behavior with autoindexes)
Everything below is true, assuming you are not talking about sub-requests. According to the stack that Greg posted, the OLD_WRITE filter was not found in the sub-request filter stack. That is to be expected, because the OLD_WRITE filter is a RESOURCE filter. Since RESOURCE filters are not kept for sub-requests, the data must be flushed. Hm Actually it seems like Resource filters should be kept for sub-requests that are created with a filter list. If that is the case, then it looks like the OLD_WRITE filter should be in the filter list. Unless, the OLD_WRITE filter is being added after the sub-request is created.. Lots to look at.. Ryan On Tue, Apr 02, 2002 at 01:50:59PM -0800, Ryan Bloom wrote: ... chain. Bill S. stuck his head in here and said something about a rule that if old_write is ever used, it has to always be used. If you read through mod_include, you will see that whenever we create a sub-request, we send the output from the current filter to the next filter. This is required to make sure that things are output in the correct order. If you are a handler and are generating data through ap_r* functions, you must flush the filter stack so that you are sure that OLD_WRITE isn't buffering for you. That's not true. OLD_WRITE was designed to allow a mix/match of writing to the stack *and* using the ap_r* functions. The design is that ap_r* implies a write into the filter stack. That write is optimized to use a brigade for buffering (so every ap_rputc() doesn't traverse the filter stack). If OLD_WRITE isn't at the top, then it doesn't do the buffering (because there are filters between the top and the OLD_WRITE buffers). If somebody writes directly into the filter stack, then it prepends any buffered content to whatever is written, and then passes it all down the stack. [ just reviewed the code; it seems to still match the original design ] It all looks pretty good, but there might be a problem where: *) ap_r* is used, so OLD_WRITE is inserted, and the content is stored into its buffer. *) another filter is inserted above OLD_WRITE, so it is not the top. ap_r* content should now go into this new filter. *) ap_r* is called again, but it thinks it cannot buffer. it passes the new data directly into the filter stack. BUG: the previously-buffered content is not prepended. So... the bug might appear because of the increased dynamicity of the filter chain nowadays. Back when OLD_WRITE was written, it was perceived that the chain would be quite static by the time the first write occurred. Note that OLD_WRITE tries quite hard to keep its filter on top (its filter type is RESOURCE-10). If something broke that, then it is possible to hit the above bug. The fix to the (potential) bug is to change line 1370 in server/protocol.c to concat the brigades, similar to ap_old_write_filter() (defined just above, at line 1316). Another question is whether anybody defines their filter type as less than RESOURCE-10 and ends up ahead of OLD_WRITE. And lastly: I just realized that if a filter gets in front of OLD_WRITE, then the branch at line 1366 will get called for each ap_r* call. If that is a series of 1000's of ap_rputc() calls, then you'll end up creating thousands of brigades for those transient buckets(!). [ a similar brigade over-creation can occur if you alternate ap_r* and writing to the filter stack; ap_old_write_filter() will forget about the brigade it created, so the next ap_r* needs to recreate one ] The answer might be to test for brigade-empty rather than ctx-bb == NULL. Then the branch at 1366 can store and reuse a brigade in ctx-bb. Cheers, -g
Re: OLD_WRITE stuff (was: Re: 2.0.34 - erratic behavior with autoindexes)
On Tue, Apr 02, 2002 at 02:56:33PM -0800, Ryan Bloom wrote: Everything below is true, assuming you are not talking about sub-requests. According to the stack that Greg posted, the OLD_WRITE filter was not found in the sub-request filter stack. That is to be expected, because the OLD_WRITE filter is a RESOURCE filter. Since RESOURCE filters are not kept for sub-requests, the data must be flushed. Right. Hm Actually it seems like Resource filters should be kept for sub-requests that are created with a filter list. If that is the case, then it looks like the OLD_WRITE filter should be in the filter list. If the sub-request goes, then its resource filters should go. Those filters are associated with that particular sub-request. That implies that the resource filters should be flushed before they die. [ but we should probably have a SUBREQ_ENDING_FLUSH bucket so the SUBREQ filter can chew it up; we don't want to flush all the way to the network just cuz the subreq is disappearing. ] Unless, the OLD_WRITE filter is being added after the sub-request is created.. The filter is added at the first call to ap_r*(). We decided on the lazy approach so that pure-brigade systems would not suffer the overhead. Cheers, -g -- Greg Stein, http://www.lyra.org/