Re: should input filter return the exact amount of bytes asked for?
--On Thursday, November 13, 2003 11:01 AM -0800 Stas Bekman [EMAIL PROTECTED] wrote: Should we add an explicit explanation to AP_MODE_READBYTES: return at most readbytes data. Can't return 0 with APR_BLOCK_READ. Can't return more than readbytes data. I'd say the first and last one are equivalent statements. And, that APR_BLOCK_READ description belongs with the definition of APR_BLOCK_READ not AP_MODE_READBYTES. Also while we are at it I have a few more questions: /** The filter should return at most one line of CRLF data. * (If a potential line is too long or no CRLF is found, the * filter may return partial data). */ AP_MODE_GETLINE, does it mean that the filter should ignore the readbytes argument in this mode? I think so, yes. /** The filter should implicitly eat any CRLF pairs that it sees. */ AP_MODE_EATCRLF, does it mean that it should do the same as AP_MODE_GETLINE but kill CRLF? If not how much data is it supposed to read? Or is it a mode that never goes on its own and should be OR'ed with some definitive mode, e.g.: AP_MODE_GETLINE|AP_MODE_EATCRLF and AP_MODE_READBYTES|AP_MODE_EATCRLF? It's meant to be called right before we read the next pipelined request on the connection. Old (really old) Netscape clients added spurious CRLFs between requests. I don't see a clear rationale why it'd have to be 'combined' with other ap_get_brigade() modes. The only one that'd make sense (to me) is AP_MODE_GETLINE. Note that AP_MODE_EATCRLF doesn't necessarily return anything. It's wildly HTTP specific... Though it'd be nice to add a note re: APR_BLOCK_READ in the AP_MODE_READBYTES doc above. Or I guess may be it belongs to some filters tutorial... I'll note that I wrote an article on describing httpd-2.x's filters for some Linux magazine recently. I bet you can find back issues. As an aside, I never actually saw the final copy or the printed copy. So, don't blame me if it doesn't help. ;-) -- justin
Re: should input filter return the exact amount of bytes asked for?
Justin Erenkrantz wrote: Thanks for the explanations Justin. Once I'll get some free time I'll need to revamp the filters chapter [1] to address the read mode issue. So far I was completely ignoring it :( (1) http://perl.apache.org/docs/2.0/user/handlers/filters.html Though it'd be nice to add a note re: APR_BLOCK_READ in the AP_MODE_READBYTES doc above. Or I guess may be it belongs to some filters tutorial... I'll note that I wrote an article on describing httpd-2.x's filters for some Linux magazine recently. I bet you can find back issues. As an aside, I never actually saw the final copy or the printed copy. So, don't blame me if it doesn't help. ;-) -- justin Is that the one you are talking about? http://www.linux-mag.com/2003-08/apache_01.html rbb wrote a bunch of filtering articles some 2 years ago or so too. It'd probably be nice to ask those magazines if we can dump them somewhere under the docs-2.0 project, versus linking to them, as ezines tend to move things a lot and even kill them. __ Stas BekmanJAm_pH -- Just Another mod_perl Hacker http://stason.org/ mod_perl Guide --- http://perl.apache.org mailto:[EMAIL PROTECTED] http://use.perl.org http://apacheweek.com http://modperlbook.org http://apache.org http://ticketmaster.com
Re: should input filter return the exact amount of bytes asked for?
Justin Erenkrantz wrote: On Tue, Nov 04, 2003 at 01:41:46AM -0800, Stas Bekman wrote: filter. What happens if the filter returns less bytes (while there is still more data coming?) What happens if the filter returns more bytes than requested (e.g. because it uncompressed some data). After all the incoming Less bytes = OK. Same bytes = OK. More bytes = Not OK. (Theoretically possible though with bad filters.) Great. Where this should be documented? In the ap_get_brigade .h? Also, 0 bytes = Not OK right? Or how otherwise would you explain the assertion: AP_DEBUG_ASSERT(!APR_BRIGADE_EMPTY(bb)); in consumers like ap_get_client_block. Or do you say that a filter can return a non-empty brigade with an empty single bucket? Thanks Justin. __ Stas BekmanJAm_pH -- Just Another mod_perl Hacker http://stason.org/ mod_perl Guide --- http://perl.apache.org mailto:[EMAIL PROTECTED] http://use.perl.org http://apacheweek.com http://modperlbook.org http://apache.org http://ticketmaster.com
Re: should input filter return the exact amount of bytes asked for?
--On Thursday, November 13, 2003 12:38 AM -0800 Stas Bekman [EMAIL PROTECTED] wrote: Great. Where this should be documented? In the ap_get_brigade .h? It's already in util_filters.h. Read the documentation for ap_input_mode_t: /** The filter should return at most readbytes data. */ AP_MODE_READBYTES, ... right? Or how otherwise would you explain the assertion: AP_DEBUG_ASSERT(!APR_BRIGADE_EMPTY(bb)); If using APR_BLOCK_READ, it's illegal to return 0 bytes with AP_MODE_READBYTES - that is what this assert is checking for in maintainer mode (this was a troublesome assert at one point). It's the same expectation as doing a blocking socking read() - blocking reads shouldn't return until something is returned. -- justin
Re: should input filter return the exact amount of bytes asked for?
Justin Erenkrantz wrote: --On Thursday, November 13, 2003 12:38 AM -0800 Stas Bekman [EMAIL PROTECTED] wrote: Great. Where this should be documented? In the ap_get_brigade .h? It's already in util_filters.h. Read the documentation for ap_input_mode_t: /** The filter should return at most readbytes data. */ AP_MODE_READBYTES, ... Aha! I was looking in the wrong place then. Thanks Justin. Should we add an explicit explanation to AP_MODE_READBYTES: return at most readbytes data. Can't return 0 with APR_BLOCK_READ. Can't return more than readbytes data. Also while we are at it I have a few more questions: /** The filter should return at most one line of CRLF data. * (If a potential line is too long or no CRLF is found, the * filter may return partial data). */ AP_MODE_GETLINE, does it mean that the filter should ignore the readbytes argument in this mode? /** The filter should implicitly eat any CRLF pairs that it sees. */ AP_MODE_EATCRLF, does it mean that it should do the same as AP_MODE_GETLINE but kill CRLF? If not how much data is it supposed to read? Or is it a mode that never goes on its own and should be OR'ed with some definitive mode, e.g.: AP_MODE_GETLINE|AP_MODE_EATCRLF and AP_MODE_READBYTES|AP_MODE_EATCRLF? right? Or how otherwise would you explain the assertion: AP_DEBUG_ASSERT(!APR_BRIGADE_EMPTY(bb)); If using APR_BLOCK_READ, it's illegal to return 0 bytes with AP_MODE_READBYTES - that is what this assert is checking for in maintainer mode (this was a troublesome assert at one point). It's the same expectation as doing a blocking socking read() - blocking reads shouldn't return until something is returned. -- justin Cool: /** Determines how a bucket or brigade should be read */ typedef enum { APR_BLOCK_READ, /** block until data becomes available */ APR_NONBLOCK_READ /** return immediately if no data is available */ } apr_read_type_e; Though it'd be nice to add a note re: APR_BLOCK_READ in the AP_MODE_READBYTES doc above. Or I guess may be it belongs to some filters tutorial... __ Stas BekmanJAm_pH -- Just Another mod_perl Hacker http://stason.org/ mod_perl Guide --- http://perl.apache.org mailto:[EMAIL PROTECTED] http://use.perl.org http://apacheweek.com http://modperlbook.org http://apache.org http://ticketmaster.com
Re: should input filter return the exact amount of bytes asked for?
On Tue, Nov 04, 2003 at 01:41:46AM -0800, Stas Bekman wrote: filter. What happens if the filter returns less bytes (while there is still more data coming?) What happens if the filter returns more bytes than requested (e.g. because it uncompressed some data). After all the incoming Less bytes = OK. Same bytes = OK. More bytes = Not OK. (Theoretically possible though with bad filters.) HTH. -- justin
Re: should input filter return the exact amount of bytes asked for?
At 03:31 AM 11/11/2003, Justin Erenkrantz wrote: On Tue, Nov 04, 2003 at 01:41:46AM -0800, Stas Bekman wrote: filter. What happens if the filter returns less bytes (while there is still more data coming?) What happens if the filter returns more bytes than requested (e.g. because it uncompressed some data). After all the incoming Less bytes = OK. But not great if there is more incoming data available (consider that one can call with NONBLOCK and dig up some more. There is a balance to be found here, one doesn't want to slurp 15mb of a file at onces, but one doesn't want bytes to trickle up one at a time. Same bytes = OK. Of course More bytes = Not OK. (Theoretically possible though with bad filters.) Wrong. This is OK across the board, please consider; module requests 1000 arbitrary bytes; codepage module requests 1000 http reads one 'chunk' available, 8000 bytes and will return that page codepage can translate 7998 bytes and comes to a screeching halt for a 3 byte sequence. returns our Now Translated 4000 bytes module sees a 4000 byte heap bucket. What can you do? Instead of treating that bucket as a singleton when you want 1000 bytes, consume the first 1000 bytes from that bucket (or the brigade.) Please review the archives for this discussion (the brigades on the apr list, the filter api on httpd.) This was a very long thread, but the net result of filters is that you get what is available/handy, not any specific number of bytes. BIll
Re: should input filter return the exact amount of bytes asked for?
--On Tuesday, November 11, 2003 11:24 AM -0600 William A. Rowe, Jr. [EMAIL PROTECTED] wrote: More bytes = Not OK. (Theoretically possible though with bad filters.) Wrong. This is OK across the board, please consider; Uh, no. We changed the filter semantics some time ago to stop this insanity. It was inefficient to call AP_MODE_READBYTES and have it return more than asked for. Check out the CVS log for util_filter.h, specifically around revision 1.62. module requests 1000 arbitrary bytes; codepage module requests 1000 http reads one 'chunk' available, 8000 bytes and will return that page codepage can translate 7998 bytes and comes to a screeching halt for a 3 byte sequence. returns our Now Translated 4000 bytes module sees a 4000 byte heap bucket. What can you do? Instead of treating that bucket as a singleton when you want 1000 bytes, consume the first 1000 bytes from that bucket (or the brigade.) No. That means you have 3k more bytes you have to consume that you didn't ask for. The filter wouldn't return it again. Writing code that used input filters and having to deal with that it could get more than asked for was just confusing and led to lots of error-prone code. If it asks for 1k in AP_MODE_READBYTES, it gets at most 1k. Anything else is broken. (util_filter.h AP_MODE_READBYTES says as much, but that's not fair, because I wrote that comment.) Please review the archives for this discussion (the brigades on the apr list, the filter api on httpd.) This was a very long thread, but the net result of filters is that you get what is available/handy, not any specific number of bytes. That *was* indeed the position at one time, but when I redid the input filters (which was about rewrite #14 of input filters), we corrected this because it was causing lots of problems to return more than asked for - this is when we added the mode argument to ap_get_brigade. mod_ssl's input filtering code was just broken under that old API. And, the big boys even reviewed the code and semantic changes before it went in. So, it was definitely RTC. ;-) -- justin
Re: should input filter return the exact amount of bytes asked for?
Stas Bekman wrote: I'm trying to get rid of ap_get_client_block(), but I don't understand a few things. ap_get_client_block() asks for readbytes from the upstream filter. What happens if the filter returns less bytes (while there is still more data coming?) What happens if the filter returns more bytes than requested (e.g. because it uncompressed some data). After all the incoming filters all propogate a request for N bytes read to the core_in filter, which returns that exact number if it can. Now as the data flows up the filter chain its length may change. Does it mean that if the filter didn't return the exact amount asked for it's broken? Is that the case when it returns less data than requested? Or when it returns more data? I'm trying to deal with the case where a user call wants N bytes and I've to give that exact number in a single call. I'm not sure whether I should buffer things if I've got too much data or on the opposite ask for more bbs if I don't have enough data. Are there any modules I can look at to learn from? The doc for ap_get_brigade doesn't say anything about ap_get_brigade satisfying 'readbytes' argument. /** * Get the current bucket brigade from the next filter on the filter * stack. The filter returns an apr_status_t value. If the bottom-most * filter doesn't read from the network, then ::AP_NOBODY_READ is returned. * The bucket brigade will be empty when there is nothing left to get. * @param filter The next filter in the chain * @param bucket The current bucket brigade. The original brigade passed * to ap_get_brigade() must be empty. * @param mode The way in which the data should be read * @param block How the operations should be performed * ::APR_BLOCK_READ, ::APR_NONBLOCK_READ * @param readbytes How many bytes to read from the next filter. */ AP_DECLARE(apr_status_t) ap_get_brigade(ap_filter_t *filter, apr_bucket_brigade *bucket, ap_input_mode_t mode, apr_read_type_e block, apr_off_t readbytes); What bothers me most is the case where a filter may return more data than it has been asked for in the AP_MODE_READBYTES mode. ap_get_client_block() doesn't deal with buffering such data and drops it on the floor. So it either has to be fixed to do the buffering or the filter spec (ap_get_brigade) needs to clearly state that no more than requested amount of data should be returned in the AP_MODE_READBYTES. And ap_get_client_block needs to assert if it gets more. __ Stas BekmanJAm_pH -- Just Another mod_perl Hacker http://stason.org/ mod_perl Guide --- http://perl.apache.org mailto:[EMAIL PROTECTED] http://use.perl.org http://apacheweek.com http://modperlbook.org http://apache.org http://ticketmaster.com