I've attempted to think some more about how we should handle "pushback" in the input filtering (say that a caller does a speculative read - i.e. MIME-continuation in ap_[r]getline).
I think the best strategy would be to introduce a new input filter mode (call it PEEK and rename the current PEEK mode to EATCRLF) that will indicate to input filters that this data may not be consumed and that it should hold on to the data if possible so that it can be read again (on a non-PEEK call). Using this strategy wisely should remove all instances of a req_cfg->bb (such as what ap_[r]getline uses). Data may only be stored inside the filter chain (no external brigades with unread input are allowed!). What follows is my high-level algorithm on how ap_[r]getline would work with this strategy. Please provide any feedback or thoughts you may have on this. Thanks. -- justin ap_getline with buckets and brigades: 1. call ap_get_brigade with 0 to indicate want LF-line. - ap_get_brigade may return with a line that does not have a LF-ending (i.e. len == HUGE_STRING_LEN). 2. Check to see if we got an error value back (!= APR_SUCCESS) 3. Call apr_brigade_length to determine length of brigade. 4. Do we have enough space in our caller's buffer to handle this? (If we're not doing the allocation for them, that is.) 5. Run through all buckets in the brigade and read/copy into the caller's buffer. (What happened to zero-copy?) 6. Does last bucket's data (APR_BRIGADE_LAST) have LF char? - If no LF, go to step 1. 7. We now have a LF-line (otherwise, we'd error out). 8. Attempt to compress the EOL by removing any CR or whitespace. - This would maintain backwards compatibility, but may no longer be worthwhile. 9. Are we asked to handling folding? - If not, then translate the buffer from ascii and return. 10. Call ap_get_brigade with PEEK. - Should this PEEK call block? Is that an option that the caller to ap_get_brigade specifies? I think ap_getline would indeed want it to be a blocking PEEK. - This peek is not the same as current AP_MODE_PEEK (which just eats CRLFs). It would require some modifications to the current input filters to ensure that each filter does *NOT* delete the returned data from its input stack, but instead copies the returned buckets. It seems that this would be possible by doing a bucket copy (which should always succeed since we are dealing with already-read data). - Let's consider how core_input_filter would handle this: Have it treat it as a *readbytes != 0 case with the following exception: it will copy *all* buckets in b back to ctx->b before returning. - mod_ssl: It would convert it from an AP_MODE_PEEK read to a *readbytes (AP_IOBUFSIZE) and read the data normally. In ssl_io_filter_Input, it will do as core_input_filter does and leave the data in ctx->b to be read again (and return it too). Note that core_input_filter will *not* see this read as a PEEK, but as a normal ap_get_brigade call (*readbytes != 0). - HTTP_IN: Gets out of its way and lets it be passed down without any verification on its part. (Should it be concerned about reading past the end-of-request??) - xlate_in_filter: Can save the buffer in its context for later reading. 11. Was any data read? - If not, translate the buffer from ascii and return. 12. Is the first character a tab or space? - If not, translate the buffer from ascii and return. 13. Go to step 1.