--On Thursday, August 12, 2004 2:51 AM -0400 Glenn Strauss <[EMAIL PROTECTED]> wrote:

    /** The filter should pass any special buckets (not in-memory) as long
as it      *  does not need to perform any processing on them (translation
or protocol      *  delimiting) (augments AP_MODE_BYTES; mutually exclusive
w/ AP_MODE_PEEK)      */
    AP_MODE_PASS_SPECIALS = 4,

This is a violation of the bucket brigade API (i.e. trying to force a type on the returned bucket when it is supposed to be abstract). I definitely would like to see a *lot* more justification as to why this is mandatory.


    /** The filter read should be treated as speculative and any returned
     *  data should be stored for later retrieval in another mode.
     *  (augments AP_MODE_BYTES or AP_MODE_LINE)
     */
    AP_MODE_PEEK = 8,

PEEK (aka SPECULATIVE: PEEK was a poor name for a lot of reasons which is why we tossed it). It's not possible to know whether it is going to be a line or a byte, so it doesn't make sense for that to be its own independent mode. Doing a speculative read on a line isn't possible and adds a lot of overhead.


You should only be calling SPECULATIVE when your protocol doesn't state what to do next. If you are expecting a line, then just call AP_MODE_GETLINE. So, I don't believe this option isn't combinable with anything else.

    /** When reading lines, look for MIME-folded continuations as well
     *  (augments AP_MODE_LINE)
     */
    AP_MODE_MIME_FOLDING = 16

Actually, this can't be handled as efficiently as you might think it would due to the presence of SSL or other connection-level filters. Therefore, folding might only be possible to do in ap_http_filter, but it can't go down further as into core_input_filter (which is where we now call apr_brigade_split_line). A new getline_folding filter right in front of ap_http_filter would do the trick, I suppose.


Once again, I don't believe it makes sense to combine this with any other option. For example, this doesn't combine with AP_MODE_READBYTES or AP_MODE_SPECULATIVE. It only makes sense in combination with AP_MODE_GETLINE.

So, one option is to add a new mode here: AP_MODE_GETLINE_CONTINUATION (or some such) and place it as a protocol-level filter. However, there are some corner cases that have to be dealt with: up-to and including extending over the line limits (see below). Just altering AP_MODE_GETLINE to handle continuations isn't allowable as there are some HTTP protocol behaviors that can not be line continued: i.e. request line.

Right now, we deal with these corner cases outside of the filter chain. Now that ap_rgetline() has been refactored several times: the code was too hairy at that time I last re-did input filtering that it wasn't worth it to mess with ap_rgetline() then. (After I re-did the input filters, Brian Pane re-did ap_rgetline() if my recollection of the timeline is correct.) So, I believe it's easier now to just add a new filter mode and try to abstract away the continuation logic now. The code wasn't very clean back then...

of code duplication between modules.  For example, the behavior of
line-mode is vauge and requires that callers re-parse the brigade
to check for complete lines.  I've got a whole litany of things,

It's for the reason listed below...

including bad code examples of brigade parsing in the Apache core,
but that's another post.

Well, pointers as to where things aren't right are certainly appreciated.

As to ap_input_mode_t as bitflags: the case doesn't yet seem to be made as to why a combination of options is mandatory. I believe adding one additional AP_MODE_FOLDED_LINE or something would be sufficient.

One advantage of the new API:
MIME continuation lines are used so often (HTTP headers/trailers,
email message headers, MIME encapsulation, ...) that they should be
part of the core API.  Instead of code duplication (witness
ap_rgetline_core() and ap_get_mime_headers_core() both implementing
continuation line unfolding), AP_MODE_LINE | AP_MODE_MIME_FOLDING
does not return a partial line.  It only returns a complete line
including continuations.  Simplifying things a bit for now, this
allows caller to know that if its brigade is returned non-empty,
that the line is complete, including continuations, without caller
having to re-parse the line to double-check.

It is simply a denial of service condition to allow unbounded lines to be read: therefore, it must be possible to return partial lines (or possibly error out). This is why there is an internal upper limit on how much can be read into a single line in the core_input_filter (via ap_brigade_splitline). We *must* respect the LimitRequestLine and LimitFieldSize (?). Dealing with this config logic in a filter is a tad more complicated - this is why a separate mime-continuation-only protocol filter makes a lot of sense: the code isn't as trivial as you might expect.


One thing that does concern me though is that the LimitFieldSize's only apply to HTTP MIME request headers. There are other allowable cases where the limits shouldn't apply; so I'm mildly concerned as to how we'd allow a filter to intentionally bypass those limits: perhaps those use cases have to continue calling GETLINE and doing the continuation themselves. -- justin

Reply via email to