Input Filtering and Pushback.

Justin Erenkrantz Mon, 31 Dec 2001 02:38:19 -0800

I've attempted to think some more about how we should handle 
"pushback" in the input filtering (say that a caller does a 
speculative read - i.e. MIME-continuation in ap_[r]getline).


I think the best strategy would be to introduce a new input filter
mode (call it PEEK and rename the current PEEK mode to EATCRLF) that 
will indicate to input filters that this data may not be consumed
and that it should hold on to the data if possible so that it can
be read again (on a non-PEEK call).  Using this strategy wisely 
should remove all instances of a req_cfg->bb (such as what 
ap_[r]getline uses).  Data may only be stored inside the filter
chain (no external brigades with unread input are allowed!).

What follows is my high-level algorithm on how ap_[r]getline would 
work with this strategy.  Please provide any feedback or thoughts
you may have on this.  Thanks.  -- justin

ap_getline with buckets and brigades:
1. call ap_get_brigade with 0 to indicate want LF-line.
   - ap_get_brigade may return with a line that does not have
     a LF-ending (i.e. len == HUGE_STRING_LEN).
2. Check to see if we got an error value back (!= APR_SUCCESS)
3. Call apr_brigade_length to determine length of brigade.
4. Do we have enough space in our caller's buffer to handle 
   this?  (If we're not doing the allocation for them, that is.)
5. Run through all buckets in the brigade and read/copy into
   the caller's buffer.  (What happened to zero-copy?)
6. Does last bucket's data (APR_BRIGADE_LAST) have LF char?
   - If no LF, go to step 1.
7. We now have a LF-line (otherwise, we'd error out).
8. Attempt to compress the EOL by removing any CR or whitespace.
   - This would maintain backwards compatibility, but may no 
     longer be worthwhile.
9. Are we asked to handling folding?
   - If not, then translate the buffer from ascii and return.
10. Call ap_get_brigade with PEEK.
   - Should this PEEK call block?  Is that an option that the
     caller to ap_get_brigade specifies?  I think ap_getline
     would indeed want it to be a blocking PEEK.
   - This peek is not the same as current AP_MODE_PEEK (which 
     just eats CRLFs).  It would require some modifications to 
     the current input filters to ensure that each filter does
     *NOT* delete the returned data from its input stack, but
     instead copies the returned buckets.  It seems that this 
     would be possible by doing a bucket copy (which should 
     always succeed since we are dealing with already-read 
     data).
        - Let's consider how core_input_filter would handle this:
          Have it treat it as a *readbytes != 0 case with the 
          following exception: it will copy *all* buckets in b
          back to ctx->b before returning.
        - mod_ssl: It would convert it from an AP_MODE_PEEK read
          to a *readbytes (AP_IOBUFSIZE) and read the data 
          normally.  In ssl_io_filter_Input, it will do as 
          core_input_filter does and leave the data in ctx->b 
          to be read again (and return it too).  Note that 
          core_input_filter will *not* see this read as a PEEK, 
          but as a normal ap_get_brigade call (*readbytes != 0).
        - HTTP_IN: Gets out of its way and lets it be passed 
          down without any verification on its part.  (Should
          it be concerned about reading past the end-of-request??)
        - xlate_in_filter: Can save the buffer in its context
          for later reading.
11. Was any data read?
   - If not, translate the buffer from ascii and return.
12. Is the first character a tab or space?
   - If not, translate the buffer from ascii and return.
13. Go to step 1.

Input Filtering and Pushback.

Reply via email to