Re: [PATCH] Rework httpd buckets/filtering

Justin Erenkrantz Sun, 23 Sep 2001 14:28:05 -0700
On Sun, Sep 23, 2001 at 02:53:14PM -0700, Greg Stein wrote:
> Euh... not sure if you typoed. Input filters should *NOT* return more than
> they were asked for. Never.

Yup.  That's currently what can happen.

> >...
> > > --- include/apr_buckets.h 2001/09/22 22:36:07     1.118
> > > +++ include/apr_buckets.h 2001/09/23 10:36:12
> > > @@ -689,6 +689,17 @@
> > >                                               apr_off_t *length);
> > >
> > >  /**
> > > + * create a flat buffer of the elements in a bucket_brigade.
> > > + * @param b The bucket brigade to create the buffer from
> > > + * @param c The pointer to the returned character string
> > > + * @param len The length of the returned character string
> > > + * @param p The pool to allocate the character string from.
> > > + * Note: You *must* have enough space allocated to store all of the data.
> > > + * See apr_brigade_length().
> > > + */
> > > +APU_DECLARE(apr_status_t) apr_brigade_to_buffer(apr_bucket_brigade *b,
> > > char *c); +
> 
> Ryan said this should take a reading mode. I don't understand why that
> should be necessary. You have a brigade of a defined size, and you're
> flattening that much data. You're already past the point of non-blocking
> reads being used.
> 
> If you had a partial read, then how would you communiate that? And don't say
> strlen(c).
> 
> No... the function does not need a mode, nor an output length. You give it a
> brigade and it flattens it in its entirety. If there is a problem getting
> the whole thing flattened, then it returns an error.
> 
> Personally, I would pass in a length of the output buffer so the function
> can verify that it is flattening the proper amount -- not too little, and
> not overwriting.
> 
> >...
> > > +            case WANT_ENDCHUNK:
> > > +                /* We don't care.  Next state please. */
> > > +                ctx->state = WANT_TRL;
> > > +                break;
> > >              case WANT_TRL:
> > > -                /* XXX sanity check end chunk here */
> > > -                if (strlen(line)) {
> > > -                    /* bad trailer */
> > > -                }
> > > -                if (ctx->chunk_size == 0) { /* we just finished the last chunk? 
>*/
> > > -                    /* ### woah... ap_http_filter() is doing this, too */
> > > -                    /* append eos bucket and get out */
> > > -                    b = apr_bucket_eos_create();
> > > -                    APR_BRIGADE_INSERT_TAIL(bb, b);
> 
> DECHUNK is the guy to insert the EOS. The HTTP filter cannot know how much
> data is in the request (if you're chunking, the C-L header is ignored). The
> above code should probably stay, and the HTTP filter should stop inserting
> EOS.
> 
> But then again, if you aren't chunking, then the HTTP filter *does* define
> the end of the request. Fun, huh? That is why we discussed combining the two
> filters into one. The state stuff is completely wiped out, and we can easily
> determine EOS, simplifying everything greatly.
> 
> Consider that when somebody asks the (new) HTTP filter for input. It can
> zoom through the headers in direct procedural code. Then it gets to the
> body, start dechunking if necessary, and places the proper amount of read
> data into the passed-brigade.

Damn, you're good (we knew that already).  I just figured that out over 
here.  =-)

> 
> >...
> > > -        /* ### the code below, which moves bytes from one brigade to the
> > > -           ### other is probably bogus. presuming the next filter down was
> > > -           ### working properly, it should not have returned more than
> > > -           ### READBYTES bytes, and we wouldn't have to do any work.
> > > -        */
> 
> You have to fix the lower code before tossing this stuff.
> 
> >...
> > > -        /* ### this is a hack. it is saying, "if we have read everything
> > > -           ### that was requested, then we are at the end of the request."
> > > -           ### it presumes that the next filter up will *only* call us
> > > -           ### with readbytes set to the Content-Length of the request.
> > > -           ### that may not always be true, and this code is *definitely*
> > > -           ### too presumptive of the caller's intent. the point is: this
> > > -           ### filter is not the guy that is parsing the headers or the
> > > -           ### chunks to determine where the end of the request is, so we
> > > -           ### shouldn't be monkeying with EOS buckets.
> > > -        */
> > > -        if (*readbytes == 0) {
> > > -            apr_bucket *eos = apr_bucket_eos_create();
> > > -
> > > -            APR_BRIGADE_INSERT_TAIL(b, eos);
> > > -        }
> 
> Somebody has to return an EOS somewhere.
> 
> Consider the highest-level input filter. How does it know when it has read
> all of the data for the current request? It needs to see an EOS bucket.
> 
> 
> I'll wait for the new patch and review again. All the formatting changes
> were getting in the way.

I've got it right now that it passes the apache tests in httpd-test,
but not the filter/input tests.  This is where I hit the EOS that
you described - we need to join the two filters again.

> But I really think the proper tack is to fix the amount-returned, and to
> combine the HTTP_IN and DECHUNK filters.

Since I know I'm going in the right direction, I'll head towards doing
this.

Expect a patch later tonight.  -- justin
Re: [PATCH] Rework httpd buckets/filtering

Reply via email to