On 20 Oct 2021, at 10:43, Yann Ylavic <ylavic....@gmail.com> wrote:

> apr_strnstr() maybe, strnstr() is non-standard AFAICT and possibly not
> available on all platforms (Windows)?

This needs to be memmem really, as the boundary isn’t always a string.

Memmem is non portable and old versions are buggy. Is there license 
compatibility to use this code?

https://cvsweb.openbsd.org/src/lib/libc/string/memmem.c?rev=1.5

> 
>> +
>> +            /* definitely found it, we leave */
>> +            if (pos != NULL) {
>> +
>> +                off = pos - str;
>> +
>> +                /* everything up to the boundary */
>> +                if (off) {
>> +
>> +                    apr_bucket_split(e, off);
>> +                    APR_BUCKET_REMOVE(e);
>> +                    APR_BRIGADE_INSERT_TAIL(bbOut, e);
>> +
>> +                    e = APR_BRIGADE_FIRST(bbIn);
>> +                }
>> +
>> +                /* cut out the boundary */
>> +                apr_bucket_split(e, boundary_len);
>> +                apr_bucket_delete(e);
>> +
>> +                return APR_SUCCESS;
>> +            }
>> +
>> +            /* any partial matches at the end? */
>> +            leftover = boundary_len - 1;
>> +            off = (len - leftover);
> 
> Can't we fall through the below "else" slow path here? The code looks
> quite similar.

Let me look, as I recall the paths are separate to be clear.

>> +            APR_BUCKET_REMOVE(e);
>> +            APR_BRIGADE_INSERT_TAIL(bbOut, e);
>> +            continue;
> 
> "outbytes" not bumped?

Good catch.

> 
>> +
>> +        }
>> +
>> +        /*
>> +         * If we reach skip, it means the bucket in e is:
>> +         *
>> +         * - shorter than the boundary
>> +         * - matches the boundary up to the bucket length
>> +         * - might match more buckets
>> +         *
>> +         * Read further buckets and check whether the boundary matches all
>> +         * the way to the end. If so, we have a match. If no match, shave 
>> off
>> +         * one byte and continue round to try again.
>> +         */
>> +skip:
> 
> Below you read ahead to try to match the end of the boundary, and if
> not split one byte into bbOut before restarting the loop, which could
> result in a quite fragmented bbOut and is also what you almost did
> above already.
> 
> Couldn't we instead save the partially matching bucket(s) in a
> temporary brigade (provided by the caller) and restart the loop with
> the more effective [apr_]strnstr() match (at boundary+off)?

The input brigade is our temporary brigade. Instead of shaving off one on the 
bucket, we could keep a counter to just skip one (extra) byte on the next go 
round.

Regards,
Graham
—

Reply via email to