Re: svn commit: r1161661 - /httpd/httpd/trunk/modules/http/byterange_filter.c

2011-08-26 Thread Ruediger Pluem


On 08/25/2011 07:38 PM, j...@apache.org wrote:
 Author: jim
 Date: Thu Aug 25 17:38:19 2011
 New Revision: 1161661
 
 URL: http://svn.apache.org/viewvc?rev=1161661view=rev
 Log:
 Merge in byteranges
 
 Modified:
 httpd/httpd/trunk/modules/http/byterange_filter.c
 
 Modified: httpd/httpd/trunk/modules/http/byterange_filter.c
 URL: 
 http://svn.apache.org/viewvc/httpd/httpd/trunk/modules/http/byterange_filter.c?rev=1161661r1=1161660r2=1161661view=diff
 ==
 --- httpd/httpd/trunk/modules/http/byterange_filter.c (original)
 +++ httpd/httpd/trunk/modules/http/byterange_filter.c Thu Aug 25 17:38:19 2011
 @@ -469,11 +469,14 @@ AP_CORE_DECLARE_NONSTD(apr_status_t) ap_
  
  static int ap_set_byterange(request_rec *r)
  {
 -const char *range;
 +const char *range, *or;
  const char *if_range;
  const char *match;
  const char *ct;
 -int num_ranges;
 +char *merged, *cur;
 +int num_ranges = 0;
 +apr_off_t ostart, oend;
 +int in_merge = 0;
  
  if (r-assbackwards) {
  return 0;
 @@ -526,17 +529,81 @@ static int ap_set_byterange(request_rec 
  }
  }
  
 -if (!ap_strchr_c(range, ',')) {
 -/* a single range */
 -num_ranges = 1;
 -}
 -else {
 -/* a multiple range */
 -num_ranges = 2;
 +range += 6;
 +or = apr_pstrdup(r-pool, range);
 +merged = apr_pstrdup(r-pool, );
 +while ((cur = ap_getword(r-pool, range, ','))) {
 +char *dash;
 +char *errp;
 +apr_off_t number, start, end;
 +
 +if (!(dash = strchr(cur, '-'))) {
 +break;
 +}
 +
 +if (dash == cur) {
 +/* In the form -5... leave as is */
 +merged = apr_pstrcat(r-pool, merged, (num_ranges++ ? , : ), 
 cur, NULL);
 +continue;
 +}
 +*dash++ = '\0';
 +if (!*dash) {
 +/* form 5- */
 +merged = apr_pstrcat(r-pool, merged, (num_ranges++ ? , : ), 
 cur, -, NULL);
 +continue;
 +}
 +/*
 + * we have #-#, so we need to grab them... we don't bother
 + * doing this for the #- or -# cases for speed reasons.
 + * After all, those will be fixed when the filter parses
 + * the merged range
 + */
 +if (apr_strtoff(number, cur, errp, 10) || *errp || number  0) {
 +break;
 +}
 +start = number;
 +if (apr_strtoff(number, dash, errp, 10) || *errp || number = 0) {
 +break;
 +}
 +end = number;
 +if (!in_merge) {
 +ostart = start;
 +oend = end;
 +}
 +in_merge = 0;
 +if (start = ostart) {
 +ostart = start;
 +in_merge = 1;
 +}
 +if (start  ostart  start  oend) {
 +in_merge = 1;
 +}
 +if ((end-1) = oend) {
 +oend = end;
 +in_merge = 1;
 +}
 +if (end  ostart  end  oend) {
 +in_merge = 1;
 +}
 +if (in_merge) {
 +continue;

I followed through the above algorithm with a range of 0-1,1000-1001 and the
result was a range of 0-1001 which feels wrong. Culprit seems to be the 
((end-1) = oend) check.
Shouldn't that be oend = (start-1)?

 +} else {
 +char *nr = apr_psprintf(r-pool, % APR_OFF_T_FMT -% 
 APR_OFF_T_FMT,
 +ostart, oend);
 +merged = apr_pstrcat(r-pool, merged, (num_ranges++ ? , : ), 
 nr, NULL);

Doing this in a loop feels bad as it creates new buffers over and over.
Shouldn't we create a buffer of the size of strlen(range) and add to this 
subsequently
(we know that the length of the merged range = length of range)?

 +}
  }
  
 +if (in_merge) {
 +char *nr = apr_psprintf(r-pool, % APR_OFF_T_FMT -% 
 APR_OFF_T_FMT,
 +ostart, oend);
 +merged = apr_pstrcat(r-pool, merged, (num_ranges++ ? , : ), nr, 
 NULL);
 +}
 +
  r-status = HTTP_PARTIAL_CONTENT;
 -r-range = range + 6;
 +r-range = merged;
 +ap_log_rerror(APLOG_MARK, APLOG_DEBUG, 0, r,
 +  Range: %s | %s, or, merged);
  
  return num_ranges;
  }
 


Regards

Rüdiger


Re: svn commit: r1161661 - /httpd/httpd/trunk/modules/http/byterange_filter.c

2011-08-26 Thread Jim Jagielski
I still think that your version is wrong wrong wrong and am
tempted to veto it.

It completely invalidates what ap_set_byterange() is designed to
do (set r-range) as well as removes the ability to count
overlaps, non-ascends, etc…

On Aug 25, 2011, at 7:11 PM, Stefan Fritsch wrote:

 On Thursday 25 August 2011, Jim Jagielski wrote:
 Be my guest. commit and fix rp's concerns.
 
 OK, done that. Trunk r1161791 passes all tests, including the new 
 byterange3.t and byterange4.t.
 



Re: svn commit: r1161661 - /httpd/httpd/trunk/modules/http/byterange_filter.c

2011-08-26 Thread Jim Jagielski

On Aug 25, 2011, at 5:34 PM, Joe Schaefer wrote:

 Look Jim, as someone who has written
 a HTTP parsing library with a good track
 record of no published security holes,
 take my criticism of what you wrote for
 what it's worth.

Believe me… I always do.



Re: svn commit: r1161661 - /httpd/httpd/trunk/modules/http/byterange_filter.c

2011-08-26 Thread Jim Jagielski

On Aug 26, 2011, at 2:50 AM, Ruediger Pluem wrote:
 
 I followed through the above algorithm with a range of 0-1,1000-1001 and the
 result was a range of 0-1001 which feels wrong. Culprit seems to be the 
 ((end-1) = oend) check.
 Shouldn't that be oend = (start-1)?

Yeah, that is wrong. Investigating. That part was to catch
things like 10-20,21-100

 
 +} else {
 +char *nr = apr_psprintf(r-pool, % APR_OFF_T_FMT -% 
 APR_OFF_T_FMT,
 +ostart, oend);
 +merged = apr_pstrcat(r-pool, merged, (num_ranges++ ? , : 
 ), nr, NULL);
 
 Doing this in a loop feels bad as it creates new buffers over and over.
 Shouldn't we create a buffer of the size of strlen(range) and add to this 
 subsequently
 (we know that the length of the merged range = length of range)?

My plan is to put each range into an array and at the
end flatten it via apr_array_pstrcat()

Re: svn commit: r1161661 - /httpd/httpd/trunk/modules/http/byterange_filter.c

2011-08-26 Thread Stefan Fritsch
On Fri, August 26, 2011 13:37, Jim Jagielski wrote:
 I still think that your version is wrong wrong wrong and am
 tempted to veto it.

 It completely invalidates what ap_set_byterange() is designed to
 do (set r-range) as well as removes the ability to count
 overlaps, non-ascends, etc…

r-range is documented as The Range: header, and ap_set_byterange() sets
it to the value sent by the client. I don't think there is any specific
need to set r-range to a textual representation of what
ap_byterange_filter() chooses to do with the header. Or can you explain in
more detail why r-range needs to be updated?

And the counting logic is still possible, it just has to go inside the if
block where the merging is done, too:

if (i  0) {
if (!(ranges[i].start  ranges[i-1].end   + 1 
  ranges[i].endranges[i-1].start - 1))
{
if (ranges[i].start  ranges[i-1].start)
ranges[i-1].start = ranges[i].start;
if (ranges[i].end  ranges[i-1].end)
ranges[i-1].end = ranges[i].end;
continue;
}
}





RE: svn commit: r1161661 - /httpd/httpd/trunk/modules/http/byterange_filter.c

2011-08-26 Thread Plüm, Rüdiger, VF-Group
 

 -Original Message-
 From: Jim Jagielski 
 Sent: Freitag, 26. August 2011 13:38
 To: dev@httpd.apache.org
 Subject: Re: svn commit: r1161661 - 
 /httpd/httpd/trunk/modules/http/byterange_filter.c
 
 I still think that your version is wrong wrong wrong and am
 tempted to veto it.
 
 It completely invalidates what ap_set_byterange() is designed to
 do (set r-range) as well as removes the ability to count
 overlaps, non-ascends, etc...

The question is: Does r-range need to reflect the range definition
that is actually delivered by the byte range filter?
The comment in httpd.h only says:

/** The Range: header */

So you could put it vice versa and say r-range is not allowed to contain
an adjusted / optimized version of the Range header, but must contain
the original Range header.
To be honest I would not follow that, because the original Range header is still
in r-headers_out and it makes sense to store an optimized version of the
Range header in r-range.

Nevertheless it is my opinion that it doesn't make sense to parse the Range 
header twice.
If we need a proper r-range and the other statistics, we should have a function
that parses the Range header and outputs

1. The number of ranges in the original header
2. The number of ranges after the merges.
3. Whatever else statistics we need.
4. The merged ranges as an array (like the one Stefans patch creates)
5. A proper r-range created from 4.

Although ap_set_byterange is in the ap_ namespace it is a static function
in byterange_filter.c. So we are free to adjust it as we like without the
need for any bumps and possibly creating code that cannot be backported.

Regards

Rüdiger

 


RE: svn commit: r1161661 - /httpd/httpd/trunk/modules/http/byterange_filter.c

2011-08-26 Thread Plüm, Rüdiger, VF-Group
 

 -Original Message-
 From: Jim Jagielski [mailto:j...@jagunet.com] 
 Sent: Freitag, 26. August 2011 13:49
 To: dev@httpd.apache.org
 Subject: Re: svn commit: r1161661 - 
 /httpd/httpd/trunk/modules/http/byterange_filter.c
 
 
 On Aug 26, 2011, at 2:50 AM, Ruediger Pluem wrote:
  
  I followed through the above algorithm with a range of 
 0-1,1000-1001 and the
  result was a range of 0-1001 which feels wrong. Culprit 
 seems to be the ((end-1) = oend) check.
  Shouldn't that be oend = (start-1)?
 
 Yeah, that is wrong. Investigating. That part was to catch
 things like 10-20,21-100
 
  
  +} else {
  +char *nr = apr_psprintf(r-pool, % 
 APR_OFF_T_FMT -% APR_OFF_T_FMT,
  +ostart, oend);
  +merged = apr_pstrcat(r-pool, merged, 
 (num_ranges++ ? , : ), nr, NULL);
  
  Doing this in a loop feels bad as it creates new buffers 
 over and over.
  Shouldn't we create a buffer of the size of strlen(range) 
 and add to this subsequently
  (we know that the length of the merged range = length of range)?
 
 My plan is to put each range into an array and at the
 end flatten it via apr_array_pstrcat()

I thought about that as well, but I think a combination of a preallocated
buffer and apr_snprintf using a moving pointer in this buffer could
save even more memory in the typical use case.
Of course this changes if you remove a lot of ranges by merging.

Regards

Rüdiger


Re: svn commit: r1161661 - /httpd/httpd/trunk/modules/http/byterange_filter.c

2011-08-26 Thread Jim Jagielski

On Aug 26, 2011, at 7:54 AM, Plüm, Rüdiger, VF-Group wrote:

 
 
 -Original Message-
 From: Jim Jagielski 
 Sent: Freitag, 26. August 2011 13:38
 To: dev@httpd.apache.org
 Subject: Re: svn commit: r1161661 - 
 /httpd/httpd/trunk/modules/http/byterange_filter.c
 
 I still think that your version is wrong wrong wrong and am
 tempted to veto it.
 
 It completely invalidates what ap_set_byterange() is designed to
 do (set r-range) as well as removes the ability to count
 overlaps, non-ascends, etc...
 
 The question is: Does r-range need to reflect the range definition
 that is actually delivered by the byte range filter?
 The comment in httpd.h only says:
 
 /** The Range: header */
 
 So you could put it vice versa and say r-range is not allowed to contain
 an adjusted / optimized version of the Range header, but must contain
 the original Range header.
 To be honest I would not follow that, because the original Range header is 
 still
 in r-headers_out and it makes sense to store an optimized version of the
 Range header in r-range.
 

My thoughts as well…

 Nevertheless it is my opinion that it doesn't make sense to parse the Range 
 header twice.

Agreed… The issue is that to merge, you need to parse… We could have
ap_set_byterange() also return a set of start/stop points (in an
array), so that the filter can work with that and not bother with rereading
r-range. In fact, I think that makes the most sense… That is, adjust
ap_set_byterange() to create the modified, parsed r-range, a return status
and the start/stop array. The filter then goes thru that array.

Re: svn commit: r1161661 - /httpd/httpd/trunk/modules/http/byterange_filter.c

2011-08-26 Thread Jim Jagielski

On Aug 26, 2011, at 8:03 AM, Plüm, Rüdiger, VF-Group wrote:
 
 I thought about that as well, but I think a combination of a preallocated
 buffer and apr_snprintf using a moving pointer in this buffer could
 save even more memory in the typical use case.
 Of course this changes if you remove a lot of ranges by merging.

That's the tradeoff, yeah...

Re: svn commit: r1161661 - /httpd/httpd/trunk/modules/http/byterange_filter.c

2011-08-26 Thread Jim Jagielski

On Aug 26, 2011, at 8:31 AM, Jim Jagielski wrote:
 
 Agreed… The issue is that to merge, you need to parse… We could have
 ap_set_byterange() also return a set of start/stop points (in an
 array), so that the filter can work with that and not bother with rereading
 r-range. In fact, I think that makes the most sense… That is, adjust
 ap_set_byterange() to create the modified, parsed r-range, a return status
 and the start/stop array. The filter then goes thru that array.

So ap_set_byterange(request_rec *r, apr_off_t clen, apr_array_header_t *indexes)
will exist and parse_byterange() will go away…

Sound OK?

RE: svn commit: r1161661 - /httpd/httpd/trunk/modules/http/byterange_filter.c

2011-08-26 Thread Plüm, Rüdiger, VF-Group
 

 -Original Message-
 From: Jim Jagielski [mailto:j...@jagunet.com] 
 Sent: Freitag, 26. August 2011 14:38
 To: dev@httpd.apache.org
 Subject: Re: svn commit: r1161661 - 
 /httpd/httpd/trunk/modules/http/byterange_filter.c
 
 
 On Aug 26, 2011, at 8:31 AM, Jim Jagielski wrote:
  
  Agreed... The issue is that to merge, you need to parse... We could have
  ap_set_byterange() also return a set of start/stop points (in an
  array), so that the filter can work with that and not 
 bother with rereading
  r-range. In fact, I think that makes the most sense... That 
 is, adjust
  ap_set_byterange() to create the modified, parsed r-range, 
 a return status
  and the start/stop array. The filter then goes thru that array.
 
 So ap_set_byterange(request_rec *r, apr_off_t clen, 
 apr_array_header_t *indexes)

What is clen?

Regards

Rüdiger

 


Re: svn commit: r1161661 - /httpd/httpd/trunk/modules/http/byterange_filter.c

2011-08-26 Thread Jim Jagielski

On Aug 26, 2011, at 8:46 AM, Plüm, Rüdiger, VF-Group wrote:

 
 
 -Original Message-
 From: Jim Jagielski [mailto:j...@jagunet.com] 
 Sent: Freitag, 26. August 2011 14:38
 To: dev@httpd.apache.org
 Subject: Re: svn commit: r1161661 - 
 /httpd/httpd/trunk/modules/http/byterange_filter.c
 
 
 On Aug 26, 2011, at 8:31 AM, Jim Jagielski wrote:
 
 Agreed... The issue is that to merge, you need to parse... We could have
 ap_set_byterange() also return a set of start/stop points (in an
 array), so that the filter can work with that and not 
 bother with rereading
 r-range. In fact, I think that makes the most sense... That 
 is, adjust
 ap_set_byterange() to create the modified, parsed r-range, 
 a return status
 and the start/stop array. The filter then goes thru that array.
 
 So ap_set_byterange(request_rec *r, apr_off_t clen, 
 apr_array_header_t *indexes)
 
 What is clen?
 

The content length… we need to know for ranges like 100- and -99




RE: svn commit: r1161661 - /httpd/httpd/trunk/modules/http/byterange_filter.c

2011-08-26 Thread Plüm, Rüdiger, VF-Group
 

 -Original Message-
 From: Jim Jagielski [mailto:j...@jagunet.com] 
 Sent: Freitag, 26. August 2011 14:56
 To: dev@httpd.apache.org
 Subject: Re: svn commit: r1161661 - 
 /httpd/httpd/trunk/modules/http/byterange_filter.c
 
 
 On Aug 26, 2011, at 8:46 AM, Plüm, Rüdiger, VF-Group wrote:
 
  
  
  -Original Message-
  From: Jim Jagielski [mailto:j...@jagunet.com] 
  Sent: Freitag, 26. August 2011 14:38
  To: dev@httpd.apache.org
  Subject: Re: svn commit: r1161661 - 
  /httpd/httpd/trunk/modules/http/byterange_filter.c
  
  
  On Aug 26, 2011, at 8:31 AM, Jim Jagielski wrote:
  
  Agreed... The issue is that to merge, you need to 
 parse... We could have
  ap_set_byterange() also return a set of start/stop points (in an
  array), so that the filter can work with that and not 
  bother with rereading
  r-range. In fact, I think that makes the most sense... That 
  is, adjust
  ap_set_byterange() to create the modified, parsed r-range, 
  a return status
  and the start/stop array. The filter then goes thru that array.
  
  So ap_set_byterange(request_rec *r, apr_off_t clen, 
  apr_array_header_t *indexes)
  
  What is clen?
  
 
 The content length... we need to know for ranges like 100- and -99
 

Thanks. Where in these parameters / return values do we find the statistics (if 
needed),
e.g. number of ranges, number of overlaps, number of merges, etc.

Regards

Rüdiger


Re: svn commit: r1161661 - /httpd/httpd/trunk/modules/http/byterange_filter.c

2011-08-26 Thread Jim Jagielski

On Aug 26, 2011, at 9:02 AM, Plüm, Rüdiger, VF-Group wrote:

 
 Thanks. Where in these parameters / return values do we find the statistics 
 (if needed),
 e.g. number of ranges, number of overlaps, number of merges, etc.
 

Not sure yet how to best expose them, or wether to simply have
ap_set_byterange do that internally…

Working PoC coming soon...



Re: svn commit: r1161661 - /httpd/httpd/trunk/modules/http/byterange_filter.c

2011-08-26 Thread Stefan Fritsch
On Fri, August 26, 2011 14:38, Jim Jagielski wrote:

 On Aug 26, 2011, at 8:31 AM, Jim Jagielski wrote:

 Agreed… The issue is that to merge, you need to parse… We could have
 ap_set_byterange() also return a set of start/stop points (in an
 array), so that the filter can work with that and not bother with
 rereading
 r-range. In fact, I think that makes the most sense… That is, adjust
 ap_set_byterange() to create the modified, parsed r-range, a return
 status
 and the start/stop array. The filter then goes thru that array.

 So ap_set_byterange(request_rec *r, apr_off_t clen, apr_array_header_t
 *indexes)
 will exist and parse_byterange() will go away…

 Sound OK?

Sounds OK to me. But I would prefer that the array is allocated with the
right size right from the start instead of growing an apr_array which
would again involve lots of copying.



RE: svn commit: r1161661 - /httpd/httpd/trunk/modules/http/byterange_filter.c

2011-08-26 Thread Plüm, Rüdiger, VF-Group
 

 -Original Message-
 From: Stefan Fritsch [mailto:s...@sfritsch.de] 
 Sent: Freitag, 26. August 2011 15:41
 To: dev@httpd.apache.org
 Subject: Re: svn commit: r1161661 - 
 /httpd/httpd/trunk/modules/http/byterange_filter.c
 
 On Fri, August 26, 2011 14:38, Jim Jagielski wrote:
 
  On Aug 26, 2011, at 8:31 AM, Jim Jagielski wrote:
 
  Agreed... The issue is that to merge, you need to parse... We 
 could have
  ap_set_byterange() also return a set of start/stop points (in an
  array), so that the filter can work with that and not bother with
  rereading
  r-range. In fact, I think that makes the most sense... That 
 is, adjust
  ap_set_byterange() to create the modified, parsed 
 r-range, a return
  status
  and the start/stop array. The filter then goes thru that array.
 
  So ap_set_byterange(request_rec *r, apr_off_t clen, 
 apr_array_header_t
  *indexes)
  will exist and parse_byterange() will go away...
 
  Sound OK?
 
 Sounds OK to me. But I would prefer that the array is 
 allocated with the
 right size right from the start instead of growing an apr_array which
 would again involve lots of copying.

I guess we can do both: Count the ',' and give the number to apr_array_make

Regards

Rüdiger


Re: svn commit: r1161661 - /httpd/httpd/trunk/modules/http/byterange_filter.c

2011-08-26 Thread Jim Jagielski
Which array?

On Aug 26, 2011, at 9:41 AM, Stefan Fritsch wrote:

 On Fri, August 26, 2011 14:38, Jim Jagielski wrote:
 
 On Aug 26, 2011, at 8:31 AM, Jim Jagielski wrote:
 
 Agreed… The issue is that to merge, you need to parse… We could have
 ap_set_byterange() also return a set of start/stop points (in an
 array), so that the filter can work with that and not bother with
 rereading
 r-range. In fact, I think that makes the most sense… That is, adjust
 ap_set_byterange() to create the modified, parsed r-range, a return
 status
 and the start/stop array. The filter then goes thru that array.
 
 So ap_set_byterange(request_rec *r, apr_off_t clen, apr_array_header_t
 *indexes)
 will exist and parse_byterange() will go away…
 
 Sound OK?
 
 Sounds OK to me. But I would prefer that the array is allocated with the
 right size right from the start instead of growing an apr_array which
 would again involve lots of copying.
 



Re: svn commit: r1161661 - /httpd/httpd/trunk/modules/http/byterange_filter.c

2011-08-26 Thread Jim Jagielski
 
 I guess we can do both: Count the ',' and give the number to apr_array_make
 

Doesn't that mean that someone can craft a nasty Range (e.g: 0-0,1-1,2-2,
3-3,….- and cause us to preallocate a bunch
of memory when at the end we'll get 0- ???




RE: svn commit: r1161661 - /httpd/httpd/trunk/modules/http/byterange_filter.c

2011-08-26 Thread Plüm, Rüdiger, VF-Group
 

 -Original Message-
 From: Jim Jagielski [mailto:j...@apache.org] 
 Sent: Freitag, 26. August 2011 16:27
 To: dev@httpd.apache.org
 Subject: Re: svn commit: r1161661 - 
 /httpd/httpd/trunk/modules/http/byterange_filter.c
 
  
  I guess we can do both: Count the ',' and give the number 
 to apr_array_make
  
 
 Doesn't that mean that someone can craft a nasty Range (e.g: 
 0-0,1-1,2-2,
 3-3,- and cause us to preallocate a bunch
 of memory when at the end we'll get 0- ???

In principal yes. Two things can happen:

1. The ranges are valid and do not overlap or are not mergable. In this
   case we need to allocate that memory anyway.

2. The ranges are mergable. In this case we allocate too much memory
   for the array. But this effect is limited by the maximum length a header 
field can
   have. And if this is not enough do a sane cut for the preallocation:

   MIN(number of ranges, MAX_PREALLOCATED_ARRAY_MEMBERS)

This should work fine for the typical use case where we can't merge anything
and avoid running in a DoS trap if we have a large number of mergable ranges.


Regards

Rüdiger 


Re: svn commit: r1161661 - /httpd/httpd/trunk/modules/http/byterange_filter.c

2011-08-26 Thread Jim Jagielski

On Aug 26, 2011, at 10:34 AM, Plüm, Rüdiger, VF-Group wrote:

 
 
 -Original Message-
 From: Jim Jagielski [mailto:j...@apache.org] 
 Sent: Freitag, 26. August 2011 16:27
 To: dev@httpd.apache.org
 Subject: Re: svn commit: r1161661 - 
 /httpd/httpd/trunk/modules/http/byterange_filter.c
 
 
 I guess we can do both: Count the ',' and give the number 
 to apr_array_make
 
 
 Doesn't that mean that someone can craft a nasty Range (e.g: 
 0-0,1-1,2-2,
 3-3,- and cause us to preallocate a bunch
 of memory when at the end we'll get 0- ???
 
 In principal yes. Two things can happen:
 
 1. The ranges are valid and do not overlap or are not mergable. In this
   case we need to allocate that memory anyway.
 
 2. The ranges are mergable. In this case we allocate too much memory
   for the array. But this effect is limited by the maximum length a header 
 field can
   have. And if this is not enough do a sane cut for the preallocation:
 
   MIN(number of ranges, MAX_PREALLOCATED_ARRAY_MEMBERS)
 
 This should work fine for the typical use case where we can't merge anything
 and avoid running in a DoS trap if we have a large number of mergable ranges.
 

The current rev just allocates memory when needed….

Re: svn commit: r1161661 - /httpd/httpd/trunk/modules/http/byterange_filter.c

2011-08-26 Thread William A. Rowe Jr.
On 8/26/2011 7:03 AM, Plüm, Rüdiger, VF-Group wrote:
  
 From: Jim Jagielski [mailto:j...@jagunet.com] 
 Sent: Freitag, 26. August 2011 13:49

 My plan is to put each range into an array and at the
 end flatten it via apr_array_pstrcat()
 
 I thought about that as well, but I think a combination of a preallocated
 buffer and apr_snprintf using a moving pointer in this buffer could
 save even more memory in the typical use case.
 Of course this changes if you remove a lot of ranges by merging.

But the wasted memory is insignificant compared to the CPU saved by
not resizing the buffers, IMHO

Yet another example where we need apr_prealloc :)


Re: svn commit: r1161661 - /httpd/httpd/trunk/modules/http/byterange_filter.c

2011-08-26 Thread Greg Ames
On Fri, Aug 26, 2011 at 10:27 AM, Jim Jagielski j...@apache.org wrote:

 
  I guess we can do both: Count the ',' and give the number to
 apr_array_make
 

 Doesn't that mean that someone can craft a nasty Range (e.g: 0-0,1-1,2-2,
 3-3,….- and cause us to preallocate a bunch
 of memory when at the end we'll get 0- ???


it won't fit in a header field of (default) legal length.  the attack vector
that killed us before the copy_brigade_range() patch keeps a single digit
start specifier and nearly fills up a legal header when the the end
specifier gets up to 1300.

fwiw, I calculated that with the original code and a tiny bit more malicious
attack vector, we would allocate 848,250 buckets per thread per request.

Greg


Re: svn commit: r1161661 - /httpd/httpd/trunk/modules/http/byterange_filter.c

2011-08-25 Thread Stefan Fritsch

On Thu, 25 Aug 2011, j...@apache.org wrote:


Author: jim
Date: Thu Aug 25 17:38:19 2011
New Revision: 1161661

URL: http://svn.apache.org/viewvc?rev=1161661view=rev
Log:
Merge in byteranges

Modified:
   httpd/httpd/trunk/modules/http/byterange_filter.c

Modified: httpd/httpd/trunk/modules/http/byterange_filter.c
URL: 
http://svn.apache.org/viewvc/httpd/httpd/trunk/modules/http/byterange_filter.c?rev=1161661r1=1161660r2=1161661view=diff
==
--- httpd/httpd/trunk/modules/http/byterange_filter.c (original)
+++ httpd/httpd/trunk/modules/http/byterange_filter.c Thu Aug 25 17:38:19 2011




+if (in_merge) {
+continue;
+} else {
+char *nr = apr_psprintf(r-pool, % APR_OFF_T_FMT -% 
APR_OFF_T_FMT,
+ostart, oend);
+merged = apr_pstrcat(r-pool, merged, (num_ranges++ ? , : ), 
nr, NULL);
+}
}

+if (in_merge) {
+char *nr = apr_psprintf(r-pool, % APR_OFF_T_FMT -% APR_OFF_T_FMT,
+ostart, oend);
+merged = apr_pstrcat(r-pool, merged, (num_ranges++ ? , : ), nr, 
NULL);
+}


Is it really a good idea to first parse the range string (involving lots 
of copying with ap_getword), then format it back into a string just to 
have it parsed by parse_byterange again? I would much prefer to have it 
parsed only once into an array of values and then do the merging in that 
array. This is more efficient and I think it would also lead to better 
readability. My original patch for merging/sorting had some code for that 
which we could reuse:


http://mail-archives.apache.org/mod_mbox/httpd-dev/201108.mbox/%3c201108240028.03308...@sfritsch.de%3E



Re: svn commit: r1161661 - /httpd/httpd/trunk/modules/http/byterange_filter.c

2011-08-25 Thread Joe Schaefer
+1, also has the advantage of not being a quadratic
allocator the way Jim's usage of apr_pstrcat is.






From: Stefan Fritsch s...@sfritsch.de
To: dev@httpd.apache.org
Sent: Thursday, August 25, 2011 4:56 PM
Subject: Re: svn commit: r1161661 - 
/httpd/httpd/trunk/modules/http/byterange_filter.c

On Thu, 25 Aug 2011, j...@apache.org wrote:

 Author: jim
 Date: Thu Aug 25 17:38:19 2011
 New Revision: 1161661
 
 URL: http://svn.apache.org/viewvc?rev=1161661view=rev
 Log:
 Merge in byteranges
 
 Modified:
    httpd/httpd/trunk/modules/http/byterange_filter.c
 
 Modified: httpd/httpd/trunk/modules/http/byterange_filter.c
 URL: 
 http://svn.apache.org/viewvc/httpd/httpd/trunk/modules/http/byterange_filter.c?rev=1161661r1=1161660r2=1161661view=diff
 ==
 --- httpd/httpd/trunk/modules/http/byterange_filter.c (original)
 +++ httpd/httpd/trunk/modules/http/byterange_filter.c Thu Aug 25 17:38:19 
 2011


 +        if (in_merge) {
 +            continue;
 +        } else {
 +            char *nr = apr_psprintf(r-pool, % APR_OFF_T_FMT -% 
 APR_OFF_T_FMT,
 +                                    ostart, oend);
 +            merged = apr_pstrcat(r-pool, merged, (num_ranges++ ? , : 
 ), nr, NULL);
 +        }
     }
 
 +    if (in_merge) {
 +        char *nr = apr_psprintf(r-pool, % APR_OFF_T_FMT -% 
 APR_OFF_T_FMT,
 +                                ostart, oend);
 +        merged = apr_pstrcat(r-pool, merged, (num_ranges++ ? , : ), 
 nr, NULL);
 +    }

Is it really a good idea to first parse the range string (involving lots of 
copying with ap_getword), then format it back into a string just to have it 
parsed by parse_byterange again? I would much prefer to have it parsed only 
once into an array of values and then do the merging in that array. This is 
more efficient and I think it would also lead to better readability. My 
original patch for merging/sorting had some code for that which we could reuse:

http://mail-archives.apache.org/mod_mbox/httpd-dev/201108.mbox/%3c201108240028.03308...@sfritsch.de%3E





Re: svn commit: r1161661 - /httpd/httpd/trunk/modules/http/byterange_filter.c

2011-08-25 Thread Jim Jagielski

On Aug 25, 2011, at 4:56 PM, Stefan Fritsch wrote:

 
 Is it really a good idea to first parse the range string (involving lots of 
 copying with ap_getword), then format it back into a string just to have it 
 parsed by parse_byterange again?

It is if we want to be able to create a correct r-range element
with ap_set_byterange()

Certainly that is where we should be creating the canonical
Range: we are using (or figuring out whether to punt it).
 
 I would much prefer to have it parsed only once into an array of values and 
 then do the merging in that array. This is more efficient and I think it 
 would also lead to better readability. My original patch for merging/sorting 
 had some code for that which we could reuse:
 
 http://mail-archives.apache.org/mod_mbox/httpd-dev/201108.mbox/%3c201108240028.03308...@sfritsch.de%3E
 



Re: svn commit: r1161661 - /httpd/httpd/trunk/modules/http/byterange_filter.c

2011-08-25 Thread Jim Jagielski

On Aug 25, 2011, at 5:02 PM, Joe Schaefer wrote:

 +1, also has the advantage of not being a quadratic
 allocator the way Jim's usage of apr_pstrcat is.
 

So what, exactly, will ap_set_byterange() do…? It was
my impression that it created our r-range entry...

Re: svn commit: r1161661 - /httpd/httpd/trunk/modules/http/byterange_filter.c

2011-08-25 Thread Jim Jagielski

On Aug 25, 2011, at 4:56 PM, Stefan Fritsch wrote:

 
 Is it really a good idea to first parse the range string (involving lots of 
 copying with ap_getword), then format it back into a string just to have it 
 parsed by parse_byterange again? I would much prefer to have it parsed only 
 once into an array of values and then do the merging in that array. This is 
 more efficient and I think it would also lead to better readability. My 
 original patch for merging/sorting had some code for that which we could 
 reuse:
 
 http://mail-archives.apache.org/mod_mbox/httpd-dev/201108.mbox/%3c201108240028.03308...@sfritsch.de%3E
 

Be my guest. commit and fix rp's concerns.

Re: svn commit: r1161661 - /httpd/httpd/trunk/modules/http/byterange_filter.c

2011-08-25 Thread Joe Schaefer
My criticism has to do with your implementation.
There's no point in fixing exploitable code with
a differently exploitable implementation.  Just
buffer things in an internal array and merge the
string once at the end of the loop, and *not* as
you iterate over the elements of the range header.





From: Jim Jagielski j...@jagunet.com
To: dev@httpd.apache.org
Sent: Thursday, August 25, 2011 5:10 PM
Subject: Re: svn commit: r1161661 - 
/httpd/httpd/trunk/modules/http/byterange_filter.c


On Aug 25, 2011, at 5:02 PM, Joe Schaefer wrote:

 +1, also has the advantage of not being a quadratic
 allocator the way Jim's usage of apr_pstrcat is.
 

So what, exactly, will ap_set_byterange() do…? It was
my impression that it created our r-range entry...



Re: svn commit: r1161661 - /httpd/httpd/trunk/modules/http/byterange_filter.c

2011-08-25 Thread Jim Jagielski
And who said this was the final version? Not me...

My plan was to simply add elements to an array and then
apr_array_pstrcat()...

On Thu, Aug 25, 2011 at 02:14:05PM -0700, Joe Schaefer wrote:
My criticism has to do with your implementation.
There's no point in fixing exploitable code with
a differently exploitable implementation.  Just
buffer things in an internal array and merge the
string once at the end of the loop, and *not* as
you iterate over the elements of the range header.
 
--
 
  From: Jim Jagielski j...@jagunet.com
  To: dev@httpd.apache.org
  Sent: Thursday, August 25, 2011 5:10 PM
  Subject: Re: svn commit: r1161661 -
  /httpd/httpd/trunk/modules/http/byterange_filter.c
 
  On Aug 25, 2011, at 5:02 PM, Joe Schaefer wrote:
 
   +1, also has the advantage of not being a quadratic
   allocator the way Jim's usage of apr_pstrcat is.
  
 
  So what, exactly, will ap_set_byterange() do***? It was
  my impression that it created our r-range entry...

-- 
===
   Jim Jagielski   [|]   j...@jagunet.com   [|]   http://www.jaguNET.com/
Great is the guilt of an unnecessary war  ~ John Adams


Re: svn commit: r1161661 - /httpd/httpd/trunk/modules/http/byterange_filter.c

2011-08-25 Thread Joe Schaefer
Look Jim, as someone who has written
a HTTP parsing library with a good track
record of no published security holes,
take my criticism of what you wrote for
what it's worth.  I have no idea what
your plans are, was simply agreeing with
Stefan's critique of your code.





From: Jim Jagielski j...@jagunet.com
To: dev@httpd.apache.org; Joe Schaefer joe_schae...@yahoo.com
Sent: Thursday, August 25, 2011 5:29 PM
Subject: Re: svn commit: r1161661 - 
/httpd/httpd/trunk/modules/http/byterange_filter.c

And who said this was the final version? Not me...

My plan was to simply add elements to an array and then
apr_array_pstrcat()...

On Thu, Aug 25, 2011 at 02:14:05PM -0700, Joe Schaefer wrote:
    My criticism has to do with your implementation.
    There's no point in fixing exploitable code with
    a differently exploitable implementation.  Just
    buffer things in an internal array and merge the
    string once at the end of the loop, and *not* as
    you iterate over the elements of the range header.
 
    --
 
      From: Jim Jagielski j...@jagunet.com
      To: dev@httpd.apache.org
      Sent: Thursday, August 25, 2011 5:10 PM
      Subject: Re: svn commit: r1161661 -
      /httpd/httpd/trunk/modules/http/byterange_filter.c
 
      On Aug 25, 2011, at 5:02 PM, Joe Schaefer wrote:
 
       +1, also has the advantage of not being a quadratic
       allocator the way Jim's usage of apr_pstrcat is.
      
 
      So what, exactly, will ap_set_byterange() do***? It was
      my impression that it created our r-range entry...

-- 
===
   Jim Jagielski   [|]  j...@jagunet.com   [|]  http://www.jaguNET.com/
        Great is the guilt of an unnecessary war  ~ John Adams




Re: svn commit: r1161661 - /httpd/httpd/trunk/modules/http/byterange_filter.c

2011-08-25 Thread Stefan Fritsch
On Thursday 25 August 2011, Jim Jagielski wrote:
 Be my guest. commit and fix rp's concerns.

OK, done that. Trunk r1161791 passes all tests, including the new 
byterange3.t and byterange4.t.