async reads - request for comments Re: svn commit: r360461 - in /httpd/httpd/trunk: CHANGES include/ap_mmn.h include/httpd.h server/protocol.c

2006-01-20 Thread Brian Pane
I'll probably have time to work on the changes to ap_core_input_filter 
and/or

the async read code this weekend (starting on the async-read-dev branch).
If anyone has a strong argument for or against putting the buffering of
partial lines inside the ap_core_input_filter context, please speak up.

Thanks,
Brian

On Jan 7, 2006, at 11:57 PM, Brian Pane wrote:


On Jan 3, 2006, at 8:07 AM, Justin Erenkrantz wrote:


AFAICT, ap_read_async_request() on the branch can't handle a partial
line correctly.

Noting of course that ap_core_input_filter is 'cute' and masks EAGAIN.
 So, you'll never see EAGAIN from this code path!  As I said earlier,
the EAGAIN logic in httpd is completely suspect.

Furthermore, as I read it, ap_read_async_request is assuming that it
gets a complete line from getline_nonblocking - which almost certainly
won't be the case.  -- justin



I'm currently working on changing ap_core_input_filter so that it
doesn't mask the EAGAIN in AP_MODE_GETLINE mode.  There's
a catch, though:

if (mode == AP_MODE_GETLINE) {
/* we are reading a single LF line, e.g. the HTTP headers */
rv = apr_brigade_split_line(b, ctx-b, block, HUGE_STRING_LEN);

if apr_brigade_split_line returns APR_EAGAIN, it will have
consumed the partial line and put it in b.  So if core_input_filter
returns rv at this point, the caller will receive the partial line and
EAGAIN.  We'll need to do one of two things:

- Buffer the partial line in ap_core_input_filter, by removing the
  buckets from b and putting them back at the start of ctx-b.

- Or buffer the partial line in getline_nonblocking (or  
read_partial_request).


Anybody have a preference among these options?

Brian





Re: svn commit: r360461 - in /httpd/httpd/trunk: CHANGES include/ap_mmn.h include/httpd.h server/protocol.c

2006-01-07 Thread Brian Pane

On Jan 3, 2006, at 8:07 AM, Justin Erenkrantz wrote:


AFAICT, ap_read_async_request() on the branch can't handle a partial
line correctly.

Noting of course that ap_core_input_filter is 'cute' and masks EAGAIN.
 So, you'll never see EAGAIN from this code path!  As I said earlier,
the EAGAIN logic in httpd is completely suspect.

Furthermore, as I read it, ap_read_async_request is assuming that it
gets a complete line from getline_nonblocking - which almost certainly
won't be the case.  -- justin


I'm currently working on changing ap_core_input_filter so that it
doesn't mask the EAGAIN in AP_MODE_GETLINE mode.  There's
a catch, though:

if (mode == AP_MODE_GETLINE) {
/* we are reading a single LF line, e.g. the HTTP headers */
rv = apr_brigade_split_line(b, ctx-b, block, HUGE_STRING_LEN);

if apr_brigade_split_line returns APR_EAGAIN, it will have
consumed the partial line and put it in b.  So if core_input_filter
returns rv at this point, the caller will receive the partial line and
EAGAIN.  We'll need to do one of two things:

- Buffer the partial line in ap_core_input_filter, by removing the
  buckets from b and putting them back at the start of ctx-b.

- Or buffer the partial line in getline_nonblocking (or  
read_partial_request).


Anybody have a preference among these options?

Brian



Re: svn commit: r360461 - in /httpd/httpd/trunk: CHANGES include/ap_mmn.h include/httpd.h server/protocol.c

2006-01-03 Thread William A. Rowe, Jr.

Roy T. Fielding wrote:

On Jan 2, 2006, at 2:14 PM, Roy T. Fielding wrote:


Now, if you want to tell me that those changes produced a net
performance benefit on prefork (and thus are applicable to other  MPMs),
then I am all ears.  I am easily convinced by comparative performance
figures when the comparison is meaningful.


lol, of course you choose the non-threaded MPM as a reference, which therefore
should receive no meaningful performance change.  The difference between an
async wakeup and a poll result should be null for one socket pool, one process,
one thread (of course select is a differently ugly beast, and if there were
a platform that supported async with no poll, I'd laugh.)


BTW, part of the reason I say that is because I have considered
replacing the same code with something that doesn't parse the
header fields until the request header/body separator line is
seen, since that would allow the entire request header to be parsed
in-place for the common case.


Well ... you are using protocol knowledge that will render us http-bound
when it comes time to efficently bind waka (no crlf delims in a binary format
protocol, no?) or ftp (pushes a 'greeting' before going back to sleep.)

But I'd agree these changes are radical enough that maintaining the async
branch a while longer is probably a good thing.

Bill


Re: svn commit: r360461 - in /httpd/httpd/trunk: CHANGES include/ap_mmn.h include/httpd.h server/protocol.c

2006-01-03 Thread Brian Pane


On Jan 2, 2006, at 3:41 PM, Justin Erenkrantz wrote:


+static apr_status_t process_request_line(request_rec *r, char *line,
+ int skip_first)
+{
+if (!skip_first  (r-the_request == NULL)) {
+/* This is the first line of the request */
+if ((line == NULL) || (*line == '\0')) {
+/* We skip empty lines because browsers have to tack  
a CRLF
on to the end + * of POSTs to support old CERN  
webservers.

+ */
+int max_blank_lines = r-server-limit_req_fields;
+if (max_blank_lines = 0) {
+max_blank_lines = DEFAULT_LIMIT_REQUEST_FIELDS;
+}
+r-num_blank_lines++;
+if (r-num_blank_lines  max_blank_lines) {
+return APR_SUCCESS;
+}
+}
+set_the_request(r, line);
+}


This will cause a segfault if line is null and we are at or above  
max_blank_lines.  Perhaps you meant for an else clause here?


Yes, thanks for catching that.




+else {
+/* We've already read the first line of the request.   
This is
either + * a header field or the blank line terminating  
the header

+ */
+if ((line == NULL) || (*line == '\0')) {
+if (r-pending_header_line != NULL) {
+apr_status_t rv = set_mime_header(r,
r-pending_header_line); +if (rv != APR_SUCCESS) {
+return rv;
+}
+r-pending_header_line = NULL;
+}
+if (r-status == HTTP_REQUEST_TIME_OUT) {
+apr_table_compress(r-headers_in,
APR_OVERLAP_TABLES_MERGE);
+r-status = HTTP_OK;


Say what?  ...looks at rest of file...

Is this because r-status is unset and we're saying that's it's  
'okay' to proceed with the request.  If so, this *really* needs a  
comment to that effect.  It makes no sense whatsoever otherwise.   
(We should probably remove the hack to set it to  
HTTP_REQUEST_TIME_OUT initially as part of these changes.)


Yeah, setting r-status to HTTP_OK is done here solely to make it work
with the existing logic about HTTP_REQUEST_TIME_OUT meaning still
reading the request header.

+1 for of removing the HTTP_REQUEST_TIME_OUT hack.  I was trying
to be conservative by preserving that part of the original logic, but  
now
that you mention it, we might as well replace it with something less  
subtle

in 2.3.  This will break any 3rd party modules that depend upon the
HTTP_REQUEST_TIME_OUT convention, but for a major release
like 2.4 or 3.0 that's a defensible choice.


+}
+}
+else {
+if ((*line == ' ') || (*line == '\t')) {
+/* This line is a continuation of the previous  
one */

+if (r-pending_header_line == NULL) {
+r-pending_header_line = line;
+r-pending_header_size = 0;
+}
+else {
+apr_size_t pending_len =
strlen(r-pending_header_line);
+apr_size_t fold_len = strlen(line);


This seems really expensive.  You shouldn't need to recount  
pending_len each time.


Good point; I'll add something to keep track of the length.


+}
+break;


If I understand your direction, it'd bail out here if it ever got  
EAGAIN?


Yes indeed.  That's what the version on the async-read-dev branch does.


+request_rec *ap_read_request(conn_rec *conn)
+{
+request_rec *r;
+apr_status_t rv;
+
+r = init_request(conn);
+
+rv = read_partial_request(r);
+/* TODO poll if EAGAIN */
+if (r-status != HTTP_OK) {
+ap_send_error_response(r, 0);
+ap_update_child_status(conn-sbh, SERVER_BUSY_LOG, r);
+ap_run_log_transaction(r);
+return NULL;
+}


Obviously, this can't be the case if you want to do real polling.   
This would be the wrong place to poll.  You have to exit out of  
ap_read_request and go back up to an 'event thread' that waits  
until there's data to read on any incoming sockets.  Then, you'd  
have to call ap_read_request again to 'restart' the parser whenever  
there is more data to read.


In my opinion, this code isn't even close to being async.


I'm relieved to hear that it's not async, since you're looking at the  
blocking
version.  :-)  See ap_read_async_request() (still on the async-read- 
dev branch).


  So, I wonder why it was even merged to trunk right now.  You'd  
have to deal with partial lines and the state the header parser is  
in when it gets the next chunk of data - which this code doesn't  
even try to do.  The current code is just going to bail when it  
doesn't have a 'complete' line.


My hunch is that you plan to build up pending_header_line and delay  
parsing until you have the line terminators; but that's not going  
to work really well with non-blocking sockets as you have to store  
the data you just read 

Re: svn commit: r360461 - in /httpd/httpd/trunk: CHANGES include/ap_mmn.h include/httpd.h server/protocol.c

2006-01-03 Thread Justin Erenkrantz
On 1/3/06, Brian Pane [EMAIL PROTECTED] wrote:

 Yeah, setting r-status to HTTP_OK is done here solely to make it work
 with the existing logic about HTTP_REQUEST_TIME_OUT meaning still
 reading the request header.

 +1 for of removing the HTTP_REQUEST_TIME_OUT hack.  I was trying
 to be conservative by preserving that part of the original logic, but
 now
 that you mention it, we might as well replace it with something less
 subtle
 in 2.3.  This will break any 3rd party modules that depend upon the
 HTTP_REQUEST_TIME_OUT convention, but for a major release
 like 2.4 or 3.0 that's a defensible choice.

+1.

 I'm relieved to hear that it's not async, since you're looking at the
 blocking
 version.  :-)  See ap_read_async_request() (still on the async-read-
 dev branch).

AFAICT, ap_read_async_request() on the branch can't handle a partial
line correctly.

Noting of course that ap_core_input_filter is 'cute' and masks EAGAIN.
 So, you'll never see EAGAIN from this code path!  As I said earlier,
the EAGAIN logic in httpd is completely suspect.

Furthermore, as I read it, ap_read_async_request is assuming that it
gets a complete line from getline_nonblocking - which almost certainly
won't be the case.  -- justin


Re: svn commit: r360461 - in /httpd/httpd/trunk: CHANGES include/ap_mmn.h include/httpd.h server/protocol.c

2006-01-03 Thread Roy T. Fielding

On Jan 3, 2006, at 12:02 AM, William A. Rowe, Jr. wrote:


Roy T. Fielding wrote:

On Jan 2, 2006, at 2:14 PM, Roy T. Fielding wrote:

Now, if you want to tell me that those changes produced a net
performance benefit on prefork (and thus are applicable to other   
MPMs),
then I am all ears.  I am easily convinced by comparative  
performance

figures when the comparison is meaningful.


lol, of course you choose the non-threaded MPM as a reference,  
which therefore
should receive no meaningful performance change.  The difference  
between an
async wakeup and a poll result should be null for one socket pool,  
one process,
one thread (of course select is a differently ugly beast, and if  
there were

a platform that supported async with no poll, I'd laugh.)


You seem to misunderstand me -- if I compare two prefork servers, one
with the change and one without the change, and the one with the change
performs better (by whatever various measures of performance you can  
test),

then that is an argument for making the change regardless of the other
concerns.

If, instead, you say that the change improves the event MPM by 10% and
degrades performance on prefork by 1%, then I am -1 on that change.
Prefork is our workhorse MPM.  The task then is to isolate MPM-specific
changes so that they have no significant impact on the critical path
of our primary MPM, even if that means using #ifdefs.

Alternatively, rewrite the server to remove all MPMs other than
event and then show that the new server is better than our existing
server, and we can adopt that for 3.0.


BTW, part of the reason I say that is because I have considered
replacing the same code with something that doesn't parse the
header fields until the request header/body separator line is
seen, since that would allow the entire request header to be parsed
in-place for the common case.


Well ... you are using protocol knowledge that will render us http- 
bound
when it comes time to efficently bind waka (no crlf delims in a  
binary format
protocol, no?) or ftp (pushes a 'greeting' before going back to  
sleep.)


Well, I am assuming that the MIME header parsing code is in the
protocol-specific part of the server, yes.

Roy



Re: svn commit: r360461 - in /httpd/httpd/trunk: CHANGES include/ap_mmn.h include/httpd.h server/protocol.c

2006-01-03 Thread William A. Rowe, Jr.

Roy T. Fielding wrote:


Alternatively, rewrite the server to remove all MPMs other than
event and then show that the new server is better than our existing
server, and we can adopt that for 3.0.


Well that's a bit silly, leave the others for those who must have an entirely
non-threaded server or other options; but prove that event is more performant
and quite stable for all platforms, if they would only take the time to use
threaded-compatible 3rd party modules, then make event the default 3.0 mpm...

Now that's a metric I'll buy.

Bill


Re: svn commit: r360461 - in /httpd/httpd/trunk: CHANGES include/ap_mmn.h include/httpd.h server/protocol.c

2006-01-02 Thread Roy T. Fielding

On Dec 31, 2005, at 9:55 PM, Brian Pane wrote:

On Dec 31, 2005, at 6:50 PM, Roy T. Fielding wrote:


The nonblocking yield should happen inside ap_rgetline (or its
replacement), not in the calling routine.  The thread has nothing
to do until that call is finished or it times out.


On the contrary, the thread has some very important work to do before
that call finishes or times out: it has other connections to  
process.  If
the thread waits until the ap_rgetline completes, a server  
configuration
sized for multiple threads per connection will be vulnerable to a  
trivially

implementable DoS.


When I say thread, I mean a single stream of control with execution
stack, not OS process/thread.  An event MPM is going to have a single
stream of control per event, right?  What I am saying is that the
control should block in rgetline and yield (return to the event pool)
inside that function.  That way, the complications due to yielding are
limited to the I/O routines that might block a thread rather than
being spread all over the server code.

Am I missing something?  This is not a new topic -- Dean Gaudet had
quite a few rants on the subject in the archives.


 In any case,
this code should be independent of the MPM and no MPM is going
to do something useful with a partial HTTP request.

I say -1 to adding the input buffer variables to the request_rec.
Those variables can be local to the input loop.


Are you proposing that the buffers literally become local variables?
That generally won't work; the input loop isn't necessarily contained
within a single function call, and the reading of a single request's
input can cross threads.


I am saying it doesn't belong in the request_rec.  It belongs on the
execution stack that the yield routine has to save in order to return
to this execution path on the next event.  The request does not care
about partial lines.


It would be feasible to build up the pending request in a structure
other than the request_rec, so that ap_read_async_request() can
operate on, say, an ap_partial_request_t instead of a request_rec.
My preference so far, though, has been to leave the responsibility
for knowing how to parse request headers encapsulated within
the request_rec and its associated methods.


Maybe you should just keep those changes on the async branch for now.
The rest of the server cannot be allowed to degrade just because you
want to introduce a new MPM.  After the async branch is proven to be
significantly faster than prefork, then we can evaluate whether or
not the additional complications are worth it.

Roy


Re: svn commit: r360461 - in /httpd/httpd/trunk: CHANGES include/ap_mmn.h include/httpd.h server/protocol.c

2006-01-02 Thread Brian Pane

On Jan 2, 2006, at 11:52 AM, Roy T. Fielding wrote:


It would be feasible to build up the pending request in a structure
other than the request_rec, so that ap_read_async_request() can
operate on, say, an ap_partial_request_t instead of a request_rec.
My preference so far, though, has been to leave the responsibility
for knowing how to parse request headers encapsulated within
the request_rec and its associated methods.


Maybe you should just keep those changes on the async branch for now.
The rest of the server cannot be allowed to degrade just because you
want to introduce a new MPM.  After the async branch is proven to be
significantly faster than prefork, then we can evaluate whether or
not the additional complications are worth it.


Significantly faster than prefork has never been a litmus test for
assessing new features, and I'm -1 for adding it now.  A reasonable
technical metric for validating the async changes would significantly
more scalable than the 2.2 Event MPM or memory footprint
competitive with IIS/Zeus/phttpd/one's-competitive-benchmark-of-choice.

The bit about degrading the rest of the server is a wonderful sound
bite, but we need to engineer the httpd based on data, not FUD.

Brian



Re: svn commit: r360461 - in /httpd/httpd/trunk: CHANGES include/ap_mmn.h include/httpd.h server/protocol.c

2006-01-02 Thread Roy T. Fielding

On Jan 2, 2006, at 1:37 PM, Brian Pane wrote:


Significantly faster than prefork has never been a litmus test for
assessing new features, and I'm -1 for adding it now.  A reasonable
technical metric for validating the async changes would significantly
more scalable than the 2.2 Event MPM or memory footprint
competitive with IIS/Zeus/phttpd/one's-competitive-benchmark-of- 
choice.


Those aren't features.  They are properties of the resulting system
assuming all goes well.


The bit about degrading the rest of the server is a wonderful sound
bite, but we need to engineer the httpd based on data, not FUD.


I said leave it on the async branch until you have data.  You moved
it to trunk before you've even implemented the async part, which I
think is wrong because the way you implemented it damages the
performance of prefork and needlessly creates an incompatible MMN.
Maybe it would be easier for me to understand why the event loop is
being controlled at such a high level if I could see it work first.

Now, if you want to tell me that those changes produced a net
performance benefit on prefork (and thus are applicable to other MPMs),
then I am all ears.  I am easily convinced by comparative performance
figures when the comparison is meaningful.

Roy


Re: svn commit: r360461 - in /httpd/httpd/trunk: CHANGES include/ap_mmn.h include/httpd.h server/protocol.c

2006-01-02 Thread Brian Pane


On Jan 2, 2006, at 2:14 PM, Roy T. Fielding wrote:


On Jan 2, 2006, at 1:37 PM, Brian Pane wrote:


Significantly faster than prefork has never been a litmus test for
assessing new features, and I'm -1 for adding it now.  A reasonable
technical metric for validating the async changes would  
significantly

more scalable than the 2.2 Event MPM or memory footprint
competitive with IIS/Zeus/phttpd/one's-competitive-benchmark-of- 
choice.


Those aren't features.  They are properties of the resulting system
assuming all goes well.


The bit about degrading the rest of the server is a wonderful sound
bite, but we need to engineer the httpd based on data, not FUD.


I said leave it on the async branch until you have data.  You moved
it to trunk before you've even implemented the async part, which I
think is wrong because the way you implemented it damages the
performance of prefork and needlessly creates an incompatible MMN


You have yet to present any data backing up the assertion that
it damages the performance of prefork.  (Repeating the claim
doesn't count as a proof.)  Having looked at the compiler's
inlining of the code that's been factored into separate functions,
I'm rather skeptical of the claim.

The incompatible MMN point is puzzling, to say the least.  As the
4th MMN change in the past quarter, this was by no means an
unusual event.  And on an unreleased development codeline,
the impact to production sites and 3rd party module developers
is near zero.

Brian



Re: svn commit: r360461 - in /httpd/httpd/trunk: CHANGES include/ap_mmn.h include/httpd.h server/protocol.c

2006-01-02 Thread Roy T. Fielding

On Jan 2, 2006, at 2:14 PM, Roy T. Fielding wrote:


Now, if you want to tell me that those changes produced a net
performance benefit on prefork (and thus are applicable to other  
MPMs),

then I am all ears.  I am easily convinced by comparative performance
figures when the comparison is meaningful.


BTW, part of the reason I say that is because I have considered
replacing the same code with something that doesn't parse the
header fields until the request header/body separator line is
seen, since that would allow the entire request header to be parsed
in-place for the common case.

Roy


Re: svn commit: r360461 - in /httpd/httpd/trunk: CHANGES include/ap_mmn.h include/httpd.h server/protocol.c

2006-01-02 Thread Jim Jagielski
It seems to me (catching up on my vacation Email) that the issues can
be resolved by simply reverting the patches on trunk while maintaining
development on the async branch. Certainly one reason for such
branches is to allow for sideline development without affecting
the main development branch with changes that are unproven
yet (or not so self-contained as to allow inline development).
-- 
===
   Jim Jagielski   [|]   [EMAIL PROTECTED]   [|]   http://www.jaguNET.com/
If you can dodge a wrench, you can dodge a ball.


Re: svn commit: r360461 - in /httpd/httpd/trunk: CHANGES include/ap_mmn.h include/httpd.h server/protocol.c

2006-01-02 Thread Justin Erenkrantz

Comments inline.

--On December 31, 2005 11:45:14 PM + [EMAIL PROTECTED] wrote:


= --- httpd/httpd/trunk/include/httpd.h (original)
+++ httpd/httpd/trunk/include/httpd.h Sat Dec 31 15:45:11 2005
@@ -967,6 +967,16 @@
 /** A flag to determine if the eos bucket has been sent yet */
 int eos_sent;

+/** Number of leading blank lines encountered before request header
*/ +int num_blank_lines;
+
+/** A buffered header line, used to support header folding while
+ *  reading the request */
+char *pending_header_line;
+
+/** If nonzero, the number of bytes allocated to hold
pending_header_line */ +apr_size_t pending_header_size;
+


As Roy pointed out, these fields should not be in request_rec.  A filter or 
connection context, perhaps.  But, not here.  (More as to why below.)


...snip...


+static apr_status_t process_request_line(request_rec *r, char *line,
+ int skip_first)
+{
+if (!skip_first  (r-the_request == NULL)) {
+/* This is the first line of the request */
+if ((line == NULL) || (*line == '\0')) {
+/* We skip empty lines because browsers have to tack a CRLF
on to the end + * of POSTs to support old CERN webservers.
+ */
+int max_blank_lines = r-server-limit_req_fields;
+if (max_blank_lines = 0) {
+max_blank_lines = DEFAULT_LIMIT_REQUEST_FIELDS;
+}
+r-num_blank_lines++;
+if (r-num_blank_lines  max_blank_lines) {
+return APR_SUCCESS;
+}
+}
+set_the_request(r, line);
+}


This will cause a segfault if line is null and we are at or above 
max_blank_lines.  Perhaps you meant for an else clause here?



+else {
+/* We've already read the first line of the request.  This is
either + * a header field or the blank line terminating the header
+ */
+if ((line == NULL) || (*line == '\0')) {
+if (r-pending_header_line != NULL) {
+apr_status_t rv = set_mime_header(r,
r-pending_header_line); +if (rv != APR_SUCCESS) {
+return rv;
+}
+r-pending_header_line = NULL;
+}
+if (r-status == HTTP_REQUEST_TIME_OUT) {
+apr_table_compress(r-headers_in,
APR_OVERLAP_TABLES_MERGE);
+r-status = HTTP_OK;


Say what?  ...looks at rest of file...

Is this because r-status is unset and we're saying that's it's 'okay' to 
proceed with the request.  If so, this *really* needs a comment to that 
effect.  It makes no sense whatsoever otherwise.  (We should probably 
remove the hack to set it to HTTP_REQUEST_TIME_OUT initially as part of 
these changes.)



+}
+}
+else {
+if ((*line == ' ') || (*line == '\t')) {
+/* This line is a continuation of the previous one */
+if (r-pending_header_line == NULL) {
+r-pending_header_line = line;
+r-pending_header_size = 0;
+}
+else {
+apr_size_t pending_len =
strlen(r-pending_header_line);
+apr_size_t fold_len = strlen(line);


This seems really expensive.  You shouldn't need to recount pending_len 
each time.



+if (pending_len + fold_len 
+r-server-limit_req_fieldsize) {
+/* CVE-2004-0942 */
+r-status = HTTP_BAD_REQUEST;
+return APR_ENOSPC;
+}
+if (pending_len + fold_len + 1 
r-pending_header_size) { +/* Allocate a new
buffer big enough to hold the + * concatenated
lines
+ */
+apr_size_t new_size = r-pending_header_size;
+char *new_buf;
+if (new_size == 0) {
+new_size = pending_len + fold_len + 1;
+}
+else {
+do {
+new_size *= 2;
+} while (new_size  pending_len + fold_len +
1); +}
+new_buf = (char *)apr_palloc(r-pool, new_size);
+memcpy(new_buf, r-pending_header_line,
pending_len); +r-pending_header_line = new_buf;
+r-pending_header_size = new_size;
+}
+memcpy(r-pending_header_line + pending_len, line,
+   fold_len + 1);


See, we know how much we've written each time...


+/* Read and process lines of the request until we
+ * encounter a complete request header, an error, or EAGAIN
+ */
+tmp_bb = apr_brigade_create(r-pool, 

Re: svn commit: r360461 - in /httpd/httpd/trunk: CHANGES include/ap_mmn.h include/httpd.h server/protocol.c

2005-12-31 Thread Roy T. Fielding

On Dec 31, 2005, at 3:45 PM, [EMAIL PROTECTED] wrote:


Author: brianp
Date: Sat Dec 31 15:45:11 2005
New Revision: 360461

URL: http://svn.apache.org/viewcvs?rev=360461view=rev
Log:
Refactoring of ap_read_request() to store partial request state
in the request rec.  The point of this is to allow asynchronous
MPMs do do nonblocking reads of requests.  (Backported from the
async-read-dev branch)


Umm, this needs more eyes.

It doesn't seem to me to be doing anything useful.  It just adds
a set of unused input buffer fields to the wrong memory structure,
resulting in what should be a minor (not major) MMN bump, and then
rearranges a critical-path function into several subroutines.

The nonblocking yield should happen inside ap_rgetline (or its
replacement), not in the calling routine.  The thread has nothing
to do until that call is finished or it times out.  In any case,
this code should be independent of the MPM and no MPM is going
to do something useful with a partial HTTP request.

I say -1 to adding the input buffer variables to the request_rec.
Those variables can be local to the input loop.  I don't see any
point in placing this on trunk until it can do something useful.

Roy


Re: svn commit: r360461 - in /httpd/httpd/trunk: CHANGES include/ap_mmn.h include/httpd.h server/protocol.c

2005-12-31 Thread Brian Pane

On Dec 31, 2005, at 6:50 PM, Roy T. Fielding wrote:


The nonblocking yield should happen inside ap_rgetline (or its
replacement), not in the calling routine.  The thread has nothing
to do until that call is finished or it times out.


On the contrary, the thread has some very important work to do before
that call finishes or times out: it has other connections to  
process.  If

the thread waits until the ap_rgetline completes, a server configuration
sized for multiple threads per connection will be vulnerable to a  
trivially

implementable DoS.


 In any case,
this code should be independent of the MPM and no MPM is going
to do something useful with a partial HTTP request.

I say -1 to adding the input buffer variables to the request_rec.
Those variables can be local to the input loop.


Are you proposing that the buffers literally become local variables?
That generally won't work; the input loop isn't necessarily contained
within a single function call, and the reading of a single request's
input can cross threads.

It would be feasible to build up the pending request in a structure
other than the request_rec, so that ap_read_async_request() can
operate on, say, an ap_partial_request_t instead of a request_rec.
My preference so far, though, has been to leave the responsibility
for knowing how to parse request headers encapsulated within
the request_rec and its associated methods.

Brian