Re: RFE: ap_input_mode_t as bitflags

2004-08-13 Thread Joe Schaefer
Joe Schaefer <[EMAIL PROTECTED]> writes:

> Glenn Strauss <[EMAIL PROTECTED]> writes:

[...]

> > You suggested always using READBYTES.  If a POST used chunked T-E to
> > send input, then when HTTP_IN is reading chunk trailers, too much data
> > might be read, i.e. the next request line might be read.  What should
> > ap_get_mime_headers() do with the excess data at that point?  
> > Should it arbitrarily push an ephemeral filter onto the bottom of the
> > connection stack?
> 
> "Bottom" being r->input_filters, why not?
> 
> > 
> > > but if it did, it could add an
> > > ephemeral connection filter and put the excess brigade
> > > in there.
> > 
> > Messy.  That makes the assumption that the connection-level
> > filters are right above HTTP_IN
> 
> Currently HTTP_IN isn't even on the filter stack when 
> ap_get_mime_headers is invoked; only connection filters 
> are in play at that point, no?

Sorry Glenn, I didn't read your post carefully enough.  You are 
focused on the chunked trailer headers here, which are parsed 
*after* the final POST content is read.  One thing ap_get_mime_headers 
shouldn't be doing at *this* point is reentering the whole
r->input_filters stack from within ap_input_filter.  

Instead, it probably should be using the *same filter* it used when 
ap_read_request had it parse the leading headers (in particular,
skipping all the AP_FTYPE_RESOURCE filters).  Otherwise the filters 
between r->input_filters and *that one* may wind up either accidentally 
modifying the trailers, or having their context data corrupted by the 
ap_get_brigade reentry (since they're still awaiting a first return 
from ap_input_filter).

Or they could all be required to punt on AP_MODE_GETLINE (like
mod_deflate does),  which would mean some filters like mod_ext_filter
need a patch.


-- 
Joe Schaefer



[patch] ap_sub_req_* args naming fix

2004-08-13 Thread Stas Bekman
If there are no objections I'd like to fix the naming of the arguments and 
 doxygen comments, as they seem to be a result of copy-n-paste.

ap_sub_req_*_uri  to use new_uri
ap_sub_req_*_file to use new_file
Index: include/http_request.h
===
RCS file: /home/cvs/httpd-2.0/include/http_request.h,v
retrieving revision 1.42.2.4
diff -u -r1.42.2.4 http_request.h
--- include/http_request.h	9 Feb 2004 20:54:34 -	1.42.2.4
+++ include/http_request.h	14 Aug 2004 04:21:23 -
@@ -62,21 +62,21 @@
 /**
  * Create a sub request from the given URI.  This sub request can be
  * inspected to find information about the requested URI
- * @param new_file The URI to lookup
+ * @param new_uri The URI to lookup
  * @param r The current request
  * @param next_filter The first filter the sub_request should use.  If 
this is
  *NULL, it defaults to the first filter for the main 
request
  * @return The new request record
- * @deffunc request_rec * ap_sub_req_lookup_uri(const char *new_file, 
const request_rec *r)
+ * @deffunc request_rec * ap_sub_req_lookup_uri(const char *new_uri, 
const request_rec *r)
  */
-AP_DECLARE(request_rec *) ap_sub_req_lookup_uri(const char *new_file,
+AP_DECLARE(request_rec *) ap_sub_req_lookup_uri(const char *new_uri,
 const request_rec *r,
 ap_filter_t *next_filter);

 /**
  * Create a sub request for the given file.  This sub request can be
  * inspected to find information about the requested file
- * @param new_file The URI to lookup
+ * @param new_file The file to lookup
  * @param r The current request
  * @param next_filter The first filter the sub_request should use.  If 
this is
  *NULL, it defaults to the first filter for the main 
request
@@ -113,15 +113,15 @@
  * Create a sub request for the given URI using a specific method.  This
  * sub request can be inspected to find information about the requested URI
  * @param method The method to use in the new sub request
- * @param new_file The URI to lookup
+ * @param new_uri The URI to lookup
  * @param r The current request
  * @param next_filter The first filter the sub_request should use.  If 
this is
  *NULL, it defaults to the first filter for the main 
request
  * @return The new request record
- * @deffunc request_rec * ap_sub_req_method_uri(const char *method, const 
char *new_file, const request_rec *r)
+ * @deffunc request_rec * ap_sub_req_method_uri(const char *method, const 
char *new_uri, const request_rec *r)
  */
 AP_DECLARE(request_rec *) ap_sub_req_method_uri(const char *method,
-const char *new_file,
+const char *new_uri,
 const request_rec *r,
 ap_filter_t *next_filter);
 /**
@@ -289,8 +289,8 @@
  * @param r The current request
  * @return DONE (or HTTP_) if this contextless request was just fulfilled
  * (such as TRACE), OK if this is not a file, and DECLINED if this is a 
file.
- * The core map_to_storage (HOOK_RUN_LAST) will directory_walk and file_walk
- * the r->filename.
+ * The core map_to_storage (HOOK_RUN_REALLY_LAST) will directory_walk
+ * and file_walk the r->filename.
  *
  * @ingroup hooks
  */
Index: server/request.c
===
RCS file: /home/cvs/httpd-2.0/server/request.c,v
retrieving revision 1.121.2.13
diff -u -r1.121.2.13 request.c
--- server/request.c	9 Feb 2004 20:59:46 -	1.121.2.13
+++ server/request.c	14 Aug 2004 04:21:24 -
@@ -1567,7 +1567,7 @@

 AP_DECLARE(request_rec *) ap_sub_req_method_uri(const char *method,
-const char *new_file,
+const char *new_uri,
 const request_rec *r,
 ap_filter_t *next_filter)
 {
@@ -1581,13 +1581,13 @@
 rnew->method = method;
 rnew->method_number = ap_method_number_of(method);
-if (new_file[0] == '/') {
-ap_parse_uri(rnew, new_file);
+if (new_uri[0] == '/') {
+ap_parse_uri(rnew, new_uri);
 }
 else {
 udir = ap_make_dirstr_parent(rnew->pool, r->uri);
 udir = ap_escape_uri(rnew->pool, udir);/* re-escape it */
-ap_parse_uri(rnew, ap_make_full_path(rnew->pool, udir, new_file));
+ap_parse_uri(rnew, ap_make_full_path(rnew->pool, udir, new_uri));
 }
 /* We cannot return NULL without violating the API. So just turn this
@@ -1612,11 +1612,11 @@
 return rnew;
 }
-AP_DECLARE(request_rec *) ap_sub_req_lookup_uri(const char *new_file,
+AP_DECLARE(request_rec *) ap_sub_req_lookup_uri(const char *new_uri,
   

Re: RFE: ap_input_mode_t as bitflags

2004-08-13 Thread Joe Schaefer
Glenn Strauss <[EMAIL PROTECTED]> writes:

> On Fri, Aug 13, 2004 at 03:08:17PM -0400, Joe Schaefer wrote:
> > Glenn Strauss <[EMAIL PROTECTED]> writes:
> > 
> > [...]
> > 
> > > I'm not sure the answer to this one:
> > > Are protocol filters attached to the request (I think so)
> > > or to the connection?  If attached to the request, then 
> > > wouldn't they need to push-back excess data before the request
> > > finishes if the data is to be used by later requests on the
> > > same connection?
> > > 
> > > The HTTP_IN filter allocates its ctx from r->pool, so it won't
> > > survive multiple requests on the same connection.
> > 
> > I don't think HTTP_IN would normally read more than one 
> > request at a time,
> 
> You suggested always using READBYTES.  If a POST used chunked T-E to
> send input, then when HTTP_IN is reading chunk trailers, too much data
> might be read, i.e. the next request line might be read.  What should
> ap_get_mime_headers() do with the excess data at that point?  
> Should it arbitrarily push an ephemeral filter onto the bottom of the
> connection stack?

"Bottom" being r->input_filters, why not?

> 
> > but if it did, it could add an
> > ephemeral connection filter and put the excess brigade
> > in there.
> 
> Messy.  That makes the assumption that the connection-level
> filters are right above HTTP_IN

Currently HTTP_IN isn't even on the filter stack when 
ap_get_mime_headers is invoked; only connection filters 
are in play at that point, no?

If you want to see an example of how this is be implemented
one level up, at the request-filter/protocol filter boundary, 
take a look at mod_apreq in httpd-apreq-2 current cvs.

-- 
Joe Schaefer



Re: RFE: ap_input_mode_t as bitflags

2004-08-13 Thread Glenn Strauss
On Fri, Aug 13, 2004 at 03:08:17PM -0400, Joe Schaefer wrote:
> Glenn Strauss <[EMAIL PROTECTED]> writes:
> 
> [...]
> 
> > I'm not sure the answer to this one:
> > Are protocol filters attached to the request (I think so)
> > or to the connection?  If attached to the request, then 
> > wouldn't they need to push-back excess data before the request
> > finishes if the data is to be used by later requests on the
> > same connection?
> > 
> > The HTTP_IN filter allocates its ctx from r->pool, so it won't
> > survive multiple requests on the same connection.
> 
> I don't think HTTP_IN would normally read more than one 
> request at a time,

You suggested always using READBYTES.  If a POST used chunked T-E to
send input, then when HTTP_IN is reading chunk trailers, too much data
might be read, i.e. the next request line might be read.  What should
ap_get_mime_headers() do with the excess data at that point?  Should
it arbitrarily push an ephemeral filter onto the bottom of the
connection stack?

> but if it did, it could add an
> ephemeral connection filter and put the excess brigade
> in there.

Messy.  That makes the assumption that the connection-level
filters are right above HTTP_IN; that f->next is a connection-level
filter.  Otherwise, what about other protocol filters which might
be doing translation?  You're bypassing them and throwing data
above them at the connection level.  I don't think this occurs in
current practice, but it is still a violation of what is intended
for filtering, unless you're going to require that there only be
a single protocol filter, or that HTTP_IN always be at the top of
the protocol filter stack.


This just occurred to me: why not promote HTTP_IN to be a
connection level filter, and add a buffer brigade to http_ctx_t?
Not only would that handle the case above, but it would also
obviate the need for the ugly AP_MODE_EATCRLF because the
HTTP_IN filter would handle what is now done in check_pipeline_flush().

And if we remove folding ability from ap_rgetline_core(), (and
replace the one usage of folding in proxy_util.c), then we can
also get rid of AP_MODE_SPECULATIVE.

Getting rid of AP_MODE_SPECULATIVE and AP_MODE_EATCRLF sounds to me
like moves in the right direction.  What do you think?

Cheers,
Glenn



ap_os_escape_path with partial=0?

2004-08-13 Thread Stas Bekman
Can someone please explain what this function does when partial=0? why 
does it prepend "./" only if the path includes ":/" in it? what's 
happening here? the API doesn't document this nuance. What kind of input 
does it take care of? I can see it used by mod_autoindex, but may be it 
never hits this condition?

Shouldn't it go the other way around? i.e. not append ./ if it finds ':/' 
in the string?

if (!(colon && (!slash || colon < slash))) {
*d++ = '.';
*d++ = '/';
}
Thanks.
Here is the whole function for your convenience:
AP_DECLARE(char *) ap_os_escape_path(apr_pool_t *p, const char *path, int 
partial)
{
char *copy = apr_palloc(p, 3 * strlen(path) + 3);
const unsigned char *s = (const unsigned char *)path;
unsigned char *d = (unsigned char *)copy;
unsigned c;

if (!partial) {
const char *colon = ap_strchr_c(path, ':');
const char *slash = ap_strchr_c(path, '/');
if (colon && (!slash || colon < slash)) {
*d++ = '.';
*d++ = '/';
}
}
while ((c = *s)) {
if (TEST_CHAR(c, T_OS_ESCAPE_PATH)) {
d = c2x(c, d);
}
else {
*d++ = c;
}
++s;
}
*d = '\0';
return copy;
}
--
__
Stas BekmanJAm_pH --> Just Another mod_perl Hacker
http://stason.org/ mod_perl Guide ---> http://perl.apache.org
mailto:[EMAIL PROTECTED] http://use.perl.org http://apacheweek.com
http://modperlbook.org http://apache.org   http://ticketmaster.com


Re: New Mod_Proxy - some testing/looking

2004-08-13 Thread NormW
Good morning Brad,
Thanks for knocking on the httpd.apache.org door the latest updates now
work all the way through to apache2.nlm without issue... can now see what I
can do with the nlm's. Give a regards to the commiters at
devAThttpd.apache.org.
Norm

>From what I can see, there seems to be a parameter mismatch for the
> scheme_handler hook.  For example, the scheme_handler
> ap_proxy_connect_handler() is defined as:
>
> int ap_proxy_connect_handler(request_rec *r, proxy_server_conf *conf,
>  char *url, const char *proxyname,
>  apr_port_t proxyport)
>
> but the scheme_handler hook is defined as:
>
> APR_DECLARE_EXTERNAL_HOOK(proxy, PROXY, int, scheme_handler,
> (request_rec *r,
>   proxy_worker *worker, proxy_server_conf
> *conf, char *url,
>   const char *proxyhost, apr_port_t
> proxyport))
>
> notice the difference in the parameter lists.  The declaration has  6
> parameters and the actual handler only has 5.  The handler is missing
> proxy_worker * as the second parameter.  How does this compile or even
> work on any platform?
>
> Brad
>
>
> Brad Nicholes
> Senior Software Engineer
> Novell, Inc., the leading provider of Net business solutions
> http://www.novell.com
>
> >>> [EMAIL PROTECTED] Friday, August 13, 2004 1:37:16 PM >>>
> At 04:58 AM 8/13/2004, NormW wrote:
> >Good evening Bill, All...
> >
> >> Please direct these comments to [EMAIL PROTECTED] - b.t.w., you
> >> can check out the latest httpd-2.0 HEAD and pick up the entire
> proxy
> >> solution (you must explicitly --enable-proxy-ajp and have the
> ajplib
> >> sources there too.)
> >>
> >> Someone want to take a wack at NormW's observations?
> >Bill, thanks for the interest...
> >I can say that the 'problem' noted in proxy_ftp.c has been attended
> with a
> >recent update from Mladen.
> >A build on the current 2.1 CVS shows (for NetWare):
> >- mod_proxy.c having trouble finding mod_status.h   (in
> >../modules/generators),
> >- the same problem mentioned previously with proxy_connect.c, (Errors
> at 375
> >see 26, 80)
> >Bypassing these I now get:
> >Calling NWGNUproxyftp
> >Generating Release\proxyftp_cc.opt
> >Compiling proxy_ftp.c
> >Compiling proxy_util.c
> >Compiling ../arch/netware/libprews.c
> >Generating Release\proxyftp_link.opt
> >Linking Release/proxyftp.nlm
> >### mwldnlm Linker Error:
> >#   Undefined symbol: 'ap_proxy_ssl_enable'
> >#   referenced from 'ap_proxy_connection_create' in proxy_util.o
> >### mwldnlm Linker Error:
> >#   Undefined symbol: 'ap_proxy_ssl_disable'
> >#   referenced from 'ap_proxy_connection_create' in proxy_util.o
> >
> >As noted above, this is on a NetWare build...
>
> It's expected because the netware gurus haven't had a chance to
> catch up.  Please quit using tomcat-dev already - final warning;
> ignoring future posts to there in order to enforce posting future
> proxy patches/comments to [EMAIL PROTECTED] ;-)
>
> I suspect Netware didn't grok that those symbols were local/global
> as required?
>
> Bill
>
>
>



Re: [PATCH] fix child reclaim timing

2004-08-13 Thread Jeff Trawick
On Fri, 13 Aug 2004 16:48:42 -0400, Arliss, Noah <[EMAIL PROTECTED]> wrote:
> I'd like to comment further... Not only is a disturbing message sent to the
> error log, but a SIGTERM is also sent to the child process. If I understand
> correctly the SIGTERM will likely interrupt any properly implemented child
> process shutdown and the child process will exit ungracefully.

for worker MPM, at least:

child processes have a SIGTERM handler that simply sets a flag and
returns to whatever was happening before; it will be the main thread
of a child that receives a message via another mechanism which tells
it to wake up and decide to exit

the SIGTERM isn't expected to interrupt any important processing going
on in the child (be it worker threads or child exit hook)

SIGTERM is sent multiple times to work around any signal loss or other
glitch (not sure when this is effective in reality); I don't see how
it is harmful to any code that must run

the SIGKILL is what yanks the rug out from under the child and any
child exit hooks; the web server simply must exit in a reasonable
timeframe if the administrator tells it too, stuck code or not

>   If it's
> acceptable to wait longer then the kill call should also be postponed to
> give modules a chance to cleanup gracefully. If any module has complex IPC
> or Mutexes in use, graceful shutdown is important especially if
> MaxRequestsPerChild is in use on a server with heavy load.

yes, the SIGKILL is the measure of last resort; shouldn't be sent for
a while after we start shutting down

here is a current example:

(I don't actually know when shutdown started; I should add a debug msg
there; but it is very short time before this uninteresting mess
starts)
[Mon Jun 14 09:15:11 2004] [warn] child process 3906 still did not
exit, sending a SIGTERM
[Mon Jun 14 09:15:12 2004] [warn] child process 3907 still did not
exit, sending a SIGTERM
[Mon Jun 14 09:15:12 2004] [warn] child process 3924 still did not
exit, sending a SIGTERM
[Mon Jun 14 09:15:12 2004] [warn] child process 3925 still did not
exit, sending a SIGTERM
[Mon Jun 14 09:15:12 2004] [warn] child process 3926 still did not
exit, sending a SIGTERM
[Mon Jun 14 09:15:12 2004] [warn] child process 3906 still did not
exit, sending a SIGTERM
[Mon Jun 14 09:15:12 2004] [warn] child process 3907 still did not
exit, sending a SIGTERM
[Mon Jun 14 09:15:12 2004] [warn] child process 3924 still did not
exit, sending a SIGTERM
[Mon Jun 14 09:15:12 2004] [warn] child process 3925 still did not
exit, sending a SIGTERM
[Mon Jun 14 09:15:12 2004] [warn] child process 3926 still did not
exit, sending a SIGTERM
[Mon Jun 14 09:15:13 2004] [warn] child process 3906 still did not
exit, sending a SIGTERM
[Mon Jun 14 09:15:13 2004] [warn] child process 3907 still did not
exit, sending a SIGTERM
[Mon Jun 14 09:15:13 2004] [warn] child process 3924 still did not
exit, sending a SIGTERM
[Mon Jun 14 09:15:13 2004] [warn] child process 3925 still did not
exit, sending a SIGTERM
[Mon Jun 14 09:15:13 2004] [warn] child process 3926 still did not
exit, sending a SIGTERM
[Mon Jun 14 09:15:14 2004] [error] [client 127.0.0.1] request failed:
error reading the headers
[Mon Jun 14 09:15:15 2004] [info] (9)Bad file number:
core_output_filter: writing data to the network
[Mon Jun 14 09:15:17 2004] [error] child process 3906 still did not
exit, sending a SIGKILL
[Mon Jun 14 09:15:34 2004] [info] removed PID file
/export/home/trawick/inst/20/logs/httpd.pid (pid=3903)
[Mon Jun 14 09:15:34 2004] [notice] caught SIGTERM, shutting down

if SIGTERM simply sets a flag and returns, what is use of repeating
the SIGTERM over and over?  for worker MPM it doesn't help or hurt
AFAICT; worker does something else to wake up its children prior to
calling the code Joe has a patch for

this sounds a bit more sane to me for timing, as long as we can exit
as soon as all children have exited:

shutdown + 0:
  send SIGTERM
shutdown + 4:
  for each child still remaining, bitch to error log and send SIGTERM again
shutdown + 8:
  for each child still remaining, bitch to error log and send SIGTERM again
shutdown + 12:
  for each child still remaining, bitch to error log and send SIGKILL
shutdown + 16:
  for each child still remaining, bitch to error log, send SIGKILL again, 
and exit anyway

if somebody suspects that sending SIGTERM every second is going to
help some MPM+platform, that would be great to know


Re: New Mod_Proxy - some testing/looking

2004-08-13 Thread Brad Nicholes
   From what I can see, there seems to be a parameter mismatch for the
scheme_handler hook.  For example, the scheme_handler
ap_proxy_connect_handler() is defined as:

int ap_proxy_connect_handler(request_rec *r, proxy_server_conf *conf, 
 char *url, const char *proxyname, 
 apr_port_t proxyport)

but the scheme_handler hook is defined as:

APR_DECLARE_EXTERNAL_HOOK(proxy, PROXY, int, scheme_handler,
(request_rec *r, 
  proxy_worker *worker, proxy_server_conf
*conf, char *url, 
  const char *proxyhost, apr_port_t
proxyport))

notice the difference in the parameter lists.  The declaration has  6
parameters and the actual handler only has 5.  The handler is missing
proxy_worker * as the second parameter.  How does this compile or even
work on any platform? 

Brad


Brad Nicholes
Senior Software Engineer
Novell, Inc., the leading provider of Net business solutions
http://www.novell.com 

>>> [EMAIL PROTECTED] Friday, August 13, 2004 1:37:16 PM >>>
At 04:58 AM 8/13/2004, NormW wrote:
>Good evening Bill, All...
>
>> Please direct these comments to [EMAIL PROTECTED] - b.t.w., you
>> can check out the latest httpd-2.0 HEAD and pick up the entire
proxy
>> solution (you must explicitly --enable-proxy-ajp and have the
ajplib
>> sources there too.)
>>
>> Someone want to take a wack at NormW's observations?
>Bill, thanks for the interest...
>I can say that the 'problem' noted in proxy_ftp.c has been attended
with a
>recent update from Mladen.
>A build on the current 2.1 CVS shows (for NetWare):
>- mod_proxy.c having trouble finding mod_status.h   (in
>../modules/generators),
>- the same problem mentioned previously with proxy_connect.c, (Errors
at 375
>see 26, 80)
>Bypassing these I now get:
>Calling NWGNUproxyftp
>Generating Release\proxyftp_cc.opt
>Compiling proxy_ftp.c 
>Compiling proxy_util.c
>Compiling ../arch/netware/libprews.c
>Generating Release\proxyftp_link.opt
>Linking Release/proxyftp.nlm
>### mwldnlm Linker Error:
>#   Undefined symbol: 'ap_proxy_ssl_enable'
>#   referenced from 'ap_proxy_connection_create' in proxy_util.o
>### mwldnlm Linker Error:
>#   Undefined symbol: 'ap_proxy_ssl_disable'
>#   referenced from 'ap_proxy_connection_create' in proxy_util.o
>
>As noted above, this is on a NetWare build...

It's expected because the netware gurus haven't had a chance to
catch up.  Please quit using tomcat-dev already - final warning;
ignoring future posts to there in order to enforce posting future
proxy patches/comments to [EMAIL PROTECTED] ;-)

I suspect Netware didn't grok that those symbols were local/global
as required?

Bill




RE: [PATCH] fix child reclaim timing

2004-08-13 Thread Arliss, Noah
I'd like to comment further... Not only is a disturbing message sent to the
error log, but a SIGTERM is also sent to the child process. If I understand
correctly the SIGTERM will likely interrupt any properly implemented child
process shutdown and the child process will exit ungracefully. If it's
acceptable to wait longer then the kill call should also be postponed to
give modules a chance to cleanup gracefully. If any module has complex IPC
or Mutexes in use, graceful shutdown is important especially if
MaxRequestsPerChild is in use on a server with heavy load.

-Noah

-Original Message-
From: Jeff Trawick [mailto:[EMAIL PROTECTED] 
Sent: Friday, August 13, 2004 10:27 AM
To: [EMAIL PROTECTED]
Subject: Re: [PATCH] fix child reclaim timing

On Fri, 13 Aug 2004 14:51:23 +0100, Joe Orton <[EMAIL PROTECTED]> wrote:
> The 2.0 ap_reclaim_child_processes logic seems to be broken - it never
> resets the waittime variable as it did in 1.3; so the parent will wait
> for up to 23 minutes (sic) in total for a stuck child process.  (SIGSTOP
> a child and strace the parent to see for yourself)
> 
> This updates the logic to be a little more sane:
> 
> - at t + 16, 82, 344 ms, just waitpid()
> - at t + 425, 688, 1736 ms, waitpid() else SIGTERM the child
> - at t + 1.74 secs, waitpid() else SIGKILL the child
> - at t + 1.75, 1.82 secs, just waitpid()
> - at t + 2.08 secs, waitpid() else log "this child won't die"
> 
> Any comments?

Here is my take on what is wrong with current code:

1) It starts complaining a bit too soon.  Some third-party modules
have rather complicated child exit strategies.  Whether or not that is
good or bad (bad ;) ), it results in disturbing messages that wouldn't
have appeared if we were a little more patient (2-3 seconds).  Also, I
suspect that the use of threaded MPM affects how quickly the children
are exiting now on Unix.

2) It should never stop checking for exited processes less often than
1-2 seconds, even if it doesn't complain to error log that often. 
Like you say, current code can wait a VERY long time for child
processes to exit.  In practice, I see that it can wait a VERY long
time even after the last child has exited.

I'll agree that it should never wait so long, though I think around 15
or so seconds total is reasonable.  Exiting before children are gone
doesn't let Apache start up any more quickly; it just prevents
potentially-useful information about timing from getting logged to the
error log.

--/--

I wouldn't complain to error log at all until it has been 2 seconds,
and then I'd still wait around for 10-15 more.  But it has to check
every second so it finds out soon after all children have exited and
doesn't sleep needlessly.



Re: New Mod_Proxy - some testing/looking

2004-08-13 Thread William A. Rowe, Jr.
At 04:58 AM 8/13/2004, NormW wrote:
>Good evening Bill, All...
>
>> Please direct these comments to [EMAIL PROTECTED] - b.t.w., you
>> can check out the latest httpd-2.0 HEAD and pick up the entire proxy
>> solution (you must explicitly --enable-proxy-ajp and have the ajplib
>> sources there too.)
>>
>> Someone want to take a wack at NormW's observations?
>Bill, thanks for the interest...
>I can say that the 'problem' noted in proxy_ftp.c has been attended with a
>recent update from Mladen.
>A build on the current 2.1 CVS shows (for NetWare):
>- mod_proxy.c having trouble finding mod_status.h   (in
>../modules/generators),
>- the same problem mentioned previously with proxy_connect.c, (Errors at 375
>see 26, 80)
>Bypassing these I now get:
>Calling NWGNUproxyftp
>Generating Release\proxyftp_cc.opt
>Compiling proxy_ftp.c
>Compiling proxy_util.c
>Compiling ../arch/netware/libprews.c
>Generating Release\proxyftp_link.opt
>Linking Release/proxyftp.nlm
>### mwldnlm Linker Error:
>#   Undefined symbol: 'ap_proxy_ssl_enable'
>#   referenced from 'ap_proxy_connection_create' in proxy_util.o
>### mwldnlm Linker Error:
>#   Undefined symbol: 'ap_proxy_ssl_disable'
>#   referenced from 'ap_proxy_connection_create' in proxy_util.o
>
>As noted above, this is on a NetWare build...

It's expected because the netware gurus haven't had a chance to
catch up.  Please quit using tomcat-dev already - final warning;
ignoring future posts to there in order to enforce posting future
proxy patches/comments to [EMAIL PROTECTED] ;-)

I suspect Netware didn't grok that those symbols were local/global
as required?

Bill




Re: RFE: ap_input_mode_t as bitflags

2004-08-13 Thread Joe Schaefer
Glenn Strauss <[EMAIL PROTECTED]> writes:

[...]

> I'm not sure the answer to this one:
> Are protocol filters attached to the request (I think so)
> or to the connection?  If attached to the request, then 
> wouldn't they need to push-back excess data before the request
> finishes if the data is to be used by later requests on the
> same connection?
> 
> The HTTP_IN filter allocates its ctx from r->pool, so it won't
> survive multiple requests on the same connection.

I don't think HTTP_IN would normally read more than one 
request at a time, but if it did, it could add an
ephemeral connection filter and put the excess brigade
in there.

-- 
Joe Schaefer



Re: RFE: ap_input_mode_t as bitflags

2004-08-13 Thread Glenn Strauss
On Fri, Aug 13, 2004 at 02:36:38PM -0400, Joe Schaefer wrote:
> Glenn Strauss <[EMAIL PROTECTED]> writes:
> > If you are suggesting that there be no line-mode to read from filters,
> 
> I am.
> 
> > then we might also need some sort of way to push excess data back up
> > the filter chain if we pulled it, parsed out lines, and decided that
> > we did not need all of it, e.g. pulling the next HTTP request line
> > into the HTTP_IN filter for the current request.  
> 
> If a protocol filter reads more data than it needs, then that filter
> needs to set-aside the excess data for downstream filters to collect.
> I see no reason why an imput filter needs to push the excess data 
> back upstream.

I'm not sure the answer to this one:
Are protocol filters attached to the request (I think so)
or to the connection?  If attached to the request, then 
wouldn't they need to push-back excess data before the request
finishes if the data is to be used by later requests on the
same connection?

The HTTP_IN filter allocates its ctx from r->pool, so it won't
survive multiple requests on the same connection.

Cheers,
Glenn


Re: RFE: ap_input_mode_t as bitflags

2004-08-13 Thread Glenn Strauss
On Fri, Aug 13, 2004 at 01:21:15PM -0400, Greg Ames wrote:
> Glenn Strauss wrote:
> >On Wed, Aug 11, 2004 at 03:51:13PM -0700, Justin Erenkrantz wrote:
> 
> >>Please back up a bit.
> >>
> >>Why do you think the modes should be combined?  -- justin
> >
> 
> >More details:
> >-
> >Why bitflags, you ask?
> 
> >AP_MODE_MIME_FOLDING = 16
> 
> yuck.
> 
> getline would be more efficient and easier to understand if we removed the 
> folding code and created a separate function whose only job is to do 
> folding, i.e., get single lines one at a time and combine them if 
> appropriate. ap_get_mime_headers_core() works pretty much like that now, 
> but it has optimizations and additional functions that won't work for a 
> more general ap_get_folded_line() or whatever.

ap_rgetline and ap_get_mime_headers are too specific to be useful
except in a few places.  They both pull from r->input_filters, which
makes them unusable by any routine that wants to pull a line from a
brigade and is located in the filter chain anywhere other than at the
end.

I'm aiming more at a generic API to aid routines that have a brigade
(that they can stick in an ap_filter_ra_t) and have these routines
do the dirty work, whether it be to pull a certain number of bytes or
mime folding or speculative reads.

> If the seldom used folding code is removed from the basic getline, there 
> will be no need to test a folding flag/parameter in the usual non-folding 
> case, and the instruction cache hit rate will improve.

If you're concerned with icache performance, then there are plenty
of other places where Apache could benefit from a little code reorg.
(We all know that increasing code reuse and reducing code duplication
often improves icache performance.)

> As far as I know, only mod_proxy calls ap_getline today with the folding 
> flag set.  Third party modules could do the same, but if we changed this on 
> a release boundary it shouldn't be too disruptive.

That's because ap_get_mime_headers() implements its own folding.
The hoop-jumping behavior of ap_rgetline_core() to not remove all
whitespace from a line containing only whitespace is not needed
if the API that detects MIME folding is at a lower lever and can
return consistent values for completed lines.

(An alternative is not having ap_rgetline_core() remove whitespace
 and having each routine that calls it do it itself.)



Any comments on the ap_rgetline() and ap_get_mime_headers() examples
using ap_brigade_ra* that I posted to the list this morning?

One of the reasons I wrote this API is to be able to make the core
routines ap_rgetline_core(), ap_get_mime_headers(), and others
fully non-blocking if APR_NONBLOCKING_READ is requested.  (Instead
of using ap_regetline_core() and ap_get_mime_headers(), I'd make
other routines that took an (ap_filter_ra_t *) as an arg so that
state could be kept when nonblocking was desired)

Do you have other suggestions how to accomplish this without having 
something keeping state (such as the readahead API I proposed)?

Thanks.
Glenn


Re: RFE: ap_input_mode_t as bitflags

2004-08-13 Thread Joe Schaefer
Glenn Strauss <[EMAIL PROTECTED]> writes:

[...]

> > I really don't like the very confusing AP_MODE_* semantics
> 
> Would they be less confusing if the behavior was more consistent?

No, because of the burden these modes place other filters (eg
mod_deflate).  Downstream input-filters / content-handlers cannot 
rely on upsteam filters doing anything sane with modes other than 
AP_MODE_READBYTES.

> That's what I'm trying to accomplish, too.

Agreed.

> > as a fundamental component of the input filter API.  Why not just 
> > replace ap_get_mime_headers_core with a "request_parser" filter 
> > (or add this functionality to ap_http_filter) that parses both 
> > the request line and the incoming headers?  If that were done, 
> > there would be no need for an AP_MODE_GETLINE invocation (hence 
> > no apr_brigade_split_line in core_input_filter).
> 
> The problem is that many protocols are based around line-by-line
> or MIME headers delimited by a blank CRLF line.  And even when
> not in filters, I find myself having to parse lines out of brigades.

Then IMO those protocols need a protocol filter which implements
without extending their protocol's specific needs into the
upstream filters.

> If you are suggesting that there be no line-mode to read from filters,

I am.

> then we might also need some sort of way to push excess data back up
> the filter chain if we pulled it, parsed out lines, and decided that
> we did not need all of it, e.g. pulling the next HTTP request line
> into the HTTP_IN filter for the current request.  

If a protocol filter reads more data than it needs, then that filter
needs to set-aside the excess data for downstream filters to collect.
I see no reason why an imput filter needs to push the excess data 
back upstream.


-- 
Joe Schaefer



Re: RFE: ap_input_mode_t as bitflags

2004-08-13 Thread Greg Ames
Glenn Strauss wrote:
On Wed, Aug 11, 2004 at 03:51:13PM -0700, Justin Erenkrantz wrote:

Please back up a bit.
Why do you think the modes should be combined?  -- justin


More details:
-
Why bitflags, you ask?

AP_MODE_MIME_FOLDING = 16
yuck.
getline would be more efficient and easier to understand if we removed the 
folding code and created a separate function whose only job is to do folding, 
i.e., get single lines one at a time and combine them if appropriate. 
ap_get_mime_headers_core() works pretty much like that now, but it has 
optimizations and additional functions that won't work for a more general 
ap_get_folded_line() or whatever.

If the seldom used folding code is removed from the basic getline, there will be 
no need to test a folding flag/parameter in the usual non-folding case, and the 
instruction cache hit rate will improve.

As far as I know, only mod_proxy calls ap_getline today with the folding flag 
set.  Third party modules could do the same, but if we changed this on a release 
boundary it shouldn't be too disruptive.

Greg



Re: RFE: ap_input_mode_t as bitflags

2004-08-13 Thread Glenn Strauss
On Fri, Aug 13, 2004 at 12:37:30PM -0400, Joe Schaefer wrote:
> Justin Erenkrantz <[EMAIL PROTECTED]> writes:
> 
> [...]
> 
> > Therefore, folding might only be possible to do in ap_http_filter, but
> > it can't go down further as into core_input_filter (which is where we
> > now call apr_brigade_split_line). A new getline_folding filter right
> > in front of ap_http_filter would do the trick,
> 
> I really don't like the very confusing AP_MODE_* semantics

Would they be less confusing if the behavior was more consistent?
That's what I'm trying to accomplish, too.

> as a fundamental component of the input filter API.  Why not just 
> replace ap_get_mime_headers_core with a "request_parser" filter 
> (or add this functionality to ap_http_filter) that parses both 
> the request line and the incoming headers?  If that were done, 
> there would be no need for an AP_MODE_GETLINE invocation (hence 
> no apr_brigade_split_line in core_input_filter).

The problem is that many protocols are based around line-by-line
or MIME headers delimited by a blank CRLF line.  And even when
not in filters, I find myself having to parse lines out of brigades.

If you are suggesting that there be no line-mode to read from filters,
then we might also need some sort of way to push excess data back up
the filter chain if we pulled it, parsed out lines, and decided that
we did not need all of it, e.g. pulling the next HTTP request line
into the HTTP_IN filter for the current request.  IFF the HTTP_IN
filter was right before the connection level filters, it could insert
a filter at the bottom of the connection filters and push back the
excess data.  However, I think this is more complicated and more fragile
than having an API -- such as the readahead API I proposed -- that
handles the AP_MODE_* semantics for filters that use it.

Cheers,
Glenn


Increasing LimitRequestFieldsize

2004-08-13 Thread Mathihalli, Madhusudan
Hello,
I was wondering if there's any potential harm in increasing the
LimitRequestFieldsize from it's current value of 8k to something more
(like 32k).

Thanks
-Madhu


Re: RFE: ap_input_mode_t as bitflags

2004-08-13 Thread Joe Schaefer
Justin Erenkrantz <[EMAIL PROTECTED]> writes:

[...]

> Therefore, folding might only be possible to do in ap_http_filter, but
> it can't go down further as into core_input_filter (which is where we
> now call apr_brigade_split_line). A new getline_folding filter right
> in front of ap_http_filter would do the trick,

I really don't like the very confusing AP_MODE_* semantics as a
fundamental component of the input filter API.  Why not just 
replace ap_get_mime_headers_core with a "request_parser" filter 
(or add this functionality to ap_http_filter) that parses both 
the request line and the incoming headers?  If that were done, 
there would be no need for an AP_MODE_GETLINE invocation (hence 
no apr_brigade_split_line in core_input_filter).

-- 
Joe Schaefer



[PATCH] proxy_ajp.c

2004-08-13 Thread jean-frederic clere
Hi,
I have arranged the logic to send the request and the body to Tomcat, now it 
works for both chunked and not-chunked.

What is not yet working is when the Tomcat starts to send data before having all 
the body and then reads a little more body data.

Cheers
Jean-Frederic
Index: proxy_ajp.c
===
RCS file: /home/cvspublic/httpd-2.0/modules/proxy/proxy_ajp.c,v
retrieving revision 1.13
diff -u -r1.13 proxy_ajp.c
--- proxy_ajp.c 11 Aug 2004 23:08:04 -  1.13
+++ proxy_ajp.c 13 Aug 2004 16:11:00 -
@@ -107,6 +107,10 @@
 apr_status_t status;
 int result;
 apr_bucket_brigade *input_brigade;
+ajp_msg_t *msg;
+apr_size_t bufsiz;
+char *buff;
+const char *tenc;
 
 /*
  * Send the AJP request to the remote server
@@ -123,51 +127,59 @@
 return HTTP_SERVICE_UNAVAILABLE;
 }
 
-/* read the first bloc of data */
-input_brigade = apr_brigade_create(p, r->connection->bucket_alloc);
-status = ap_get_brigade(r->input_filters, input_brigade,
-AP_MODE_READBYTES, APR_BLOCK_READ,
-AJP13_MAX_SEND_BODY_SZ);
- 
+/* allocate an AJP message to store the data of the buckets */
+status = ajp_alloc_data_msg(r, &buff, &bufsiz, &msg);
 if (status != APR_SUCCESS) {
 ap_log_error(APLOG_MARK, APLOG_DEBUG, 0, r->server,
- "proxy: ap_get_brigade failed");
-apr_brigade_destroy(input_brigade);
-return HTTP_INTERNAL_SERVER_ERROR;
+ "proxy: ajp_alloc_data_msg failed");
+return status;
 }
-
-/* have something */
-if (APR_BUCKET_IS_EOS(APR_BRIGADE_LAST(input_brigade))) {
+/* read the first bloc of data */
+input_brigade = apr_brigade_create(p, r->connection->bucket_alloc);
+tenc = apr_table_get(r->headers_in, "Transfer-Encoding");
+if (tenc && strcasecmp(tenc, "chunked")==0) {
+ /* The AJP protocol does not want body data yet */
 ap_log_error(APLOG_MARK, APLOG_DEBUG, 0, r->server,
- "proxy: APR_BUCKET_IS_EOS");
-}
-
-if (1) { /*  only when something to send ? */
-ajp_msg_t *msg;
-apr_size_t bufsiz;
-char *buff;
-status = ajp_alloc_data_msg(r, &buff, &bufsiz, &msg);
+ "proxy: request is chunked");
+} else {
+status = ap_get_brigade(r->input_filters, input_brigade,
+AP_MODE_READBYTES, APR_BLOCK_READ,
+AJP13_MAX_SEND_BODY_SZ);
+ 
 if (status != APR_SUCCESS) {
-return status;
+ap_log_error(APLOG_MARK, APLOG_DEBUG, 0, r->server,
+ "proxy: ap_get_brigade failed");
+apr_brigade_destroy(input_brigade);
+return HTTP_INTERNAL_SERVER_ERROR;
 }
+
+/* have something */
+if (APR_BUCKET_IS_EOS(APR_BRIGADE_LAST(input_brigade))) {
+ap_log_error(APLOG_MARK, APLOG_DEBUG, 0, r->server,
+ "proxy: APR_BUCKET_IS_EOS");
+}
+
+/* Try to send something */
 ap_log_error(APLOG_MARK, APLOG_DEBUG, 0, r->server,
  "proxy: data to read (max %d at %08x)", bufsiz, buff);
 
-/*  calls apr_brigade_flatten... */
 status = apr_brigade_flatten(input_brigade, buff, &bufsiz);
 if (status != APR_SUCCESS) {
- ap_log_error(APLOG_MARK, APLOG_ERR, status, r->server,
- "proxy: apr_brigade_flatten");
+apr_brigade_destroy(input_brigade);
+ap_log_error(APLOG_MARK, APLOG_ERR, status, r->server,
+ "proxy: apr_brigade_flatten");
 return HTTP_INTERNAL_SERVER_ERROR;
 }
+apr_brigade_cleanup(input_brigade);
 
 ap_log_error(APLOG_MARK, APLOG_DEBUG, 0, r->server,
  "proxy: got %d byte of data", bufsiz);
 if (bufsiz > 0) {
 status = ajp_send_data_msg(conn->sock, r, msg, bufsiz);
 if (status != APR_SUCCESS) {
+apr_brigade_destroy(input_brigade);
 ap_log_error(APLOG_MARK, APLOG_ERR, status, r->server,
- "proxy: request failed to %pI (%s)",
+ "proxy: send failed to %pI (%s)",
  conn->worker->cp->addr,
  conn->worker->hostname);
 return HTTP_SERVICE_UNAVAILABLE;
@@ -179,8 +191,9 @@
 status = ajp_read_header(conn->sock, r,
  (ajp_msg_t **)&(conn->data));
 if (status != APR_SUCCESS) {
+apr_brigade_destroy(input_brigade);
 ap_log_error(APLOG_MARK, APLOG_ERR, status, r->server,
- "proxy: request failed to %pI (%s)",
+ "proxy: read response failed from %pI (%s)",
  conn->worker->cp->addr,
   

Re: cvs commit: httpd-2.0/modules/proxy proxy_util.c

2004-08-13 Thread Jeff Trawick
On Fri, 13 Aug 2004 10:07:27 -0500, William A. Rowe, Jr.
<[EMAIL PROTECTED]> wrote:
> At 07:20 AM 8/13/2004, [EMAIL PROTECTED] wrote:
> >trawick 2004/08/13 05:20:53
> >
> >  Modified:modules/proxy proxy_util.c
> >  Log:
> >  axe some unused variables and don't log an error code that
> >  hasn't been initialized
> 
> >  -ap_log_error(APLOG_MARK, APLOG_ERR, rv, s,
> >  +ap_log_error(APLOG_MARK, APLOG_ERR, 0, s,
> >"proxy: %s: disabled connection for (%s)",
> >proxy_function, worker->hostname);
> 
> Shouldn't that be APLOG_ERR|APLOG_NOERRNO if rv isn't passed?

no.  1.3 required APLOG_NOERRNO so that it didn't look at errno;

the natural logic of formatting the error information when the
apr_status_t parameter is non-APR_SUCCESS is what is used

APLOG_NOERRNO is still defined but has no effect


Re: cvs commit: httpd-2.0/modules/proxy proxy_util.c

2004-08-13 Thread William A. Rowe, Jr.
At 07:20 AM 8/13/2004, [EMAIL PROTECTED] wrote:
>trawick 2004/08/13 05:20:53
>
>  Modified:modules/proxy proxy_util.c
>  Log:
>  axe some unused variables and don't log an error code that
>  hasn't been initialized


>  -ap_log_error(APLOG_MARK, APLOG_ERR, rv, s,
>  +ap_log_error(APLOG_MARK, APLOG_ERR, 0, s,
>"proxy: %s: disabled connection for (%s)",
>proxy_function, worker->hostname);

Shouldn't that be APLOG_ERR|APLOG_NOERRNO if rv isn't passed?

Bill




Re: [PATCH] fix child reclaim timing

2004-08-13 Thread Jeff Trawick
On Fri, 13 Aug 2004 14:51:23 +0100, Joe Orton <[EMAIL PROTECTED]> wrote:
> The 2.0 ap_reclaim_child_processes logic seems to be broken - it never
> resets the waittime variable as it did in 1.3; so the parent will wait
> for up to 23 minutes (sic) in total for a stuck child process.  (SIGSTOP
> a child and strace the parent to see for yourself)
> 
> This updates the logic to be a little more sane:
> 
> - at t + 16, 82, 344 ms, just waitpid()
> - at t + 425, 688, 1736 ms, waitpid() else SIGTERM the child
> - at t + 1.74 secs, waitpid() else SIGKILL the child
> - at t + 1.75, 1.82 secs, just waitpid()
> - at t + 2.08 secs, waitpid() else log "this child won't die"
> 
> Any comments?

Here is my take on what is wrong with current code:

1) It starts complaining a bit too soon.  Some third-party modules
have rather complicated child exit strategies.  Whether or not that is
good or bad (bad ;) ), it results in disturbing messages that wouldn't
have appeared if we were a little more patient (2-3 seconds).  Also, I
suspect that the use of threaded MPM affects how quickly the children
are exiting now on Unix.

2) It should never stop checking for exited processes less often than
1-2 seconds, even if it doesn't complain to error log that often. 
Like you say, current code can wait a VERY long time for child
processes to exit.  In practice, I see that it can wait a VERY long
time even after the last child has exited.

I'll agree that it should never wait so long, though I think around 15
or so seconds total is reasonable.  Exiting before children are gone
doesn't let Apache start up any more quickly; it just prevents
potentially-useful information about timing from getting logged to the
error log.

--/--

I wouldn't complain to error log at all until it has been 2 seconds,
and then I'd still wait around for 10-15 more.  But it has to check
every second so it finds out soon after all children have exited and
doesn't sleep needlessly.


[PATCH] fix child reclaim timing

2004-08-13 Thread Joe Orton
The 2.0 ap_reclaim_child_processes logic seems to be broken - it never
resets the waittime variable as it did in 1.3; so the parent will wait
for up to 23 minutes (sic) in total for a stuck child process.  (SIGSTOP
a child and strace the parent to see for yourself)

This updates the logic to be a little more sane:

- at t + 16, 82, 344 ms, just waitpid()
- at t + 425, 688, 1736 ms, waitpid() else SIGTERM the child
- at t + 1.74 secs, waitpid() else SIGKILL the child
- at t + 1.75, 1.82 secs, just waitpid()
- at t + 2.08 secs, waitpid() else log "this child won't die" 

Any comments?

Index: mpm_common.c
===
RCS file: /home/cvs/httpd-2.0/server/mpm_common.c,v
retrieving revision 1.120
diff -u -r1.120 mpm_common.c
--- mpm_common.c15 Mar 2004 23:08:41 -  1.120
+++ mpm_common.c13 Aug 2004 13:42:47 -
@@ -70,7 +70,7 @@
 
 ap_mpm_query(AP_MPMQ_MAX_DAEMON_USED, &max_daemons);
 
-for (tries = terminate ? 4 : 1; tries <= 9; ++tries) {
+for (tries = terminate ? 4 : 1; tries <= 10; ++tries) {
 /* don't want to hold up progress any more than
  * necessary, but we need to allow children a few moments to exit.
  * Set delay with an exponential backoff.
@@ -98,13 +98,15 @@
 switch (tries) {
 case 1: /*  16ms */
 case 2: /*  82ms */
+break;
+
 case 3: /* 344ms */
-case 4: /*  16ms */
+waittime = 16 * 1024;
 break;
-
-case 5: /*  82ms */
-case 6: /* 344ms */
-case 7: /* 1.4sec */
+
+case 4: /* 360ms */
+case 5: /* 425ms */
+case 6: /* 688ms */
 /* ok, now it's being annoying */
 ap_log_error(APLOG_MARK, APLOG_WARNING,
  0, ap_server_conf,
@@ -114,7 +116,7 @@
 kill(pid, SIGTERM);
 break;
 
-case 8: /*  6 sec */
+case 7: /*  1.74 sec */
 /* die child scum */
 ap_log_error(APLOG_MARK, APLOG_ERR,
  0, ap_server_conf,
@@ -132,9 +134,14 @@
  */
 kill_thread(pid);
 #endif
+waittime = 16 * 1024;
+break;
+
+case 8: /* 1.75 sec */
+case 9: /* 1.82 sec */
 break;
 
-case 9: /* 14 sec */
+case 10:/* 2.08 secs */
 /* gave it our best shot, but alas...  If this really
  * is a child we are trying to kill and it really hasn't
  * exited, we will likely fail to bind to the port


Re: [AJP] proxy status

2004-08-13 Thread Jeff Trawick
On Thu, 12 Aug 2004 09:19:59 -0700, Justin Erenkrantz
<[EMAIL PROTECTED]> wrote:
> --On Thursday, August 12, 2004 12:57 AM -0500 "William A. Rowe, Jr."
> <[EMAIL PROTECTED]> wrote:
> 
> > Although he's subscribed to all three lists, I'd ask that they go either
> > to [EMAIL PROTECTED] or [EMAIL PROTECTED]  The history of the discussions is just
> > as important as the actual code commits.
> 
> Can we please not use [EMAIL PROTECTED]  As long as the code resides in the
> main httpd repository, development discussion belongs on [EMAIL PROTECTED], IMHO.  --
> justin

+1


Re: RFE: ap_input_mode_t as bitflags

2004-08-13 Thread Glenn Strauss
On Thu, Aug 12, 2004 at 10:20:14AM -0700, Justin Erenkrantz wrote:
> --On Thursday, August 12, 2004 2:51 AM -0400 Glenn Strauss 
> <[EMAIL PROTECTED]> wrote:

> >of code duplication between modules.  For example, the behavior of
> >line-mode is vauge and requires that callers re-parse the brigade
> >to check for complete lines.  I've got a whole litany of things,
> 
> It's for the reason listed below...
> 
> >including bad code examples of brigade parsing in the Apache core,
> >but that's another post.
> 
> Well, pointers as to where things aren't right are certainly appreciated.

Certain design decisions result in situations that are annoying
and difficult to work with.

(Some moreso than others, and I am in no way suggesting that all
 of these can easily be fixed or even that they should be fixed)

- The way that all line-mode reads work, the caller must re-parse
  the returned brigade, repeating exactly what upstream filters did.
  The caller must read every bucket, check the length, and check that
  it is LF terminated, when the upstream filters already knew those
  answers.  And because partial lines are returned, the caller must
  do this parsing even when upstream knows that the line is not
  complete.  Also, because partial lines are returned, upstream
  filters do not know when a line length limit has been reached
  if portions of the line had been returned previously.

- ap_http_filter() blocks on read chunk lines and chunk trailers,
  even when APR_NONBLOCK_READ is requested.  Actually, anything that
  uses ap_rgetline() or ap_get_mime_headers() will block.

- apr_brigade_split_line() might return more than readbytes
  The entire bucket is returned when readbytes is exceeded, instead
  of bucket splitting at the limit.

- apr_brigade_split_line() in core_input_filter() uses HUGE_STRING_LEN
  as maximum line length; regardless of what caller specified in
  readbytes.

- Every filter that buffers buckets must reinvent the wheel to protect
  against the buildup of many, many buckets each containing 1-byte,
  or other bucket fragmentation fun.

- AP_MODE_SPECULATIVE only seems to work for reading bytes from connection
  filters.  If it were used during chunking (it is not in practice),
  then it might return portions of chunk lines or chunk trailers
  along with the data, and downstream caller would not know the
  difference.  In other words, AP_MODE_SPECULATIVE is not useful
  to too many filters unless the filter knows certain things about
  where it is in the filter stack.  That means that ap_rgetline()
  can not be used except by those same filters that are hyper-aware
  of their location in the filter stack and ramifications of pulling
  data speculatively.  As it is, AP_MODE_SPECULATIVE is a noticeable
  amount of extra work for the SSL and HTTP_IN filters.

- apr_bucket_read() does not bother to look at the value in &len
  that was passed in.  It always reads APR_BUCKET_BUFF_SIZE
  (8000 bytes) in the case of sockets and pipes.  That's good for
  efficiency, but there is no way for a caller to specify
  "only read X bytes" when it really, really means "only read X bytes"

- the close() routines for pipes and sockets burn a system call if
  the fd is already closed and marked as such.  This could easily
  be avoided by checking if the fd < 0 in the pipe or socket bucket
  before attempting the close() on the pipe or socket bucket fd


That's all I can think of at the moment.

Cheers,
Glenn


Re: RFE: ap_input_mode_t as bitflags

2004-08-13 Thread Glenn Strauss
On Thu, Aug 12, 2004 at 02:06:39PM -0700, Justin Erenkrantz wrote:
> --On Thursday, August 12, 2004 3:52 PM -0400 Glenn Strauss 
> <[EMAIL PROTECTED]> wrote:
> 
> >I saw so much repeated code for parsing brigades, that I created a
> >"readahead" API: ap_brigade_ra().  It is passed similar arguments as
> >those to input filters, and additionally is passed a readahead struct
> >and a readahead limit.  This abstraction acts as a buffer and parses
> >out bytes and lines, reading from upstream filters when it needs to.
> 
> Okay, I have no idea what this ap_brigade_ra() API is or what its intention 
> is based upon your description.

Code reuse and consistent input filter behavior. :)

> You keep referring to the 'readahead API' 
> as the magic solution for all of the issues I raised.  So, I respectfully 
> ask that you please provide this API with examples to this list if you want 
> to continue the discussion.

I'll try.  Apologies for the length.
The following is an attempt at some documentation.

The code below has _not_ been tested, and in fact compilation has not
even been attempted.  It's been cut-n-pasted from various places in
my code.  Still I would be happy to fix any mistakes that are pointed out.

Thanks!
Glenn

(If there are any questions of license, the code that is mine below
 is released under the revised BSD license.  Use it in good health.)



typedef struct filter_readahead_ctx_t {

apr_bucket_brigade *readahead;
apr_bucket_brigade *partial;
apr_bucket_brigade *lookahead;

/* (internal use) bucket following last in-memory data bucket */
apr_bucket *b;

/* length of data read ahead */
/* (up to end of brigade or to first bucket that is not in-memory) */
/* (includes length of data in both readahead and 'partial' brigades) */
/* caller MUST update this if it modifies readahead brigade */
apr_off_t len_readahead;

/* length of data in 'partial' brigade (all buckets are in-memory) */
/* caller MUST update this if it modifies 'partial' brigade */
apr_off_t len_partial;

/* (private) running count of data set aside from ra->readahead
 * when performing speculative reads (AP_MODE_PEEK) */
apr_off_t len_lookahead;

/* (private) running count of data from upstream (appended to readahead)
 * in excess of caller-specified max length to readahead when reading
 * from a non-in-memory bucket.  (On subsequent calls to read routines,
 * it is added to len_upstream to simulate reads from upstream) */
apr_off_t len_pending;

/* running count of data read from upstream (appended to readahead) */
/* (includes data that was subsequently accounted in len_downstream)*/
/* caller may reset this at any time */
apr_off_t len_upstream;

/* running count of data sent downstream (appended to caller bb) */
/* caller may reset this at any time */
apr_off_t len_downstream;

} ap_filter_ra_t;


AP_DECLARE(void)
ap_filter_ra_init(ap_filter_ra_t * const restrict ra,
  apr_pool_t * const restrict pool,
  apr_bucket_alloc_t * const restrict bucket_alloc)
{
memset(ra, '\0', sizeof(ap_filter_ra_t));
ra->readahead = apr_brigade_create(pool, bucket_alloc);

/* (created when needed)
ra->partial   = apr_brigade_create(pool, bucket_alloc);
ra->lookahead = apr_brigade_create(pool, bucket_alloc);
 */

ra->b = APR_BRIGADE_SENTINEL(ra->readahead);
}

/* (not strictly necessary, but frees brigades sooner than connection cleanup)*/
AP_DECLARE(void)
ap_filter_ra_destroy(ap_filter_ra_t * const restrict ra)
{
if (ra->readahead != NULL) {
apr_brigade_destroy(ra->readahead);
}
if (ra->lookahead != NULL) {
apr_brigade_destroy(ra->lookahead);
}
if (ra->partial   != NULL) {
apr_brigade_destroy(ra->partial);
}
}



/**
 * Read bytes from upstream.
 * @param ra readahead structure to keep readahead state
 * @param f current filter (not f->next)
 *(e.g. filter with which ra is associated)
 * @param bb brigade in which to return results
 * @param r_mode AP_MODE_BYTES (and not AP_MODE_LINE), plus optional flags
 *(neither or one of AP_MODE_PASS_SPECIALS or AP_MODE_PEEK, not both)
 *(@see ap_input_mode_t)
 * @param r_type APR_BLOCK_READ or APR_NONBLOCK_READ
 *(@see apr_read_type_e)
 * @param r_limit max number bytes to return in bb
 * @param ra_limit max number of bytes to readahead
 *(includes data buffered in ra and data read from upstream)
 *(if ra_limit <= 0, r_limit is used as ra_limit)
 * @return APR_SUCCESS and buckets appended to bb if successful
 * APR_EINVAL if AP_MODE_PASS_SPECIALS combined with AP_MODE_PEEK
 * or other return values from
 *   ap_get_brigade(), apr_bucket_read(), or apr_bucket_split()
 * @remark metadata buckets are return

Re: [AJP] proxy status

2004-08-13 Thread jean-frederic clere
Guenter Knauf wrote:
Hi,
That seems rational to me.  The reason for proposing [EMAIL PROTECTED] is so
that tomcat-dev'ers wouldn't have to swallow the full bandwidth of 
[EMAIL PROTECTED] (converse of the problem where they asked anyone in [EMAIL PROTECTED] to
follow [EMAIL PROTECTED] for the duration of that proxy_ajp development).
hahahahha! I get 5x++ more traffic from tomcat-dev than from [EMAIL PROTECTED] and
[EMAIL PROTECTED] together because: - the commit mails go to the same list as the
dicussion stuff. - there are so many folks subscribed unable to control their
mail server that nearly every day a couple of autoresponder mails come
through the list. - there are so many folks subscribed unable to protect
their machine from viruses so that every day some viruses come through the
list.
in addition its anyway a pain with the tomcat-dev list since it drops any
attachments which makes it harder for everyone to attach a patch. Reason: the
stupid and totally senseless footer. Even with the footer every few days
another idiot asks how to unsubscribe - I think that proofes enough that it
is useless!

So I ask our tomcat-dev'ers who are interested in proxy_ajp, proxy_balancer
 and so on - are you already subscribed/following [EMAIL PROTECTED]  Or do you 
feel a -strong- need for a lower-traffic list?  If no one complains loudly,
we will keep all proxy traffic on [EMAIL PROTECTED] (cc's to tomcat-dev if you feel
a point needs feedback from the tomcat connector folks.)
I'm fine with that. Since the code now moved into httpd HEAD anyway no jtc
commiter can commit any more; I f.e. have only karma for the connectors,
which means now I can only commit to the old dead code; so for me I see no
reason to keep subscribed to tomcat-dev list; the few posts which are really
about connectors I can read online, or wait a bit and then crawl though
bugzilla when the problems appear there
The idea when starting the ajp-proxy was to get it in httpd HEAD.
The first developpement were done in j-t-c (jakarta-tomcat-connectors) only 
because Henri, Mladen and I don't have commit in httpd and we wanted something 
running for testing very quicky.

Now the code is at its right place and we (as Jakarta committers) have to submit 
patches. (That is what we had done to get the commit rights in Jakarta).

my 2ct.
Guenter.
BTW: also it was asked more than once for a separate section only for
connectors in BugZilla - currently you have to do complex searches to get all
the connector issues together since they are bound to Tomcat releases, but
the connectors are developed independent from Tomcat.