Re: [community] 2.3.0 alpha on October 1?
>> (Speaking of pet peeves -- why does Apache handle so many things >> besides HTTP, and yet I have to get other servers to handle certain >> kinds of HTTP requests because Apache doesn't handle it well?) 100K concurrent requests, kept open.
Re: svn commit: r691418 [2/2] - in/httpd/httpd/trunk:./docs/manual/mod/ modules/filters/
Patch is attached. I have tested the patch. It now prints \10 and \11 for character values 0x8 and 0x9. Regards, Basant. On Wed, Sep 10, 2008 at 04:00:42PM +0200, "Plüm, Rüdiger, VF-Group" wrote: > > > > -Ursprüngliche Nachricht- > > Von: [EMAIL PROTECTED] > > Gesendet: Mittwoch, 10. September 2008 15:54 > > An: dev@httpd.apache.org > > Betreff: Re: svn commit: r691418 [2/2] - > > in/httpd/httpd/trunk:./docs/manual/mod/ modules/filters/ > > > > > > > Can you elaborate why you chose "<" and ">". I could not > > think of any > > > > reasons behind it. > > > > > > Because these characters are currently displayed by sed and > > I saw no reason to change > > > it. > > > > > > > sed's man page says : > > > > (2)lList the pattern space on the standard out- > > > > put in an unambiguous form. Non-printing > > > > characters are spelled in two digit ASCII > > > > and long lines are folded. > > > > > > > > So 0x8 and 0x9 char values, which I believe are non printable > > > > characters, > > > > should be printed into *two* digit ASCII. So \10 and \11 > > looks to me > > > > as conforming to man page. > > > > > > As said displaying two digit ASCII's also makes sense and > > conforms to the man > > > page. So go for it. > > > BTW: Shouldn't it be \08 and \09 for 0x8 and 0x9 instead of > > \10 and \11 ? > > If we notice, how other characters are printed then sed code > > is using octal > > numbers. After \17 it changes to \20, \21. Similarly After > > \27 it changes to > > \30, \31 etc. So based on the above pattern, \10 and \11 > > seems to be more > > consistent to me. > > Thanks for pointing out. I missed that these were actual octal numbers. > So yes, you are correct. > > Regards > > Rüdiger > Index: modules/filters/sed1.c === --- modules/filters/sed1.c (revision 692275) +++ modules/filters/sed1.c (working copy) @@ -33,8 +33,8 @@ "\\05", "\\06", "\\07", -"-<", -"->", +"\\10", +"\\11", "\n", "\\13", "\\14",
Re: svn commit: r693697 - in /httpd/sandbox/mod_valid_utf8: ./ README mod_valid_utf8.c
On 09/10/2008 05:54 AM, [EMAIL PROTECTED] wrote: Author: ianh Date: Tue Sep 9 20:53:59 2008 New Revision: 693697 URL: http://svn.apache.org/viewvc?rev=693697&view=rev Log: initial check in. this filter validates that the incoming request contains valid UTF8 characters. Added: httpd/sandbox/mod_valid_utf8/ httpd/sandbox/mod_valid_utf8/README (with props) httpd/sandbox/mod_valid_utf8/mod_valid_utf8.c (with props) Added: httpd/sandbox/mod_valid_utf8/mod_valid_utf8.c URL: http://svn.apache.org/viewvc/httpd/sandbox/mod_valid_utf8/mod_valid_utf8.c?rev=693697&view=auto == + +static apr_status_t utf8_in_filter(ap_filter_t *f, +apr_bucket_brigade *bb, +ap_input_mode_t mode, +apr_read_type_e block, +apr_off_t readbytes) { +apr_status_t rv; +utf8_ctx *ctx = f->ctx; + +if (!ctx) { +ctx = apr_pcalloc( f->c->pool, sizeof(*ctx)); +ctx->bb = apr_brigade_create( f->c->pool, f->c->bucket_alloc ); +} +rv = ap_get_brigade(f->next, ctx->bb, mode, block, readbytes); +if ( rv != APR_SUCCESS ) { +return rv; +} + +while ( !APR_BRIGADE_EMPTY(ctx->bb) ) { +apr_size_t length; +char *buffer; +const char *data; +apr_bucket *cpy; +int i ; +apr_bucket *b; + +b = APR_BRIGADE_FIRST(ctx->bb ); +APR_BUCKET_REMOVE(b); +if ( APR_BUCKET_IS_EOS(b) ) { +APR_BRIGADE_INSERT_TAIL( bb, b); +break; +} +length = b->length; + +rv = apr_bucket_read( b, &data, &length, APR_BLOCK_READ); +if ( rv != APR_SUCCESS ) { +return rv; +} + +buffer = validate_buffer(f->c->pool, data, &length); What do you do when a multibyte character is split over multiple buckets? Regards Rüdiger
Re: svn commit: r691418 [2/2] - in/httpd/httpd/trunk:./docs/manual/mod/ modules/filters/
> -Ursprüngliche Nachricht- > Von: [EMAIL PROTECTED] > Gesendet: Mittwoch, 10. September 2008 15:54 > An: dev@httpd.apache.org > Betreff: Re: svn commit: r691418 [2/2] - > in/httpd/httpd/trunk:./docs/manual/mod/ modules/filters/ > > > > Can you elaborate why you chose "<" and ">". I could not > think of any > > > reasons behind it. > > > > Because these characters are currently displayed by sed and > I saw no reason to change > > it. > > > > > sed's man page says : > > > (2)lList the pattern space on the standard out- > > > put in an unambiguous form. Non-printing > > > characters are spelled in two digit ASCII > > > and long lines are folded. > > > > > > So 0x8 and 0x9 char values, which I believe are non printable > > > characters, > > > should be printed into *two* digit ASCII. So \10 and \11 > looks to me > > > as conforming to man page. > > > > As said displaying two digit ASCII's also makes sense and > conforms to the man > > page. So go for it. > > BTW: Shouldn't it be \08 and \09 for 0x8 and 0x9 instead of > \10 and \11 ? > If we notice, how other characters are printed then sed code > is using octal > numbers. After \17 it changes to \20, \21. Similarly After > \27 it changes to > \30, \31 etc. So based on the above pattern, \10 and \11 > seems to be more > consistent to me. Thanks for pointing out. I missed that these were actual octal numbers. So yes, you are correct. Regards Rüdiger
Re: Query on ap_lingering_close
On Wed, Sep 10, 2008 at 9:46 AM, Arnab Ganguly <[EMAIL PROTECTED]> wrote: > Hi All, > I am getting lot of CLOSE_WAIT ,SYNC_RCV and TIME_WAIT state when I do a > netstat and result it is Apache pause. > What would be the ideal approach to solve the issue.I am planning for > explicit call of ap_lingering_close after reading the client request. Wouldn't that prevent Apache from writing a response? Maybe [EMAIL PROTECTED] could help you identify why you server is hanging. -- Eric Covener [EMAIL PROTECTED]
Re: svn commit: r691418 [2/2] - in /httpd/httpd/trunk:./docs/manual/mod/ modules/filters/
On Wed, Sep 10, 2008 at 03:41:21PM +0200, "Plüm, Rüdiger, VF-Group" wrote: > > > > -Ursprüngliche Nachricht- > > Von: [EMAIL PROTECTED] > > Gesendet: Mittwoch, 10. September 2008 15:32 > > An: dev@httpd.apache.org > > Betreff: Re: svn commit: r691418 [2/2] - in > > /httpd/httpd/trunk:./docs/manual/mod/ modules/filters/ > > > > On Wed, Sep 10, 2008 at 10:10:20AM +0200, "Plüm, Rüdiger, > > VF-Group" wrote: > > > > > > > > > > -Ursprüngliche Nachricht- > > > > Von: [EMAIL PROTECTED] > > > > Gesendet: Mittwoch, 10. September 2008 08:56 > > > > An: dev@httpd.apache.org > > > > Betreff: Re: svn commit: r691418 [2/2] - in > > > > /httpd/httpd/trunk: ./docs/manual/mod/ modules/filters/ > > > > > > > > > > > I investigated further. I wrote a test file having binary > > > > character from 0 to 31: > > > > $ od -c out.txt > > > > 000 \0 \n 001 \n 002 \n 003 \n 004 \n 005 \n 006 > > > > \n 007 \n > > > > 020 \b \n \t \n \n \n 013 \n \f \n \r \n 016 > > > > \n 017 \n > > > > ... > > > > > > > > And a small sed script : > > > > $ cat one.sed > > > > l > > > > d > > > > > > > > Sed script just runs the "l" command for each line. > > > > $ /usr/ucb/sed -f one.sed out.txt > out1.txt > > > > > > > > Here is the output of out1.txt > > > > $ od -c out1.txt > > > > 000 \n \ 0 1 \n \ 0 2 \n \ 0 3 \n > > > > \ 0 4 > > > > 020 \n \ 0 5 \n \ 0 6 \n \ 0 7 \n > > > > - \b < > > > > 040 \n - \b > \n \n \n \ 1 3 \n \ 1 > > > > 4 \n \ > > > > 060 1 5 \n \ 1 6 \n \ 1 7 \n \ 2 > > > > 0 \n \ > > > > ... > > > > > > > > $ cat out1.txt > > > > \01 > > > > \02 > > > > \03 > > > > \04 > > > > \05 > > > > \06 > > > > \07 > > > > < > > > > > > > > > > > > > > > > > \13 > > > > \14 > > > > > > > > --- > > > > > > > > So for some strange reason : > > > > 0x8 is converted to "-\b<" and > > > > 0x9 is converted to "-\b>" > > > > > > > > That's what we see in "trans" variable. > > > > > > > > Do you think it could be a bug in original sed and should we > > > > correct it? > > > > > > I guess it is a bug in original sed and it should be corrected. > > > IMHO it should be suffient to replace > > > > > > -\b< > > > > > > and > > > > > > -\b< > > > > > > with > > > > > > < > > > > > > and > > > > > > > > > Can you elaborate why you chose "<" and ">". I could not think of any > > reasons behind it. > > Because these characters are currently displayed by sed and I saw no reason > to change > it. > > > sed's man page says : > > (2)lList the pattern space on the standard out- > > put in an unambiguous form. Non-printing > > characters are spelled in two digit ASCII > > and long lines are folded. > > > > So 0x8 and 0x9 char values, which I believe are non printable > > characters, > > should be printed into *two* digit ASCII. So \10 and \11 looks to me > > as conforming to man page. > > As said displaying two digit ASCII's also makes sense and conforms to the man > page. So go for it. > BTW: Shouldn't it be \08 and \09 for 0x8 and 0x9 instead of \10 and \11 ? If we notice, how other characters are printed then sed code is using octal numbers. After \17 it changes to \20, \21. Similarly After \27 it changes to \30, \31 etc. So based on the above pattern, \10 and \11 seems to be more consistent to me. For reference, output of the converted characters (0-31) by "l" command is given below. $ cat out1.txt \01 \02 \03 \04 \05 \06 \07 < > \13 \14 \15 \16 \17 \20 \21 \22 \23 \24 \25 \26 \27 \30 \31 \32 \33 \34 \35 \36 \37 I will submit the patch today. Thanks for noticing this. Regards, Basant.
Query on ap_lingering_close
Hi All, I am getting lot of CLOSE_WAIT ,SYNC_RCV and TIME_WAIT state when I do a netstat and result it is Apache pause. What would be the ideal approach to solve the issue.I am planning for explicit call of ap_lingering_close after reading the client request. Also I see a #ifdef CORE_PRIVATE under httpd_connection.h where ap_lingering_close is defined and the macro #define CORE_PRIVATE is defined in mod_proxy.h.But I don't use mod_proxy so is it going to impact anything? For TIME_WAIT plan to reduce tcp_max_tw_buckets value.Is it going to be impacted? For SYNC_RCV Increasing net.ipv4.tcp_max_syn_backlog, decreasing tcp_synack_retries and enabling tcp_syncookies. If you have any suggestion then please let me know. Thanks in advance. -A
Re: svn commit: r691418 [2/2] - in /httpd/httpd/trunk:./docs/manual/mod/ modules/filters/
> -Ursprüngliche Nachricht- > Von: [EMAIL PROTECTED] > Gesendet: Mittwoch, 10. September 2008 15:32 > An: dev@httpd.apache.org > Betreff: Re: svn commit: r691418 [2/2] - in > /httpd/httpd/trunk:./docs/manual/mod/ modules/filters/ > > On Wed, Sep 10, 2008 at 10:10:20AM +0200, "Plüm, Rüdiger, > VF-Group" wrote: > > > > > > > -Ursprüngliche Nachricht- > > > Von: [EMAIL PROTECTED] > > > Gesendet: Mittwoch, 10. September 2008 08:56 > > > An: dev@httpd.apache.org > > > Betreff: Re: svn commit: r691418 [2/2] - in > > > /httpd/httpd/trunk: ./docs/manual/mod/ modules/filters/ > > > > > > > > I investigated further. I wrote a test file having binary > > > character from 0 to 31: > > > $ od -c out.txt > > > 000 \0 \n 001 \n 002 \n 003 \n 004 \n 005 \n 006 > > > \n 007 \n > > > 020 \b \n \t \n \n \n 013 \n \f \n \r \n 016 > > > \n 017 \n > > > ... > > > > > > And a small sed script : > > > $ cat one.sed > > > l > > > d > > > > > > Sed script just runs the "l" command for each line. > > > $ /usr/ucb/sed -f one.sed out.txt > out1.txt > > > > > > Here is the output of out1.txt > > > $ od -c out1.txt > > > 000 \n \ 0 1 \n \ 0 2 \n \ 0 3 \n > > > \ 0 4 > > > 020 \n \ 0 5 \n \ 0 6 \n \ 0 7 \n > > > - \b < > > > 040 \n - \b > \n \n \n \ 1 3 \n \ 1 > > > 4 \n \ > > > 060 1 5 \n \ 1 6 \n \ 1 7 \n \ 2 > > > 0 \n \ > > > ... > > > > > > $ cat out1.txt > > > \01 > > > \02 > > > \03 > > > \04 > > > \05 > > > \06 > > > \07 > > > < > > > > > > > > > > > > > \13 > > > \14 > > > > > > --- > > > > > > So for some strange reason : > > > 0x8 is converted to "-\b<" and > > > 0x9 is converted to "-\b>" > > > > > > That's what we see in "trans" variable. > > > > > > Do you think it could be a bug in original sed and should we > > > correct it? > > > > I guess it is a bug in original sed and it should be corrected. > > IMHO it should be suffient to replace > > > > -\b< > > > > and > > > > -\b< > > > > with > > > > < > > > > and > > > > > > Can you elaborate why you chose "<" and ">". I could not think of any > reasons behind it. Because these characters are currently displayed by sed and I saw no reason to change it. > sed's man page says : > (2)lList the pattern space on the standard out- > put in an unambiguous form. Non-printing > characters are spelled in two digit ASCII > and long lines are folded. > > So 0x8 and 0x9 char values, which I believe are non printable > characters, > should be printed into *two* digit ASCII. So \10 and \11 looks to me > as conforming to man page. As said displaying two digit ASCII's also makes sense and conforms to the man page. So go for it. BTW: Shouldn't it be \08 and \09 for 0x8 and 0x9 instead of \10 and \11 ? Regards Rüdiger
Re: svn commit: r691418 [2/2] - in /httpd/httpd/trunk: ./docs/manual/mod/ modules/filters/
On Wed, Sep 10, 2008 at 10:10:20AM +0200, "Plüm, Rüdiger, VF-Group" wrote: > > > > -Ursprüngliche Nachricht- > > Von: [EMAIL PROTECTED] > > Gesendet: Mittwoch, 10. September 2008 08:56 > > An: dev@httpd.apache.org > > Betreff: Re: svn commit: r691418 [2/2] - in > > /httpd/httpd/trunk: ./docs/manual/mod/ modules/filters/ > > > > > I investigated further. I wrote a test file having binary > > character from 0 to 31: > > $ od -c out.txt > > 000 \0 \n 001 \n 002 \n 003 \n 004 \n 005 \n 006 > > \n 007 \n > > 020 \b \n \t \n \n \n 013 \n \f \n \r \n 016 > > \n 017 \n > > ... > > > > And a small sed script : > > $ cat one.sed > > l > > d > > > > Sed script just runs the "l" command for each line. > > $ /usr/ucb/sed -f one.sed out.txt > out1.txt > > > > Here is the output of out1.txt > > $ od -c out1.txt > > 000 \n \ 0 1 \n \ 0 2 \n \ 0 3 \n > > \ 0 4 > > 020 \n \ 0 5 \n \ 0 6 \n \ 0 7 \n > > - \b < > > 040 \n - \b > \n \n \n \ 1 3 \n \ 1 > > 4 \n \ > > 060 1 5 \n \ 1 6 \n \ 1 7 \n \ 2 > > 0 \n \ > > ... > > > > $ cat out1.txt > > \01 > > \02 > > \03 > > \04 > > \05 > > \06 > > \07 > > < > > > > > > > > > \13 > > \14 > > > > --- > > > > So for some strange reason : > > 0x8 is converted to "-\b<" and > > 0x9 is converted to "-\b>" > > > > That's what we see in "trans" variable. > > > > Do you think it could be a bug in original sed and should we > > correct it? > > I guess it is a bug in original sed and it should be corrected. > IMHO it should be suffient to replace > > -\b< > > and > > -\b< > > with > > < > > and > > > Can you elaborate why you chose "<" and ">". I could not think of any reasons behind it. sed's man page says : (2)lList the pattern space on the standard out- put in an unambiguous form. Non-printing characters are spelled in two digit ASCII and long lines are folded. So 0x8 and 0x9 char values, which I believe are non printable characters, should be printed into *two* digit ASCII. So \10 and \11 looks to me as conforming to man page. Regards, Basant.
Re: svn commit: r693697 - in /httpd/sandbox/mod_valid_utf8: ./ README mod_valid_utf8.c
Thanks for the feedback. I'll fix the code soon. Nick Kew wrote: On Wed, 10 Sep 2008 03:54:00 - [EMAIL PROTECTED] wrote: Author: ianh Date: Tue Sep 9 20:53:59 2008 New Revision: 693697 URL: http://svn.apache.org/viewvc?rev=693697&view=rev Log: initial check in. this filter validates that the incoming request contains valid UTF8 characters. Why? Last time I looked, incoming charsets were indeed a problem area, but a browser submitting an HTML form would de-facto use the same charset as the form. Not necessarily utf-8. Not to mention the many other use cases for sending non-utf8 data. +static char* validate_buffer( apr_pool_t *p, const char* inbuf, apr_size_t* length) +{ Looks like a potential util_ function. Cousin to both apr_uri and apr_xlate. +if (inbuf[i] == '%' ) { +if ((i < len -2 ) && ishexnumber( inbuf[i+1]) && ishexnumber( inbuf[i+2])) { ... is all very well, but +else { +buffer[j++]=inbuf[i++]; +} Shouldn't that at least be marked /* FIXME */ where you potentially let through the chars you're supposed to block ? +ap_hook_pre_connection(utf8_pre_conn, NULL, NULL, APR_HOOK_MIDDLE); + +ap_register_input_filter(utf8_filter_name, utf8_in_filter, NULL, + AP_FTYPE_NETWORK - 1); Huh? Isn't that before mod_ssl, let alone mod_deflate, mod_charset? And no excape path for binary data either. It'll cripple any server with it loaded!
Re: svn commit: r693697 - in /httpd/sandbox/mod_valid_utf8: ./ README mod_valid_utf8.c
Nick Kew wrote: Looks like a potential util_ function. Cousin to both apr_uri and apr_xlate. xlate() rejects such nonsense, so apr shouldn't need it. It's only an issue in other ecosystems behind httpd, such as dodging the recent Tomcat vulnerability.
Re: svn commit: r693697 - in /httpd/sandbox/mod_valid_utf8: ./ README mod_valid_utf8.c
Nick Kew wrote: Last time I looked, incoming charsets were indeed a problem area, but a browser submitting an HTML form would de-facto use the same charset as the form. Not necessarily utf-8. It's worthwhile to add some logic to 'step out of the way' when the charset is inferred as other-than utf-8.
Re: svn commit: r693697 - in /httpd/sandbox/mod_valid_utf8: ./ README mod_valid_utf8.c
On Wed, 10 Sep 2008 03:54:00 - [EMAIL PROTECTED] wrote: > Author: ianh > Date: Tue Sep 9 20:53:59 2008 > New Revision: 693697 > > URL: http://svn.apache.org/viewvc?rev=693697&view=rev > Log: > initial check in. > this filter validates that the incoming request contains valid UTF8 > characters. Why? Last time I looked, incoming charsets were indeed a problem area, but a browser submitting an HTML form would de-facto use the same charset as the form. Not necessarily utf-8. Not to mention the many other use cases for sending non-utf8 data. > +static char* validate_buffer( apr_pool_t *p, const char* inbuf, > apr_size_t* length) +{ Looks like a potential util_ function. Cousin to both apr_uri and apr_xlate. > +if (inbuf[i] == '%' ) { > +if ((i < len -2 ) && ishexnumber( inbuf[i+1]) && > ishexnumber( inbuf[i+2])) { ... is all very well, but > +else { > +buffer[j++]=inbuf[i++]; > +} Shouldn't that at least be marked /* FIXME */ where you potentially let through the chars you're supposed to block ? > +ap_hook_pre_connection(utf8_pre_conn, NULL, NULL, > APR_HOOK_MIDDLE); + > +ap_register_input_filter(utf8_filter_name, utf8_in_filter, NULL, > + AP_FTYPE_NETWORK - 1); Huh? Isn't that before mod_ssl, let alone mod_deflate, mod_charset? And no excape path for binary data either. It'll cripple any server with it loaded! -- Nick Kew Application Development with Apache - the Apache Modules Book http://www.apachetutor.org/
Re: svn commit: r691418 [2/2] - in /httpd/httpd/trunk: ./docs/manual/mod/ modules/filters/
> -Ursprüngliche Nachricht- > Von: [EMAIL PROTECTED] > Gesendet: Mittwoch, 10. September 2008 08:56 > An: dev@httpd.apache.org > Betreff: Re: svn commit: r691418 [2/2] - in > /httpd/httpd/trunk: ./docs/manual/mod/ modules/filters/ > > I investigated further. I wrote a test file having binary > character from 0 to 31: > $ od -c out.txt > 000 \0 \n 001 \n 002 \n 003 \n 004 \n 005 \n 006 > \n 007 \n > 020 \b \n \t \n \n \n 013 \n \f \n \r \n 016 > \n 017 \n > ... > > And a small sed script : > $ cat one.sed > l > d > > Sed script just runs the "l" command for each line. > $ /usr/ucb/sed -f one.sed out.txt > out1.txt > > Here is the output of out1.txt > $ od -c out1.txt > 000 \n \ 0 1 \n \ 0 2 \n \ 0 3 \n > \ 0 4 > 020 \n \ 0 5 \n \ 0 6 \n \ 0 7 \n > - \b < > 040 \n - \b > \n \n \n \ 1 3 \n \ 1 > 4 \n \ > 060 1 5 \n \ 1 6 \n \ 1 7 \n \ 2 > 0 \n \ > ... > > $ cat out1.txt > \01 > \02 > \03 > \04 > \05 > \06 > \07 > < > > > > > \13 > \14 > > --- > > So for some strange reason : > 0x8 is converted to "-\b<" and > 0x9 is converted to "-\b>" > > That's what we see in "trans" variable. > > Do you think it could be a bug in original sed and should we > correct it? I guess it is a bug in original sed and it should be corrected. IMHO it should be suffient to replace -\b< and -\b< with < and > > > It should probably print "\10" and "\11". This would be an option, but I wouldn't get this far. It is still fine with me to print < and >. > > BTW /usr/bin/sed have the exactly the same behavior. > > It sound strange though that this was never caught in sed code. Some issues last longer than other ones :-) Regards Rüdiger