Re: [community] 2.3.0 alpha on October 1?

2008-09-10 Thread steve
>> (Speaking of pet peeves -- why does Apache handle so many things
>> besides HTTP, and yet I have to get other servers to handle certain
>> kinds of HTTP requests because Apache doesn't handle it well?)

100K concurrent requests, kept open.


Re: svn commit: r691418 [2/2] - in/httpd/httpd/trunk:./docs/manual/mod/ modules/filters/

2008-09-10 Thread Basant Kumar kukreja
Patch is attached. I have tested the patch. It now prints \10 and \11 for
character values 0x8 and 0x9.

Regards,
Basant.

On Wed, Sep 10, 2008 at 04:00:42PM +0200, "Plüm, Rüdiger, VF-Group" wrote:
>  
> 
> > -Ursprüngliche Nachricht-
> > Von: [EMAIL PROTECTED] 
> > Gesendet: Mittwoch, 10. September 2008 15:54
> > An: dev@httpd.apache.org
> > Betreff: Re: svn commit: r691418 [2/2] - 
> > in/httpd/httpd/trunk:./docs/manual/mod/ modules/filters/
> > 
> 
> > > > Can you elaborate why you chose "<" and ">". I could not 
> > think of any
> > > > reasons behind it.
> > > 
> > > Because these characters are currently displayed by sed and 
> > I saw no reason to change
> > > it.
> > > 
> > > > sed's man page says :
> > > > (2)lList the pattern space on the standard  out-
> > > >  put  in  an  unambiguous  form. Non-printing
> > > >  characters are spelled in  two  digit  ASCII
> > > >  and long lines are folded.
> > > > 
> > > > So 0x8 and 0x9 char values, which I believe are non printable 
> > > > characters,
> > > > should be printed into *two* digit ASCII. So \10 and \11 
> > looks to me
> > > > as conforming to man page.
> > > 
> > > As said displaying two digit ASCII's also makes sense and 
> > conforms to the man
> > > page. So go for it.
> > > BTW: Shouldn't it be \08 and \09 for 0x8 and 0x9 instead of 
> > \10 and \11 ?
> > If we notice, how other characters are printed then sed code 
> > is using octal
> > numbers. After \17 it changes to \20, \21. Similarly After 
> > \27 it changes to
> > \30, \31 etc. So based on the above pattern, \10 and \11 
> > seems to be more
> > consistent to me.
> 
> Thanks for pointing out. I missed that these were actual octal numbers.
> So yes, you are correct.
> 
> Regards
> 
> Rüdiger
> 
Index: modules/filters/sed1.c
===
--- modules/filters/sed1.c  (revision 692275)
+++ modules/filters/sed1.c  (working copy)
@@ -33,8 +33,8 @@
 "\\05",
 "\\06",
 "\\07",
-"-<",
-"->",
+"\\10",
+"\\11",
 "\n",
 "\\13",
 "\\14",


Re: svn commit: r693697 - in /httpd/sandbox/mod_valid_utf8: ./ README mod_valid_utf8.c

2008-09-10 Thread Ruediger Pluem



On 09/10/2008 05:54 AM, [EMAIL PROTECTED] wrote:

Author: ianh
Date: Tue Sep  9 20:53:59 2008
New Revision: 693697

URL: http://svn.apache.org/viewvc?rev=693697&view=rev
Log:
initial check in.
this filter validates that the incoming request contains valid UTF8 characters.

Added:
httpd/sandbox/mod_valid_utf8/
httpd/sandbox/mod_valid_utf8/README   (with props)
httpd/sandbox/mod_valid_utf8/mod_valid_utf8.c   (with props)




Added: httpd/sandbox/mod_valid_utf8/mod_valid_utf8.c
URL: 
http://svn.apache.org/viewvc/httpd/sandbox/mod_valid_utf8/mod_valid_utf8.c?rev=693697&view=auto
==



+
+static apr_status_t utf8_in_filter(ap_filter_t *f,
+apr_bucket_brigade *bb,
+ap_input_mode_t mode,
+apr_read_type_e block,
+apr_off_t readbytes) {
+apr_status_t rv;
+utf8_ctx *ctx = f->ctx;
+
+if (!ctx) {
+ctx = apr_pcalloc( f->c->pool, sizeof(*ctx));
+ctx->bb = apr_brigade_create( f->c->pool, f->c->bucket_alloc );
+}
+rv = ap_get_brigade(f->next, ctx->bb, mode, block, readbytes);
+if ( rv != APR_SUCCESS ) {
+return rv;
+}
+
+while ( !APR_BRIGADE_EMPTY(ctx->bb) ) {
+apr_size_t length;
+char *buffer;
+const char *data;
+apr_bucket *cpy;
+int i ;
+apr_bucket *b;
+
+b = APR_BRIGADE_FIRST(ctx->bb );

+APR_BUCKET_REMOVE(b);
+if ( APR_BUCKET_IS_EOS(b) ) {
+APR_BRIGADE_INSERT_TAIL( bb, b);
+break;
+}
+length = b->length;
+
+rv = apr_bucket_read( b, &data, &length, APR_BLOCK_READ);
+if ( rv != APR_SUCCESS ) {
+return rv;
+}
+
+buffer = validate_buffer(f->c->pool, data, &length);


What do you do when a multibyte character is split over multiple buckets?

Regards

Rüdiger


Re: svn commit: r691418 [2/2] - in/httpd/httpd/trunk:./docs/manual/mod/ modules/filters/

2008-09-10 Thread Plüm, Rüdiger, VF-Group
 

> -Ursprüngliche Nachricht-
> Von: [EMAIL PROTECTED] 
> Gesendet: Mittwoch, 10. September 2008 15:54
> An: dev@httpd.apache.org
> Betreff: Re: svn commit: r691418 [2/2] - 
> in/httpd/httpd/trunk:./docs/manual/mod/ modules/filters/
> 

> > > Can you elaborate why you chose "<" and ">". I could not 
> think of any
> > > reasons behind it.
> > 
> > Because these characters are currently displayed by sed and 
> I saw no reason to change
> > it.
> > 
> > > sed's man page says :
> > > (2)lList the pattern space on the standard  out-
> > >  put  in  an  unambiguous  form. Non-printing
> > >  characters are spelled in  two  digit  ASCII
> > >  and long lines are folded.
> > > 
> > > So 0x8 and 0x9 char values, which I believe are non printable 
> > > characters,
> > > should be printed into *two* digit ASCII. So \10 and \11 
> looks to me
> > > as conforming to man page.
> > 
> > As said displaying two digit ASCII's also makes sense and 
> conforms to the man
> > page. So go for it.
> > BTW: Shouldn't it be \08 and \09 for 0x8 and 0x9 instead of 
> \10 and \11 ?
> If we notice, how other characters are printed then sed code 
> is using octal
> numbers. After \17 it changes to \20, \21. Similarly After 
> \27 it changes to
> \30, \31 etc. So based on the above pattern, \10 and \11 
> seems to be more
> consistent to me.

Thanks for pointing out. I missed that these were actual octal numbers.
So yes, you are correct.

Regards

Rüdiger



Re: Query on ap_lingering_close

2008-09-10 Thread Eric Covener
On Wed, Sep 10, 2008 at 9:46 AM, Arnab Ganguly <[EMAIL PROTECTED]> wrote:
> Hi All,
> I am getting lot of CLOSE_WAIT ,SYNC_RCV and TIME_WAIT state when I do a
> netstat and result it is Apache pause.
> What would be the ideal approach to solve the issue.I am planning for
> explicit call of ap_lingering_close after reading the client request.

Wouldn't that prevent Apache from writing a response?

Maybe [EMAIL PROTECTED] could help you identify why you server is hanging.

-- 
Eric Covener
[EMAIL PROTECTED]


Re: svn commit: r691418 [2/2] - in /httpd/httpd/trunk:./docs/manual/mod/ modules/filters/

2008-09-10 Thread Basant Kukreja
On Wed, Sep 10, 2008 at 03:41:21PM +0200, "Plüm, Rüdiger, VF-Group" wrote:
>  
> 
> > -Ursprüngliche Nachricht-
> > Von: [EMAIL PROTECTED] 
> > Gesendet: Mittwoch, 10. September 2008 15:32
> > An: dev@httpd.apache.org
> > Betreff: Re: svn commit: r691418 [2/2] - in 
> > /httpd/httpd/trunk:./docs/manual/mod/ modules/filters/
> > 
> > On Wed, Sep 10, 2008 at 10:10:20AM +0200, "Plüm, Rüdiger, 
> > VF-Group" wrote:
> > >  
> > > 
> > > > -Ursprüngliche Nachricht-
> > > > Von: [EMAIL PROTECTED] 
> > > > Gesendet: Mittwoch, 10. September 2008 08:56
> > > > An: dev@httpd.apache.org
> > > > Betreff: Re: svn commit: r691418 [2/2] - in 
> > > > /httpd/httpd/trunk: ./docs/manual/mod/ modules/filters/
> > > > 
> > > 
> > > > I investigated further. I wrote a test file having binary 
> > > > character from 0 to 31:
> > > > $ od -c out.txt
> > > > 000  \0  \n 001  \n 002  \n 003  \n 004  \n 005  \n 006  
> > > > \n 007  \n
> > > > 020  \b  \n  \t  \n  \n  \n 013  \n  \f  \n  \r  \n 016  
> > > > \n 017  \n
> > > > ...
> > > > 
> > > > And a small sed script :
> > > >  $ cat one.sed
> > > > l
> > > > d
> > > > 
> > > > Sed script just runs the "l" command for each line.
> > > > $ /usr/ucb/sed -f one.sed out.txt  > out1.txt
> > > > 
> > > > Here is the output of out1.txt
> > > > $ od -c out1.txt
> > > > 000  \n   \   0   1  \n   \   0   2  \n   \   0   3  \n   
> > > > \   0   4
> > > > 020  \n   \   0   5  \n   \   0   6  \n   \   0   7  \n   
> > > > -  \b   <
> > > > 040  \n   -  \b   >  \n  \n  \n   \   1   3  \n   \   1   
> > > > 4  \n   \
> > > > 060   1   5  \n   \   1   6  \n   \   1   7  \n   \   2   
> > > > 0  \n   \
> > > > ...
> > > > 
> > > > $ cat out1.txt
> > > > \01
> > > > \02
> > > > \03
> > > > \04
> > > > \05
> > > > \06
> > > > \07
> > > > <
> > > > >
> > > > 
> > > > 
> > > > \13
> > > > \14
> > > > 
> > > > ---
> > > > 
> > > > So for some strange reason :
> > > > 0x8 is converted to "-\b<" and
> > > > 0x9 is converted to "-\b>"
> > > > 
> > > > That's what we see in "trans" variable.
> > > > 
> > > > Do you think it could be a bug in original sed and should we 
> > > > correct it? 
> > > 
> > > I guess it is a bug in original sed and it should be corrected.
> > > IMHO it should be suffient to replace
> > > 
> > > -\b<
> > > 
> > > and
> > > 
> > > -\b<
> > > 
> > > with
> > > 
> > > <
> > > 
> > > and
> > > 
> > > >
> > Can you elaborate why you chose "<" and ">". I could not think of any
> > reasons behind it.
> 
> Because these characters are currently displayed by sed and I saw no reason 
> to change
> it.
> 
> > sed's man page says :
> > (2)lList the pattern space on the standard  out-
> >  put  in  an  unambiguous  form. Non-printing
> >  characters are spelled in  two  digit  ASCII
> >  and long lines are folded.
> > 
> > So 0x8 and 0x9 char values, which I believe are non printable 
> > characters,
> > should be printed into *two* digit ASCII. So \10 and \11 looks to me
> > as conforming to man page.
> 
> As said displaying two digit ASCII's also makes sense and conforms to the man
> page. So go for it.
> BTW: Shouldn't it be \08 and \09 for 0x8 and 0x9 instead of \10 and \11 ?
If we notice, how other characters are printed then sed code is using octal
numbers. After \17 it changes to \20, \21. Similarly After \27 it changes to
\30, \31 etc. So based on the above pattern, \10 and \11 seems to be more
consistent to me.

For reference, output of the converted characters (0-31) by "l" command is
given below.

$ cat out1.txt

\01
\02
\03
\04
\05
\06
\07
<
>


\13
\14
\15
\16
\17
\20
\21
\22
\23
\24
\25
\26
\27
\30
\31
\32
\33
\34
\35
\36
\37

I will submit the patch today. Thanks for noticing this.

Regards,
Basant.


Query on ap_lingering_close

2008-09-10 Thread Arnab Ganguly
Hi All,
I am getting lot of CLOSE_WAIT ,SYNC_RCV and TIME_WAIT state when I do a
netstat and result it is Apache pause.
What would be the ideal approach to solve the issue.I am planning for
explicit call of ap_lingering_close after reading the client request.

Also I see a #ifdef CORE_PRIVATE  under httpd_connection.h where
ap_lingering_close is defined and the macro #define CORE_PRIVATE is defined
in mod_proxy.h.But I don't use mod_proxy so is it going to impact anything?

For TIME_WAIT plan to reduce tcp_max_tw_buckets value.Is it going to be
impacted?

For SYNC_RCV
Increasing net.ipv4.tcp_max_syn_backlog, decreasing tcp_synack_retries and
enabling tcp_syncookies.

If you have any suggestion then please let me know.
Thanks in advance.

-A


Re: svn commit: r691418 [2/2] - in /httpd/httpd/trunk:./docs/manual/mod/ modules/filters/

2008-09-10 Thread Plüm, Rüdiger, VF-Group
 

> -Ursprüngliche Nachricht-
> Von: [EMAIL PROTECTED] 
> Gesendet: Mittwoch, 10. September 2008 15:32
> An: dev@httpd.apache.org
> Betreff: Re: svn commit: r691418 [2/2] - in 
> /httpd/httpd/trunk:./docs/manual/mod/ modules/filters/
> 
> On Wed, Sep 10, 2008 at 10:10:20AM +0200, "Plüm, Rüdiger, 
> VF-Group" wrote:
> >  
> > 
> > > -Ursprüngliche Nachricht-
> > > Von: [EMAIL PROTECTED] 
> > > Gesendet: Mittwoch, 10. September 2008 08:56
> > > An: dev@httpd.apache.org
> > > Betreff: Re: svn commit: r691418 [2/2] - in 
> > > /httpd/httpd/trunk: ./docs/manual/mod/ modules/filters/
> > > 
> > 
> > > I investigated further. I wrote a test file having binary 
> > > character from 0 to 31:
> > > $ od -c out.txt
> > > 000  \0  \n 001  \n 002  \n 003  \n 004  \n 005  \n 006  
> > > \n 007  \n
> > > 020  \b  \n  \t  \n  \n  \n 013  \n  \f  \n  \r  \n 016  
> > > \n 017  \n
> > > ...
> > > 
> > > And a small sed script :
> > >  $ cat one.sed
> > > l
> > > d
> > > 
> > > Sed script just runs the "l" command for each line.
> > > $ /usr/ucb/sed -f one.sed out.txt  > out1.txt
> > > 
> > > Here is the output of out1.txt
> > > $ od -c out1.txt
> > > 000  \n   \   0   1  \n   \   0   2  \n   \   0   3  \n   
> > > \   0   4
> > > 020  \n   \   0   5  \n   \   0   6  \n   \   0   7  \n   
> > > -  \b   <
> > > 040  \n   -  \b   >  \n  \n  \n   \   1   3  \n   \   1   
> > > 4  \n   \
> > > 060   1   5  \n   \   1   6  \n   \   1   7  \n   \   2   
> > > 0  \n   \
> > > ...
> > > 
> > > $ cat out1.txt
> > > \01
> > > \02
> > > \03
> > > \04
> > > \05
> > > \06
> > > \07
> > > <
> > > >
> > > 
> > > 
> > > \13
> > > \14
> > > 
> > > ---
> > > 
> > > So for some strange reason :
> > > 0x8 is converted to "-\b<" and
> > > 0x9 is converted to "-\b>"
> > > 
> > > That's what we see in "trans" variable.
> > > 
> > > Do you think it could be a bug in original sed and should we 
> > > correct it? 
> > 
> > I guess it is a bug in original sed and it should be corrected.
> > IMHO it should be suffient to replace
> > 
> > -\b<
> > 
> > and
> > 
> > -\b<
> > 
> > with
> > 
> > <
> > 
> > and
> > 
> > >
> Can you elaborate why you chose "<" and ">". I could not think of any
> reasons behind it.

Because these characters are currently displayed by sed and I saw no reason to 
change
it.

> sed's man page says :
> (2)lList the pattern space on the standard  out-
>  put  in  an  unambiguous  form. Non-printing
>  characters are spelled in  two  digit  ASCII
>  and long lines are folded.
> 
> So 0x8 and 0x9 char values, which I believe are non printable 
> characters,
> should be printed into *two* digit ASCII. So \10 and \11 looks to me
> as conforming to man page.

As said displaying two digit ASCII's also makes sense and conforms to the man
page. So go for it.
BTW: Shouldn't it be \08 and \09 for 0x8 and 0x9 instead of \10 and \11 ?

Regards

Rüdiger



Re: svn commit: r691418 [2/2] - in /httpd/httpd/trunk: ./docs/manual/mod/ modules/filters/

2008-09-10 Thread Basant Kukreja
On Wed, Sep 10, 2008 at 10:10:20AM +0200, "Plüm, Rüdiger, VF-Group" wrote:
>  
> 
> > -Ursprüngliche Nachricht-
> > Von: [EMAIL PROTECTED] 
> > Gesendet: Mittwoch, 10. September 2008 08:56
> > An: dev@httpd.apache.org
> > Betreff: Re: svn commit: r691418 [2/2] - in 
> > /httpd/httpd/trunk: ./docs/manual/mod/ modules/filters/
> > 
> 
> > I investigated further. I wrote a test file having binary 
> > character from 0 to 31:
> > $ od -c out.txt
> > 000  \0  \n 001  \n 002  \n 003  \n 004  \n 005  \n 006  
> > \n 007  \n
> > 020  \b  \n  \t  \n  \n  \n 013  \n  \f  \n  \r  \n 016  
> > \n 017  \n
> > ...
> > 
> > And a small sed script :
> >  $ cat one.sed
> > l
> > d
> > 
> > Sed script just runs the "l" command for each line.
> > $ /usr/ucb/sed -f one.sed out.txt  > out1.txt
> > 
> > Here is the output of out1.txt
> > $ od -c out1.txt
> > 000  \n   \   0   1  \n   \   0   2  \n   \   0   3  \n   
> > \   0   4
> > 020  \n   \   0   5  \n   \   0   6  \n   \   0   7  \n   
> > -  \b   <
> > 040  \n   -  \b   >  \n  \n  \n   \   1   3  \n   \   1   
> > 4  \n   \
> > 060   1   5  \n   \   1   6  \n   \   1   7  \n   \   2   
> > 0  \n   \
> > ...
> > 
> > $ cat out1.txt
> > \01
> > \02
> > \03
> > \04
> > \05
> > \06
> > \07
> > <
> > >
> > 
> > 
> > \13
> > \14
> > 
> > ---
> > 
> > So for some strange reason :
> > 0x8 is converted to "-\b<" and
> > 0x9 is converted to "-\b>"
> > 
> > That's what we see in "trans" variable.
> > 
> > Do you think it could be a bug in original sed and should we 
> > correct it? 
> 
> I guess it is a bug in original sed and it should be corrected.
> IMHO it should be suffient to replace
> 
> -\b<
> 
> and
> 
> -\b<
> 
> with
> 
> <
> 
> and
> 
> >
Can you elaborate why you chose "<" and ">". I could not think of any
reasons behind it.
sed's man page says :
(2)lList the pattern space on the standard  out-
 put  in  an  unambiguous  form. Non-printing
 characters are spelled in  two  digit  ASCII
 and long lines are folded.

So 0x8 and 0x9 char values, which I believe are non printable characters,
should be printed into *two* digit ASCII. So \10 and \11 looks to me
as conforming to man page.

Regards,
Basant.


Re: svn commit: r693697 - in /httpd/sandbox/mod_valid_utf8: ./ README mod_valid_utf8.c

2008-09-10 Thread Ian Holsman

Thanks for the feedback.

I'll fix the code soon.

Nick Kew wrote:

On Wed, 10 Sep 2008 03:54:00 -
[EMAIL PROTECTED] wrote:

  

Author: ianh
Date: Tue Sep  9 20:53:59 2008
New Revision: 693697

URL: http://svn.apache.org/viewvc?rev=693697&view=rev
Log:
initial check in.
this filter validates that the incoming request contains valid UTF8
characters.



Why?

Last time I looked, incoming charsets were indeed a problem area,
but a browser submitting an HTML form would de-facto use the
same charset as the form.  Not necessarily utf-8.

Not to mention the many other use cases for sending non-utf8 data.

  

+static char* validate_buffer( apr_pool_t *p,  const char* inbuf,
apr_size_t* length) +{



Looks like a potential util_ function.  Cousin to both apr_uri
and apr_xlate.

  

+if (inbuf[i] == '%' ) {
+if ((i < len -2 ) && ishexnumber( inbuf[i+1]) &&
ishexnumber( inbuf[i+2])) {



... is all very well, but

  

+else {
+buffer[j++]=inbuf[i++];
+}



Shouldn't that at least be marked /* FIXME */
where you potentially let through the chars you're
supposed to block ?


  

+ap_hook_pre_connection(utf8_pre_conn, NULL, NULL,
APR_HOOK_MIDDLE); +
+ap_register_input_filter(utf8_filter_name, utf8_in_filter, NULL,
+ AP_FTYPE_NETWORK - 1);



Huh?  Isn't that before mod_ssl, let alone mod_deflate, mod_charset?
And no excape path for binary data either.  It'll cripple any server
with it loaded!


  




Re: svn commit: r693697 - in /httpd/sandbox/mod_valid_utf8: ./ README mod_valid_utf8.c

2008-09-10 Thread William A. Rowe, Jr.

Nick Kew wrote:


Looks like a potential util_ function.  Cousin to both apr_uri
and apr_xlate.


xlate() rejects such nonsense, so apr shouldn't need it.  It's only
an issue in other ecosystems behind httpd, such as dodging the recent
Tomcat vulnerability.


Re: svn commit: r693697 - in /httpd/sandbox/mod_valid_utf8: ./ README mod_valid_utf8.c

2008-09-10 Thread William A. Rowe, Jr.

Nick Kew wrote:


Last time I looked, incoming charsets were indeed a problem area,
but a browser submitting an HTML form would de-facto use the
same charset as the form.  Not necessarily utf-8.


It's worthwhile to add some logic to 'step out of the way' when the
charset is inferred as other-than utf-8.


Re: svn commit: r693697 - in /httpd/sandbox/mod_valid_utf8: ./ README mod_valid_utf8.c

2008-09-10 Thread Nick Kew
On Wed, 10 Sep 2008 03:54:00 -
[EMAIL PROTECTED] wrote:

> Author: ianh
> Date: Tue Sep  9 20:53:59 2008
> New Revision: 693697
> 
> URL: http://svn.apache.org/viewvc?rev=693697&view=rev
> Log:
> initial check in.
> this filter validates that the incoming request contains valid UTF8
> characters.

Why?

Last time I looked, incoming charsets were indeed a problem area,
but a browser submitting an HTML form would de-facto use the
same charset as the form.  Not necessarily utf-8.

Not to mention the many other use cases for sending non-utf8 data.

> +static char* validate_buffer( apr_pool_t *p,  const char* inbuf,
> apr_size_t* length) +{

Looks like a potential util_ function.  Cousin to both apr_uri
and apr_xlate.

> +if (inbuf[i] == '%' ) {
> +if ((i < len -2 ) && ishexnumber( inbuf[i+1]) &&
> ishexnumber( inbuf[i+2])) {

... is all very well, but

> +else {
> +buffer[j++]=inbuf[i++];
> +}

Shouldn't that at least be marked /* FIXME */
where you potentially let through the chars you're
supposed to block ?


> +ap_hook_pre_connection(utf8_pre_conn, NULL, NULL,
> APR_HOOK_MIDDLE); +
> +ap_register_input_filter(utf8_filter_name, utf8_in_filter, NULL,
> + AP_FTYPE_NETWORK - 1);

Huh?  Isn't that before mod_ssl, let alone mod_deflate, mod_charset?
And no excape path for binary data either.  It'll cripple any server
with it loaded!


-- 
Nick Kew

Application Development with Apache - the Apache Modules Book
http://www.apachetutor.org/


Re: svn commit: r691418 [2/2] - in /httpd/httpd/trunk: ./docs/manual/mod/ modules/filters/

2008-09-10 Thread Plüm, Rüdiger, VF-Group
 

> -Ursprüngliche Nachricht-
> Von: [EMAIL PROTECTED] 
> Gesendet: Mittwoch, 10. September 2008 08:56
> An: dev@httpd.apache.org
> Betreff: Re: svn commit: r691418 [2/2] - in 
> /httpd/httpd/trunk: ./docs/manual/mod/ modules/filters/
> 

> I investigated further. I wrote a test file having binary 
> character from 0 to 31:
> $ od -c out.txt
> 000  \0  \n 001  \n 002  \n 003  \n 004  \n 005  \n 006  
> \n 007  \n
> 020  \b  \n  \t  \n  \n  \n 013  \n  \f  \n  \r  \n 016  
> \n 017  \n
> ...
> 
> And a small sed script :
>  $ cat one.sed
> l
> d
> 
> Sed script just runs the "l" command for each line.
> $ /usr/ucb/sed -f one.sed out.txt  > out1.txt
> 
> Here is the output of out1.txt
> $ od -c out1.txt
> 000  \n   \   0   1  \n   \   0   2  \n   \   0   3  \n   
> \   0   4
> 020  \n   \   0   5  \n   \   0   6  \n   \   0   7  \n   
> -  \b   <
> 040  \n   -  \b   >  \n  \n  \n   \   1   3  \n   \   1   
> 4  \n   \
> 060   1   5  \n   \   1   6  \n   \   1   7  \n   \   2   
> 0  \n   \
> ...
> 
> $ cat out1.txt
> \01
> \02
> \03
> \04
> \05
> \06
> \07
> <
> >
> 
> 
> \13
> \14
> 
> ---
> 
> So for some strange reason :
> 0x8 is converted to "-\b<" and
> 0x9 is converted to "-\b>"
> 
> That's what we see in "trans" variable.
> 
> Do you think it could be a bug in original sed and should we 
> correct it? 

I guess it is a bug in original sed and it should be corrected.
IMHO it should be suffient to replace

-\b<

and

-\b<

with

<

and

>

> 
> It should probably print "\10" and "\11".

This would be an option, but  I wouldn't get this far. It is still fine with me
to print < and >.

> 
> BTW /usr/bin/sed have the exactly the same behavior. 
> 
> It sound strange though that this was never caught in sed code.

Some issues last longer than other ones :-)

Regards

Rüdiger