Reading data from Request Body - Twice!!

2008-05-02 Thread Subra A Narayanan
Hello folks,

I really hope someone can point me in the correct direction here.

I have an apache module that receives data from clients in the request body
and saves the recvd data in a db. The client also sends the Content-MD5
header with the request which I want to verify by computing the MD5 of recvd
data and checking if the result is the same as the value of the Content-MD5
header. I want to just discard the data (not store it in the db) and return
an error if the MD5 check fails.

The client can send up to 4 GB data in the request. Which means I will have
to compute the MD5 incrementally in small chunks using the MD5_INIT,
MD5_UPDATE and MD5_FINAL routines. But this means that I would have already
read the request body once from start to end using the following functions:

ap_setup_client_block(request_rechttp://httpd.apache.org/dev/apidoc/apidoc_request_rec.html*r,
int read_policy);
ap_should_client_block(request_rechttp://httpd.apache.org/dev/apidoc/apidoc_request_rec.html*r);
ap_get_client_block(request_rechttp://httpd.apache.org/dev/apidoc/apidoc_request_rec.html*r,
char *buffer, int bufsiz);


Now if the MD5 checksum passes and now I want to read the data again, how
would i do it? I cannot use the above function again, right?
'ap_should_client_block' has already told the client once to send the entire
data once.

Reading the data once, computing the MD5, caching it and then reusing the
cached data if the checksum passes is one option, but I dont want to do that
for a variety of reasons.

Is there any other way?

Thanks for any help.

Subra


How do I know the character encoding?

2008-05-02 Thread John Zhang
In my output filter, I need to parse the document to search for certain 
patterns.

Where can I get the information about the (character) encoding so that I can 
parse the document correctly?  Eg the document may contain unicode characters 
and are encoded in a special encoding. 

Thanks,
John


Re: svn commit: r646285 - in /httpd/httpd/trunk: CHANGES docs/manual/mod/mod_auth_form.xml modules/aaa/config.m4 modules/aaa/mod_auth_form.c

2008-05-02 Thread Plüm , Rüdiger , VF-Group
 

 -Ursprüngliche Nachricht-
 Von: Graham Leggett 
 Gesendet: Freitag, 2. Mai 2008 00:01
 An: dev@httpd.apache.org
 Betreff: Re: svn commit: r646285 - in /httpd/httpd/trunk: 
 CHANGES docs/manual/mod/mod_auth_form.xml 
 modules/aaa/config.m4 modules/aaa/mod_auth_form.c
 
 Ruediger Pluem wrote:
 

  +apr_table_set(r-headers_out, Location, 
 sent_loc);
  +return HTTP_MOVED_PERMANENTLY;
  
  Shouldn't this be HTTP_TEMPORARY_REDIRECT?
 
 Not sure, can you explain why it would be temporary?

Because the login page is not what should be displayed normally. It should
only get displayed if the user is not logged in. IMHO HTTP_MOVED_PERMANENTLY
is cachable whereas HTTP_TEMPORARY_REDIRECT is not.

 
  +/* return the underlying error, or OK on success */
  +return r-status == HTTP_OK || r-status == OK ? OK : 
 r-status;
  
  Why not returning r-status here directly?
 
 Because when it did as I recall, the HTTP_OK value triggered 
 the error 
 document handler, which broke the request.

Ah, ok. Looking at the code this makes sense.

Regards

Rüdiger


Re: apache mod_dbd/htaccess

2008-05-02 Thread Graham Leggett

Res wrote:


So what would be the impact to having a very large httpd.conf file,
having to read in, perhaps many thousands of extra directory blocks?
If only 100 or 1000 want it thats fine, but if i end up with 10+K
wanting it, I would like to know what impact I may expect, what would be 
the memory use for each additional directory block?


I think the best way to check is to try create a very large httpd file 
like this pointing at some test directories and see. How fast this will 
be will depend largely on your environment I would think.


Regards,
Graham
--


smime.p7s
Description: S/MIME Cryptographic Signature


Re: svn commit: r647263 - in /httpd/httpd/trunk: ./ docs/manual/mod/ include/ modules/aaa/ modules/filters/ modules/http/ server/

2008-05-02 Thread Graham Leggett

Ruediger Pluem wrote:


+/* all is well, set aside the buckets */
+for (bucket = APR_BRIGADE_FIRST(b);
+ bucket != APR_BRIGADE_SENTINEL(b);
+ bucket = APR_BUCKET_NEXT(bucket))
+{
+apr_bucket_copy(bucket, e);


What about transient buckets? Don't we need to set them aside?


I don't follow - does the apr_bucket_copy not do that for us already?


+ctx-remaining -= readbytes;
+ctx-offset += readbytes;
+return APR_SUCCESS;


Why using ctx-offset at all and not just taking all data from the 
kept_body brigade until readbytes,
copy it over to b and remove it afterwards from the kept_body brigade. 
This would save one call

to apr_brigade_partition.


In theory, that would mean you could only read the kept_body once. The 
kept body could be delivered to multiple requests embedded within 
mod_include for example, and would be needed to be read more than once.



+c = low ^ hi;


Shouldn't this be c = low + hi ?


In theory either should work, which is faster?


+/* If we have been asked to, keep the data up until the
+ * configured limit. If the limit is exceeded, we return an
+ * HTTP_REQUEST_ENTITY_TOO_LARGE response so the caller is
+ * clear the server couldn't handle their request.
+ */
+if (kept_body) {
+if (len = left) {
+apr_bucket_copy(bucket, e);
+APR_BRIGADE_INSERT_TAIL(kept_body, e);
+left -= len;
+}
+else {
+apr_brigade_destroy(bb);
+apr_brigade_destroy(kept_body);
+return HTTP_REQUEST_ENTITY_TOO_LARGE;
+}


Why is this needed? Should this job be performed by the 
ap_keep_body_filter that should

be in our input filter chain if we want to keep the body?
Of course this depends when we call ap_parse_request_form. If we call it 
during the
authn/z phase the filter chain hasn't been setup. So maybe we should 
ensure that

this is the case.


I think the reason it is there was from when the kept body was being 
captured by ap_discard_request_body, which wouldn't be run if this code 
kicked in.


However we do call it in the authn/z phase, so if the keep body filter 
isn't set up yet then it does still need to be here.



Why not using the insert_filter hook?


Good question, let me look.


@@ -1648,8 +1649,8 @@
  * Add the KEPT_BODY filter, which will insert any body marked to be
  * kept for the use of a subrequest, into the subrequest.
  */
-ap_add_input_filter_handle(ap_kept_body_input_filter_handle,
-   NULL, rnew, rnew-connection);
+ap_add_input_filter(KEPT_BODY_FILTER,
+NULL, rnew, rnew-connection);
 


This creates an error message on each subrequest if mod_request is not 
loaded, because

in this case the KEPT_BODY_FILTER is not registered.


You're right, let me look at this.

Regards,
Graham
--


smime.p7s
Description: S/MIME Cryptographic Signature


Re: svn commit: r647263 - in /httpd/httpd/trunk: ./ docs/manual/mod/ include/ modules/aaa/ modules/filters/ modules/http/ server/

2008-05-02 Thread Plüm , Rüdiger , VF-Group
 

 -Ursprüngliche Nachricht-
 Von: Graham Leggett 
 Gesendet: Freitag, 2. Mai 2008 12:40
 An: dev@httpd.apache.org
 Betreff: Re: svn commit: r647263 - in /httpd/httpd/trunk: ./ 
 docs/manual/mod/ include/ modules/aaa/ modules/filters/ 
 modules/http/ server/
 
 Ruediger Pluem wrote:
 
  +/* all is well, set aside the buckets */
  +for (bucket = APR_BRIGADE_FIRST(b);
  + bucket != APR_BRIGADE_SENTINEL(b);
  + bucket = APR_BUCKET_NEXT(bucket))
  +{
  +apr_bucket_copy(bucket, e);
  
  What about transient buckets? Don't we need to set them aside?
 
 I don't follow - does the apr_bucket_copy not do that for us already?

No, it does not.

 
  +ctx-remaining -= readbytes;
  +ctx-offset += readbytes;
  +return APR_SUCCESS;
  
  Why using ctx-offset at all and not just taking all data from the 
  kept_body brigade until readbytes,
  copy it over to b and remove it afterwards from the 
 kept_body brigade. 
  This would save one call
  to apr_brigade_partition.
 
 In theory, that would mean you could only read the kept_body 
 once. The 
 kept body could be delivered to multiple requests embedded within 
 mod_include for example, and would be needed to be read more 
 than once.

But to deliver it more then once you would need to reset the filter context, 
right?

 
  +c = low ^ hi;
  
  Shouldn't this be c = low + hi ?
 
 In theory either should work, which is faster?

I think there is not much difference with respect to speed but using
'+' seems to be easier to read.

 
  +/* If we have been asked to, keep the data up 
 until the
  + * configured limit. If the limit is 
 exceeded, we return an
  + * HTTP_REQUEST_ENTITY_TOO_LARGE response so 
 the caller is
  + * clear the server couldn't handle their request.
  + */
  +if (kept_body) {
  +if (len = left) {
  +apr_bucket_copy(bucket, e);
  +APR_BRIGADE_INSERT_TAIL(kept_body, e);
  +left -= len;
  +}
  +else {
  +apr_brigade_destroy(bb);
  +apr_brigade_destroy(kept_body);
  +return HTTP_REQUEST_ENTITY_TOO_LARGE;
  +}
  
  Why is this needed? Should this job be performed by the 
  ap_keep_body_filter that should
  be in our input filter chain if we want to keep the body?
  Of course this depends when we call ap_parse_request_form. 
 If we call it 
  during the
  authn/z phase the filter chain hasn't been setup. So maybe 
 we should 
  ensure that
  this is the case.
 
 I think the reason it is there was from when the kept body was being 
 captured by ap_discard_request_body, which wouldn't be run if 
 this code 
 kicked in.
 
 However we do call it in the authn/z phase, so if the keep 
 body filter 
 isn't set up yet then it does still need to be here.

Yes, but what worries me is that other input filters aren't setup as well
that might be needed. Couldn't there be a case where we need to have the inflate
input filter in place?
Maybe it is needed to ensure that the input filter stack is already setup
before we read from it.

Regards

Rüdiger



Re: svn commit: r647263 - in /httpd/httpd/trunk: ./ docs/manual/mod/ include/ modules/aaa/ modules/filters/ modules/http/ server/

2008-05-02 Thread Graham Leggett

Plüm wrote:


What about transient buckets? Don't we need to set them aside?

I don't follow - does the apr_bucket_copy not do that for us already?


No, it does not.


Let me look further.

In theory, that would mean you could only read the kept_body 
once. The 
kept body could be delivered to multiple requests embedded within 
mod_include for example, and would be needed to be read more 
than once.


But to deliver it more then once you would need to reset the filter context, 
right?


You wouldn't, no - the filter is added to each subrequest when relevant 
as many times as is necessary, and each instance of the filter is only 
used once.


I think the reason it is there was from when the kept body was being 
captured by ap_discard_request_body, which wouldn't be run if 
this code 
kicked in.


However we do call it in the authn/z phase, so if the keep 
body filter 
isn't set up yet then it does still need to be here.


Yes, but what worries me is that other input filters aren't setup as well
that might be needed. Couldn't there be a case where we need to have the inflate
input filter in place?
Maybe it is needed to ensure that the input filter stack is already setup
before we read from it.


At what point are the input filters inserted?

Regards,
Graham
--



smime.p7s
Description: S/MIME Cryptographic Signature


Re: svn commit: r647263 - in /httpd/httpd/trunk: ./ docs/manual/mod/ include/ modules/aaa/ modules/filters/ modules/http/ server/

2008-05-02 Thread Plüm , Rüdiger , VF-Group
 

 -Ursprüngliche Nachricht-
 Von: Graham Leggett 
 Gesendet: Freitag, 2. Mai 2008 13:28
 An: dev@httpd.apache.org
 Betreff: Re: svn commit: r647263 - in /httpd/httpd/trunk: ./ 
 docs/manual/mod/ include/ modules/aaa/ modules/filters/ 
 modules/http/ server/
 
 Plüm wrote:
 

  I think the reason it is there was from when the kept body 
 was being 
  captured by ap_discard_request_body, which wouldn't be run if 
  this code 
  kicked in.
 
  However we do call it in the authn/z phase, so if the keep 
  body filter 
  isn't set up yet then it does still need to be here.
  
  Yes, but what worries me is that other input filters aren't 
 setup as well
  that might be needed. Couldn't there be a case where we 
 need to have the inflate
  input filter in place?
  Maybe it is needed to ensure that the input filter stack is 
 already setup
  before we read from it.
 
 At what point are the input filters inserted?

IMHO in the insert_filter hook that is called shortly before the handler is 
invoked.

Regards

Rüdiger



Revamped apachemonitor.exe trunk/ needs review

2008-05-02 Thread William A. Rowe, Jr.

Looking for backport votes in STATUS to move the current trunk/ code into
our 2.2.x and 2.0.x branches for apachemonitor.c.  The module has been
taught ucs-16 service names for our non-western-latin language friends
(it's a matter of replacing the /D _MCBS with /D _UNICODE and
/D UNICODE in the compilation).

It's been taught how to invoke the service control manager and reinvoke
itself for the administrator or vanilla user as a real administrator
with a password prompt, if it doesn't have permissions to start/stop
the server, such as when UAC is in effect.  This change is *required*
for Windows Vista and 2008 server by default, although UAC can be
disabled by and for the admin.  On 2003 and earlier, it just serves as
a nice-to-have (and will make you pick the admin user + password, not
simply enter the admin p/w).

For the few of you who want to test this directly, I've compiled trunk
and dropped the binary to http://people.apache.org/~wrowe/ if you just
want to avoid compilation.  There are VC6 builds for regular as well as
the unicode flavor.  The VS2005 builds require the VS2005 C runtime, and
are X64 flavors for your experimentation.

Bill


Building trunk on Windows

2008-05-02 Thread César Leonardo Blum Silveira
Hi all,

Is there any documentation regarding how to build httpd-trunk on Windows?

Thanks,

-- 
César L. B. Silveira


Re: Revamped apachemonitor.exe trunk/ needs review

2008-05-02 Thread William A. Rowe, Jr.

William A. Rowe, Jr. wrote:


For the few of you who want to test this directly, I've compiled trunk
and dropped the binary to http://people.apache.org/~wrowe/ if you just
want to avoid compilation.  There are VC6 builds for regular as well as
the unicode flavor.  The VS2005 builds require the VS2005 C runtime, and
are X64 flavors for your experimentation.


FYI - to install/use the X64 flavor, see

http://www.microsoft.com/downloads/details.aspx?familyid=90548130-4468-4BBC-9673-D6ACABD5D13Bdisplaylang=en

if you do not have any Studio 2005-based products (causing the program to
fail when you attempt to run it).

ApacheMonitor[*].exe --kill should remove any running instance of the
ApacheMonitor.exe program (provided those running were renamed with
exactly that name).


Re: AuthzMergeRules directive

2008-05-02 Thread Chris Darroch

Brad Nicholes wrote:


So what I am really trying to say is that intra-block logic and
inter-block logic as far as merging goes, are tied together.  If we
want to change the way that the logic of two block is merged, we
would also have to change the base state of each independent block.
It's all or nothing.  This would affect how the following block is
evaluated:

Directory /foo
require user joe
require user sue
/Directory

As it currently stands, the logic when combining these two rules would
be OR.  If we make the change, this would also change the same
configuration to use AND instead.  I think we determined that this
logic would be more secure anyway even if it is different than 2.2.x.


  Well, I suppose the absolutely key thing is to set the default
AuthzMergeRules state to Off.  Next up, I guess try changing the
On merge state to AND and we'll do some thinking and testing at
that point.

  It does look to me like the pre-2.4 intra-block logic was OR,
but I don't know how widely people would have depended on that
(as in your example above).  My gut instinct is that we'll wind up
having to replicate OR within blocks, but implement AND between
blocks, to achieve both backwards-compatibility and good security.

  As a first step, though, I'd suggest just making the two changes
to the existing defaults (AuthzMergeRules Off as default, and On = AND)
and we'll review again at that point.

  For my part I hope to deal with a parallel, unrelated, and hopefully
uncontroversial set of authn/z edits once SVN returns, namely changing all
the hard-coded 0 provider versions to use AUTHN/Z_PROVIDER_VERSION macros.
That's a simple first step toward the work outlined in this thread:

http://mail-archives.apache.org/mod_mbox/httpd-dev/200804.mbox/[EMAIL PROTECTED]

  Thanks again,

Chris.

--
GPG Key ID: 366A375B
GPG Key Fingerprint: 485E 5041 17E1 E2BB C263  E4DE C8E3 FA36 366A 375B



Re: svn commit: r647263 - in /httpd/httpd/trunk: ./ docs/manual/mod/ include/ modules/aaa/ modules/filters/ modules/http/ server/

2008-05-02 Thread Roy T. Fielding

On May 2, 2008, at 4:07 AM, Plüm, Rüdiger, VF-Group wrote:


+c = low ^ hi;


Shouldn't this be c = low + hi ?


In theory either should work, which is faster?


The AND.


I think there is not much difference with respect to speed but using
'+' seems to be easier to read.


Not to me (assuming these are two separate bit fields being merged, as
I've lost the context at this point).

Roy

apr_cvt() in apr_snprintf() - not (L)GPL

2008-05-02 Thread Jim Jagielski

This closes a long standing confusion on some code in apr_snprintf()
(and ap_snprintf() in apache-1.3). The comments indicate that the
apr_cvt() implementation was pulled from GNU libc. However, the
actual origin of the code is from UNIX V7 (at least... it is also
possible that this was derived/obtained from older BSD code, but
this requires more digging). In any case, the code itself is licensed
by Caldera under a BSD-ish license:

   http://www.tuhs.org/Archive/Caldera-license.pdf

which makes it fine for us to use.

I have updated the source, LICENSE and NOTICE files to reflect
this fact.

PS: Even though I updated the files and am writing this Email,
the actual investigation was done by myself and Joe Orton.
Thanks Joe!

PPS: This is also applicable to the snprintf() impl in mod_jk,
 hence adding the [EMAIL PROTECTED] list. This is because that
 was pulled from APR :)


Re: svn commit: r647263 - in /httpd/httpd/trunk: ./ docs/manual/mod/ include/ modules/aaa/ modules/filters/ modules/http/ server/

2008-05-02 Thread Ruediger Pluem



On 05/02/2008 07:54 PM, Roy T. Fielding wrote:

On May 2, 2008, at 4:07 AM, Plüm, Rüdiger, VF-Group wrote:


+c = low ^ hi;


Shouldn't this be c = low + hi ?


In theory either should work, which is faster?


The AND.


I agree that an AND () or an OR (|) would be also readable, but
above we have an XOR, which makes one think about its deeper meaning.

Regards

Rüdiger



Re: svn commit: r647263 - in /httpd/httpd/trunk: ./ docs/manual/mod/ include/ modules/aaa/ modules/filters/ modules/http/ server/

2008-05-02 Thread Roy T. Fielding

On May 2, 2008, at 11:19 AM, Ruediger Pluem wrote:

On 05/02/2008 07:54 PM, Roy T. Fielding wrote:

On May 2, 2008, at 4:07 AM, Plüm, Rüdiger, VF-Group wrote:


+c = low ^ hi;


Shouldn't this be c = low + hi ?


In theory either should work, which is faster?

The AND.


I agree that an AND () or an OR (|) would be also readable, but
above we have an XOR, which makes one think about its deeper meaning.


Hah, I was reading it like an equation from my discrete math days.
I guess ^ really is less readable. ;-)  low  hi would be my preference.

Roy