Re: Extra bucket brigade with just an EOS on an input filter at the end.

2021-08-08 Thread Joe Orton
On Sat, Aug 07, 2021 at 06:33:49PM +0200, Dirk-Willem van Gulik wrote: 
...
> What I am seeing is a first brigade with the POST content; with a 
> terminating EOF. The bbout data makes it to the CGI script or 
> (reverse) proxy.
> 
> But I am then getting a second _input_filter call with a second 
> brigade of just an EOS packet.
> 
> What causes that ? Or am I not running through the brigade properly ?

You're not doing anything wrong, it seems to be a "feature" of the way 
input filtering is implemented and should probably be better documented.  
I wrote this in mod_ssl:

/* Surprisingly (and perhaps, wrongly), the request body can be
 * pulled from the input filter stack more than once; a
 * handler may read it, and ap_discard_request_body() will
 * attempt to do so again after *every* request.  So input
 * filters must be prepared to give up an EOS if invoked after
 * initially reading the request. The HTTP_IN filter does this
 * with its ->eos_sent flag. */

https://svn.apache.org/viewvc/httpd/httpd/trunk/modules/ssl/ssl_engine_io.c?view=markup#l2205

Regards, Joe



Extra bucket brigade with just an EOS on an input filter at the end.

2021-08-07 Thread Dirk-Willem van Gulik
In some code 
(https://source.redwax.eu/svn/redwax/rs/mod_cms_verify/trunk/mod_cms_verify.c) 
I have in input filter (that checks a PKCS#7 signature before passing the 
payload on to a proxy/cgi-script, etc).

I am testing this with:

echo  "field1=foo&field2=bar” |\
openssl cms -sign -signer /tmp/sign.cert -outform DER -stream  
|\
 curl --data-binary @- - http://127.0.0.1:8080/show.cgi

Works well.

But I am seeing after all this going well an extra bucket brigade being passed; 
with 0 bytes. And I’d like to understand why.

Code is roughly 
((https://source.redwax.eu/svn/redwax/rs/mod_cms_verify/trunk/mod_cms_verify.c 
for the real McCoy):

  static apr_status_t _input_filter(ap_filter_t * f,  apr_bucket_brigade * 
bbout, ….
  {
verify_config_rec *conf = ap_get_module_config(r->per_dir_config, 
&cms_verify_module);
request_rec *r = f->r;

bb = apr_brigade_create(r->pool, r->connection->bucket_alloc);

if (state == NULL)  {
setup some state..
state->pbb_tmp = apr_brigade_create(r->pool, c->bucket_alloc);
….
}

if (APR_BRIGADE_EMPTY(state->pbb_tmp)) {
rv = ap_get_brigade(f->next, state->pbb_tmp, eMode, eBlock, nBytes);
if (eMode == AP_MODE_EATCRLF || rv != APR_SUCCESS)
return rv;
}

while (!APR_BRIGADE_EMPTY(state->pbb_tmp)) {
apr_bucket *pbkt_in = APR_BRIGADE_FIRST(state->pbb_tmp);
const char *data;
apr_size_t len;

if (APR_BUCKET_IS_EOS(pbkt_in)) {
apr_bucket *pbkt_out = validate()..

if (pbkt_out is valid) 
APR_BRIGADE_INSERT_TAIL(bbout, pbkt_out);

APR_BRIGADE_INSERT_TAIL(bbout, 
apr_bucket_eos_create(r->connection->bucket_alloc));
APR_BUCKET_REMOVE(pbkt_in);
break;
}

rv = apr_bucket_read(pbkt_in, &data, &len, eBlock);
if (rv != APR_SUCCESS)
return rv;

… add len bytes to a buffer

apr_bucket_delete(pbkt_in);
};
return APR_SUCCESS;
   }

And mostly taken from mod_example.

What I am seeing is a first brigade with the POST content; with a terminating 
EOF. The bbout data makes it to the CGI script or (reverse) proxy. 

But I am then getting a second _input_filter call with a second brigade of just 
an EOS packet.

What causes that ? Or am I not running through the brigade properly ?

Dw



Re: input filter problem

2013-05-01 Thread Eric Covener
> I understand that, however I don't understand how to make my module run the
> input_filters without consuming the request.
>
> What I mean is that if I add the following, it will consume the actual
> request.
>
> rv = ap_get_brigade(r->input_filters, bb, AP_MODE_READBYTES,
> APR_BLOCK_READ, HUGE_STRING_LEN);
>
> if (rv != APR_SUCCESS) {
> ap_log_rerror(APLOG_MARK, APLOG_ERR, rv, r, "Error reading
> request entity data");
> return HTTP_INTERNAL_SERVER_ERROR;
> }
>
> for (bucket = APR_BRIGADE_FIRST(bb); bucket !=
> APR_BRIGADE_SENTINEL(bb); bucket = APR_BUCKET_NEXT(bucket)) {
> const char *data;
> apr_size_t len;
>
> Do you have any suggestions?


I didn't realize your handler returned DECLINED.  Either it will run
before the post body is read, or it will not run at all.  You can't
really write anything that runs before the real handler that also
notices something your input filter will change.  There is a module in
2.4 that allows the request body to be buffered, but I don't know much
about it. http://httpd.apache.org/docs/trunk/mod/mod_buffer.html

I guess it mostly depends on what you expect to do with the
information outside of the filter.

--
Eric Covener
cove...@gmail.com


Re: input filter problem

2013-05-01 Thread Marian Marinov

On 05/01/2013 04:23 PM, Eric Covener wrote:

However I'm unable to make my input filter function to be called before my
handler.


Input filters are normally called during the handler, when the handler
tries to read the request body through the filter chain.

  In your case, the body is probably not read until the core of Apache
sees it needs to discard the body the handler didn't read towards the
end of the request.You'll need to read the body (see e.g.
mod_cgi.c near r->input_filters)



I understand that, however I don't understand how to make my module run the 
input_filters without consuming the request.

What I mean is that if I add the following, it will consume the actual request.

rv = ap_get_brigade(r->input_filters, bb, AP_MODE_READBYTES, 
APR_BLOCK_READ, HUGE_STRING_LEN);

if (rv != APR_SUCCESS) {
ap_log_rerror(APLOG_MARK, APLOG_ERR, rv, r, "Error reading request 
entity data");
return HTTP_INTERNAL_SERVER_ERROR;
}

for (bucket = APR_BRIGADE_FIRST(bb); bucket != 
APR_BRIGADE_SENTINEL(bb); bucket = APR_BUCKET_NEXT(bucket)) {
const char *data;
apr_size_t len;

Do you have any suggestions?


Re: input filter problem

2013-05-01 Thread Eric Covener
> However I'm unable to make my input filter function to be called before my
> handler.

Input filters are normally called during the handler, when the handler
tries to read the request body through the filter chain.

 In your case, the body is probably not read until the core of Apache
sees it needs to discard the body the handler didn't read towards the
end of the request.You'll need to read the body (see e.g.
mod_cgi.c near r->input_filters)


input filter problem

2013-05-01 Thread Marian Marinov

Hi guys,
I'm developing a module that needs to do the following things:
1. Read POST arguments
2. Read input headers
3. Read Query string arguments

I have decided to implement it by having an input filter for the POST parsing(since this is the only way to look into 
the POST without consuming it).

And implement the headers and query string parsing in ap_hook_handler().

However I'm unable to make my input filter function to be called before my 
handler.

Can you please help?

All I see in the logs is this:
[Wed May 01 05:59:47 2013] [error] [client 10.2.0.4] mod_imp handler called
[Wed May 01 05:59:47 2013] [error] [client 10.2.0.4] mod_imp insert filter(0)
[Wed May 01 05:59:47 2013] [error] [client 10.2.0.4] mod_imp POST: user=123
[Wed May 01 05:59:47 2013] [error] [client 10.2.0.4] mod_imp found 'user'
[Wed May 01 05:59:47 2013] [error] [client 10.2.0.4] mod_imp insert filter(1)
[Wed May 01 05:59:47 2013] [error] [client 10.2.0.4] mod_imp insert filter(2)

I know that using connection->base_server->module_config will not work, unfortunately I'm out of ideas and started 
trying everything I know.


All I want to do is to tell my handler that the input filter function has 
matched or not.

This is my code:


static int imp_handler(request_rec *r) {
imp_config *cfg = 
ap_get_module_config(r->connection->base_server->module_config, &imp_module);

// Only handle initial requests
if ( !ap_is_initial_req(r) )
return DECLINED;

if ( cfg->match == 1 || apr_table_get(r->headers_in, "matched") || 
apr_table_get(r->notes, "matched") ) {
ap_log_rerror(APLOG_MARK, APLOG_ERR | APLOG_NOERRNO, 0, r, MODULE_NAME " 
MATCH found");
cfg->match = 0;
}
ap_log_rerror(APLOG_MARK, APLOG_ERR | APLOG_NOERRNO, 0, r, MODULE_NAME " handler 
called");

return DECLINED;
}

static apr_status_t imp_filter(ap_filter_t *f, apr_bucket_brigade *bb, ap_input_mode_t mode, apr_read_type_e block, 
apr_off_t readbytes) {

apr_bucket *bktIn, *bktOut;
apr_status_t ret;
apr_size_t len;
const char *data;
imp_ctx *ctx = f->ctx;
imp_config *cfg = ap_get_module_config(f->c->base_server->module_config, 
&imp_module);

if ( ctx == NULL ) {
ctx = f->ctx = apr_pcalloc(f->r->pool, sizeof(imp_ctx));
ctx->count = 0;
}

ap_log_rerror(APLOG_MARK, APLOG_ERR | APLOG_NOERRNO, 0, f->r, MODULE_NAME " insert 
filter(%d)", ctx->count);
ctx->count++;

ret = ap_get_brigade(f->next, bb, mode, block, readbytes);
if (ret != APR_SUCCESS)
return ret;

while(!APR_BRIGADE_EMPTY(bb)) {
bktIn = APR_BRIGADE_FIRST(bb);

if (APR_BUCKET_IS_EOS(bktIn)) {
APR_BUCKET_REMOVE(bktIn);
APR_BRIGADE_INSERT_TAIL(bb, bktIn);
break;
}

ret=apr_bucket_read(bktIn, &data, &len, block);
if ( ret != APR_SUCCESS )
return ret;

ap_log_rerror(APLOG_MARK, APLOG_ERR | APLOG_NOERRNO, 0, f->r, MODULE_NAME " 
POST: %s", data);
if (strstr(data, "user") != NULL) {
ap_log_rerror(APLOG_MARK, APLOG_ERR | APLOG_NOERRNO, 0, f->r, MODULE_NAME 
" found 'user'");
cfg->match = 1;
apr_table_setn(f->r->headers_in, "matched", "1");
apr_table_setn(f->r->notes, "matched", "1");
}

// The following two lines reintroduce the parsed data to the brigade
bktOut = apr_bucket_heap_create(data, len, 0, 
f->r->connection->bucket_alloc);
APR_BRIGADE_INSERT_TAIL(bb, bktOut);

apr_bucket_delete(bktIn);
}
return APR_SUCCESS;
}

static void imp_add_filter(request_rec *r) {
imp_config *cfg = (imp_config *) 
ap_get_module_config(r->base_server->module_config, &imp_module);
cfg->count = 0;
cfg->match = 0;
ap_add_input_filter("imp-post-filter", cfg, r, r->connection);
}


static int imp_pre_conn(conn_rec *c, void *csd) {
imp_config *cfg = ap_get_module_config(c->base_server->module_config, 
&imp_module);
if ( cfg == NULL )
cfg = apr_pcalloc(c->pool, sizeof(imp_config));
cfg->count = 0;
cfg->match = 0;
return OK;
}

static void imp_register_hooks(apr_pool_t *p) {
ap_hook_pre_connection(imp_pre_conn, NULL, NULL, APR_HOOK_MIDDLE);
ap_hook_insert_filter(imp_add_filter, NULL, NULL, APR_HOOK_MIDDLE);
ap_register_input_filter("imp-post-filter", imp_filter, NULL, 
AP_FTYPE_RESOURSE);
ap_hook_header_parser(imp_handler, NULL, NULL, APR_HOOK_MIDDLE);
}


Thank you very much!
Marian


Re: What is a "clogging" input filter?

2013-04-22 Thread Jim Jagielski
In process_socket() when we create cs, should we explicitly
set cs->pub.sense to CONN_SENSE_DEFAULT?

Otherwise, +1 (with a mmn bump).

On Apr 19, 2013, at 6:09 PM, Graham Leggett  wrote:

> On 19 Apr 2013, at 6:38 PM, Eric Covener  wrote:
> 
>>> What would it take to make mod_ssl a non-clogging filter?
>> 
>> Basically needs two things:
>> 1) Never buffer data inside ssl (feasible I believe)
>> 2) Create a new way for a filter deep inside the chain to indicate we
>> need to wait for a read/write ability on a socket -- this
>> fundamentally is why a Serf bucket model is better than our chained
>> filters.  An alternative  approach would be changing the flow so
>> filters don't directly call the next filter, putting the core back in
>> control of filter flow, but this is still a non-trivial project.
> 
> Would something like the attached patch work?
> 
> Looking closer at mod_ssl, it seems that when it writes, it wires in both the 
> input and output filter stacks, which is good, but when it reads, only the 
> input filter stack is wired in, which means it can never write. This might 
> need to be looked at too.
> 
> Regards,
> Graham
> --
> 



Re: What is a "clogging" input filter?

2013-04-19 Thread Graham Leggett
On 19 Apr 2013, at 6:38 PM, Eric Covener  wrote:

>> What would it take to make mod_ssl a non-clogging filter?
> 
> Basically needs two things:
> 1) Never buffer data inside ssl (feasible I believe)
> 2) Create a new way for a filter deep inside the chain to indicate we
> need to wait for a read/write ability on a socket -- this
> fundamentally is why a Serf bucket model is better than our chained
> filters.  An alternative  approach would be changing the flow so
> filters don't directly call the next filter, putting the core back in
> control of filter flow, but this is still a non-trivial project.

Would something like the attached patch work?

Looking closer at mod_ssl, it seems that when it writes, it wires in both the 
input and output filter stacks, which is good, but when it reads, only the 
input filter stack is wired in, which means it can never write. This might need 
to be looked at too.

Regards,
Graham
--


httpd-unclog.patch
Description: Binary data


smime.p7s
Description: S/MIME cryptographic signature


Re: What is a "clogging" input filter?

2013-04-19 Thread Eric Covener
On Fri, Apr 19, 2013 at 11:14 AM, Graham Leggett  wrote:
> Hi all,
>
> Currently mod_ssl doesn't work with the event MPM, as per the following code:
>
> if (c->clogging_input_filters && !c->aborted) {
> /* Since we have an input filter which 'cloggs' the input stream,
>  * like mod_ssl, lets just do the normal read from input filters,
>  * like the Worker MPM does.
>  */
> apr_atomic_inc32(&clogged_count);
> ap_run_process_connection(c);
> if (cs->pub.state != CONN_STATE_SUSPENDED) {
> cs->pub.state = CONN_STATE_LINGER;
> }
> apr_atomic_dec32(&clogged_count);
> }
>
> Can anyone explain what it means exactly to clog the input stream?
>
> Am I right in understanding that openssl reads ahead, and therefore there may 
> not be data for reading when we get to apr_pollset_poll()?
>
> Or is this instead to do with openssl returning SSL_ERROR_WANT_READ during 
> SSL_write? (and vice versa)?
>

I have this starred response from Paul Querna in my gmail:

-
It means the filter buffers in such a way that it might have data
inside it, but a poll() on the socket will never return active when it
has data ready.  It also has to do with filters that change the
read/write flow -- we currently don't have a way for an output filter
for example to say "come back to me when there is data available to
READ".  This is specifically the mod_ssl problem, because of the SSL
protocol, you often need to do a write before you can read, or vice
versa.

> What would it take to make mod_ssl a non-clogging filter?

Basically needs two things:
1) Never buffer data inside ssl (feasible I believe)
2) Create a new way for a filter deep inside the chain to indicate we
need to wait for a read/write ability on a socket -- this
fundamentally is why a Serf bucket model is better than our chained
filters.  An alternative  approach would be changing the flow so
filters don't directly call the next filter, putting the core back in
control of filter flow, but this is still a non-trivial project.





--
Eric Covener
cove...@gmail.com


What is a "clogging" input filter?

2013-04-19 Thread Graham Leggett
Hi all,

Currently mod_ssl doesn't work with the event MPM, as per the following code:

if (c->clogging_input_filters && !c->aborted) {
/* Since we have an input filter which 'cloggs' the input stream,
 * like mod_ssl, lets just do the normal read from input filters,
 * like the Worker MPM does.
 */
apr_atomic_inc32(&clogged_count);
ap_run_process_connection(c);
if (cs->pub.state != CONN_STATE_SUSPENDED) {
cs->pub.state = CONN_STATE_LINGER;
}
apr_atomic_dec32(&clogged_count);
}

Can anyone explain what it means exactly to clog the input stream?

Am I right in understanding that openssl reads ahead, and therefore there may 
not be data for reading when we get to apr_pollset_poll()?

Or is this instead to do with openssl returning SSL_ERROR_WANT_READ during 
SSL_write? (and vice versa)?

Regards,
Graham
--



smime.p7s
Description: S/MIME cryptographic signature


Re: chunked trailers input filter processing by ap_http_filter should be documented

2013-02-12 Thread Joe Schaefer
Thx Bill!  I will let a few days of lazy consensus
pass before committing this to trunk.





>
> From: William A. Rowe Jr. 
>To: dev@httpd.apache.org 
>Cc: joe_schae...@yahoo.com 
>Sent: Tuesday, February 12, 2013 11:52 AM
>Subject: Re: chunked trailers input filter processing by ap_http_filter should 
>be documented
> 
>On Sun, 10 Feb 2013 08:25:35 -0800 (PST)
>Joe Schaefer  wrote:
>
>> Here's a sledgehammer patch to ap_rgetline_core()
>> to replace r->input_filters with r->proto_input_filters.
>> This would still mean protocol filters behind ap_http_filter
>> would need to punt on these calls, but that's a lot
>> more reasonable than imposing it on AP_FTYPE_RESOURCE
>> and similar filters as well.
>
>+1, seems much saner.  Request filters have no business reading
>beyond end of request body.
>
>
>

Re: chunked trailers input filter processing by ap_http_filter should be documented

2013-02-12 Thread William A. Rowe Jr.
On Sun, 10 Feb 2013 08:25:35 -0800 (PST)
Joe Schaefer  wrote:

> Here's a sledgehammer patch to ap_rgetline_core()
> to replace r->input_filters with r->proto_input_filters.
> This would still mean protocol filters behind ap_http_filter
> would need to punt on these calls, but that's a lot
> more reasonable than imposing it on AP_FTYPE_RESOURCE
> and similar filters as well.

+1, seems much saner.  Request filters have no business reading
beyond end of request body.


Re: chunked trailers input filter processing by ap_http_filter should be documented

2013-02-10 Thread Joe Schaefer
Here's a sledgehammer patch to ap_rgetline_core()
to replace r->input_filters with r->proto_input_filters.
This would still mean protocol filters behind ap_http_filter
would need to punt on these calls, but that's a lot
more reasonable than imposing it on AP_FTYPE_RESOURCE
and similar filters as well.



Index: protocol.c
===
--- protocol.c    (revision 1331861)
+++ protocol.c    (working copy)
@@ -229,7 +229,7 @@
 
 for (;;) {
 apr_brigade_cleanup(bb);
-    rv = ap_get_brigade(r->input_filters, bb, AP_MODE_GETLINE,
+    rv = ap_get_brigade(r->proto_input_filters, bb, AP_MODE_GETLINE,
 APR_BLOCK_READ, 0);
 if (rv != APR_SUCCESS) {
 return rv;
@@ -346,7 +346,7 @@
 apr_brigade_cleanup(bb);
 
 /* We only care about the first byte. */
-    rv = ap_get_brigade(r->input_filters, bb, AP_MODE_SPECULATIVE,
+    rv = ap_get_brigade(r->proto_input_filters, bb, 
AP_MODE_SPECULATIVE,
 APR_BLOCK_READ, 1);
 if (rv != APR_SUCCESS) {
 return rv;







>
> From: Joe Schaefer 
>To: "dev@httpd.apache.org"  
>Sent: Sunday, February 10, 2013 11:05 AM
>Subject: chunked trailers input filter processing by ap_http_filter should be 
>documented
> 
>
>So ap_http_filter winds up calling ap_get_mime_headers
>once it recognizes that the request body has finished,
>to process the trailing headers on chunked requests.
>This is actually a strange thing to do, because it means
>ap_http_filter winds up calling ap_get_brigade on
>r->input_filters with AP_MODE_GETLINE set, right in the middle of an
>existing ap_get_brigade sequence on the filter chain.
>In other words, this recursion only works if all
>post-protocol filters are written to punt on processing
>AP_MODE_GETLINE invocations- this is what we need to
>document somewhere if we don't want to fix the code.
>
>
>It would be logically better if there were a way to pass a
>ap_filter_t argument to ap_get_mime_headers so ap_http_filter
>didn't need to reach backwards in the input filter chain just
>to finish its HTTP protocol handling.
>
>
>
>
>
>

chunked trailers input filter processing by ap_http_filter should be documented

2013-02-10 Thread Joe Schaefer
So ap_http_filter winds up calling ap_get_mime_headers
once it recognizes that the request body has finished,
to process the trailing headers on chunked requests.
This is actually a strange thing to do, because it means
ap_http_filter winds up calling ap_get_brigade on
r->input_filters with AP_MODE_GETLINE set, right in the middle of an
existing ap_get_brigade sequence on the filter chain.
In other words, this recursion only works if all
post-protocol filters are written to punt on processing
AP_MODE_GETLINE invocations- this is what we need to
document somewhere if we don't want to fix the code.

It would be logically better if there were a way to pass a
ap_filter_t argument to ap_get_mime_headers so ap_http_filter
didn't need to reach backwards in the input filter chain just
to finish its HTTP protocol handling.

Re: rethinking connection input filter API

2011-11-11 Thread Jim Jagielski

On Nov 11, 2011, at 4:25 AM, Graham Leggett wrote:

> 
> My moon-on-a-stick would be for this to be ap_mpm_poll_connections() instead, 
> and allow the option to add extra connections to the poll (for something like 
> mod_proxy and friends, with the option to have each of these extra 
> connections removed by registering a pool cleanup appropriately). In theory, 
> ap_mpm_poll_connections() would return the connection that triggered the 
> event somehow.
> 

+1

> Some thoughts about async SSL handling, specifically SSL_read or SSL_write 
> returning WANTSREAD or WANTSWRITE, in theory EAGAIN just means "call me 
> again", but if there was a way to express "call me again when ready to read" 
> (EREADAGAIN), or "call me again when ready to write" (EWRITEAGAIN) we can do 
> proper async SSL. Not sure how that would fit in with the stuff above?



Re: rethinking connection input filter API

2011-11-11 Thread Jim Jagielski

On Nov 10, 2011, at 11:42 AM, Paul Querna wrote:
> 
> This means every input filter, must implement all of these different
> modes.  This significantly complicates modules like mod_ssl and
> reqtimeout.  Every module to support things like speculative must
> support buffering internally.
> 
> I would like to change the function signature for connection level
> input filters to:
> 
> apr_status_t  func(
> ap_filter_t *f,
> apr_bucket_brigade *b);
> 
> The function always be allowed to return EAGAIN.  It would not be
> responsible for polling or blocking.
> 
> This would massively simplify how most input filters work, and would
> enable more creative methods of network IO inside the MPMs.
> 

Assuming that we want to keep input filters (and filters in
general), certainly the current design needs this kind of
adjustment.

++1 on centralization and simplification.



Re: rethinking connection input filter API

2011-11-11 Thread Graham Leggett

On 10 Nov 2011, at 9:42 PM, Paul Querna wrote:


The input filter API function signature is the following:

   apr_status_t func(
ap_filter_t *f,
apr_bucket_brigade *b,
ap_input_mode_t mode,
apr_read_type_e block,
apr_off_t readbytes);

Problems:

1) This gives the caller of the API control over weither they can
block via the  apr_read_type_e block.

2) The caller also suggests how much data 'should' be read by passing
of readbytes, however many modules just pull in whatever they can, or
they have to put in extra logic to keep a backlog buffer.

3) Currently we have these modes:

   AP_MODE_READBYTES,
   AP_MODE_GETLINE,
   AP_MODE_EATCRLF,
   AP_MODE_SPECULATIVE,
   AP_MODE_EXHAUSTIVE,
   AP_MODE_INIT

This means every input filter, must implement all of these different
modes.  This significantly complicates modules like mod_ssl and
reqtimeout.  Every module to support things like speculative must
support buffering internally.

I would like to change the function signature for connection level
input filters to:

apr_status_t  func(
ap_filter_t *f,
apr_bucket_brigade *b);

The function always be allowed to return EAGAIN.  It would not be
responsible for polling or blocking.

This would massively simplify how most input filters work, and would
enable more creative methods of network IO inside the MPMs.

Most of the work would be in ap_rgetline_core, and figuring out how to
make it work without pushing complexity into the protocol parsers.

It would however reduce the total lines of code I believe, because we
would be centralizing all input buffering and different mode needs
into one place.


Huge +1.


I believe we would need to add a new method to the MPMs,
ap_mpm_poll_connection() as a transitionary API.  This could be called
by ap_rgetline_core or ap_get_brigade, and in the the traditional IO
MPMs like Worker, and Prefork, select/poll on the socket until it is
available. I'm unsure about how to best fix ap_get_brigade, it is also
used within mod_proxy for client IO.


My moon-on-a-stick would be for this to be ap_mpm_poll_connections()  
instead, and allow the option to add extra connections to the poll  
(for something like mod_proxy and friends, with the option to have  
each of these extra connections removed by registering a pool cleanup  
appropriately). In theory, ap_mpm_poll_connections() would return the  
connection that triggered the event somehow.


Some thoughts about async SSL handling, specifically SSL_read or  
SSL_write returning WANTSREAD or WANTSWRITE, in theory EAGAIN just  
means "call me again", but if there was a way to express "call me  
again when ready to read" (EREADAGAIN), or "call me again when ready  
to write" (EWRITEAGAIN) we can do proper async SSL. Not sure how that  
would fit in with the stuff above?


Regards,
Graham
--



Re: svn commit: r1200612 - in /httpd/httpd/branches/input-filter-dev: include/util_filter.h modules/debugging/mod_dumpio.c modules/filters/mod_reqtimeout.c modules/ssl/ssl_engine_io.c server/core_filt

2011-11-10 Thread William A. Rowe Jr.

On 11/10/2011 5:53 PM, Paul Querna wrote:


I think this got fixed when mod_ssl was changed to support RFC 2817 -
Upgrading to TLS Within HTTP/1.1.


That would make sense.



Re: svn commit: r1200612 - in /httpd/httpd/branches/input-filter-dev: include/util_filter.h modules/debugging/mod_dumpio.c modules/filters/mod_reqtimeout.c modules/ssl/ssl_engine_io.c server/core_filt

2011-11-10 Thread Paul Querna
On Thu, Nov 10, 2011 at 3:35 PM, William A. Rowe Jr.
 wrote:
> On 11/10/2011 4:55 PM, pque...@apache.org wrote:
>>
>> URL: http://svn.apache.org/viewvc?rev=1200612&view=rev
>> Log:
>> Remove AP_MODE_INIT, it is a no-op, everywhere
>
> This was added in order to init ssl connections on protocols which
> follow a write-then-read logic like snmp or ftp.
>
> I should be able to create a proof of concept quickly just adding a
> mod_echo welcome message and test if it's still needed for mod_ssl.
>

It's not needed in mod_ssl though -- it just ignored it.

I think this got fixed when mod_ssl was changed to support RFC 2817 -
Upgrading to TLS Within HTTP/1.1.

If every module just ignores it and treats it as a no-op, why does it
matter what the protocol module wants to do?


Re: svn commit: r1200612 - in /httpd/httpd/branches/input-filter-dev: include/util_filter.h modules/debugging/mod_dumpio.c modules/filters/mod_reqtimeout.c modules/ssl/ssl_engine_io.c server/core_filt

2011-11-10 Thread William A. Rowe Jr.

On 11/10/2011 4:55 PM, pque...@apache.org wrote:


URL: http://svn.apache.org/viewvc?rev=1200612&view=rev
Log:
Remove AP_MODE_INIT, it is a no-op, everywhere


This was added in order to init ssl connections on protocols which
follow a write-then-read logic like snmp or ftp.

I should be able to create a proof of concept quickly just adding a
mod_echo welcome message and test if it's still needed for mod_ssl.


rethinking connection input filter API

2011-11-10 Thread Paul Querna
The input filter API function signature is the following:

apr_status_t func(
 ap_filter_t *f,
 apr_bucket_brigade *b,
 ap_input_mode_t mode,
 apr_read_type_e block,
 apr_off_t readbytes);

Problems:

1) This gives the caller of the API control over weither they can
block via the  apr_read_type_e block.

2) The caller also suggests how much data 'should' be read by passing
of readbytes, however many modules just pull in whatever they can, or
they have to put in extra logic to keep a backlog buffer.

3) Currently we have these modes:

AP_MODE_READBYTES,
AP_MODE_GETLINE,
AP_MODE_EATCRLF,
AP_MODE_SPECULATIVE,
AP_MODE_EXHAUSTIVE,
AP_MODE_INIT

This means every input filter, must implement all of these different
modes.  This significantly complicates modules like mod_ssl and
reqtimeout.  Every module to support things like speculative must
support buffering internally.

I would like to change the function signature for connection level
input filters to:

apr_status_t  func(
 ap_filter_t *f,
 apr_bucket_brigade *b);

The function always be allowed to return EAGAIN.  It would not be
responsible for polling or blocking.

This would massively simplify how most input filters work, and would
enable more creative methods of network IO inside the MPMs.

Most of the work would be in ap_rgetline_core, and figuring out how to
make it work without pushing complexity into the protocol parsers.

It would however reduce the total lines of code I believe, because we
would be centralizing all input buffering and different mode needs
into one place.

I believe we would need to add a new method to the MPMs,
ap_mpm_poll_connection() as a transitionary API.  This could be called
by ap_rgetline_core or ap_get_brigade, and in the the traditional IO
MPMs like Worker, and Prefork, select/poll on the socket until it is
available. I'm unsure about how to best fix ap_get_brigade, it is also
used within mod_proxy for client IO.

Thoughts?

Thanks,

Paul


Re: How to do a internal redirect (was: sub-request) from input filter?

2011-04-28 Thread Micha Lenk
Hi Nick,

On Wed, Apr 27, 2011 at 06:39:49PM +0100, Nick Kew wrote:
> You may be a little confused.  Subrquests and internal redirects
> are not the same thing, and serve different purposes. [...]

You're right, what I want is an internal redirect from within my input filter,
not a subrequest.

> A subrequest should be OK in a filter, but if it generates any output
> you'd need to figure out where to send it - which is not obvious in
> an input filter.  As for an internal redirect, I don't see how it would
> make sense, but I haven't tried.

I've just tried it with ap_internal_redirect(), and the redirected request
indeed gets processed almost as intended. I've still following issues with this
approach:

1.) Additional to the redirected request, the original request ends up at the
backend server too. Only if I return from the input filter with something
non-zero, the request doesn't reach the backend server. But in this case a
"proxy: prefetch request body failed "... error is logged by
ap_proxy_http_request(). From reading the whole code of ap_proxy_http_request()
I wonder whether internal redirects calling what ever redirect function from
input filters are supported at all.

2.) The redirected request seems to inherit the request method from the
originial request. This is not what I want. The input filter operates on the
request body of a POST request, but the internal redirection should do a GET
request.

What else do you suggest?
Should I parse the request body in a content handler instead inside an input
filter?

Thanks in advance for your feedback.

Regards,
Micha


Re: How to do a sub-request from input filter?

2011-04-27 Thread Micha Lenk
Hi Graham,

On Wed, Apr 27, 2011 at 07:44:05PM +0200, Graham Leggett wrote:
> On 27 Apr 2011, at 7:07 PM, Micha Lenk wrote:
> >I am using Apache as a reverse proxy to forward requests to a backend web
> >server protected by some self-written Apache modules. Now I would like to
> >do a sub-request to a different location from within an input filter.
> >
> >What is the easiest way to achieve this? Reading Nick Kew's excellent
> >Apache Modules Book I've found the function ap_internal_redirect(), but
> >does this function also work from an input filter? Currently I doubt it
> >because the first line of the HTTP request for the backend web server is
> >built very early in ap_proxy_http_request(), apparently before running
> >the input filters...
> >Do you have any better suggestion?
> >
> >What else should I keep in mind in this situation?
> 
> It depends, do you want the subrequest to replace the main request?
> Or do you want the subrequest to happen, and then for the main
> request to continue as it was before?

I want the subrequest to replace the main request.

Regards,
Micha


Re: How to do a sub-request from input filter?

2011-04-27 Thread Graham Leggett

On 27 Apr 2011, at 7:07 PM, Micha Lenk wrote:

I am using Apache as a reverse proxy to forward requests to a  
backend web
server protected by some self-written Apache modules. Now I would  
like to do a

sub-request to a different location from within an input filter.

What is the easiest way to achieve this? Reading Nick Kew's  
excellent Apache
Modules Book I've found the function ap_internal_redirect(), but  
does this
function also work from an input filter? Currently I doubt it  
because the first
line of the HTTP request for the backend web server is built very  
early in
ap_proxy_http_request(), apparently before running the input  
filters...

Do you have any better suggestion?

What else should I keep in mind in this situation?


It depends, do you want the subrequest to replace the main request? Or  
do you want the subrequest to happen, and then for the main request to  
continue as it was before?


Regards,
Graham
--



Re: How to do a sub-request from input filter?

2011-04-27 Thread Nick Kew

On 27 Apr 2011, at 18:07, Micha Lenk wrote:

> What else should I keep in mind in this situation?

You may be a little confused.  Subrquests and internal redirects
are not the same thing, and serve different purposes.  Maybe the
book, which discusses them in consecutive sections in the chapter
on the request processing cycle, should be clearer.

A subrequest should be OK in a filter, but if it generates any output
you'd need to figure out where to send it - which is not obvious in
an input filter.  As for an internal redirect, I don't see how it would
make sense, but I haven't tried.

BTW, this would be better on the modules-dev list.

-- 
Nick Kew

Available for work, contract or permanent
http://www.webthing.com/~nick/cv.html



How to do a sub-request from input filter?

2011-04-27 Thread Micha Lenk
Hi,

I am using Apache as a reverse proxy to forward requests to a backend web
server protected by some self-written Apache modules. Now I would like to do a
sub-request to a different location from within an input filter.

What is the easiest way to achieve this? Reading Nick Kew's excellent Apache
Modules Book I've found the function ap_internal_redirect(), but does this
function also work from an input filter? Currently I doubt it because the first
line of the HTTP request for the backend web server is built very early in
ap_proxy_http_request(), apparently before running the input filters...
Do you have any better suggestion?

What else should I keep in mind in this situation?

Regards,
Micha


Apache Input Filter

2010-08-31 Thread Varanasi, Sasikala
Hi,
 
I have a need to capture the incoming request body each time a request
comes into Apache Proxy server. I have mod_proxy and mod_filters setup
and I have an input filter that I wrote that captures the payload from
the request and logs it out. The input filter is able to do this,
however, the filter is processing the same request twice. The input
filter gets invoked 7-8 times for each request, most of the time with
empty buf on apr_bucket_read and twice with the same request contents.
What do I need to do to change the filter not to process the request
contents twice? How do I know the request has been already processed
once? Here is the part of the code that I have in the filter. Anything
that I missed? thank you in advance for your help.
 
if (ap_is_initial_req(r)) {
   if ( ret = ap_get_brigade(f->next, bb, mode, block, nbytes), ret ==
APR_SUCCESS ) {
const char* buf;
for ( b = APR_BRIGADE_FIRST(bb); b != APR_BRIGADE_SENTINEL(bb); b =
APR_BUCKET_NEXT(b) )
{
 

   int len;
   if (apr_bucket_read(b,&buf,&bytes,APR_NONBLOCK_READ)
== APR_SUCCESS) {
  read = 1;
   }
 
}
 
//if anything is read, send out the payload
if (read == 1) {
 //check if buf is all empty
 char *newbuf = stripspaces(buf);
 int l = strlen(newbuf);

 

 if (l == 0 || l == 1) {
 
fprintf(stderr, "the buf length is 0 or 1 - all empty!
not sending data \n");
 } else   {
 
fprintf(stderr, "unparsed uri is: %s \n",
r->unparsed_uri);
 

fprintf(stderr, "The request buf is --- %s",
buf);
 
   }
   }
 }
  }
  return ret ;


input filter module, chunked encoding, end-of-request indication?

2010-07-01 Thread David Youatt
Newbie input filter module writer, so apologies if this is the wrong list.

>From inside an input filter module, when handling filter routine calls and 
>iterating 
over buckets, with chunked encoding, what's the (or a) termination condition 
for end of request 
body?  Is it a EOS metadata bucket?  

Thanks.  And apologies if this is the wrong list, but please point me to the 
right
one if it is?


consulting help with input filter for decompression

2010-06-23 Thread David Youatt
 We're looking for an experienced consultant who can guide us through
writing a new filter module for a custom compression/encryption
technique.  We have a prototype output filter that's "working", and need
help and guidance with the input filter.  Our prototype was based on
mod_deflate, but the underlying compression/encryption library is
significantly different from libz.

Development environment is linux and apache 2.2.

Please respond directly.

If you're in the greater Seattle, WA, USA area that preferable but not
required.




Re: Recv'ing data directly from a socket and running it through SSL input filter.

2009-11-03 Thread Shane Pope
This has been solved by creating a new input filter, replacing core
with it, and calling ap_get_brigade on my new input_filter chain.
Apologies for so many messages,
Shane


Re: Recv'ing data directly from a socket and running it through SSL input filter.

2009-11-03 Thread Shane Pope
Essentially what I'm asking is,
I have data in a char *
I have the filters (r->input_filters) I need to run this through.
Is there a way to send this data through that filter?

I've found some code that will let me place that data in a bucket brigade.

apr_bucket_brigade *bb = apr_brigade_create(r->pool, c->bucket_alloc);
apr_bucket *b = apr_bucket_transient_create(str, len, c->bucket_alloc);
APR_BRIGADE_INSERT_TAIL(bb, b);

But then what?

Thank you,
Shane

On Tue, Nov 3, 2009 at 12:29 AM, Shane Pope  wrote:
>
> Hello modules-dev mailing list,
> I'm attempting to write a module that proxy's bits from an outside host to a 
> local port under SSL. I have everything working except reading data from the 
> client socket, and sending it through the SSL filter to decrypt it.
>
> For outputting bits to the client connection I can get the data from the 
> local socket like this
>
> rv = apr_socket_recv(local_socket, buffer, &nbytes);
>
> and call these two functions, which would pass the bits through the output 
> filters.
>
> ap_rwrite(buffer, nbytes, r);
> ap_rflush(r);
>
> But I cannot read bits directly from the client socket, because it's 
> encrypted. I've attempted to pass the bits through input_filters but failed.
>
> rv = apr_socket_recv(client_socket, buffer, &nbytes);
> //tried passing bits through filters, but don't know what I'm doing wrong. 
> what next?
>
> Thanks,
> Shane


Re: Input filter to process POST variables (e.g. mod_form)

2009-03-29 Thread Erik Westland

- Original Message 

> From: Nick Kew 
> To: modules-...@httpd.apache.org
> Sent: Wednesday, March 25, 2009 4:10:39 PM
> Subject: Re: Input filter to process POST variables (e.g. mod_form)
> 
> 
> On 25 Mar 2009, at 14:34, Erik Westland wrote
> 
> > 
> >> From: Erik Westland 
> >> To: modules-...@httpd.apache.org
> >> Sent: Saturday, March 21, 2009 11:22:52 PM
> >> Subject: Input filter to process POST variables (e.g. mod_form)
> 
> 
> I'm sure I recollect replying to that question a few days ago:
> in fact my mailer shows your question as replied to.  Guess
> maybe it got lost in the ether.
> 
> >> It seems that "mod_form" (see: http://apache.webthing.com/mod_form/) does 
> >> the
> >> parameter processing, but comes with the following ominous caveat:
> >> "mod_form works with GET but is largely untested under POST, and appears 
> >> to 
> have
> >> problems with POST and HTTP Keepalive. That will be fixed, but it's not
> >> currently a priority."
> 
> There was some test case where it hung.  I've also had reports of it working
> successfully with POST.
> 
> >> Q: Does anyone know if this is safe to use? Their support page lists 
> >> mod_form
> >> as: "unlikely to represent value to you."
> 
> That means I don't want anyone making the voluntary donation my pages invite
> on the false premise that it's a tried-and tested production-grade module.
> 
> >> Q: Is there another alternative to writing my own? If not, any idea what 
> >> the
> >> keepalive issue was?
> 
> mod_request (didn't exist when I wrote mod_form).  Or apreq (did exist,
> but a more heavyweight solution).
> 
> >> Q:  Apache processes variables in the URL (parses into r->args) during POST
> >> requests, but the mod_form implementation doesn't look for r->args when 
> handling
> >> POST requests. Is this disallowed/discouraged in the spec or an oversight 
> >> in 
> the
> >> implementation?
> 
> Both.  It's an edge case, and a mod_form incompleteness.  But it won't bite 
> you
> unless your application uses a POST form with a query_string in the action.
> Or perhaps does similar for non-browser data.
> 

Nick,

Thanks for the tip on libapreq, it should work for me (until mod_request is 
generally available). I am having some issues using it though, it seg faults 
when I POST data AND ask for the body in my input filter. I am assuming I am 
doing something out of sequence (hopefully not violating some operational 
principle), but need to dig more (and post to the apreq-dev group).

BTW - I am open to giving to donations to you (or anyone else of similar 
caliber) to answer about 20 mins of questions. I will be in the SF Bay Area 
this afternoon. If you are interested, drop me a message (erikwestland  yahoo 
com).

Cheers,
Erik


  


How to share context between input filter and output filter

2008-12-17 Thread Ashish Khare
Hi,

I am facing problem about how to share context between input filter and
output filter which are registered through different hooks.

Here is description of the problem:


static void
mod_realsecure_register_hooks (apr_pool_t * p)
{
ap_hook_child_init (realsecure_child_init, NULL, NULL, APR_HOOK_MIDDLE);
ap_hook_create_request(realsecure_hook_insertfilter , NULL, NULL,
APR_HOOK_MIDDLE);
ap_hook_insert_filter(realsecure_hook_insertoutputfilter , NULL, NULL,
APR_HOOK_MIDDLE);
};

static int realsecure_hook_insertfilter(request_rec *r)
{
   PamReqRspCtx *ctxPam1;
   ctxPam1 = (PamReqRspCtx *)apr_palloc(r->pool,sizeof(PamReqRspCtx));
   ap_add_input_filter("REALSECURE_READ_INPUT_1", (void *) ctxPam1, r,
r->connection);
   ap_add_output_filter("REALSECURE_READ_OUTPUT_1",(void *) ctxPam1, r,
r->connection);
   return OK;
}

static void realsecure_hook_insertoutputfilter(request_rec *r)
{
   PamReqRspCtx *ctxPam2;
   ctxPam2 = (PamReqRspCtx *)apr_palloc(r->pool,sizeof(PamReqRspCtx));
   ap_add_input_filter("REALSECURE_READ_INPUT_2", (void *) ctxPam2, r,
r->connection);
   ap_add_output_filter("REALSECURE_READ_OUTPUT_2",(void *) ctxPam2, r,
r->connection);
   return OK;
}

Now my problem is the context(ctxPam1) which I have created in
realsecure_hook_insertfilter(request_rec *r)  function and stored some data
how to access that context and data (ctxPam1) in
realsecure_hook_insertoutputfilter(request_rec *r) function


Please let me know about the solution.

-Ashish

On Tue, Dec 2, 2008 at 6:19 PM, Sorin Manolache  wrote:

> On Tue, Dec 2, 2008 at 10:58, Ashish Khare  wrote:
> > Hi,
> >
> > I have tried with this option AP_FTYPE_PROTOCOL.
> > But my input filter is not called.
> > But when I change the type to AP_FTYPE_RESOURCE, my filter did get called
> > but no header.
> >
> > I have checked the Apache code where following modules filters are
> > registered with this option.
> >
> > ap_register_input_filter("HTTP_IN", ap_http_filter,NULL,
> AP_FTYPE_PROTOCOL);
> > ap_register_input_filter  (ssl_io_buffer, ssl_io_filter_buffer, NULL,
> > AP_FTYPE_PROTOCOL - 1);
> > ap_register_input_filter("NET_TIME", net_time_filter, NULL,
> > AP_FTYPE_PROTOCOL);
> >
> >
> > Please find my code snippet below to check if I am doing correct or not.
> > Let me know if I am doing correct or not.
>
> Hello,
>
> The apache2 call sequence is:
>
> ap_process_http_connection->ap_read_request->create_request callbacks
>
> ->ap_get_mime_headers_core
>
> ->ap_process_request->insert_filter callbacks
>
> As you see, ap_get_mime_headers_core is called before the
> insert_filter callbacks. Thus, the headers are consumed by
> ap_http_filter and they do not reach your filter.
>
> Instead of calling ap_add_input_filter_handle from a insert_filter
> callback, call it from a create_request callback. Thus, it will be
> called before ap_http_filter.
>
> I've attached the code.
>
> S
>
> >
> > code snippet
> > 
> >
> > module AP_MODULE_DECLARE_DATA input_module = {
> >  STANDARD20_MODULE_STUFF,
> >  NULL, /* module initializer  */
> >  NULL, /* create per-dirconfig structures */
> >  NULL, /* merge  per-dirconfig structures */
> >  NULL, /* create per-server config structures */
> >  my_table,/* merge  per-server config structures */
> >  mod_my_register_hooks,
> > };
> >
> > static ap_filter_rec_t * globalInputFilter;
> >
> > static void
> > mod_my_register_hooks (apr_pool_t * p)
> > {
> >  ap_hook_child_init (my_child_init, NULL, NULL, APR_HOOK_MIDDLE);
> >  ap_hook_insert_filter(my_insertfilter , NULL, NULL, APR_HOOK_MIDDLE);
> >  globalInputFilter= ap_register_input_filter("MY_READ_INPUT",
> > my_input_filter ,NULL, AP_FTYPE_PROTOCOL );
> > };
> >
> > static void my_insertfilter (request_rec * r)
> > {
> >   ReqRspCtx *ctx;
> >   ctx = (ReqRspCtx *)apr_palloc(r->pool,sizeof(ReqRspCtx));
> >   ap_add_input_filter_handle ( globalInputFilter, (void *)ctx, r,
> > r->connection);
> > }
> >
> > static apr_status_t my_input_filter (ap_filter_t * f, apr_bucket_brigade
> *
> > bb,
> > ap_input_mode_t mode,apr_read_type_e
> > block,
> >     apr_off_t readbytes)
> > {
> > //Function Body.
> > }
> >
> >
> >
> > Regards,
> > Ashish
> >
> 

Re: Problem setting new HTTP headers from an input filter

2007-10-10 Thread Julian Williams
On 10/10/07, Joe Lewis <[EMAIL PROTECTED]> wrote:
> Julian Williams wrote:

[snip]

> > My code uses calls to apr_table_set to create the headers e.g.
> >
> >   apr_table_set(r->subprocess_env, "HTTP_SAMS_USER", uid);
> >
>
> Try :
>
> apr_table_set(r->headers_in, "HTTP_SAMS_USER", uid);
>
>
> in addition to the subprocess_env table set?  (Just in case you aren't
> getting to a subprocess?)
>

Unfortunately that doesn't seem to make any difference. My filter also
sets the REMOTE_USER header (using r->user) which is being logged
correctly in the apache access log. However the web service
application doesn't pick it up which suggests that it's environment is
setup prior to the filter processing. (It is able to see other
standard server variables such as SERVER_NAME)

Thanks,
Julian Williams


Re: Problem setting new HTTP headers from an input filter

2007-10-10 Thread Joe Lewis
Julian Williams wrote:
> Hi,
> I have written an Apache module that controls access depending on
> whether a token is present in the request - normally as an HTTP
> cookie. I have recently had to modify the module to include an input
> filter which will spot the token if included in the body of a POST
> request - the SOAP/xml call to a web service. This works OK.
>
> I also need to create some new headers that include user data
> extracted from the token. The objective is that the new headers are
> easily picked up as server variables in the application environment.
> My code uses calls to apr_table_set to create the headers e.g.
>
>   apr_table_set(r->subprocess_env, "HTTP_SAMS_USER", uid);
>   

Try :

apr_table_set(r->headers_in, "HTTP_SAMS_USER", uid);


in addition to the subprocess_env table set?  (Just in case you aren't
getting to a subprocess?)

Joe
-- 
Joseph Lewis <http://sharktooth.org/>
"Divide the fire, and you will sooner put it out." - Publius Syrus


Re: Problem setting new HTTP headers from an input filter

2007-10-10 Thread Arturo 'Buanzo' Busleiman
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA512

Julian Williams wrote:
> Is it just not possible to modify the headers at this point in the
> processing 'pipeline' or is there a different way that I can do it?

What TYPE of input filter is it?

- --
Arturo "Buanzo" Busleiman - Consultor Independiente en Seguridad Informatica
Servicios Ofrecidos: http://www.buanzo.com.ar/pro/
Unase a los Foros GNU/Buanzo - La palabra Comunidad en su maxima expresion.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.7 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFHDMU2AlpOsGhXcE0RCrCnAJ9xWReEtX/Rcp8q0bK0wlMiV30PjQCfU0nt
fiYTasO5bdPYrfbrdsBdTh0=
=eNW1
-END PGP SIGNATURE-


RE: Input filter cannot change size of multipart/form-data

2007-02-07 Thread Kratzer, James (Xetron)
The problem I'm having is a bug in mod_proxy which has been fixed in
2.0.55 and later as described in the link below. I was using 2.0.52
which is what comes with RHEL 4.  Updating to 2.0.55 fixed my problem.

http://mail-archives.apache.org/mod_mbox/httpd-dev/200212.mbox/%3C214748
[EMAIL PROTECTED]

The problem is that mod_proxy generates the Content-Length header before
calling the input filters.  If an input filter modifies the body of the
request, as my filter does, mod_proxy does not regenerate the
Content-Length header.  A fix was made in mod_proxy to delay generating
the Content-Length header until after the input filters are called.  

Just wanted to give others an update

JK
 

-Original Message-
From: Nick Kew [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, February 06, 2007 12:55 PM
To: modules-dev@httpd.apache.org
Subject: Re: Input filter cannot change size of multipart/form-data

On Tue, 6 Feb 2007 12:38:26 -0500
"Kratzer, James \(Xetron\)" <[EMAIL PROTECTED]> wrote:

> Please help.
> 
> I have an input filter that looks for the Content-Type 
> "multipart/form-data" and saves the whole POST request to disk for 
> processing by scripts invoked by the filter.  The processed
> multipart/form-data is then read back in and sent up the chain.   

Is mod_upload not any use to you?

> Everything works as long as I don't change the size of the 
> multipart/form-data.  It appears that if I change the size of the 
> multipart/form-data post(such as replacing an attachment with either a

> smaller or larger attachment than the original), then the post dies 
> even though I'm updating the Content-Length in headers_in.  Why can't 
> I change the size of the multipart/form-data post.

*When* are you changing the header in headers_in?

I expect you might want to look at the note on order of processing in my
article at http://www.apachetutor.org/dev/request

--
Nick Kew

Application Development with Apache - the Apache Modules Book
http://www.apachetutor.org/


RE: Input filter cannot change size of multipart/form-data

2007-02-06 Thread Kratzer, James (Xetron)
Hi,

I've looked at mod_upload and if I understand it correctly, mod_upload
is removing attachments from multipart/form-data posts and sending them
up the chain and putting the other fields of the mime encoded
multipart/form-data into a table which is part of the request record.  I
don't see the Content-Length in headers_in being changed anywhere in the
module.  Why does mod_upload not need to change the Content-Length?

In my filter, I'm changing the Content-Length right after the scripts
are done processing the multipart/form-data which was saved to /tmp as
so:

1. Save multipart/form-data to /tmp
2. Invoke scripts to process multipart/form-data
3. Read processed multipart/form-data from /tmp and put into brigade.
4. >>>>>> Update Content-Length in headers_in  <<<<<
5. Insert EOS into brigade
6. return APR_SUCCESS.

Any hints?

Thanks


-Original Message-
From: Nick Kew [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, February 06, 2007 12:55 PM
To: modules-dev@httpd.apache.org
Subject: Re: Input filter cannot change size of multipart/form-data

On Tue, 6 Feb 2007 12:38:26 -0500
"Kratzer, James \(Xetron\)" <[EMAIL PROTECTED]> wrote:

> Please help.
> 
> I have an input filter that looks for the Content-Type 
> "multipart/form-data" and saves the whole POST request to disk for 
> processing by scripts invoked by the filter.  The processed
> multipart/form-data is then read back in and sent up the chain.   

Is mod_upload not any use to you?

> Everything works as long as I don't change the size of the 
> multipart/form-data.  It appears that if I change the size of the 
> multipart/form-data post(such as replacing an attachment with either a

> smaller or larger attachment than the original), then the post dies 
> even though I'm updating the Content-Length in headers_in.  Why can't 
> I change the size of the multipart/form-data post.

*When* are you changing the header in headers_in?

I expect you might want to look at the note on order of processing in my
article at http://www.apachetutor.org/dev/request

--
Nick Kew

Application Development with Apache - the Apache Modules Book
http://www.apachetutor.org/


Re: Input filter cannot change size of multipart/form-data

2007-02-06 Thread Nick Kew
On Tue, 6 Feb 2007 12:38:26 -0500
"Kratzer, James \(Xetron\)" <[EMAIL PROTECTED]> wrote:

> Please help.
> 
> I have an input filter that looks for the Content-Type
> "multipart/form-data" and saves the whole POST request to disk for
> processing by scripts invoked by the filter.  The processed
> multipart/form-data is then read back in and sent up the chain.   

Is mod_upload not any use to you?

> Everything works as long as I don't change the size of the
> multipart/form-data.  It appears that if I change the size of the
> multipart/form-data post(such as replacing an attachment with either a
> smaller or larger attachment than the original), then the post dies
> even though I'm updating the Content-Length in headers_in.  Why can't
> I change the size of the multipart/form-data post.

*When* are you changing the header in headers_in?

I expect you might want to look at the note on order
of processing in my article at
http://www.apachetutor.org/dev/request

-- 
Nick Kew

Application Development with Apache - the Apache Modules Book
http://www.apachetutor.org/


Input filter cannot change size of multipart/form-data

2007-02-06 Thread Kratzer, James (Xetron)
Please help.

I have an input filter that looks for the Content-Type
"multipart/form-data" and saves the whole POST request to disk for
processing by scripts invoked by the filter.  The processed
multipart/form-data is then read back in and sent up the chain.   

Here are the steps for my input filter:
1. If a multipart/form-data post is found then save it to disk.
2. Execute a script to process the multipart/form-data.
3. Read the processed multipart/form data in, put it into the brigade,
and send it on up the chain.
4. If the size of the new multpart/form-data has changed then my filter
changes the Content-Length header in the headers_in table.

Everything works as long as I don't change the size of the
multipart/form-data.  It appears that if I change the size of the
multipart/form-data post(such as replacing an attachment with either a
smaller or larger attachment than the original), then the post dies even
though I'm updating the Content-Length in headers_in.  Why can't I
change the size of the multipart/form-data post.

Thanks,

JK





[jira] Created: (MODPYTHON-212) req.read() with no arguments doesn't return all data where input filter inserts extra data in input stream.

2007-01-13 Thread Graham Dumpleton (JIRA)
req.read() with no arguments doesn't return all data where input filter inserts 
extra data in input stream.
---

 Key: MODPYTHON-212
 URL: https://issues.apache.org/jira/browse/MODPYTHON-212
 Project: mod_python
  Issue Type: Bug
  Components: core
Affects Versions: 3.2.10, 3.3
Reporter: Graham Dumpleton


The req.read() function when supplied with no arguments is supposed to return 
all available data from the request body input stream. To do this it uses 
req.remaining as the amount of data still to be read and will only read up to 
that amount of data.

The problem with this is that req.remaining is originally set to be the value 
of the Content-Length header of the request. This however may not actually be 
how much data there is to read as input filters may actually remove or add data 
to the input stream. This will not cause a problem when an input filter removes 
data as the amount it thinks there is to read will still be more than what is 
actually available after the input filter has removed data.

In the case of an input filter that inserts additional data, the amount of data 
to be read will actually exceed the value of req.remaining and as a consequence 
req.read() will not return all the available data. Even subsequent calls to 
req.read() with no arguments will not return all the data as req.remaining by 
then is zero and it will think it has read all the available data.

The only way to get all the data is to make successive calls to req.read() but 
supply some size as argument. Calls should be made to req.read() with the 
argument until req.read() returns an empty string. Only when it returns an 
empty string are you sure all data has been returned.

This problem will also affect req.readline() as it uses req.remaining in a 
similar way. The req.readlines() function is also possibly indirectly affected 
as it calls req.readline() to read individual lines and so it may not be able 
to access any additional lines beyond the original content length specified in 
the request.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




Re: My input filter is being called endlessly

2007-01-09 Thread Nick Kew
On Wed, 10 Jan 2007 02:16:09 +0200
"Kfir Dadosh" <[EMAIL PROTECTED]> wrote:

> Hi All,
>  
> I'm trying to write an input filter that buffers the whole post
> content,

You should avoid buffering where possible.  Failing that,
you'll want to tell server admins to limit request sizes.

>  replaces it with some other content, and passes it on the
> chain. I managed to read the data from the brigade and buffer it
> correctly, until I get the EOS bucket.
> At that point, I created a new brigade, added the modified content to
> a new heap bucket and added the bucket to my brigade.

Huh?  That sounds like an output filter.  Input is a Pull API.
So you have to populate the brigade you were called with,
and add an EOS to that when finished.

Also don't forget the protocol.  Your filter may invalidate
things like Content-Length headers, or any checksums.

-- 
Nick Kew

Application Development with Apache - the Apache Modules Book
http://www.apachetutor.org/


My input filter is being called endlessly

2007-01-09 Thread Kfir Dadosh
Hi All,
 
I'm trying to write an input filter that buffers the whole post content,
replaces it with some other content, and passes it on the chain.
I managed to read the data from the brigade and buffer it correctly, until I
get the EOS bucket.
At that point, I created a new brigade, added the modified content to a new
heap bucket and added the bucket to my brigade.
I also added a new EOS bucket, and concated my brigade to the given bb.
>From some reason, my filter is being called endlessly with the same content.
 
Can anyone tell me what have I done wrong?
Thanks for any help.

Regards,

Kfir



Input Filter Infinite Loop

2006-11-27 Thread Tess Flynn
I'm new to writing modules, but this is just baffling to me. I'm trying to 
write an input filter to access POST data from an incoming request. I don't 
need to modify the data in any way, just trap it and pass the contents to the 
next filter. The problem is, I only seem to trap one character from the 
brigade, and then I drop into an infinite loop. My guess this is at the server 
level, since it reenters my function over and over.

Here's the code I have thus far:

static int infilter(ap_filter_t* f, apr_bucket_brigade* bb, 
  ap_input_mode_t mode, 
apr_read_type_e block,
  apr_off_t readbytes) {

apr_size_t  bucket_len = 102400; /* some 
size, I'll config it later. */
char*   bucket_text = 
apr_palloc(f->r->pool, bucket_len);

  
while(ap_get_brigade(f->next, bb, AP_MODE_READBYTES, APR_BLOCK_READ, 
readbytes) == APR_SUCCESS){

apr_brigade_flatten(bb, bucket_text, &bucket_len);
apr_brigade_cleanup(bb);

/* Now do something with the text. */
}

return APR_SUCCESS;
}

What am I doing wrong?

-Tess


[Fwd: Re: Input Filter reading Body]

2006-10-28 Thread Ruediger Pluem

On 10/20/2006 05:03 PM, Christian Verdelli wrote:
> 
> static int redirect_wmlInFilter (ap_filter_t* f, apr_bucket_brigade* 
> brigade,
> ap_input_mode_t mode, apr_read_type_e block, apr_off_t readbytes) {
> 
> apr_bucket* bucket ;
> apr_status_t ret ;
> 
> if ( ret = ap_get_brigade(f->next, brigade, mode, block, readbytes) ,
> ret == APR_SUCCESS )
> for ( bucket = APR_BRIGADE_FIRST(brigade) ;
> bucket != APR_BRIGADE_SENTINEL(brigade) ;
> bucket = APR_BUCKET_NEXT(bucket) )
> 
> char *data[bucket->length];
> int len;
> apr_bucket_read(bucket,&data,&len,1);
> ap_log_perror(APLOG_MARK, APLOG_NOTICE, NULL, f->c->pool,
> "(IN) %s %s: %d bytes - %s ", f->frec->name, t, bucket->length , data );

data is an array of *char so you are supplying a **char to ap_log_perror here 
and
not an *char as you intent to do, right?
Use

char *data instead of char *data[bucket->length]


Regards

Rüdiger





Re: Input Filter reading Body

2006-10-23 Thread Christian Verdelli
I simply changed  this piece of code :	char *data[4096];	memset (data,0,4096);  with this :   	char *data;  	data = "" 4096) ; and now it works fine, what the hell is this !!?? :-DThanks anyway..On Oct 20, 2006, at 6:35 PM, Christian Verdelli wrote:Well,actually i took my piece of code right from that module.I just took out the piece i needed but happen exaclty using mod_diagnostic.cI just added the function to read data from the bucket :apr_bucket_read(b,&data,&len,1); ap_log_perror(APLOG_MARK, APLOG_NOTICE, NULL, f->c->pool,     "    %s %s:    %d bytes - data : %s", f->frec->name, t, b->length, data) ;and that's the output in the log file:[Fri Oct 20 18:32:51 2006] [notice] i-network-2: mode READBYTES; blocking; 17 bytes[Fri Oct 20 18:32:51 2006] [notice] \ti-network-2 HEAP:\t17 bytes - data : \xa7\x0fe[Fri Oct 20 18:32:51 2006] [error] [client 127.0.0.1] File does not exist: C:/Apache2/htdocs/pino[Fri Oct 20 18:32:51 2006] [notice] i-network-2: mode EATCRLF; non-blocking; 0 bytes[Fri Oct 20 18:32:51 2006] [notice] i-network-2: ap_get_brigade returned 730035[Fri Oct 20 18:32:51 2006] [notice] i-network-2: mode GETLINE; blocking; 0 bytes[Fri Oct 20 18:32:51 2006] [notice] \ti-network-2 IMMORTAL:\t0 bytes - data : \xa9~\xe9nt[Fri Oct 20 18:32:51 2006] [notice] i-network-2: mode GETLINE; blocking; 0 bytes[Fri Oct 20 18:32:51 2006] [notice] i-network-2: ap_get_brigade returned 70014why am i getting this \xa7\x0fe instead of  name=value of the post??thanks.Nick Kew wrote: On Friday 20 October 2006 16:03, Christian Verdelli wrote:   --- Hi,im writing my first simple input filters ; i need it to debug therequest body (content of POST) coming to my Web Server before is handledby a third party contect handler module (which corrupt data).I simply want to print out the content of the body request sent bythe client . I wrote this peace of code:     This would be more on-topic on the modules list.Your code is a little hard to read, with no indentation and someobvious typos.   char *data[bucket->length];     huh?Are you trying to do something similar to mod_diagnostics?If so, how does it differ?     - - - - - - - - - - - - - - - - - - -Christian VerdelliSenior Software SpecialistZEROPIU S.p.A.via Fra Luca Pacioli, 320144 MilanoPhone:	+39 02.83102.300Mobile:	+39 348.0907972fax:	+39 02.83102.310e-mail: [EMAIL PROTECTED]homepage: http://www.zeropiu.it 

Re: Input Filter reading Body

2006-10-20 Thread Christian Verdelli

Well,
actually i took my piece of code right from that module.
I just took out the piece i needed but happen exaclty using mod_diagnostic.c
I just added the function to read data from the bucket :

apr_bucket_read(b,&data,&len,1);
 ap_log_perror(APLOG_MARK, APLOG_NOTICE, NULL, f->c->pool,
 "%s %s:%d bytes - data : %s", f->frec->name, t, b->length, 
data) ;


and that's the output in the log file:


[Fri Oct 20 18:32:51 2006] [notice] i-network-2: mode READBYTES; 
blocking; 17 bytes
[Fri Oct 20 18:32:51 2006] [notice] \ti-network-2 HEAP:\t17 bytes - data 
: \xa7\x0fe
[Fri Oct 20 18:32:51 2006] [error] [client 127.0.0.1] File does not 
exist: C:/Apache2/htdocs/pino
[Fri Oct 20 18:32:51 2006] [notice] i-network-2: mode EATCRLF; 
non-blocking; 0 bytes
[Fri Oct 20 18:32:51 2006] [notice] i-network-2: ap_get_brigade returned 
730035
[Fri Oct 20 18:32:51 2006] [notice] i-network-2: mode GETLINE; blocking; 
0 bytes
[Fri Oct 20 18:32:51 2006] [notice] \ti-network-2 IMMORTAL:\t0 bytes - 
data : \xa9~\xe9nt
[Fri Oct 20 18:32:51 2006] [notice] i-network-2: mode GETLINE; blocking; 
0 bytes
[Fri Oct 20 18:32:51 2006] [notice] i-network-2: ap_get_brigade returned 
70014



why am i getting this \xa7\x0fe instead of  name=value of the post??

thanks.




Nick Kew wrote:

On Friday 20 October 2006 16:03, Christian Verdelli wrote:
  

-
-- Hi,

im writing my first simple input filters ; i need it to debug the
request body (content of POST) coming to my Web Server before is handled
by a third party contect handler module (which corrupt data).
I simply want to print out the content of the body request sent by
the client . I wrote this peace of code:



This would be more on-topic on the modules list.

Your code is a little hard to read, with no indentation and some
obvious typos.

  

char *data[bucket->length];



huh?

Are you trying to do something similar to mod_diagnostics?
If so, how does it differ?


  




Re: Input Filter reading Body

2006-10-20 Thread Nick Kew
On Friday 20 October 2006 16:03, Christian Verdelli wrote:
> -
> -- Hi,
>
> im writing my first simple input filters ; i need it to debug the
> request body (content of POST) coming to my Web Server before is handled
> by a third party contect handler module (which corrupt data).
> I simply want to print out the content of the body request sent by
> the client . I wrote this peace of code:

This would be more on-topic on the modules list.

Your code is a little hard to read, with no indentation and some
obvious typos.

> char *data[bucket->length];

huh?

Are you trying to do something similar to mod_diagnostics?
If so, how does it differ?


-- 
Nick Kew


Input Filter reading Body

2006-10-20 Thread Christian Verdelli
- -- Hi,im writing my first simple input filters ; i need it to debug the request body (content of POST) coming to my Web Server before is handledby a third party contect handler module (which corrupt data).I simply want to print out the content of the body request sent by the client . I wrote this peace of code:static int redirect_wmlInFilter (ap_filter_t* f, apr_bucket_brigade* brigade,ap_input_mode_t mode, apr_read_type_e block, apr_off_t readbytes) {apr_bucket* bucket ;apr_status_t ret ;if ( ret = ap_get_brigade(f->next, brigade, mode, block, readbytes) ,ret == APR_SUCCESS )for ( bucket = APR_BRIGADE_FIRST(brigade) ;bucket != APR_BRIGADE_SENTINEL(brigade) ;bucket = APR_BUCKET_NEXT(bucket) )char *data[bucket->length];int len;apr_bucket_read(bucket,&data,&len,1);ap_log_perror(APLOG_MARK, APLOG_NOTICE, NULL, f->c->pool,"(IN) %s %s: %d bytes - %s ", f->frec->name, t, bucket->length , data );elseap_log_perror(APLOG_MARK, APLOG_NOTICE, NULL, f->c->pool,"(IN) %s: ap_get_brigade returned %d", f->frec->name, ret) ;return ret ;}The filter is then registered as input filter in the AP_FTYPE_TRANSCODE phase.static void redirect_wmlRegisterHooks(apr_pool_t *p) {ap_register_input_filter(webifilter,redirect_wmlInFilter,ifilter_init, AP_FTYPE_TRANSCODE );}To test it i simply post fake data with using curl : curl -v -d nameb=valueb http://myserver.myhost.com/postread.cgiLooking at the logs the instead of seeing the content body nameb=valueb , here what i get :[Fri Oct 20 00:42:40 2006] [notice] (IN)\twebicookiet20 bytes - \x01\x8b\xa7\x1bMaybe should i decode it before print it ??Any suggestion will be appreciated -- 

Help with input filter ... crashing apache ...

2006-08-09 Thread José Muanis

Hi, hope that someone has any time to help me.

I'm designing an input filter, my goal is to parse de URI and Cookies,
set a couple of new headers in the request and change the uri.

I have coded some stuff, based on Ryan Bloom article 'Writing Input
Filters for Apache 2.0'
(http://www.onlamp.com/pub/a/apache/2001/09/20/apache_2.html).

Right now, I managed to deal with the buckets to change the uri. At
the end of the code, I get the first bucket in the brigade, and it
looks fine (from my  point of view of course).

What I've done is the following:

First of all, I register the following hooks:

ap_hook_pre_connection(headers_pre, NULL, NULL, APR_HOOK_MIDDLE);
ap_register_input_filter("HEADERS_IN", headers_filter_in,
NULL,AP_FTYPE_CONNECTION);

On headers_pre I do the following:

ap_add_input_filter("HEADERS_IN", NULL, NULL, c);

I'm adding this stuff because maybe you will tell me "You should not
use this type of filter to do this".

On header_filters in, i get the brigade, get the first bucket, cut
everything, just like in the article.

At the end of the code (again, just like in the article) I do the following:

d = apr_bucket_transient_create(data, 23, bb->bucket_alloc);
apr_bucket_setaside(d, f->c->pool);

APR_BUCKET_INSERT_AFTER(e, d);
APR_BUCKET_REMOVE(e);
apr_bucket_destroy(e);

return APR_SUCCESS;

If I remove the line where I insert the new bucket (d) after (e) the
code executes and apache don't crash (but I get an error, as bucket
(e) does not contain a uri. When I insertthe line with the insert
after, I crash apache.

As an info I'm coding and testing it on Windows XP, with VStudio.
(Please, don't blame on me I should be using unix for it).

Thanks in advance,

Jose Muanis


Re: input filter which modifies body, Content-Length

2006-02-24 Thread Jeff Trawick
On 2/24/06, Greg Ames <[EMAIL PROTECTED]> wrote:
> Jeff Trawick wrote:
>
> > It isn't clear to me what an input filter should do about
> > Content-Length when it modifies the length of the body  (assuming that
> > this isn't chunked encoding).
>
> > mod_cgi uses brigades to read the body but needs to look at
> > Content-Length before spawning the CGI script, so that's problematic.
> > And there is an unexpected ordering requirement so that the input
> > filter can signal to this handler that the content-length can't be
> > trusted, before mod_cgi spawns the child.
>
> so is this the current ordering?
>
> 1. mod_cgi[d] handler is dispatched
> 2. C-L environment variable is set for the script from the initial C-L header
> 3. CGI child is spawned.
> 4. mod_cgi[d] reads the body from input filters
> 5. foo_input_filter changes the body, invalidating the C-L env var + whatever 
> local
> variables the script is using to track the length
> 6. CGI reads from stdin into buffer of length ?

looks right to me

>
> > A filter which spools up to a configured amount of request body in
> > order to calculate content-length could be of some practical benefit,
> > since many request bodies are relatively small and this could
> > potentially allow mod_cgi[d] to properly handle chunked request
> > bodies, regardless of input filtering.  With such a filter installed,
> > and no need to spool beyond a configured limit, getting a brigade
> > would return bucket(s) with known length and an EOS at the end.
> > Unknown lengths or EOS?  Better punt if you're mod_cgi[d].
>
> per offline discussions,
>
> * the CGI spec (fwiw) is oblivious to chunking
> * 1.3 and pre-filtering 2.0 used to fail CGI requests with chunked bodies.  
> getting them
> to work properly in common cases (i.e. < 8K bodies ) would be a step forward.
> * this is analogous to proxy trying to avoid chunking to the origin server, 
> except the
> spec is weaker for CGIs
>
> a decent solution for CGI request body chunking would also solve some cases 
> of input
> filters modifying the length.

y

I suspect a number of third-party modules would need to change (not to
mention our own).

The handler would need to get a brigade then, if there is EOS bucket
at the end and the length of each bucket is known, then use that for
content-length.  (Presumably a magic filter with configurable timeouts
and spooling capability would allow this to happen.)  Otherwise,
either fail the request (e.g., CGI) or don't handle the body in a way
that needs to know the length in advance.


Re: input filter which modifies body, Content-Length

2006-02-24 Thread Greg Ames

Jeff Trawick wrote:


It isn't clear to me what an input filter should do about
Content-Length when it modifies the length of the body  (assuming that
this isn't chunked encoding).



mod_cgi uses brigades to read the body but needs to look at
Content-Length before spawning the CGI script, so that's problematic. 
And there is an unexpected ordering requirement so that the input

filter can signal to this handler that the content-length can't be
trusted, before mod_cgi spawns the child.


so is this the current ordering?

1. mod_cgi[d] handler is dispatched
2. C-L environment variable is set for the script from the initial C-L header
3. CGI child is spawned.
4. mod_cgi[d] reads the body from input filters
5. foo_input_filter changes the body, invalidating the C-L env var + whatever local 
variables the script is using to track the length

6. CGI reads from stdin into buffer of length ?


A filter which spools up to a configured amount of request body in
order to calculate content-length could be of some practical benefit,
since many request bodies are relatively small and this could
potentially allow mod_cgi[d] to properly handle chunked request
bodies, regardless of input filtering.  With such a filter installed,
and no need to spool beyond a configured limit, getting a brigade
would return bucket(s) with known length and an EOS at the end. 
Unknown lengths or EOS?  Better punt if you're mod_cgi[d].


per offline discussions,

* the CGI spec (fwiw) is oblivious to chunking
* 1.3 and pre-filtering 2.0 used to fail CGI requests with chunked bodies.  getting them 
to work properly in common cases (i.e. < 8K bodies ) would be a step forward.
* this is analogous to proxy trying to avoid chunking to the origin server, except the 
spec is weaker for CGIs


a decent solution for CGI request body chunking would also solve some cases of input 
filters modifying the length.


Greg


input filter which modifies body, Content-Length

2006-02-23 Thread Jeff Trawick
(Apologies if there is juicy past discussion that I didn't find with
my search keywords.)

It isn't clear to me what an input filter should do about
Content-Length when it modifies the length of the body  (assuming that
this isn't chunked encoding).

It causes problems for some handlers to modify the length of the
request body but leave Content-Length alone.  Handlers which care
about Content-Length have probably already looked at it before the
input filter has been called anyway.

A handler that uses brigades to read the body and has no need to look
at Content-Length should be okay either way.

mod_cgi uses brigades to read the body but needs to look at
Content-Length before spawning the CGI script, so that's problematic. 
And there is an unexpected ordering requirement so that the input
filter can signal to this handler that the content-length can't be
trusted, before mod_cgi spawns the child.

ap_setup_client_block() looks at the initial Content-Length to know
how much to read, so that's problematic but perhaps fixable for these
types of handlers.

A filter which spools up to a configured amount of request body in
order to calculate content-length could be of some practical benefit,
since many request bodies are relatively small and this could
potentially allow mod_cgi[d] to properly handle chunked request
bodies, regardless of input filtering.  With such a filter installed,
and no need to spool beyond a configured limit, getting a brigade
would return bucket(s) with known length and an EOS at the end. 
Unknown lengths or EOS?  Better punt if you're mod_cgi[d].

Thoughts?


input filter problems

2005-08-18 Thread Martijn Faassen

Hi there,

I've spent the last few days trying to get an input filter to work in 
mod_python, in the context of requests that are being reverse proxied to 
an app-server by mod_proxy. I've also tried mod_rewrite.


The input filter works fine if mod_proxy or mod_rewrite are *not* in 
play and things are just posted into the void. When posting to an 
appserver, the symptoms for mod_proxy and mod_rewrite appear to be very 
similar.


My apologies in advance for the information dump... I hope someone can 
at least give me any clue as to where this problem may lay or what 
further actions to pursue. My 'fix' to mod_python might also be worth 
consideration, though I have no idea why it works.


I ran into a host of problems that I think have something to do with the 
underlying apache, but may also have something to do with mod_python itself.


Symptoms on Apache 2.0.54, using mod_python 3.1.3
-

* input filters appear to be unreliable in the face of changing data. 
(writing out something else than goes in) Apache hangs or the filter 
gets called infinite times in a single request (subrequests appear to be 
not in play though, at least I can't detect them using req.main).


* even when data is not changed, filtering sometimes hangs.

* the infinite calling thing can be worked around by disabling the 
filter after one is done.


* I can suppress the infinite call behavior using .disable(), but then 
the system hangs instead.


Apache 2.0.54, latest svn mod_python


* same story as before.

Latest apache 2.0.x svn as well as svn mod_python
-

* Infinite calls do occur, but..

* Hangs still occur, but..

* Hangs and infinite calls disappear and everything works as expected 
(except for message in error log) if an exception is raised inside the 
filter code!


Tracking this down to lib/python/mod_python/apache.py, in 
FilterDispatch() there's the following section:


if object:

# call the object
if config.has_key("PythonEnablePdb"):
pdb.runcall(object, filter)
else:
object(filter)

# always flush the filter. without a FLUSH or EOS bucket,
# the content is never written to the network.
# XXX an alternative is to tell the user to flush() always
filter.flush()

The hang/calling behavior seems to be triggered when filter.flush() is 
called. If instead I put in a line:


return OK

before 'filter.flush()' is ever reached (as is the case when the 
exception is raised), everything appears to work. Unfortunately the same 
trick doesn't work on Apache 2.0.54... (even if I use the disable() trick).


Does this mean that filter.flush() is buggy when mod_proxy or 
mod_rewrite are in effect? I don't know, but I thought I'd report it 
here. Perhaps it's also a problem to do with input filters in 
particular? The comment talks about needing to flush to make sure things 
are sent to the network, but that comment makes more sense for output 
filters than input filters (even though mod_proxy in turn sends stuff to 
the network again).


Apache change hunt
--

Trying to figure out what in Apache itself might've changed, I found
that in the bleeding-edge Apache 2.0 trunk there is a patch that seems 
to have to do something with this. More on this apache patch is here:


http://svn.apache.org/viewcvs.cgi/httpd/httpd/branches/2.0.x/CHANGES?rev=233302&view=markup

in particular:

  *) proxy HTTP: Rework the handling of request bodies to handle
 chunked input and input filters which modify content length, and
 avoid spooling arbitrary-sized request bodies in memory.
 PR 15859.  [Jeff Trawick]

Though it also seems to work when mod_rewrite is used instead of 
mod_proxy, so perhaps this fix isn't it and it's something deeper inside 
Apache that changed...


...

Feel free to ask me more questions; I can do more testing if you like. I 
can also post the test code if people are interested. Of course ideally 
I'd make all of this work on a released version of Apache with a 
released version of mod_python, but I'll take any hint I can get. :)


Regards,

Martijn


Re: Accessing to per Directory configuration from an input filter: HOW?

2005-06-23 Thread luca regini
Sorry if i disturb you but i still don't manage to solve my problems. Is it possible to alter cookie in the header_parser hook?? I don't mean to alter the value of the apr_table of the apache server that contains cookie values i really mean changing cookies in the request so that some underling (mostly java) web applications can se the right value for these special cookies. We are trying to reproduce the functioning of a security infracstructure developed originally for MS. It would really be a pity if we don't manage to do with apache what we alreay have in production with IIS.

 
Thanks in advance for your attention.
Luca
 
On 6/22/05, Nick Kew <[EMAIL PROTECTED]> wrote:
luca regini wrote:> I need to write an input filter that is able to change the value of some> cookies. However the name of the cookie to be changed is a per- directory
> value. So i have an input filter whose behavior depends programmatically> from per-directory configuration. How should i gain information about per> -dir configuration from within a filter??>   Thanks in advance
> Luca>Same as from anywhere else.But you don't want to do that in an input filter.  Use the header_parserhook.  Or, if it needs to happen before/after some other module whichviews cookies in a different hook, move as appropriate.
--Nick Kew


My Input Filter doesn't get called

2005-06-23 Thread luca regini
I add an input filter with an "ap_add_input_filter" from an ap_hook_header_parser hook. My filter doesn't get called, i guess probably because we are too far in the request processing cycle. Sadly this filter needs per directory config information that is not available within previous hooks. So shall i write custom configuration code to solve this problem?? Is there a way to do it?? 

 
Thanks,
Luca


Re: Accessing to per Directory configuration from an input filter: HOW?

2005-06-22 Thread luca regini
I add to my previous mail that the ctx parameter in the ap_add_output_filter function can be used to pass an arbitrary structure to the filter. So this should be the right place to pass information to the filter.

 
Luca
 
 
On 6/22/05, Nick Kew <[EMAIL PROTECTED]> wrote:
luca regini wrote:> I need to write an input filter that is able to change the value of some> cookies. However the name of the cookie to be changed is a per- directory
> value. So i have an input filter whose behavior depends programmatically> from per-directory configuration. How should i gain information about per> -dir configuration from within a filter??>   Thanks in advance
> Luca>Same as from anywhere else.But you don't want to do that in an input filter.  Use the header_parserhook.  Or, if it needs to happen before/after some other module whichviews cookies in a different hook, move as appropriate.
--Nick Kew


Re: Accessing to per Directory configuration from an input filter: HOW?

2005-06-22 Thread luca regini
I am not able to find any example of use of this hook to alter the value of a cookie. For my ( limited and erroneous) understanding an hook cannot alter requests nor produce content. This are tasks made for filters.

 
Luca 
On 6/22/05, Nick Kew <[EMAIL PROTECTED]> wrote:
luca regini wrote:> I need to write an input filter that is able to change the value of some> cookies. However the name of the cookie to be changed is a per- directory
> value. So i have an input filter whose behavior depends programmatically> from per-directory configuration. How should i gain information about per> -dir configuration from within a filter??>   Thanks in advance
> Luca>Same as from anywhere else.But you don't want to do that in an input filter.  Use the header_parserhook.  Or, if it needs to happen before/after some other module whichviews cookies in a different hook, move as appropriate.
--Nick Kew


Re: Accessing to per Directory configuration from an input filter: HOW?

2005-06-22 Thread Nick Kew
luca regini wrote:
> I need to write an input filter that is able to change the value of some 
> cookies. However the name of the cookie to be changed is a per- directory 
> value. So i have an input filter whose behavior depends programmatically 
> from per-directory configuration. How should i gain information about per 
> -dir configuration from within a filter??
>   Thanks in advance 
> Luca
> 

Same as from anywhere else.

But you don't want to do that in an input filter.  Use the header_parser
hook.  Or, if it needs to happen before/after some other module which
views cookies in a different hook, move as appropriate.

-- 
Nick Kew


Accessing to per Directory configuration from an input filter: HOW?

2005-06-22 Thread luca regini
I need to write an input filter that is able to change the value of some cookies. However the name of the cookie to be changed is a per- directory value. So i have an input filter whose behavior depends programmatically from per-directory configuration. How should i gain information about per -dir configuration from within a filter??

 
 
Thanks in advance 
Luca


Re: [apache-2.1.3] buffering content in input filter

2005-01-13 Thread Pawel Niewiadomski

On Wed, 12 Jan 2005 18:43:53 + (GMT), Nick Kew <[EMAIL PROTECTED]> wrote :

> I meant to add: there are C code examples in my stuff at
> apache.webthing.com.  Can't recollect OTTOMH what's published, but
> mod_upload is definitely opensource and its tmpfile_filter does
> what you're asking (though it's not an ideal example).

Thanks! I didn't think about calling ap_get_brigade in a loop. 
I was ROTFL from myself when I saw your response.
Now the problem is solved.

-- 
**Paweł Niewiadomski**, new()foo-baz.com, http://new.foo-baz.com/
Virtual Qmail (http://v-q.foo-baz.com), qmail-patches (http://q-p.foo-baz.com)



Re: [apache-2.1.3] buffering content in input filter

2005-01-12 Thread Nick Kew
On Wed, 12 Jan 2005, Nick Kew wrote:

> [a fairly useless message]

I meant to add: there are C code examples in my stuff at
apache.webthing.com.  Can't recollect OTTOMH what's published, but
mod_upload is definitely opensource and its tmpfile_filter does
what you're asking (though it's not an ideal example).

-- 
Nick Kew


Re: [apache-2.1.3] buffering content in input filter

2005-01-12 Thread Nick Kew
On Wed, 12 Jan 2005, Pawel Niewiadomski wrote:

> Does anyone know how can I solve this problem?

As Stas says, simply don't return to the caller until you've got EOS.

You lose the efficiency of pipelining, and beware of how big a request
you accept and buffer!



-- 
Nick Kew


Re: [apache-2.1.3] buffering content in input filter

2005-01-12 Thread Stas Bekman
Pawel Niewiadomski wrote:
hello,
I'm writing an input filter that needs to operate on a full content
of incoming request. But it should not pass incoming data to other
input filters - it gathers full body of request, changes it and then
passes it further.
I have one problem - if I empty a brigade that is passed to the input filter
it is not called again.
I read that free brigades mean that input is ended. If I leave a byte in the
brigade everything works ok - input filter is called until
EOS and I can dump previously gathered content. But I don't want to
add these bytes. 

Does anyone know how can I solve this problem?
Instead of returning empty brigades up the stream, you can gobble up all 
incoming data by issuing your own ap_get_brigade calls. I don't have 
example in C, but you can see how it's done in perl:
t/filter/TestFilter/in_bbs_consume.pm
http://svn.apache.org/viewcvs.cgi/perl/modperl/trunk/t/filter/TestFilter/in_bbs_consume.pm?rev=106958&view=markup

--
__
Stas BekmanJAm_pH --> Just Another mod_perl Hacker
http://stason.org/ mod_perl Guide ---> http://perl.apache.org
mailto:[EMAIL PROTECTED] http://use.perl.org http://apacheweek.com
http://modperlbook.org http://apache.org   http://ticketmaster.com


[apache-2.1.3] buffering content in input filter

2005-01-12 Thread Pawel Niewiadomski
hello,
I'm writing an input filter that needs to operate on a full content
of incoming request. But it should not pass incoming data to other
input filters - it gathers full body of request, changes it and then
passes it further.

I have one problem - if I empty a brigade that is passed to the input filter
it is not called again.
I read that free brigades mean that input is ended. If I leave a byte in the
brigade everything works ok - input filter is called until
EOS and I can dump previously gathered content. But I don't want to
add these bytes. 

Does anyone know how can I solve this problem?

-- 
**Paweł Niewiadomski**, new()foo-baz.com, http://new.foo-baz.com/
Virtual Qmail (http://v-q.foo-baz.com), qmail-patches (http://q-p.foo-baz.com)



POST data input filter questions.

2004-10-21 Thread Nathanael Noblet
Hello,
	I apologize in advance if I'm in the wrong place. I have searched 
through the documentation provided through httpd.apache.org and the 2.0 
(doxygen) API. Plus many of the examples. I'm interested in 
implementing a module. I realize that this isn't the best place to ask, 
but the modules list bounced? I couldn't subscribe following the 
instructions at http://httpd.apache.org/lists.html for module 
development, as well as the fact that the modules archive's last post 
was in early 2005?

	I have a module as a handler that works as expected. The handler needs 
to deal with POST data, which from my reading, means I need an input 
filter. Which is fine, I can see how to set that up as well. Where I 
get lost is, how does my handler get the POST data that I'd read using 
my input filter? I assume it has to do with the context structure that 
each module defines for itself. What I don't understand is where/how 
this structure gets passed around. How would I access that structure 
from my handler? If this is the wrong place to ask, and there is a 
better place other then the defunc modules list, I am more then willing 
to go there. Or if there is a document that explains this part of the 
Apache input path, that would be great. I even looked through the PHP 
source to see if I could figure it out. And though I can see that they 
have a function for reading post data, I don't see exactly how that 
gets back to the handler or back to the output stream handling...


--
Nathanael D. Noblet
Gnat Solutions
204 - 131 Gorge Road E
Victoria, BC V9A 1L1
T/F 250.385.4613
http://www.gnat.ca/


Re: mod_ssl/input filter review needed

2004-06-03 Thread Justin Erenkrantz
--On Wednesday, June 2, 2004 12:12 PM +0100 Joe Orton <[EMAIL PROTECTED]> 
wrote:

The approach I'm using is a new input filter which runs above (before)
the HTTP input filter, and waits for an EOS, then does the SSL
handshake.  All the data must be read from the socket before starting
the handshake, so it reads from all the non-metadata buckets to ensure
they're morphed if necessary.
I think a better analogy is: 'is it closer to the network?'  My impression 
from reading your code is that it is further away from the network than 
HTTP_IN.  Is that correct?  FWIW, that's how I'd do it: HTTP_IN runs first, 
then this new one - looking for the EOS generated from HTTP_IN.

Above and before are all screwy ways to phrase it.  ;-)
Does this approach seem sane, am I missing any input filtering issues
here?  I'd appreciate some review; patch below is newer than that
attached to the bug report.
If my above assertion is correct, looks fine.  -- justin


mod_ssl/input filter review needed

2004-06-02 Thread Joe Orton
I'm working on a fix for #12355, the (infamous?) "per-directory SSL
renegotiation vs request with bodies" bug.  The issue is explained at
length in ssl_engine_kernel.c; simply put: if an SSLRequire is specified
in a directory/location context, it's necessary to perform an SSL
handshake *after* the request body has been read.

mod_ssl for 1.3 solved this using an inelegant but effective hack which
read the entire request body into memory; it seems like a better
solution is possible for 2.0.

The approach I'm using is a new input filter which runs above (before)
the HTTP input filter, and waits for an EOS, then does the SSL
handshake.  All the data must be read from the socket before starting
the handshake, so it reads from all the non-metadata buckets to ensure
they're morphed if necessary.

Does this approach seem sane, am I missing any input filtering issues
here?  I'd appreciate some review; patch below is newer than that
attached to the bug report.

Index: modules/ssl/mod_ssl.c
===
RCS file: /home/cvs/httpd-2.0/modules/ssl/mod_ssl.c,v
retrieving revision 1.97
diff -u -w -d -r1.97 mod_ssl.c
--- modules/ssl/mod_ssl.c   5 Mar 2004 02:41:39 -   1.97
+++ modules/ssl/mod_ssl.c   2 Jun 2004 11:06:00 -
@@ -469,6 +469,7 @@
 static void ssl_register_hooks(apr_pool_t *p)
 {
 ssl_io_filter_register(p);
+ssl_reneg_filter_register(p);
 
 ap_hook_pre_connection(ssl_hook_pre_connection,NULL,NULL, APR_HOOK_MIDDLE);
 ap_hook_post_config   (ssl_init_Module,NULL,NULL, APR_HOOK_MIDDLE);
Index: modules/ssl/ssl_engine_kernel.c
===
RCS file: /home/cvs/httpd-2.0/modules/ssl/ssl_engine_kernel.c,v
retrieving revision 1.106
diff -u -w -d -r1.106 ssl_engine_kernel.c
--- modules/ssl/ssl_engine_kernel.c 25 May 2004 12:09:01 -  1.106
+++ modules/ssl/ssl_engine_kernel.c 2 Jun 2004 11:06:00 -
@@ -29,6 +29,8 @@
 -- Unknown*/
 #include "ssl_private.h"
 
+static int reneg_and_check(request_rec *r, int quick);
+
 /*
  *  Post Read Request Handler
  */
@@ -159,6 +161,99 @@
 return DECLINED;
 }
 
+static ap_filter_rec_t *reneg_filter_rec;
+
+/* The renegotiation input filter is inserted into the input filter
+ * stack to perform an SSL renegotatiation after the request body has
+ * been read.  It runs before the HTTP input filter and waits for it to return
+ * an EOS; at which point it is safe to perform the SSL handshake. */
+static apr_status_t reneg_in_filter(ap_filter_t *f, 
+apr_bucket_brigade *bb,
+ap_input_mode_t mode,
+apr_read_type_e block,
+apr_off_t bytes)
+{
+apr_bucket *bkt;
+apr_status_t rv;
+
+/* This filter needs to buffer each brigade into memory to ensure
+ * that when an EOS is found, all data really has been read from
+ * the socket.  So, ensure that not too much is buffered: */
+if (bytes > HUGE_STRING_LEN) {
+bytes = HUGE_STRING_LEN;
+}
+
+rv = ap_get_brigade(f->next, bb, mode, block, bytes);
+if (rv != APR_SUCCESS) {
+return rv;
+}
+
+for (bkt = APR_BRIGADE_FIRST(bb);
+ bkt != APR_BRIGADE_SENTINEL(bb);
+ bkt = APR_BUCKET_NEXT(bkt))
+{
+if (APR_BUCKET_IS_EOS(bkt)) {
+/* No more work for this filter. */
+ap_remove_input_filter(f);
+
+/* Now really do the negotiation and access control checks. */
+if (reneg_and_check(f->r, 0)) {
+
+/* Access control checks failed: send a 403. */
+ap_log_rerror(APLOG_MARK, APLOG_ERR, 0, f->r,
+  "renegotiation failed; sending 403 error");
+bb = apr_brigade_create(f->r->pool, f->c->bucket_alloc);
+bkt = ap_bucket_error_create(HTTP_FORBIDDEN, NULL,
+ f->r->pool, f->c->bucket_alloc);
+APR_BRIGADE_INSERT_TAIL(bb, bkt);
+bkt = apr_bucket_eos_create(f->c->bucket_alloc);
+APR_BRIGADE_INSERT_TAIL(bb, bkt);
+
+rv = ap_pass_brigade(f->r->output_filters, bb);
+if (rv)
+ap_log_rerror(APLOG_MARK, APLOG_ERR, rv, f->r,
+  "could not send 403 after renegotiation"
+  " failure");
+
+/* Don't give anything back to the caller, just return
+ * an error. */
+apr_brigade_cleanup(bb);
+return APR_EACCES;
+}
+
+/* i

Re: deflate input filter and jk

2004-03-31 Thread Henri Gomez
[EMAIL PROTECTED] wrote:
 > [EMAIL PROTECTED] wrote...
 >
 > Hi to all,
 >
 > A new question to HTTP / RFC gurus.
 >
 > A customer has developped a custom PHP HTTP client,
 > using HTTP 1.0 and compression.
That's like mixing Vodka and Beer... something could
easily puke... but OK... I hear ya...
That's a detailed reply, should take some time to study it.

BTW, my customer couldn't write a full HTTP 1.1 client
so we'll have to live with a HTTP 1.0 ;(


Re: deflate input filter and jk

2004-03-31 Thread TOKILEY

> [EMAIL PROTECTED] wrote...
>
> Hi to all,
>
> A new question to HTTP / RFC gurus.
>
> A customer has developped a custom PHP HTTP client,
> using HTTP 1.0 and compression.

That's like mixing Vodka and Beer... something could
easily puke... but OK... I hear ya...

> This HTTP client compress both request and replies.

Sure, why not.

> For replies it works great but for request we have
> a doubt.

I imagine so, yes.

> Since the HTTP client compress a request there is in
> HTTP header :
>
> Content-Encoding: gzip
>
> Also the Content-Length is set to the size of the
> plain request (not the size of the compressed request).
>
> Is it correct or should it send the Content-Length with
> the size of the compressed request ?
>
> In such case, it seems that mod_deflate INPUT filter should
> modify the Content-Length accordingly ?
>
> Thanks for your help

You've got some messed up code on your hands, Henri.

In your particular case... Content-length should ALWAYS be 
ACTUAL length of the number of bytes on the wire. Anything else 
is going to screw something up somewhere.

You have to remember the difference between 'Content-Encoding:'
and 'Tranfer-encoding:'. 'Transfer-Encoding:' is TRANSPORT
layer thing but 'Content-Encoding:' is a PRESENTATION 
layer thing.

When any HTTP request or response says that it's BODY DATA
has 'Content-type: ' and/or 'Content-Length: ' what that
really meant ( in early HTTP terms ) is...

Content-Type: = Original MIME type of original data (file).
Content-Length = Actual length of original data (file).

The original assumption in early HTTP was that this would always
represent some file on some disk and the 'Content-type:' was
usually just the file extension (mapped) and the 'Content-length:' was 
whatever a 'stat()' call says the file length was.

When Content started to get produced dynamically ( does not
exist until asked for ) things got a little sticky but the CONCEPT
is still the same. Content-type: is supposed to be the MIME type
'as-if' the 'file' already existed and 'Content-length' would be the
exact number of ( PRESENTATION LAYER ) bytes 'as-if' the
'data file' was sitting on a disk somewhere.

If ANYTHING steps in to alter or filter or convert the 'content'
at the PRESENTATION layer then it MUST change the 'Content-Length'
as well because from the 'Content-x' perspective... the
content has, in fact, changed at the PRESENTATION layer.

There is no HTTP header field that looks like this...

Original-Content-Length:  <- Length of data before P layer content changed

All you have to work with is this...

Content-length:  <- Length of P layer data NOW after something changes it.

RFC 2616 says...

4.4 Message Length
3. If a Content-Length header field ( section 14.41 ) is present, its
decimal value in OCTETs represents BOTH the entity-length and
the transfer-length. The Content-Length header field must NOT be sent
if these two lengths are different [snip]

What this really means is...

3. If a ( PRESENTATION layer ) Content-Length header field 
( section 14.41 ) is present, its decimal value in OCTETs represents 
BOTH the entity-length ( Actual PRESENTATION layer length ) and
the transfer-length. ( TRANSPORT layer length - actual number of 
bytes on the wire ). The Content-Length header field must NOT be sent
if these two lengths are different [snip]

The last part is kind of moot since it's not uncommon at all for
presentation layer content-length to be 'different' from the actual
transport layer length. You will see it all the time 'out there'. The
only thing that gets you into real trouble is when the actual length
of the data is MORE than whatever the 'Content-length:' field says
it's supposed to be.

Example: Even with all the above being said... it is actually OK to
leave 'Content-Length:' set to the original size of the file IF you are
using GZIP or DEFLATE ( or any LZ77 ) to compress the content.
As long as the specified 'Content-length:' ( original size ) is MORE
than the number of compressed LZ77 bytes on the wire you will
usually still be OK.

Why?... because GZIP and ZLIB and all other LZ77 decompressors
already KNOW what the original content length was and they don't
need HTTP to tell it to them. The size of the orignal file is (usually)
contained in the LZ77 headers.

Even 'streamed compression' ( sic: ZLIB ) will KNOW when the 
decompression has ended. There's an EOD signal built into
the stream itself... but that doesn't mean the Server will know
what the decompressor 'knows'.

Which brings us to your 'action items', methinks.

If you are using 'streamed co

Re: deflate input filter and jk

2004-03-31 Thread Henri Gomez
Joshua Slive wrote:

On Wed, 31 Mar 2004, Henri Gomez wrote:

Also the Content-Length is set to the size of the
plain request (not the size of the compressed request).
Is it correct or should it send the Content-Length with
the size of the compressed request ?
In such case, it seems that mod_deflate INPUT filter should
modify the Content-Length accordingly ?


The note at the bottom of this section:
http://httpd.apache.org/docs-2.0/mod/mod_deflate.html#enable
says that it should be the compressed length and that the server is not
responsible for changing the length during decompression.
Ok.

Since jk / jk2 are using content-length to forward datas to tomcat,
how did jk/jk2 know the correct size to be forwarded ?
Regards


Re: deflate input filter and jk

2004-03-31 Thread Joshua Slive

On Wed, 31 Mar 2004, Henri Gomez wrote:
> Also the Content-Length is set to the size of the
> plain request (not the size of the compressed request).
>
> Is it correct or should it send the Content-Length with
> the size of the compressed request ?
>
> In such case, it seems that mod_deflate INPUT filter should
> modify the Content-Length accordingly ?

The note at the bottom of this section:
http://httpd.apache.org/docs-2.0/mod/mod_deflate.html#enable
says that it should be the compressed length and that the server is not
responsible for changing the length during decompression.

Joshua.


deflate input filter and jk

2004-03-31 Thread Henri Gomez
Hi to all,

A new question to HTTP / RFC gurus.

A customer has developped a custom PHP HTTP client,
using HTTP 1.0 and compression.
This HTTP client compress both request and replies.

For replies it works great but for request we have
a doubt.
Since the HTTP client compress a request there is in
HTTP header :
Content-Encoding: gzip

Also the Content-Length is set to the size of the
plain request (not the size of the compressed request).
Is it correct or should it send the Content-Length with
the size of the compressed request ?
In such case, it seems that mod_deflate INPUT filter should
modify the Content-Length accordingly ?
Thanks for your help


inserting env var to cgi from input filter

2004-02-25 Thread Aryeh Katz
It seems from the code, that any headers an input filter would like to 
insert in the environment won't be inserted.
This is because by the time the input filter has been called 
(ap_get_brigade on line 696) the environment has been finalized (line 
with run_cgi_child:ap_create_privileged_process, line 487).
What is the recommended way for an input filter to insert data into 
headers_in so that it is available for all handlers?
--
Aryeh Katz
SecureD Services
http://www.secured-services.com/
410 653 0700 x 2




Re: should input filter return the exact amount of bytes asked for?

2003-11-14 Thread Stas Bekman
Justin Erenkrantz wrote:

Thanks for the explanations Justin. Once I'll get some free time I'll need to 
revamp the filters chapter [1] to address the read mode issue. So far I was 
completely ignoring it :(

(1) http://perl.apache.org/docs/2.0/user/handlers/filters.html

Though it'd be nice to add a note re: APR_BLOCK_READ in the
AP_MODE_READBYTES doc above. Or I guess may be it belongs to some filters
tutorial...


I'll note that I wrote an article on describing httpd-2.x's filters for 
some Linux magazine recently.  I bet you can find back issues.  As an 
aside, I never actually saw the final copy or the printed copy.  So, 
don't blame me if it doesn't help.  ;-)  -- justin
Is that the one you are talking about?
http://www.linux-mag.com/2003-08/apache_01.html
rbb wrote a bunch of filtering articles some 2 years ago or so too. It'd 
probably be nice to ask those magazines if we can dump them somewhere under 
the docs-2.0 project, versus linking to them, as ezines tend to move things a 
lot and even kill them.

__
Stas BekmanJAm_pH --> Just Another mod_perl Hacker
http://stason.org/ mod_perl Guide ---> http://perl.apache.org
mailto:[EMAIL PROTECTED] http://use.perl.org http://apacheweek.com
http://modperlbook.org http://apache.org   http://ticketmaster.com


Re: should input filter return the exact amount of bytes asked for?

2003-11-14 Thread Justin Erenkrantz
--On Thursday, November 13, 2003 11:01 AM -0800 Stas Bekman <[EMAIL PROTECTED]> 
wrote:

Should we add an explicit explanation to AP_MODE_READBYTES: return at most
readbytes data. Can't return 0 with APR_BLOCK_READ. Can't return more than
readbytes data.
I'd say the first and last one are equivalent statements.  And, that 
APR_BLOCK_READ description belongs with the definition of APR_BLOCK_READ not 
AP_MODE_READBYTES.

Also while we are at it I have a few more questions:

 /** The filter should return at most one line of CRLF data.
  *  (If a potential line is too long or no CRLF is found, the
  *   filter may return partial data).
  */
 AP_MODE_GETLINE,
does it mean that the filter should ignore the readbytes argument in this
mode?
I think so, yes.

 /** The filter should implicitly eat any CRLF pairs that it sees. */
 AP_MODE_EATCRLF,
does it mean that it should do the same as AP_MODE_GETLINE but kill CRLF? If
not how much data is it supposed to read? Or is it a mode that never goes on
its own and should be OR'ed with some definitive mode, e.g.:
AP_MODE_GETLINE|AP_MODE_EATCRLF and AP_MODE_READBYTES|AP_MODE_EATCRLF?
It's meant to be called right before we read the next pipelined request on the 
connection.  Old (really old) Netscape clients added spurious CRLFs between 
requests.  I don't see a clear rationale why it'd have to be 'combined' with 
other ap_get_brigade() modes.  The only one that'd make sense (to me) is 
AP_MODE_GETLINE.  Note that AP_MODE_EATCRLF doesn't necessarily return 
anything.  It's wildly HTTP specific...

Though it'd be nice to add a note re: APR_BLOCK_READ in the
AP_MODE_READBYTES doc above. Or I guess may be it belongs to some filters
tutorial...
I'll note that I wrote an article on describing httpd-2.x's filters for some 
Linux magazine recently.  I bet you can find back issues.  As an aside, I 
never actually saw the final copy or the printed copy.  So, don't blame me if 
it doesn't help.  ;-)  -- justin


Re: should input filter return the exact amount of bytes asked for?

2003-11-13 Thread Stas Bekman
Justin Erenkrantz wrote:
--On Thursday, November 13, 2003 12:38 AM -0800 Stas Bekman 
<[EMAIL PROTECTED]> wrote:

Great. Where this should be documented? In the ap_get_brigade .h?


It's already in util_filters.h.  Read the documentation for 
ap_input_mode_t:

   /** The filter should return at most readbytes data. */
   AP_MODE_READBYTES,
   ...
Aha! I was looking in the wrong place then. Thanks Justin.

Should we add an explicit explanation to AP_MODE_READBYTES: return at most 
readbytes data. Can't return 0 with APR_BLOCK_READ. Can't return more than 
readbytes data.

Also while we are at it I have a few more questions:

/** The filter should return at most one line of CRLF data.
 *  (If a potential line is too long or no CRLF is found, the
 *   filter may return partial data).
 */
AP_MODE_GETLINE,
does it mean that the filter should ignore the readbytes argument in this mode?

/** The filter should implicitly eat any CRLF pairs that it sees. */
AP_MODE_EATCRLF,
does it mean that it should do the same as AP_MODE_GETLINE but kill CRLF? If 
not how much data is it supposed to read? Or is it a mode that never goes on 
its own and should be OR'ed with some definitive mode, e.g.:
AP_MODE_GETLINE|AP_MODE_EATCRLF and AP_MODE_READBYTES|AP_MODE_EATCRLF?

right? Or how otherwise would you explain the assertion:

   AP_DEBUG_ASSERT(!APR_BRIGADE_EMPTY(bb));


If using APR_BLOCK_READ, it's illegal to return 0 bytes with 
AP_MODE_READBYTES - that is what this assert is checking for in 
maintainer mode (this was a troublesome assert at one point).  It's the 
same expectation as doing a blocking socking read() - blocking reads 
shouldn't return until something is returned.  -- justin
Cool:

/** Determines how a bucket or brigade should be read */
typedef enum {
APR_BLOCK_READ,   /**< block until data becomes available */
APR_NONBLOCK_READ /**< return immediately if no data is available */
} apr_read_type_e;
Though it'd be nice to add a note re: APR_BLOCK_READ in the AP_MODE_READBYTES 
doc above. Or I guess may be it belongs to some filters tutorial...

__
Stas BekmanJAm_pH --> Just Another mod_perl Hacker
http://stason.org/ mod_perl Guide ---> http://perl.apache.org
mailto:[EMAIL PROTECTED] http://use.perl.org http://apacheweek.com
http://modperlbook.org http://apache.org   http://ticketmaster.com


Re: should input filter return the exact amount of bytes asked for?

2003-11-13 Thread Justin Erenkrantz
--On Thursday, November 13, 2003 12:38 AM -0800 Stas Bekman <[EMAIL PROTECTED]> 
wrote:

Great. Where this should be documented? In the ap_get_brigade .h?
It's already in util_filters.h.  Read the documentation for ap_input_mode_t:

   /** The filter should return at most readbytes data. */
   AP_MODE_READBYTES,
   ...
right? Or how otherwise would you explain the assertion:

   AP_DEBUG_ASSERT(!APR_BRIGADE_EMPTY(bb));
If using APR_BLOCK_READ, it's illegal to return 0 bytes with AP_MODE_READBYTES 
- that is what this assert is checking for in maintainer mode (this was a 
troublesome assert at one point).  It's the same expectation as doing a 
blocking socking read() - blocking reads shouldn't return until something is 
returned.  -- justin


Re: should input filter return the exact amount of bytes asked for?

2003-11-13 Thread Stas Bekman
Justin Erenkrantz wrote:
On Tue, Nov 04, 2003 at 01:41:46AM -0800, Stas Bekman wrote:

filter. What happens if the filter returns less bytes (while there is still 
more data coming?) What happens if the filter returns more bytes than 
requested (e.g. because it uncompressed some data). After all the incoming 


Less bytes = OK.
Same bytes = OK.
More bytes = Not OK.  (Theoretically possible though with bad filters.)
Great. Where this should be documented? In the ap_get_brigade .h?

Also,

 0 bytes = Not OK

right? Or how otherwise would you explain the assertion:

  AP_DEBUG_ASSERT(!APR_BRIGADE_EMPTY(bb));

in consumers like ap_get_client_block. Or do you say that a filter can return 
a non-empty brigade with an empty single bucket?

Thanks Justin.

__
Stas BekmanJAm_pH --> Just Another mod_perl Hacker
http://stason.org/ mod_perl Guide ---> http://perl.apache.org
mailto:[EMAIL PROTECTED] http://use.perl.org http://apacheweek.com
http://modperlbook.org http://apache.org   http://ticketmaster.com


Re: should input filter return the exact amount of bytes asked for?

2003-11-11 Thread Justin Erenkrantz
--On Tuesday, November 11, 2003 11:24 AM -0600 "William A. Rowe, Jr." 
<[EMAIL PROTECTED]> wrote:

More bytes = Not OK.  (Theoretically possible though with bad filters.)
Wrong.  This is OK across the board, please consider;
Uh, no.  We changed the filter semantics some time ago to stop this insanity. 
It was inefficient to call AP_MODE_READBYTES and have it return more than 
asked for.  Check out the CVS log for util_filter.h, specifically around 
revision 1.62.

module requests 1000 arbitrary bytes;

  codepage module requests 1000

http reads one 'chunk' available, 8000 bytes
and will return that page
  codepage can translate 7998 bytes and comes to
  a screeching halt for a 3 byte sequence.  returns
  our Now Translated 4000 bytes
module sees a 4000 byte heap bucket.

What can you do?  Instead of treating that bucket as a singleton
when you want 1000 bytes, consume the first 1000 bytes from that
bucket (or the brigade.)
No.  That means you have 3k more bytes you have to consume that you didn't ask 
for.  The filter wouldn't return it again.  Writing code that used input 
filters and having to deal with that it could get more than asked for was just 
confusing and led to lots of error-prone code.

If it asks for 1k in AP_MODE_READBYTES, it gets at most 1k.  Anything else is 
broken.  (util_filter.h AP_MODE_READBYTES says as much, but that's not fair, 
because I wrote that comment.)

Please review the archives for this discussion (the brigades on the
apr list, the filter api on httpd.)  This was a very long thread, but the
net result of filters is that you get what is available/handy, not any
specific number of bytes.
That *was* indeed the position at one time, but when I redid the input filters 
(which was about rewrite #14 of input filters), we corrected this because it 
was causing lots of problems to return more than asked for - this is when we 
added the mode argument to ap_get_brigade.  mod_ssl's input filtering code was 
just broken under that old API.

And, the big boys even reviewed the code and semantic changes before it went 
in.  So, it was definitely RTC.  ;-)  -- justin


Re: should input filter return the exact amount of bytes asked for?

2003-11-11 Thread William A. Rowe, Jr.
At 03:31 AM 11/11/2003, Justin Erenkrantz wrote:
>On Tue, Nov 04, 2003 at 01:41:46AM -0800, Stas Bekman wrote:
>> filter. What happens if the filter returns less bytes (while there is still 
>> more data coming?) What happens if the filter returns more bytes than 
>> requested (e.g. because it uncompressed some data). After all the incoming 
>
>Less bytes = OK.

But not great if there is more incoming data available (consider that one
can call with NONBLOCK and dig up some more.  There is a balance to be
found here, one doesn't want to slurp 15mb of a file at onces, but one doesn't
want bytes to trickle up one at a time.

>Same bytes = OK.

Of course

>More bytes = Not OK.  (Theoretically possible though with bad filters.)

Wrong.  This is OK across the board, please consider;

module requests 1000 arbitrary bytes;

  codepage module requests 1000

http reads one 'chunk' available, 8000 bytes
and will return that page

  codepage can translate 7998 bytes and comes to
  a screeching halt for a 3 byte sequence.  returns
  our Now Translated 4000 bytes

module sees a 4000 byte heap bucket.

What can you do?  Instead of treating that bucket as a singleton
when you want 1000 bytes, consume the first 1000 bytes from that
bucket (or the brigade.)

Please review the archives for this discussion (the brigades on the
apr list, the filter api on httpd.)  This was a very long thread, but the
net result of filters is that you get what is available/handy, not any
specific number of bytes.

BIll 



Re: should input filter return the exact amount of bytes asked for?

2003-11-11 Thread Justin Erenkrantz
On Tue, Nov 04, 2003 at 01:41:46AM -0800, Stas Bekman wrote:
> filter. What happens if the filter returns less bytes (while there is still 
> more data coming?) What happens if the filter returns more bytes than 
> requested (e.g. because it uncompressed some data). After all the incoming 

Less bytes = OK.
Same bytes = OK.
More bytes = Not OK.  (Theoretically possible though with bad filters.)

HTH.  -- justin


Re: should input filter return the exact amount of bytes asked for?

2003-11-06 Thread Stas Bekman
Stas Bekman wrote:
I'm trying to get rid of ap_get_client_block(), but I don't understand a 
few things. ap_get_client_block() asks for readbytes from the upstream 
filter. What happens if the filter returns less bytes (while there is 
still more data coming?) What happens if the filter returns more bytes 
than requested (e.g. because it uncompressed some data). After all the 
incoming filters all propogate a request for N bytes read to the core_in 
filter, which returns that exact number if it can. Now as the data flows 
up the filter chain its length may change. Does it mean that if the 
filter didn't return the exact amount asked for it's broken? Is that the 
case when it returns less data than requested? Or when it returns more 
data?

I'm trying to deal with the case where a user call wants N bytes and 
I've to give that exact number in a single call. I'm not sure whether I 
should buffer things if I've got too much data or on the opposite ask 
for more bbs if I don't have enough data. Are there any modules I can 
look at to learn from?

The doc for ap_get_brigade doesn't say anything about ap_get_brigade 
satisfying 'readbytes' argument.

/**
 * Get the current bucket brigade from the next filter on the filter
 * stack.  The filter returns an apr_status_t value.  If the bottom-most
 * filter doesn't read from the network, then ::AP_NOBODY_READ is returned.
 * The bucket brigade will be empty when there is nothing left to get.
 * @param filter The next filter in the chain
 * @param bucket The current bucket brigade.  The original brigade passed
 *   to ap_get_brigade() must be empty.
 * @param mode   The way in which the data should be read
 * @param block  How the operations should be performed
 *   ::APR_BLOCK_READ, ::APR_NONBLOCK_READ
 * @param readbytes How many bytes to read from the next filter.
 */
AP_DECLARE(apr_status_t) ap_get_brigade(ap_filter_t *filter,
apr_bucket_brigade *bucket,
ap_input_mode_t mode,
apr_read_type_e block,
apr_off_t readbytes);


What bothers me most is the case where a filter may return more data than it 
has been asked for in the AP_MODE_READBYTES mode. ap_get_client_block() 
doesn't deal with buffering such data and drops it on the floor. So it either 
has to be fixed to do the buffering or the filter spec (ap_get_brigade) needs 
to clearly state that no more than requested amount of data should be returned 
in the AP_MODE_READBYTES. And ap_get_client_block needs to assert if it gets more.

__
Stas BekmanJAm_pH --> Just Another mod_perl Hacker
http://stason.org/ mod_perl Guide ---> http://perl.apache.org
mailto:[EMAIL PROTECTED] http://use.perl.org http://apacheweek.com
http://modperlbook.org http://apache.org   http://ticketmaster.com


should input filter return the exact amount of bytes asked for?

2003-11-04 Thread Stas Bekman
I'm trying to get rid of ap_get_client_block(), but I don't understand a few 
things. ap_get_client_block() asks for readbytes from the upstream filter. 
What happens if the filter returns less bytes (while there is still more data 
coming?) What happens if the filter returns more bytes than requested (e.g. 
because it uncompressed some data). After all the incoming filters all 
propogate a request for N bytes read to the core_in filter, which returns that 
exact number if it can. Now as the data flows up the filter chain its length 
may change. Does it mean that if the filter didn't return the exact amount 
asked for it's broken? Is that the case when it returns less data than 
requested? Or when it returns more data?

I'm trying to deal with the case where a user call wants N bytes and I've to 
give that exact number in a single call. I'm not sure whether I should buffer 
things if I've got too much data or on the opposite ask for more bbs if I 
don't have enough data. Are there any modules I can look at to learn from?

The doc for ap_get_brigade doesn't say anything about ap_get_brigade 
satisfying 'readbytes' argument.

/**
 * Get the current bucket brigade from the next filter on the filter
 * stack.  The filter returns an apr_status_t value.  If the bottom-most
 * filter doesn't read from the network, then ::AP_NOBODY_READ is returned.
 * The bucket brigade will be empty when there is nothing left to get.
 * @param filter The next filter in the chain
 * @param bucket The current bucket brigade.  The original brigade passed
 *   to ap_get_brigade() must be empty.
 * @param mode   The way in which the data should be read
 * @param block  How the operations should be performed
 *   ::APR_BLOCK_READ, ::APR_NONBLOCK_READ
 * @param readbytes How many bytes to read from the next filter.
 */
AP_DECLARE(apr_status_t) ap_get_brigade(ap_filter_t *filter,
apr_bucket_brigade *bucket,
ap_input_mode_t mode,
apr_read_type_e block,
apr_off_t readbytes);
__
Stas BekmanJAm_pH --> Just Another mod_perl Hacker
http://stason.org/ mod_perl Guide ---> http://perl.apache.org
mailto:[EMAIL PROTECTED] http://use.perl.org http://apacheweek.com
http://modperlbook.org http://apache.org   http://ticketmaster.com


Re: Input filter and setting HTTP headers in Apache 2.0

2003-09-28 Thread Joe Schaefer
[EMAIL PROTECTED] writes:

> [...]
> > 
> > Why don't you just write a fixup handler with libapreq-2?
> > AFAICT libapreq-2's mod_apreq (filter) handles these details
> > for you.
> > 
> 
> I did have the sense that what I was trying to do running under the
> HTTP_IN filter an impossible task to achieve. That is until I took a
> look at libapreq-2. Since my concern was modifying the request body, I
> misled myself by trying to write a connection filter, when a resource
> one was what I needed. 

It's quite a tricky problem.  mod_apreq went through quite a few
redesigns before arriving at the current implementation.  libapreq-2
users need mod_apreq to pull double duty as both

  (1) an active (pull) parser with a spool sitting right behind HTTP_IN.
  (2) a passive (push) parser sitting at the very end of the input filters.

Handlers that run before the content handler need mod_apreq in 
mode (1), whereas the content handler and any subsequent output 
filters need mode (2).  The current implementation of mod_apreq relies
on the filter_init() call to switch from mode (1) to mode (2).
I'm not sure how smoothly this changeover happens on an internal
redirect (httpd-apreq-2 doesn't have any tests written for redirects), 
but I suppose we'll cross that bridge as we come to it :-).

-- 
Joe Schaefer



Re: Input filter and setting HTTP headers in Apache 2.0

2003-09-28 Thread albertochan
[...]
> 
> Why don't you just write a fixup handler with libapreq-2?
> AFAICT libapreq-2's mod_apreq (filter) handles these details
> for you.
> 

I did have the sense that what I was trying to do running under the HTTP_IN filter an 
impossible task to achieve. That is until I took a look at libapreq-2. Since my 
concern was modifying the request body, I misled myself by trying to write a 
connection filter, when a resource one was what I needed.

[...]
> No, IMO it's *even simpler* than that.
> -- 
> Joe Schaefer
> 

You're right on this one :). Thanks for the insight! 


Re: Input filter and setting HTTP headers in Apache 2.0

2003-09-25 Thread Joe Schaefer
[EMAIL PROTECTED] writes:

[...]

> The question is how can I ask CORE_IN for the request line,
> http headers and request body without causing it to be 
> consumed? 

Why don't you just write a fixup handler with libapreq-2?
AFAICT libapreq-2's mod_apreq (filter) handles these details
for you.

[...]

> > The simple answer is that if you return data in the right
> > format to HTTP_IN, then there won't be a problem :)  
> 
> I wish it was easy as that :)
> 

No, IMO it's *even simpler* than that.

-- 
Joe Schaefer



Re: Input filter and setting HTTP headers in Apache 2.0

2003-09-25 Thread albertochan
Hi, 

>> The only way I can see where I will have the ability
>>  to insert those HTTP inbound request headers is if
>>  my filter runs between the CORE_IN and the HTTP_IN
>>  input filter,
>
> I'm afraid this is true...  Does anyone else have a better idea?

After a few days of reseach, it seems like only a connection 
or protocol input filter would give me access to the headers + post data,
but the problem is that once a filter start requesting
data from CORE_IN, unless in SPECULATIVE mode, that data
will be marked gone by CORE_IN. Even in this case, this mode
will not work for a large request body (e.g. 1 MB), as it will always
return the same beginning chunk of data (the max internal buffer size 
being 8192).

>
>> in which case I will need a properly
>>  populated request_rec* structure to be able to use
>>  the ap_XXX APIs (typical things would be get the
>>  mime-type, content-length of the POST data, protocol
>>  version, etc).
>
> No, just modify the protocol data...  If you want to insert a specific 
> header field, just construct the text form of it as if the brower sent 
> it and return it to the filter on top at the right point in the data 
> stream.  You're going to be keeping up with data you've read anyway so 
> you can return it at the right time to the calling filter.  Be careful 
> you have configuration to avoid buffering the entire .iso file that 
> somebody tries to copy to their DAV filesystem.  And your configuration 
> may be roll-your-own w.r.t. selecting which request objects to operate 
> on since you may have to make your decisions before Apache has read the 
> entire request header.

Being an input filter, my code doesn't 
get invoked unless someone ask for some data. This will
be initiated by the read_request_line(). Now this will
call ap_get_brigade() with the GETLINE mode. It is then
that my filter gets called and it would like to read 
request line + http headers + post data (a big request
body, or chunked encoding will, at this point, make things
even harder), and then perhaps add additional headers 
so that the web application can accessed them through 
ap_table_get(r->headers_in) call. I could save those headers
in my context (f->ctx), and then modify the brigade 
when ap_get_mime_headers_core() will be called (assuming
that the next call that will ask for data after 
read_request_line() will always be ap_get_mime_headers_core()).
Is that a fair assumption?
Once done, return the brigade with the request line
that read_request_line() originally asked.

The question is how can I ask CORE_IN for the request line,
http headers and request body without causing it to be 
consumed? 

The other worry is memory usage. Since all allocation will
be done from the connection pool, with keepalive, that connection
can be alive for a long time, and as such, I can see memory
usage keep increasing. Could that be addressed by running 
as a protocol filter that gets inserted on the create_request()
hook?

>>  Since my filter will run before the
>>  HTTP_IN filter would have had a chance to parse the
>>  request line and the request headers, it will most
>>  likely cause confusion within the Apache internal
>>  state.

> The simple answer is that if you return data in the right format to 
> HTTP_IN, then there won't be a problem :)  

I wish it was easy as that :)


Re: Input filter and setting HTTP headers in Apache 2.0

2003-09-19 Thread Jeff Trawick
[EMAIL PROTECTED] wrote:

Be careful 
you have configuration to avoid buffering the entire .iso file that 
somebody tries to copy to their DAV filesystem.  And your configuration 
may be roll-your-own w.r.t. selecting which request objects to operate 
on since you may have to make your decisions before Apache has read the 
entire request header.


Could you perhaps please expand a bit on this? 
I made two assumptions, perhaps both bogus:

a) you need to read the entire request body (POST data) before deciding 
what headers to add, if any
b) you'll buffer the request body in memory 'til you get to the end of 
it and decide what to do

b would be really bad since for a large post you'll crash the server 
trying to alloc so much storage, which is why you wouldn't buffer it in 
memory.

--

The point about roll-your-own configuration is that you can't make use 
of Apache  and  containers to tell your module 
whether it should do this expensive processing on the POST data since 
you have to decide whether or not to do it long before Apache http 
engine has determined the Location and Directory information.




Re: Input filter and setting HTTP headers in Apache 2.0

2003-09-19 Thread albertochan
> Be careful 
> you have configuration to avoid buffering the entire .iso file that 
> somebody tries to copy to their DAV filesystem.  And your configuration 
> may be roll-your-own w.r.t. selecting which request objects to operate 
> on since you may have to make your decisions before Apache has read the 
> entire request header.

Could you perhaps please expand a bit on this? 

> The simple answer is that if you return data in the right format to 
> HTTP_IN, then there won't be a problem :)  But keep track of Apache 2 
> development in that area so that if the interface has to change between 
> those filters you can accomodate.  I'm doubtful that the exact division 
> of labor between the different Apache-provided input filters can be 
> considered a stable programming interface.

Point taken.



Re: Input filter and setting HTTP headers in Apache 2.0

2003-09-19 Thread Jeff Trawick
[EMAIL PROTECTED] wrote:

The only way I can see where I will have the ability
to insert those HTTP inbound request headers is if
my filter runs between the CORE_IN and the HTTP_IN
input filter,
I'm afraid this is true...  Does anyone else have a better idea?

in which case I will need a properly
populated request_rec* structure to be able to use
the ap_XXX APIs (typical things would be get the
mime-type, content-length of the POST data, protocol
version, etc).
No, just modify the protocol data...  If you want to insert a specific 
header field, just construct the text form of it as if the brower sent 
it and return it to the filter on top at the right point in the data 
stream.  You're going to be keeping up with data you've read anyway so 
you can return it at the right time to the calling filter.  Be careful 
you have configuration to avoid buffering the entire .iso file that 
somebody tries to copy to their DAV filesystem.  And your configuration 
may be roll-your-own w.r.t. selecting which request objects to operate 
on since you may have to make your decisions before Apache has read the 
entire request header.

Since my filter will run before the
HTTP_IN filter would have had a chance to parse the
request line and the request headers, it will most
likely cause confusion within the Apache internal
state.
The simple answer is that if you return data in the right format to 
HTTP_IN, then there won't be a problem :)  But keep track of Apache 2 
development in that area so that if the interface has to change between 
those filters you can accomodate.  I'm doubtful that the exact division 
of labor between the different Apache-provided input filters can be 
considered a stable programming interface.



Re: Input filter and setting HTTP headers in Apache 2.0

2003-09-18 Thread albertochan
Hi,
> 
> Does your module needs to read and examine the request body (POST data) 
> prior to inserting the HTTP request headers, 

Yes, the module needs the ability to look at the request body, and based on its 
content, add the necessary HTTP request headers if required for the web application to 
use. 

> or are these two separate 
> tasks that can be implemented independently in different Apache 
> processing phases?

The insertion of the headers is based on the request body content, so can't really 
separate them.

> 
> Sometimes, changes to the HTTP request headers need to be made prior to 
> reaching the handler hook anyway.
> 






Re: Input filter and setting HTTP headers in Apache 2.0

2003-09-18 Thread Jeff Trawick
[EMAIL PROTECTED] wrote:

I am writing a module need to be able to examine
POST data, and insert inbound HTTP request headers
after my code has been run by apache so that:
Does your module needs to read and examine the request body (POST data) 
prior to inserting the HTTP request headers, or are these two separate 
tasks that can be implemented independently in different Apache 
processing phases?

Sometimes, changes to the HTTP request headers need to be made prior to 
reaching the handler hook anyway.




Input filter and setting HTTP headers in Apache 2.0

2003-09-17 Thread albertochan
Hi,

I am writing a module need to be able to examine
POST data, and insert inbound HTTP request headers
after my code has been run by apache so that:

1.  the web application can still retrieve the POST
data if required for its own consumption 
2.  the web application needs to have access to the
HTTP inbound request headers generated at runtime
(mod_header can't help me) by my code.

A web application in this case could be one of the
following (not exhaustive): 
*   a native apache 2.0 module 
*   mod_jk with Tomcat on the back end actually
running the application 
*   mod_cgi/mod_cgid 
*   mod_proxy (the back web webserver will act as the
web application)

Attempted Solution

Use an input filter. This provides the mechanism to
examine the POST data at my will without consuming
it off the socket, thus giving the web application
still the opportunity to consume it.

The input filter chain happens during the handler
phase, and is triggered if either of the following
is true:

1.  there is an explicit call by the web application
to read the POST data (either calling the
ap_XXX_client_block() series API or
ap_get_brigade()) 
2.  in the case where the web application doesn't
care about the POST data (but still requires the
inbound HTTP request headers that my code creates),
it is the ap_finalize_request_protocol() that
triggers the input filter chain. This is most likely
a special case, since if the web application doesn't
care about the POST data, a GET would be more
appropriate.

Problem

Let's say that the web application, before executing
its own logic, needs to check for a HTTP request
header that I generate to decide if it should
proceed or not. So unless the web application tries
to consume the POST data, my code (in the input
filter) will never get the chance to run to insert
those inbound HTTP request headers that the
application requires.

Question

Is there a work around for the above solution in
order to avoid the above problem, in order:

1.  for my code to be able to examine the POST data
and still allow the web application to consume it,
and 
2.  for my code to be able to insert those inbound
HTTP request headers so that the web application can
see them at run time, without putting the constraint
on it to have to retrieve the POST data first to
have access to those inbound HTTP request headers?
An example would be a cgi application that relies on
mod_cgi. mod_cgi first reads the HTTP request
headers before reading the POST data
(add_common_vars() is called way before
ap_get_brigade() in cgi_handler()). So that means
the cgi application will never get the chance to see
the headers that my code inserted. Once
ap_get_brigade() returns, the headers will be set,
but add_common_vars() has already been executed.

Since I need to be able to peek at the POST data
without consuming it, the input filter is the only
known solution (right?).Is there a known work around
this?

The only way I can see where I will have the ability
to insert those HTTP inbound request headers is if
my filter runs between the CORE_IN and the HTTP_IN
input filter, in which case I will need a properly
populated request_rec* structure to be able to use
the ap_XXX APIs (typical things would be get the
mime-type, content-length of the POST data, protocol
version, etc). Since my filter will run before the
HTTP_IN filter would have had a chance to parse the
request line and the request headers, it will most
likely cause confusion within the Apache internal
state.

Any other ideas?


Thanks, -Alberto



How? input filter which parses headers

2003-07-24 Thread Maurizio Codogno
Hi all,
I am trying to create an input filter which looks at the header lines
of a request (from the Request line itself!), and modifies them if
necessary.
Unfortunately, the examples I found on the web seem to use an old
version of the API, and thus they do not compile. Moreover, what
I found in the source tree is of type AP_FTYPE_RESOURCE, and I cannot
manage to convert them to type AP_FTYPE_CONNECTION which I believe
is what I need.
Could somebody point me to a working example?

TIA, .mau.



Re: headers for an input filter in AP_FTYPE_CONNECTION mode ?

2003-06-24 Thread Jeff Trawick
Philipp Reisner wrote:
Now since I can do input (request) filtering with mod_ext_filter,
I tried to set ftype = AP_FTYPE_CONNECTION and expected to get
the headers. If you set ftype to AP_FTYPE_CONNECTION on an output
filter, it will get the complete response including headers.
On output, the headers are already known by the time we get to a 
connection filter.  On input, the connection filter sees the request 
data from the client before any parsing has been performed.

So my point of respecting the astute user's desire to make an input 
filter something other than AP_FTYPE_RESOURCE probably has some required 
checks, like forgoing worrying about content type if there is no 
r/r->headers_in.

btw, itching to get to your latest patch...

I have to add that I deployed my input filter on a name based 
vhost. -- Does anyone know it if does work with an IP based
vhost ?
I don't know why it would matter.



headers for an input filter in AP_FTYPE_CONNECTION mode ?

2003-06-24 Thread Philipp Reisner
Hi,

Now since I can do input (request) filtering with mod_ext_filter,
I tried to set ftype = AP_FTYPE_CONNECTION and expected to get
the headers. If you set ftype to AP_FTYPE_CONNECTION on an output
filter, it will get the complete response including headers.

I have to add that I deployed my input filter on a name based 
vhost. -- Does anyone know it if does work with an IP based
vhost ?

TIA
-Philipp
-- 
: Dipl-Ing Philipp Reisner  Tel +43-1-8178292-50 :
: LINBIT Information Technologies GmbH  Fax +43-1-8178292-82 :
: Schönbrunnerstr 244, 1120 Vienna, Austriahttp://www.linbit.com :



input filter

2002-11-25 Thread Estrade Matthieu
hi,

Is there a way to place an input filter before mod_proxy do the 
connection to backend.
I did a register_input_filter and it's called when mod_proxy is proxying 
the data from backend server (ap_get_brigade in proxy_http.c).

my aim is to modify the data before the proxy send it to backend.


regards,

Estrade Matthieu




__
Modem offert : 150,92 euros remboursés sur le Pack eXtense de Wanadoo ! 
Haut débit à partir de 30 euros/mois : http://www.ifrance.com/_reloc/w



Re: SSL Input Filter bogosity

2002-10-31 Thread Justin Erenkrantz
--On Thursday, October 31, 2002 12:19 AM -0600 "William A. Rowe, Jr." 
<[EMAIL PROTECTED]> wrote:

OpenSSL 0.9.6g does so.  Why shouldn't we?


Because OpenSSL is a library, we're not.


However, if we don't have inl worth of bytes, and they are sitting
ready (on the socket) shouldn't we fetch them?  Forget the GETLINE
bogosity, it means nothing to SSL.


No.  I think you are missing the point of having this code understand 
AP_MODE_GETLINE.  When we try to read a line, we don't have a way of 
really knowing how much to read.  So, we try to read a maximum of 8k 
(whatever AP_IOBUFSIZE is).  That merely defines what our maximum 
line length is (actually that is the maximum that mod_ssl will ever 
return on a read).  The generic read code doesn't have logic for 
determining when to stop.  That code is only contained in the logic 
that understands AP_MODE_GETLINE (does the memchr call).

Yet, in all probability, the chances are that the line is going to be 
very short (consider HTTP headers).  Therefore, as a critical 
optimization, we don't necessarily want to read the full 8k, but we 
want to try to see if we have enough with what we already have.  So, 
when we read anything with AP_MODE_GETLINE, we should exit out of our 
generic read call and then check to see if we've satisfied the 
getline requirements.

Removing the short-circuits is going to make all AP_MODE_GETLINE 
calls block for 8k of data every time.  That's going to be 
unacceptable when we're reading the headers (which is a blocking 
read) as we may end up reading too much data and we will end up 
reading past the headers (which commonly take less than 8k of data). 
When reading from the SSL socket generically, AP_MODE_GETLINE calls 
can only safely handle the output of one socket read, and I believe, 
if there was any data left over from the last read, we should attempt 
to check if that was a line as well (again, chances are that it is 
when called with AP_MODE_GETLINE).  Once we have the output of that 
socket read, the getline-aware logic can then determine if it is 
enough (it saw the LF), or to read some more (given the AP_IOBUFSIZE 
constraint).  Doing blind blocking as your patch does can't work.

Therefore, I believe these AP_MODE_GETLINE optimizations must stay in.

As a hunch, I would believe it would be better to return partial data 
up the filter chain sooner rather than waiting for the entirety even 
for AP_MODE_READBYTES calls as well.  There is no assumption that a 
filter *must* return all of the bytes it was requested.  On a 
blocking read, the only guarantee is that it should return at least 
one byte.  If we're really waiting around for the network, I think 
it'd be best to compute what we already have and then once we're 
ready, try to read from the network again.  Of course, I have no 
numbers to prove this thought, but I wouldn't be shocked if it is.

We should do a conditional on the APR_BRIGADE_EMPTY() check if
inbio->block is non-blocking.  It's considered a design error if a
filter returns an empty brigade on a blocking call.


Who said we are blocking?  This could be a SPECULATIVE call
with a NONBLOCKING request, no?


Huh?  We have the socket mode as we are passing that parameter to 
ap_get_brigade.  So, of course, we know if we are blocking.  Again, 
the common case is that we are blocking.

Should the APR_BUCKET_IS_EOS rather be APR_BUCKET_IS_METADATA?
Not sure here.  Perhaps.


Hmmm.  I was thinking about the METADATA case.

Do you suppose we should percolate METADATA buckets back out
to the filter_read of SSL?  I suppose metadata should just go
through unharmed.


I think so.  We really don't have a case where METADATA originates 
from the core, but it's possible.

However, we have to react to EOS, since that's the end of the input
available to the SSL pump.


Yes, as a special case of METADATA.


Getline means nothing in the context of fetching bytes off of
an SSL socket.  We must shove the raw bytes into the SSL
pump in order to return any sort of data (SPECULATIVE, READBYTES,
GETLINE, etc) from the SSL pump.  The raw bytes just need
to be fetched.

Of course, it's generally nonblocking, so if we don't get a hit
from the socket, the new code just returns whatever we got.


No, it's not generally non-blocking (see ap_rgetline_core for the 
initial call to ap_get_brigade - it is blocking).  So, attempting to 
read the full 8k when we don't have that available will not work. 
Again, the AP_MODE_GETLINE is important even within the generic 
reading of the code - see above.

No, there really isn't.  It was impossible to look at inbio.rv when
we needed too, since ctx wasn't passed, but the SSL ctx.


Then, change the function parameters.  I'm just concerned that we're 
going to be substantially adding to another function.  mod_ssl has a 
lot of places where functions go on and on and on and on and on. 
Breaking it into little pieces isn't that bad of an idea - this patch 
is making a function considerably longer and

Re: SSL Input Filter bogosity

2002-10-30 Thread Aaron Bannert
On Wed, Oct 30, 2002 at 09:04:02PM -0800, MATHIHALLI,MADHUSUDAN (HP-Cupertino,ex1) 
wrote:
> Well, I'm of the view that one should avoid declaring variables within a
> loop whenever possible. (you know - individual preferences etc.).

FWIW, I am too. :)

-aaron



  1   2   >