Re: Content-Type / AddOutputFilterByType DEFLATE text/html

2017-08-08 Thread William A Rowe Jr
This current behavior still seems wrong in httpd. A content (as opposed
perhaps to transfer) should not vary, in fact cannot vary if an etag is
presented.

I suspect that the deflate filter looks to see if there is a benefit to
compression, and cannot do so until it has a body. If it is going to do
that, I suspect it must use a transfer encoding. Otherwise the deflation of
content shouldn't vary, and the header should be toggled even when there is
no body.


On Aug 7, 2017 04:43, "Reindl Harald"  wrote:

> OK, the reason are the HEAD-Requests triggered by curl_setopt($curl,
> CURLOPT_NOBODY, 1); so the bug is ignoring that in case of
> "/static.htm?count=250_150209658" and sending in fact a body back while
> for the 3 other test urls the response is HEAD as requested and the curl
> code is identical
>
>  $curl = curl_init();
>  curl_setopt($curl, CURLOPT_NOBODY, 1);
>  curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
>  curl_setopt($curl, CURLOPT_CONNECTTIMEOUT, 5);
>  curl_setopt($curl, CURLOPT_USERAGENT, 'Mozilla/5.0 (X11; Fedora; Linux
> x86_64; rv:55.0) Gecko/20100101 Firefox/55.0');
>  curl_setopt($curl, CURLOPT_HEADER, 1);
>  curl_setopt($curl, CURLOPT_SSL_VERIFYPEER, 0);
>  curl_setopt($curl, CURLOPT_URL, $url);
>  curl_setopt($curl, CURLOPT_HTTPHEADER, $curl_header);
>
> Am 07.08.2017 um 11:12 schrieb Reindl Harald:
>
>> Hi
>>
>> AddOutputFilterByType DEFLATE text/html
>>
>> is this a bug or somehow expected behavior that in case the
>> "Content-Type" header also contains a charset mod_defalte don't work as
>> expected which means in case of curl requests only static files are gzip
>> compressed while PHP responses are missing "Content-Encoding: gzip" - that
>> this don't happen in case of Firefox as client makes it even more strange
>>
>> identical result for "Content-Type: text/html; charset=UTF-8" in case
>> "default_charset" is not set in php.ini
>>
>> the last line of each block is the PHP array for curl_setopt($curl,
>> CURLOPT_HTTPHEADER, $curl_header);
>>
>> NO GZIP
>> http://corecms/index.php?count=250_1502096587
>> HTTP/1.1 200 OK
>> Date: Mon, 07 Aug 2017 09:03:08 GMT
>> X-DNS-Prefetch-Control: off
>> X-Content-Type-Options: nosniff
>> X-Response-Time: D=1744 us
>> Expires: Mon, 26 Jul 1997 05:00:00 GMT
>> Cache-Control: no-cache
>> ETag: 7d820de3763d0e6c22ccbfe846ab1c31
>> Vary: User-Agent
>> Content-Type: text/html; charset=ISO-8859-1
>> a:2:{i:0;s:58:"Cookie: LOUNGE_ID=pivked76ocg1n9750n91
>> d3dtnqqpqhpmqci63pvo";i:1;s:30:"Accept-Encoding: gzip, deflate";}
>>
>> NO GZIP
>> http://corecms/
>> HTTP/1.1 200 OK
>> Date: Mon, 07 Aug 2017 09:03:08 GMT
>> X-DNS-Prefetch-Control: off
>> X-Content-Type-Options: nosniff
>> X-Response-Time: D=400 us
>> Cache-Control: max-age=120
>> Expires: Mon, 07 Aug 2017 09:05:08 GMT
>> Vary: User-Agent
>> Content-Type: text/html; charset=ISO-8859-1
>> a:2:{i:0;s:58:"Cookie: LOUNGE_ID=pivked76ocg1n9750n91
>> d3dtnqqpqhpmqci63pvo";i:1;s:30:"Accept-Encoding: gzip, deflate";}
>>
>> GZIP OK
>> http://corecms/static.htm?count=250_1502096587
>> HTTP/1.1 200 OK
>> Date: Mon, 07 Aug 2017 09:03:08 GMT
>> X-DNS-Prefetch-Control: off
>> X-Content-Type-Options: nosniff
>> X-Response-Time: D=297 us
>> Last-Modified: Sun, 06 Aug 2017 17:49:54 GMT
>> ETag: "ec1-556195b938c8f-gzip"
>> Accept-Ranges: bytes
>> Cache-Control: max-age=120
>> Expires: Mon, 07 Aug 2017 09:05:08 GMT
>> Content-Encoding: gzip
>> Vary: Accept-Encoding
>> Content-Length: 890
>> Content-Type: text/html
>> a:2:{i:0;s:58:"Cookie: LOUNGE_ID=pivked76ocg1n9750n91
>> d3dtnqqpqhpmqci63pvo";i:1;s:30:"Accept-Encoding: gzip, deflate";}
>>
>> NO GZIP
>> http://corecms/static.php?count=250_1502096587
>> HTTP/1.1 200 OK
>> Date: Mon, 07 Aug 2017 09:03:08 GMT
>> X-DNS-Prefetch-Control: off
>> X-Content-Type-Options: nosniff
>> X-Response-Time: D=280 us
>> Cache-Control: max-age=120
>> Expires: Mon, 07 Aug 2017 09:05:08 GMT
>> Vary: User-Agent
>> Content-Type: text/html; charset=ISO-8859-1
>> a:2:{i:0;s:58:"Cookie: LOUNGE_ID=pivked76ocg1n9750n91
>> d3dtnqqpqhpmqci63pvo";i:1;s:30:"Accept-Encoding: gzip, deflate";}
>>
>


Re: Content-Type / AddOutputFilterByType DEFLATE text/html

2017-08-07 Thread Reindl Harald
OK, the reason are the HEAD-Requests triggered by curl_setopt($curl, 
CURLOPT_NOBODY, 1); so the bug is ignoring that in case of 
"/static.htm?count=250_150209658" and sending in fact a body back while 
for the 3 other test urls the response is HEAD as requested and the curl 
code is identical


 $curl = curl_init();
 curl_setopt($curl, CURLOPT_NOBODY, 1);
 curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
 curl_setopt($curl, CURLOPT_CONNECTTIMEOUT, 5);
 curl_setopt($curl, CURLOPT_USERAGENT, 'Mozilla/5.0 (X11; Fedora; Linux 
x86_64; rv:55.0) Gecko/20100101 Firefox/55.0');

 curl_setopt($curl, CURLOPT_HEADER, 1);
 curl_setopt($curl, CURLOPT_SSL_VERIFYPEER, 0);
 curl_setopt($curl, CURLOPT_URL, $url);
 curl_setopt($curl, CURLOPT_HTTPHEADER, $curl_header);

Am 07.08.2017 um 11:12 schrieb Reindl Harald:

Hi

AddOutputFilterByType DEFLATE text/html

is this a bug or somehow expected behavior that in case the 
"Content-Type" header also contains a charset mod_defalte don't work as 
expected which means in case of curl requests only static files are gzip 
compressed while PHP responses are missing "Content-Encoding: gzip" - 
that this don't happen in case of Firefox as client makes it even more 
strange


identical result for "Content-Type: text/html; charset=UTF-8" in case 
"default_charset" is not set in php.ini


the last line of each block is the PHP array for curl_setopt($curl, 
CURLOPT_HTTPHEADER, $curl_header);


NO GZIP
http://corecms/index.php?count=250_1502096587
HTTP/1.1 200 OK
Date: Mon, 07 Aug 2017 09:03:08 GMT
X-DNS-Prefetch-Control: off
X-Content-Type-Options: nosniff
X-Response-Time: D=1744 us
Expires: Mon, 26 Jul 1997 05:00:00 GMT
Cache-Control: no-cache
ETag: 7d820de3763d0e6c22ccbfe846ab1c31
Vary: User-Agent
Content-Type: text/html; charset=ISO-8859-1
a:2:{i:0;s:58:"Cookie: 
LOUNGE_ID=pivked76ocg1n9750n91d3dtnqqpqhpmqci63pvo";i:1;s:30:"Accept-Encoding: 
gzip, deflate";}


NO GZIP
http://corecms/
HTTP/1.1 200 OK
Date: Mon, 07 Aug 2017 09:03:08 GMT
X-DNS-Prefetch-Control: off
X-Content-Type-Options: nosniff
X-Response-Time: D=400 us
Cache-Control: max-age=120
Expires: Mon, 07 Aug 2017 09:05:08 GMT
Vary: User-Agent
Content-Type: text/html; charset=ISO-8859-1
a:2:{i:0;s:58:"Cookie: 
LOUNGE_ID=pivked76ocg1n9750n91d3dtnqqpqhpmqci63pvo";i:1;s:30:"Accept-Encoding: 
gzip, deflate";}


GZIP OK
http://corecms/static.htm?count=250_1502096587
HTTP/1.1 200 OK
Date: Mon, 07 Aug 2017 09:03:08 GMT
X-DNS-Prefetch-Control: off
X-Content-Type-Options: nosniff
X-Response-Time: D=297 us
Last-Modified: Sun, 06 Aug 2017 17:49:54 GMT
ETag: "ec1-556195b938c8f-gzip"
Accept-Ranges: bytes
Cache-Control: max-age=120
Expires: Mon, 07 Aug 2017 09:05:08 GMT
Content-Encoding: gzip
Vary: Accept-Encoding
Content-Length: 890
Content-Type: text/html
a:2:{i:0;s:58:"Cookie: 
LOUNGE_ID=pivked76ocg1n9750n91d3dtnqqpqhpmqci63pvo";i:1;s:30:"Accept-Encoding: 
gzip, deflate";}


NO GZIP
http://corecms/static.php?count=250_1502096587
HTTP/1.1 200 OK
Date: Mon, 07 Aug 2017 09:03:08 GMT
X-DNS-Prefetch-Control: off
X-Content-Type-Options: nosniff
X-Response-Time: D=280 us
Cache-Control: max-age=120
Expires: Mon, 07 Aug 2017 09:05:08 GMT
Vary: User-Agent
Content-Type: text/html; charset=ISO-8859-1
a:2:{i:0;s:58:"Cookie: 
LOUNGE_ID=pivked76ocg1n9750n91d3dtnqqpqhpmqci63pvo";i:1;s:30:"Accept-Encoding: 
gzip, deflate";}


Content-Type / AddOutputFilterByType DEFLATE text/html

2017-08-07 Thread Reindl Harald

Hi

AddOutputFilterByType DEFLATE text/html

is this a bug or somehow expected behavior that in case the 
"Content-Type" header also contains a charset mod_defalte don't work as 
expected which means in case of curl requests only static files are gzip 
compressed while PHP responses are missing "Content-Encoding: gzip" - 
that this don't happen in case of Firefox as client makes it even more 
strange


identical result for "Content-Type: text/html; charset=UTF-8" in case 
"default_charset" is not set in php.ini


the last line of each block is the PHP array for curl_setopt($curl, 
CURLOPT_HTTPHEADER, $curl_header);


NO GZIP
http://corecms/index.php?count=250_1502096587
HTTP/1.1 200 OK
Date: Mon, 07 Aug 2017 09:03:08 GMT
X-DNS-Prefetch-Control: off
X-Content-Type-Options: nosniff
X-Response-Time: D=1744 us
Expires: Mon, 26 Jul 1997 05:00:00 GMT
Cache-Control: no-cache
ETag: 7d820de3763d0e6c22ccbfe846ab1c31
Vary: User-Agent
Content-Type: text/html; charset=ISO-8859-1
a:2:{i:0;s:58:"Cookie: 
LOUNGE_ID=pivked76ocg1n9750n91d3dtnqqpqhpmqci63pvo";i:1;s:30:"Accept-Encoding: 
gzip, deflate";}


NO GZIP
http://corecms/
HTTP/1.1 200 OK
Date: Mon, 07 Aug 2017 09:03:08 GMT
X-DNS-Prefetch-Control: off
X-Content-Type-Options: nosniff
X-Response-Time: D=400 us
Cache-Control: max-age=120
Expires: Mon, 07 Aug 2017 09:05:08 GMT
Vary: User-Agent
Content-Type: text/html; charset=ISO-8859-1
a:2:{i:0;s:58:"Cookie: 
LOUNGE_ID=pivked76ocg1n9750n91d3dtnqqpqhpmqci63pvo";i:1;s:30:"Accept-Encoding: 
gzip, deflate";}


GZIP OK
http://corecms/static.htm?count=250_1502096587
HTTP/1.1 200 OK
Date: Mon, 07 Aug 2017 09:03:08 GMT
X-DNS-Prefetch-Control: off
X-Content-Type-Options: nosniff
X-Response-Time: D=297 us
Last-Modified: Sun, 06 Aug 2017 17:49:54 GMT
ETag: "ec1-556195b938c8f-gzip"
Accept-Ranges: bytes
Cache-Control: max-age=120
Expires: Mon, 07 Aug 2017 09:05:08 GMT
Content-Encoding: gzip
Vary: Accept-Encoding
Content-Length: 890
Content-Type: text/html
a:2:{i:0;s:58:"Cookie: 
LOUNGE_ID=pivked76ocg1n9750n91d3dtnqqpqhpmqci63pvo";i:1;s:30:"Accept-Encoding: 
gzip, deflate";}


NO GZIP
http://corecms/static.php?count=250_1502096587
HTTP/1.1 200 OK
Date: Mon, 07 Aug 2017 09:03:08 GMT
X-DNS-Prefetch-Control: off
X-Content-Type-Options: nosniff
X-Response-Time: D=280 us
Cache-Control: max-age=120
Expires: Mon, 07 Aug 2017 09:05:08 GMT
Vary: User-Agent
Content-Type: text/html; charset=ISO-8859-1
a:2:{i:0;s:58:"Cookie: 
LOUNGE_ID=pivked76ocg1n9750n91d3dtnqqpqhpmqci63pvo";i:1;s:30:"Accept-Encoding: 
gzip, deflate";}


Re: AddOutputFilterByType in Apache 2.4 inserts filters as AP_FTYPE_RESOURCE

2016-01-20 Thread Micha Lenk

Hi Nick,

if the patch looks good, as you wrote, what is needed to get it applied 
to trunk and backported to 2.4.x?


Have you seen my follow-up questions in the other mail?

Best regards,
Micha


Am 13.01.2016 22:44, schrieb Nick Kew:

On Wed, 2016-01-13 at 17:59 +0100, Micha Lenk wrote:

Hi,

The directive AddOutputFilterByType can be used to insert filters to 
the
output filter chain depending on the content type of the HTTP 
response.

So far so good.

PROBLEM DESCRIPTION


This is probably worth a bugzilla entry.

I think I can clarify a little of what's happened.
AddOutputFilterByType was something of a hacked afterthought
to filtering back in the days of httpd 2.0.  On the one hand,
it met a need.  On the other hand, it worked only in a very
limited range of situations where the content type was known
at the time the filter was to be added.  It had no capacity
to respond to other aspects of the content, or indeed the
request/response.  And there were other issues.

Then came mod_filter and a generalised framework.
AddOutputFilterByType was now obsolete, but too widely-used
to dispense with entirely.  As a compromise it was re-implemented
within mod_filter, where it could co-exist with other dynamic
filter configuration.

Your observation tells us the semantics aren't quite compatible.
And your patch looks good - thanks.




Re: AddOutputFilterByType in Apache 2.4 inserts filters as AP_FTYPE_RESOURCE

2016-01-14 Thread Micha Lenk

Hi Nick,

Am 13.01.2016 22:44, schrieb Nick Kew:

PROBLEM DESCRIPTION


This is probably worth a bugzilla entry.


Done. https://bz.apache.org/bugzilla/show_bug.cgi?id=58856

Nick, would you mind to provide some insights on these comments from my 
initial mail:

For setups with both, FilterDeclare and AddOutputFilterByType (as
described above as fix), I observed some issues with properly merging
the two filter harnesses. However, I have no clue what semantics the
original author wanted to have in this situation.


Assumed that my patch gets applied, the filter type should be correctly 
set in the filter harness. But what if a user wants to override it? I 
got a few questions in this context:


1.) Should "FilterDeclare" with filter-name "BYTYPE:DEFLATE" (i.e. 
colliding

with the implicit filter-name created by AddOutputFilterByName) be
supported at all?

1a.) If yes, the handling of AddOutputFilterByType needs to be fixed so 
that:

 - a globally configured FilterDeclare is also effective for an
   AddOutputFilterByType within a (location) sub-container.
 - a filter type set by "FilterDeclare BYTYPE: type>"

   does not get overwritten by a subsequent
   "AddOutputFilterByType ".

1b.) If no, the code should detect and reject such configurations.

2.) On a related note to 1., Should FilterDeclare allow a filter-name of
existing filter providers at all? If yes, behavior would we expect?

Nick, I would be really glad if you could share your thoughts.

Best regards,
Micha



Re: AddOutputFilterByType in Apache 2.4 inserts filters as AP_FTYPE_RESOURCE

2016-01-13 Thread Nick Kew
On Wed, 2016-01-13 at 17:59 +0100, Micha Lenk wrote:
> Hi,
> 
> The directive AddOutputFilterByType can be used to insert filters to the 
> output filter chain depending on the content type of the HTTP response. 
> So far so good.
> 
> PROBLEM DESCRIPTION

This is probably worth a bugzilla entry.

I think I can clarify a little of what's happened.
AddOutputFilterByType was something of a hacked afterthought
to filtering back in the days of httpd 2.0.  On the one hand,
it met a need.  On the other hand, it worked only in a very
limited range of situations where the content type was known
at the time the filter was to be added.  It had no capacity
to respond to other aspects of the content, or indeed the
request/response.  And there were other issues.

Then came mod_filter and a generalised framework.
AddOutputFilterByType was now obsolete, but too widely-used
to dispense with entirely.  As a compromise it was re-implemented
within mod_filter, where it could co-exist with other dynamic
filter configuration.

Your observation tells us the semantics aren't quite compatible.
And your patch looks good - thanks.

-- 
Nick Kew



AddOutputFilterByType in Apache 2.4 inserts filters as AP_FTYPE_RESOURCE

2016-01-13 Thread Micha Lenk

Hi,

The directive AddOutputFilterByType can be used to insert filters to the 
output filter chain depending on the content type of the HTTP response. 
So far so good.


PROBLEM DESCRIPTION

I observed that the behavior of this directive changed in Apache 2.4 for 
filters. Starting with Apache 2.4 filters are inserted at an earlier 
place in the filter chain than they were inserted with Apache 2.2. For 
example, if you use the directive


AddOutputFilterByType DEFLATE text/html

The filter is inserted with AP_FTYPE_RESOURCE, even though it was 
registered in mod_deflate.c as AP_FTYPE_SET_CONTENT.
The effect is that using "AddOutputFilterByType DEFLATE text/html" the 
resulting filter chain is ordered diferrently compared to using 
"SetOutputFilter DEFLATE".


This can be fixed in configuration by adding the following directive 
right after AddOutputFilterByType:


FilterDeclare BYTYPE:DEFLATE CONTENT_SET

Unfortunately the order and placement of FilterDeclare seems to be 
relevant, i.e. this fix only works if FilterDeclare comes *after* 
AddOutputFilterByType within the same container.


I wonder whether this is an intentional behavior change of 
AddOutputFilterByType or not.


ANALYSIS

Apparently the filter type (ap_filter_rec_t struct member ftype) of the 
filter provider isn't respected anymore.


The intended filter type is provided when registering the output filter 
by calling ap_register_output_filter(). In branch 2.2.x this was 
sufficient, because the conditional filter (based on the MIME type) was 
added directly by name (i.e. by calling ap_add_output_filter() in 
ap_add_output_filters_by_type). In branches 2.4.x and trunk the 
implementation of AddOutputFilterByType apparently moved to mod_filter 
and a layer of indirection (the filter harness) is introduced. But the 
filter harness is apparently created unconditionally with 
AP_FTYPE_RESOURCE.


FIX APPROACH

When implicitly creating a filter harness by calling the function 
add_filter(), we have access to the provider's ap_filter_rec_t anyways. 
So I recommend to just copy it from the filter provider (e.g. DEFLATE) 
to the filter harness (e.g. BYTYPE:DEFLATE).


I've tested this approach with the patch below for a setup without any 
FilterDeclare directive, and it fixed the regression; the DEFLATE filter 
was ordered correctly at AP_FTYPE_CONTENT_SET again.


- 8>< 
--

diff --git a/modules/filters/mod_filter.c b/modules/filters/mod_filter.c
index 7b69223..5b5ecf6 100644
--- a/modules/filters/mod_filter.c
+++ b/modules/filters/mod_filter.c
@@ -444,6 +444,12 @@ static const char *add_filter(cmd_parms *cmd, void 
*CFG,

 ap_expr_info_t *node;
 const char *err = NULL;

+/* if provider has been registered, we can look it up */
+provider_frec = ap_get_output_filter_handle(pname);
+if (!provider_frec) {
+return apr_psprintf(cmd->pool, "Unknown filter provider %s", 
pname);

+}
+
 /* fname has been declared with DeclareFilter, so we can look it up 
*/

 frec = apr_hash_get(cfg->live_filters, fname, APR_HASH_KEY_STRING);

@@ -454,17 +460,13 @@ static const char *add_filter(cmd_parms *cmd, void 
*CFG,

 return c;
 }
 frec = apr_hash_get(cfg->live_filters, fname, 
APR_HASH_KEY_STRING);

+frec->ftype = provider_frec->ftype;
 }

 if (!frec) {
 return apr_psprintf(cmd->pool, "Undeclared smart filter %s", 
fname);

 }

-/* if provider has been registered, we can look it up */
-provider_frec = ap_get_output_filter_handle(pname);
-if (!provider_frec) {
-return apr_psprintf(cmd->pool, "Unknown filter provider %s", 
pname);

-}
 provider = apr_palloc(cmd->pool, sizeof(ap_filter_provider_t));
 if (expr) {
 node = ap_expr_parse_cmd(cmd, expr, 0, &err, NULL);
- 8>< 
--


For setups with both, FilterDeclare and AddOutputFilterByType (as 
described above as fix), I observed some issues with properly merging 
the two filter harnesses. However, I have no clue what semantics the 
original author wanted to have in this situation.


I hope my explanations are clear enough for others to follows. If not, 
please don't hesitate to ask.


Best regards,
Micha


Re: AddOutputFilterByType vs. proxy in 2.0.x

2007-10-09 Thread William A. Rowe, Jr.
Eric Covener wrote:
> 
> I'd like to backport to 2.0.x but it seems like there was a little bit
> of caution in the 2.2.x/trunk commits.

Propose it in STATUS and let's see where that goes.


AddOutputFilterByType vs. proxy in 2.0.x

2007-10-09 Thread Eric Covener
Does anyone recall if there was any fallout from the change to allow
ap_set_output_filter_by_type() to operate on proxy requests?

http://issues.apache.org/bugzilla/show_bug.cgi?id=31226
http://svn.apache.org/viewcvs.cgi/httpd/httpd/branches/2.2.x/server/core.c?rev=327793&r1=307031&r2=327793

I'd like to backport to 2.0.x but it seems like there was a little bit
of caution in the 2.2.x/trunk commits.

-- 
Eric Covener
[EMAIL PROTECTED]


Re: Removing AddOutputFilterByType

2005-06-21 Thread Nick Kew
On Tuesday 21 June 2005 06:10, Astrid Keßler wrote:
> at Montag, 20. Juni 2005, Nick Kew wrote:
> > Hmmm.  Is it better to have a UI that's openly a little more complex but
> > works as documented, or one that appears simple but has lots of
> > gotchas lurking in ambush?  I guess that's an argument for mod_filter
> > implementing AddOutputFilterByType.
>
> First, it's an argument to improve the docs ;)

Not sure about that.  Full documentation for AddOutputFilterByType could 
become quite mindboggling.

The gotcha that generates most questions is "doesn't work at all in [various
situations, notably a proxy]".  ISTR that *is* documented somewhere, but
people don't notice because it's unintuitive.

Add the *ordering* issue (always comes *after* filters configured with other
directives), and it's already more complex to get the mind around than 
mod_filter.

Now add the complex interactions that happen in setups involving more than
module, or even within a single module (c.f.
http://issues.apache.org/bugzilla/show_bug.cgi?id=33499 )
and it's not IMO something that can be documented adequately without extensive
self-defeating reference to the code itself.

-- 
Nick Kew


Re: Removing AddOutputFilterByType

2005-06-20 Thread Astrid Keßler
at Montag, 20. Juni 2005, Nick Kew wrote:

> Hmmm.  Is it better to have a UI that's openly a little more complex but
> works as documented, or one that appears simple but has lots of
> gotchas lurking in ambush?  I guess that's an argument for mod_filter
> implementing AddOutputFilterByType.

First, it's an argument to improve the docs ;)

Kess



Re: Removing AddOutputFilterByType

2005-06-20 Thread Nick Kew
On Monday 20 June 2005 06:44, Justin Erenkrantz wrote:

> > Since the purpose of this directive is now available from mod_filter,
> > does anyone object if I simply yank AddOutputFilterByType from 2.1?
>
> Why can't AddOutputFilterByType call mod_filter under the hood?

Good point.

Of course it raises issues of filter order that I'd prefer to avoid.
But that's at least less counterintuitive than the existing ordering
with AddOutputFilterByType and [anything else].

> But, from the user's perspective, I think AddOutputFilterByType needs to
> stay and I'm not convinced that mod_filter has a good enough directive
> interface to make it intuitive.  -- justin

Hmmm.  Is it better to have a UI that's openly a little more complex but
works as documented, or one that appears simple but has lots of
gotchas lurking in ambush?  I guess that's an argument for mod_filter
implementing AddOutputFilterByType.

FWIW, it could also do SetOutputFilter, but AddOutputFilter may be
more problematic, in that dispatching on filename/extension needs
thought to avoid introducing another set of problems.  But of course
there's no pressing need to change those in the first place.

-- 
Nick Kew


Re: Removing AddOutputFilterByType

2005-06-19 Thread Justin Erenkrantz

--On June 19, 2005 5:33:14 PM +0100 Nick Kew <[EMAIL PROTECTED]> wrote:


AddOutputFilterByType has always been problematic.  I see there's another bug
report this month arising from it:

http://issues.apache.org/bugzilla/show_bug.cgi?id=33499

Since the purpose of this directive is now available from mod_filter, does
anyone object if I simply yank AddOutputFilterByType from 2.1?


Why can't AddOutputFilterByType call mod_filter under the hood?

But, from the user's perspective, I think AddOutputFilterByType needs to stay 
and I'm not convinced that mod_filter has a good enough directive interface to 
make it intuitive.  -- justin


Re: Removing AddOutputFilterByType

2005-06-19 Thread Nick Kew
On Sunday 19 June 2005 19:57, Nick Kew wrote:


> But as a more general point, if there's something AddOutputFilterByType
> does that mod_filter doesn't do - or complicates doing

Bah.  I was going to say something about if [...] then mod_filter wants 
reviewing.  Then I stopped, but didn't chop out the unfinished sentence.
Sorry folks.

-- 
Nick Kew


Re: Removing AddOutputFilterByType

2005-06-19 Thread Nick Kew
On Sunday 19 June 2005 17:42, André Malo wrote:

> Is mod_filter capable of something like this:
>
> AddOutputFilterByType INCLUDES httpd/unix-directory
>
> ? I'm personally using that heavily to get SSI into autoindex descriptions.

Yow!  You mean httpd/unix-directory - which is most emphatically *not* a
content type - is passing through ap_set_content_type?

To answer your question, mod_filter can dispatch on r->handler, so if handler
gets set to "httpd/unix-directory" then yes.  If not, then it would need a
workaround, such as dispatching on text/html and controlling the scope
of that with a FilesMatch or something of that ilk.  Or a patch to mod_filter.

-- 
Nick Kew


Re: Removing AddOutputFilterByType

2005-06-19 Thread Nick Kew
On Sunday 19 June 2005 17:47, William A. Rowe, Jr. wrote:
> Uhm, yes.  Most definately object.  Trying to map ever type
> that resolves back to, say, text/*, or text/html, would be
> an unnecessary pain.

How is a regexp match
 FilterProvider textfilter my-filter content-type /^text/
or a substring match
 FilterProvider textfilter my-filter content-type $text/
an unnecessary pain?

> I'll look at those reports though, I still have some neurons
> of info about how/why it was implemented; perhaps I can help
> on those PR tickets.  After my HTTP Request proxy test cluster
> is done so I'm happy with our solutions to the Watchfire report.

Fairy nuff:-)

But as a more general point, if there's something AddOutputFilterByType
does that mod_filter doesn't do - or complicates doing

-- 
Nick Kew


Re: Removing AddOutputFilterByType

2005-06-19 Thread William A. Rowe, Jr.
Uhm, yes.  Most definately object.  Trying to map ever type
that resolves back to, say, text/*, or text/html, would be
an unnecessary pain.

I'll look at those reports though, I still have some neurons
of info about how/why it was implemented; perhaps I can help
on those PR tickets.  After my HTTP Request proxy test cluster
is done so I'm happy with our solutions to the Watchfire report.

Bill

At 11:33 AM 6/19/2005, Nick Kew wrote:
>AddOutputFilterByType has always been problematic.  I see there's another bug
>report this month arising from it:
>
>http://issues.apache.org/bugzilla/show_bug.cgi?id=33499
>
>Since the purpose of this directive is now available from mod_filter, does 
>anyone object if I simply yank AddOutputFilterByType from 2.1?
>
>I've hacked a quick patch just to remove it.  If people are happy with it,
>I'll test it and commit, and add a note to the docs about replacing it.
>
>-- 
>Nick Kew




Re: Removing AddOutputFilterByType

2005-06-19 Thread =?iso-8859-1?q?Andr=E9_Malo?=
* Nick Kew wrote:

> AddOutputFilterByType has always been problematic.  I see there's another
> bug report this month arising from it:
>
> http://issues.apache.org/bugzilla/show_bug.cgi?id=33499
>
> Since the purpose of this directive is now available from mod_filter,
> does anyone object if I simply yank AddOutputFilterByType from 2.1?

Is mod_filter capable of something like this:

AddOutputFilterByType INCLUDES httpd/unix-directory

? I'm personally using that heavily to get SSI into autoindex descriptions.

nd
-- 
Winnetous Erbe: <http://pub.perlig.de/books.html#apache2>


Removing AddOutputFilterByType

2005-06-19 Thread Nick Kew
AddOutputFilterByType has always been problematic.  I see there's another bug
report this month arising from it:

http://issues.apache.org/bugzilla/show_bug.cgi?id=33499

Since the purpose of this directive is now available from mod_filter, does 
anyone object if I simply yank AddOutputFilterByType from 2.1?

I've hacked a quick patch just to remove it.  If people are happy with it,
I'll test it and commit, and add a note to the docs about replacing it.

-- 
Nick Kew
--- modules/http/http_protocol.c2005-05-17 01:34:36.198019832 +0100
+++ modules/http/http_protocol.c.new2005-06-19 17:16:34.072088128 +0100
@@ -817,13 +817,6 @@
 }
 else if (!r->content_type || strcmp(r->content_type, ct)) {
 r->content_type = ct;
-
-/* Insert filters requested by the AddOutputFiltersByType 
- * configuration directive. Content-type filters must be 
- * inserted after the content handlers have run because 
- * only then, do we reliably know the content-type.
- */
-ap_add_output_filters_by_type(r);
 }
 }
 
--- include/http_core.h 2005-06-19 17:18:46.674929432 +0100
+++ include/http_core.h.new 2005-06-19 17:19:03.620353336 +0100
@@ -553,9 +553,6 @@
 apr_table_t *accf_map;
 } core_server_config;
 
-/* for AddOutputFiltersByType in core.c */
-void ap_add_output_filters_by_type(request_rec *r);
-
 /* for http_config.c */
 void ap_core_reorder_directories(apr_pool_t *, server_rec *);
 
--- server/core.c   2005-06-19 17:14:18.424709664 +0100
+++ server/core.c.new   2005-06-19 17:22:00.421475480 +0100
@@ -163,60 +163,6 @@
 return (void *)conf;
 }
 
-/*
- * Overlay one hash table of ct_output_filters onto another
- */
-static void *merge_ct_filters(apr_pool_t *p,
-  const void *key,
-  apr_ssize_t klen,
-  const void *overlay_val,
-  const void *base_val,
-  const void *data)
-{
-ap_filter_rec_t *cur;
-const ap_filter_rec_t *overlay_info = (const ap_filter_rec_t *)overlay_val;
-const ap_filter_rec_t *base_info = (const ap_filter_rec_t *)base_val;
-
-cur = NULL;
-
-while (overlay_info) {
-ap_filter_rec_t *new;
-
-new = apr_pcalloc(p, sizeof(ap_filter_rec_t));
-new->name = apr_pstrdup(p, overlay_info->name);
-new->next = cur;
-cur = new;
-overlay_info = overlay_info->next;
-}
-
-while (base_info) {
-ap_filter_rec_t *f;
-int found = 0;
-
-/* We can't have dups. */
-f = cur;
-while (f) {
-if (!strcasecmp(base_info->name, f->name)) {
-found = 1;
-break;
-}
-
-f = f->next;
-}
-
-if (!found) {
-f = apr_pcalloc(p, sizeof(ap_filter_rec_t));
-f->name = apr_pstrdup(p, base_info->name);
-f->next = cur;
-cur = f;
-}
-
-base_info = base_info->next;
-}
-
-return cur;
-}
-
 static void *merge_core_dir_configs(apr_pool_t *a, void *basev, void *newv)
 {
 core_dir_config *base = (core_dir_config *)basev;
@@ -391,21 +337,6 @@
 conf->input_filters = new->input_filters;
 }
 
-if (conf->ct_output_filters && new->ct_output_filters) {
-conf->ct_output_filters = apr_hash_merge(a,
- new->ct_output_filters,
- conf->ct_output_filters,
- merge_ct_filters,
- NULL);
-}
-else if (new->ct_output_filters) {
-conf->ct_output_filters = apr_hash_copy(a, new->ct_output_filters);
-}
-else if (conf->ct_output_filters) {
-/* That memcpy above isn't enough. */
-conf->ct_output_filters = apr_hash_copy(a, base->ct_output_filters);
-}
-
 /*
  * Now merge the setting of the FileETag directive.
  */
@@ -3037,88 +2968,6 @@
 return 0;
 }
 
-static const char *add_ct_output_filters(cmd_parms *cmd, void *conf_,
- const char *arg, const char *arg2)
-{
-core_dir_config *conf = conf_;
-ap_filter_rec_t *old, *new = NULL;
-const char *filter_name;
-
-if (!conf->ct_output_filters) {
-conf->ct_output_filters = apr_hash_make(cmd->pool);
-old = NULL;
-}
-else {
-old = (ap_filter_rec_t*) apr_hash_get(conf->ct_output_filters, arg2,
-  APR_HASH_KEY_STRING);
-/* find last entry */
-if (old) {
-while (old->next) {
-old = old->next;
-}
-}
-}
-
-while (*arg &&
-   (filter_nam

Re: AddOutputFilterByType oddness

2004-09-24 Thread Justin Erenkrantz
--On Thursday, September 23, 2004 12:43 PM +0100 Nick Kew <[EMAIL PROTECTED]> 
wrote:

Basically it does the lookup/dispatch once per filter in the filterchain
per request.  It checks that filter's providers until it finds a match.
So for anything you could do with an [Add|Set]OutputFilter[ByType]
that's one lookup per request.
Okay, so if I have three rules and ten filters, we'll be doing thirty checks, 
right?  And, this will happen even if mod_filter isn't configured - as 
mod_filter still needs to check ten times that it doesn't have anything to do, 
right?  Hmm.  How expensive is this again?

mod_filter takes the content-type as it is at that point in the chain.
Isn't the real nightmare where a filter calls ap_set_content_type and
some AddOutputFilterByTypes are in effect?  I guess what *really* bothers
me is the idea of adding filters *as a side-effect*.
How wouldn't it be a side-effect?  It's intentional from the admin 
perspective, but a side-effect from the developer's perspective.

  And, then
mod_deflate needs to be conditionally added (sub-case #1: it needs to be
added for 'text/plain'; sub-case #2: it needs to be added for 'text/html').
How and where is it added?  Are you inserting dummy filters?
I'm not sure I follow.  It will dispatch to deflate based on the
content-type (or other dispatch criterion) as it is at that point
in the chain.
The question is at which point in the chain does deflate get added?
So if the handler sets application/xml but that goes through an XSLT
filter which sets it to text/html, then mod_filter sees application/xml
if it's before the XSLT filter in the chain, or text/html after it.
How can AddOutputFilterByType expect to cope with that?
I thought you suggested that mod_filter could easily handle this case.  I'm 
still not seeing how.

But FWIW I have that working locally with
FilterDeclare   filter1 Content-TypeCONTENT_SET
FilterDeclare   filter2 Content-Length  CONTENT_SET
FilterProvider  filter1 filter2 $text
FilterProvider  filter2 DEFLATE >4000
FilterChain filter1
to deflate all "text/*" documents of 4k or greater.
Can I comment that I think a clearer configuration syntax is going to be 
needed if we are going to axe all of the current filter directives?

AddOutputFilterByType, for all of its internal oddness, is a simple directive 
for an administrator to understand.  So, perhaps keep 'AddOutputFilterByType' 
and having it internally converted to a mod_filter directive.  But, I'm just 
not overly excited about moving all filter configuration directives to 
something akin to mod_rewrite.  Ouch.  -- justin


Re: Style changes (was: AddOutputFilterByType oddness)

2004-09-23 Thread Justin Erenkrantz
--On Thursday, September 23, 2004 5:38 PM +0200 Graham Leggett 
<[EMAIL PROTECTED]> wrote:

At the same time I (and others) have been criticised in the past for trying
to propose patches that do style changes or that correct whitespace.
What is the official policy on this?
I am quite happy to do as much style cleanups that are necessary, on
condition nobody objects to me doing so.
Some guidelines: Separate style changes from functional changes.  You should 
also do large style commits before doing large functional commits (there are a 
couple of cases where it makes sense to reverse that though).  Style changes 
only get applied to the HEAD - they can't be backported to 2.0.

I think if you follow that, you won't get much flack.  But, there are a number 
of modules that are getting hard to maintain because they don't comply.  So, I 
think it behooves us to try to clean them up before we start 2.2.  -- justin


Style changes (was: AddOutputFilterByType oddness)

2004-09-23 Thread Graham Leggett
Justin Erenkrantz wrote:
I'm getting annoyed by people doing massive code drops (i.e. mod_filter, 
mod_proxy, mod_auth_ldap, etc.) that don't conform to our code style and 
have no comments.  It makes it much harder to go and fix bugs in 'em.  
At the same time I (and others) have been criticised in the past for 
trying to propose patches that do style changes or that correct whitespace.

What is the official policy on this?
I am quite happy to do as much style cleanups that are necessary, on 
condition nobody objects to me doing so.

Regards,
Graham
--


smime.p7s
Description: S/MIME Cryptographic Signature


Re: AddOutputFilterByType oddness

2004-09-23 Thread Nick Kew
On Wed, 22 Sep 2004, Justin Erenkrantz wrote:

> --On Wednesday, September 22, 2004 6:17 PM +0100 Nick Kew
> <[EMAIL PROTECTED]> wrote:
>
> > It seems to me heavily counterintuitive that mixing ByType directives
> > with anything else means that the ByType filters *always* come last.
> > And that Remove won't affect them, but will affect others.
>
> I think we could get Remove*Filter to also delete the content-type filters.
>
> > Indeed.  mod_filter addresses this by configuring at the last moment,
> > so any earlier set_content_type()s are irrelevant.  I don't suppose it's
> > a panacaea for everything, but I do think it's a significant improvement
> > on what we have.
>
> I'm concerned about the overhead of mod_filter having to check all of its
> rules each time a filter is invoked.  This is why I started to look through
> the code last night to see how it worked and how invasive it is.

It's improving with time (except when I introduce bugs...).  Merging in
the structs with util_filter saves on having to do superfluous lookups.

Basically it does the lookup/dispatch once per filter in the filterchain
per request.  It checks that filter's providers until it finds a match.
So for anything you could do with an [Add|Set]OutputFilter[ByType]
that's one lookup per request.

> How would you handle the situation when filter #1 sets C-T to be
> "text/plain" and then filter #2 sets C-T to be "text/html"?

mod_filter takes the content-type as it is at that point in the chain.

Isn't the real nightmare where a filter calls ap_set_content_type and
some AddOutputFilterByTypes are in effect?  I guess what *really* bothers
me is the idea of adding filters *as a side-effect*.

> And, then
> mod_deflate needs to be conditionally added (sub-case #1: it needs to be
> added for 'text/plain'; sub-case #2: it needs to be added for 'text/html').
> How and where is it added?  Are you inserting dummy filters?

I'm not sure I follow.  It will dispatch to deflate based on the
content-type (or other dispatch criterion) as it is at that point
in the chain.

So if the handler sets application/xml but that goes through an XSLT
filter which sets it to text/html, then mod_filter sees application/xml
if it's before the XSLT filter in the chain, or text/html after it.

How can AddOutputFilterByType expect to cope with that?

>
> > From the user's perspective, it's simply more powerful and flexible.
> > Works with any request or response headers (not just content-type) or
> > environment variables.  Gets rid of constraints on ordering, like
> > AddOutputFilterbyType filter always coming after other filters
> > regardless of ordering in httpd.conf.
> >
> > Example: I have a user who wants to insert mod_deflate in a reverse
> > proxy, but only for selected content-types AND not if the content
> > length is below a threshold.  How would he do that with the old filter
> > framework?
>
> I guess I'm not clear what the syntax is (I guess I should go read the
> docs).

That particular scenario is complex, and requires mod_filter to be
used as its own provider.  The point is, we *can* now support complex
setups (or will be - that chaining is still broken in CVS).

But FWIW I have that working locally with

FilterDeclare   filter1 Content-TypeCONTENT_SET
FilterDeclare   filter2 Content-Length  CONTENT_SET

FilterProvider  filter1 filter2 $text
FilterProvider  filter2 DEFLATE >4000

FilterChain filter1

to deflate all "text/*" documents of 4k or greater.


> I definitely don't want to see the filters be configured like
> mod_rewrite.  It needs to be fairly straightforward, but still fairly
> simplistic.  I don't want to have users have to read a complicated manual
> or docs to set up filters.  KISS.

Indeed.  Do you think the examples in the manual page are too complex?

Bear in mind that the third example is no more complex than the first two,
yet suddenly enables a frequently-requested capability that simply isn't
possible with the old filtering.

> Well, the point by you committing it into our tree is that the rest of us
> are now responsible for it.  That's why I brought up the code style issue:

OK, OKOK!   I promise to look harder at the code style guidelines!
And I _did_ ask on the list a couple of weeks before introducing to CVS.

> I looked yesterday afternoon (and haven't seen any commits since then).  I

That'll be the latest version.  Which FWIW was introduced prematurely
because it introduced a new feature demanded by a user.  Only that turned
out to be broken, which is why I'm re-hacking that now.

-- 
Nick Kew


Re: AddOutputFilterByType oddness

2004-09-22 Thread Justin Erenkrantz
--On Wednesday, September 22, 2004 6:17 PM +0100 Nick Kew 
<[EMAIL PROTECTED]> wrote:

It seems to me heavily counterintuitive that mixing ByType directives
with anything else means that the ByType filters *always* come last.
And that Remove won't affect them, but will affect others.
I think we could get Remove*Filter to also delete the content-type filters.
Indeed.  mod_filter addresses this by configuring at the last moment,
so any earlier set_content_type()s are irrelevant.  I don't suppose it's
a panacaea for everything, but I do think it's a significant improvement
on what we have.
I'm concerned about the overhead of mod_filter having to check all of its 
rules each time a filter is invoked.  This is why I started to look through 
the code last night to see how it worked and how invasive it is.

How would you handle the situation when filter #1 sets C-T to be 
"text/plain" and then filter #2 sets C-T to be "text/html"?  And, then 
mod_deflate needs to be conditionally added (sub-case #1: it needs to be 
added for 'text/plain'; sub-case #2: it needs to be added for 'text/html'). 
How and where is it added?  Are you inserting dummy filters?

From the user's perspective, it's simply more powerful and flexible.
Works with any request or response headers (not just content-type) or
environment variables.  Gets rid of constraints on ordering, like
AddOutputFilterbyType filter always coming after other filters
regardless of ordering in httpd.conf.
Example: I have a user who wants to insert mod_deflate in a reverse
proxy, but only for selected content-types AND not if the content
length is below a threshold.  How would he do that with the old filter
framework?
I guess I'm not clear what the syntax is (I guess I should go read the 
docs).  I definitely don't want to see the filters be configured like 
mod_rewrite.  It needs to be fairly straightforward, but still fairly 
simplistic.  I don't want to have users have to read a complicated manual 
or docs to set up filters.  KISS.

From a developers perspective, I wrote it for myself, and have at least
two other developers using it operationally in their product.  Time will
tell what others may use it for.
Well, the point by you committing it into our tree is that the rest of us 
are now responsible for it.  That's why I brought up the code style issue: 
we already have a number of modules that were never fully integrated or 
reviewed.  And, then the person who dropped the code ran away and left the 
code in a goofy state.  (See mod_rewrite, mod_ssl, mod_cache, etc.)

When was that?  I made quite a lot of updates to the style towards
conforming (like eliminating tabs and realigning some braces) before
committing to CVS, but I'm willing to believe I need to look more
carefully.
I looked yesterday afternoon (and haven't seen any commits since then).  I 
will say the most distracting parts are the odd spacing (i.e. parenthesis 
and semi-colons) as well as line spacing.  Unfortunately, I get distracted 
by shiny things such as improper code style such that I can't focus on the 
code itself.  =)  -- justin


Re: AddOutputFilterByType oddness

2004-09-22 Thread Nick Kew
On Wed, 22 Sep 2004, Justin Erenkrantz wrote:

> --On Wednesday, September 22, 2004 5:01 PM +0100 Nick Kew <[EMAIL PROTECTED]>
> wrote:
>
> > I've said it before and I'll say it again: AddOutputFilterByType is
> > fundamentally unsatisfactory.  This confusion is an effect, not cause.
>
> Suffice to say, I disagree.
>
> > * Configuration is inconsistent with other filter directives.  The
> >   relationship with [Set|Add|Remove]OutputFilter is utterly unintuitive
> >   and, from a user POV, broken.
>
> I think it's really clear from the user's perspective.  I think the problem
> comes in on the developer's side.

It seems to me heavily counterintuitive that mixing ByType directives
with anything else means that the ByType filters *always* come last.
And that Remove won't affect them, but will affect others.

> > * Tying it to ap_set_content_type is, to say the least, hairy.
> >   IMO we shouldn't *require* modules to call this, and it's utterly
> >   unreasonable to expect that it will never be called more than once
> >   for a request, given the number of modules that might take an interest.
> >   Especially when subrequests and internal redirects may be involved.
>
> We have *always* mandated that ap_set_content_type() should be called rather
> than setting r->content_type.  (I wish we could remove content_type from
> request_rec instead.)

Indeed.  But that doesn't prevent it being called multiple times, perhaps
from different modules.  So using it to insert filters leaves lots of
potantial for trouble.

> > * It's a complexity just waiting for modules to break on it.
>
> Anything that depends upon content-type like this is going to be hairy because
> there may be several 'right' answers during the course of the request.

Indeed.  mod_filter addresses this by configuring at the last moment,
so any earlier set_content_type()s are irrelevant.  I don't suppose it's
a panacaea for everything, but I do think it's a significant improvement
on what we have.

> > I've made some more updates to mod_filter since I last posted on the
> > subject, and I'm getting some very positive feedback from real users.
> > For 2.2 I'd like to remove AddOutputFilterByType entirely, replacing
> > it with mod_filter.
>
> I've yet to see a clear and concise statement as to how mod_filter will solve
> this problem in a better and more efficient way.  (Especially from a user's
> perspective, but also from a developer's perspective.)

>From the user's perspective, it's simply more powerful and flexible.
Works with any request or response headers (not just content-type) or
environment variables.  Gets rid of constraints on ordering, like
AddOutputFilterbyType filter always coming after other filters
regardless of ordering in httpd.conf.

Example: I have a user who wants to insert mod_deflate in a reverse
proxy, but only for selected content-types AND not if the content
length is below a threshold.  How would he do that with the old filter
framework?

>From a developers perspective, I wrote it for myself, and have at least
two other developers using it operationally in their product.  Time will
tell what others may use it for.

> I will also comment that I looked in the mod_filter code the other day and was
> disappointed that it doesn't follow our coding style at all or even have
> comments that help people understand what it is trying to do inside the .c
> file.

When was that?  I made quite a lot of updates to the style towards
conforming (like eliminating tabs and realigning some braces) before
committing to CVS, but I'm willing to believe I need to look more
carefully.

-- 
Nick Kew


Re: AddOutputFilterByType oddness

2004-09-22 Thread Justin Erenkrantz
--On Wednesday, September 22, 2004 5:01 PM +0100 Nick Kew <[EMAIL PROTECTED]> 
wrote:

I've said it before and I'll say it again: AddOutputFilterByType is
fundamentally unsatisfactory.  This confusion is an effect, not cause.
Suffice to say, I disagree.
* Configuration is inconsistent with other filter directives.  The
  relationship with [Set|Add|Remove]OutputFilter is utterly unintuitive
  and, from a user POV, broken.
I think it's really clear from the user's perspective.  I think the problem 
comes in on the developer's side.

* Tying it to ap_set_content_type is, to say the least, hairy.
  IMO we shouldn't *require* modules to call this, and it's utterly
  unreasonable to expect that it will never be called more than once
  for a request, given the number of modules that might take an interest.
  Especially when subrequests and internal redirects may be involved.
We have *always* mandated that ap_set_content_type() should be called rather 
than setting r->content_type.  (I wish we could remove content_type from 
request_rec instead.)

* It's a complexity just waiting for modules to break on it.
Anything that depends upon content-type like this is going to be hairy because 
there may be several 'right' answers during the course of the request.

I've made some more updates to mod_filter since I last posted on the
subject, and I'm getting some very positive feedback from real users.
For 2.2 I'd like to remove AddOutputFilterByType entirely, replacing
it with mod_filter.
I've yet to see a clear and concise statement as to how mod_filter will solve 
this problem in a better and more efficient way.  (Especially from a user's 
perspective, but also from a developer's perspective.)

mod_filter can also obsolete [Set|Add|Remove]OutputFilter, though I'm
in no hurry to do that.  What I can also do is re-implement all the
outputfilter directives within mod_filter and its updated framework.
I will also comment that I looked in the mod_filter code the other day and was 
disappointed that it doesn't follow our coding style at all or even have 
comments that help people understand what it is trying to do inside the .c 
file.  This all makes it very difficult to understand the code.  I'd greatly 
appreciate it if mod_filter (and the code that you inserted elsewhere - i.e. 
in util_filter.c) would conform to our style guidelines and had some comments 
inside of it that say what it does (or trying to do).

For example, some of the things it does just makes no sense at all. 
filter_bucket_type() is completely bogus and needs to be tossed.  The 
type->name field in the bucket should be used instead.

I'm getting annoyed by people doing massive code drops (i.e. mod_filter, 
mod_proxy, mod_auth_ldap, etc.) that don't conform to our code style and have 
no comments.  It makes it much harder to go and fix bugs in 'em.  -- justin


Re: AddOutputFilterByType oddness

2004-09-22 Thread Nick Kew
On Sat, 18 Sep 2004, Justin Erenkrantz wrote:

> > But ap_add_output_filters_by_type() explicitly does nothing for a
> > proxied request.  Anyone know why?  "AddOutputFilterByType DEFLATE
> > text/plain text/html" seems to work as expected here for a forward proxy
> > with this applied: maybe I'm missing something fundamental...
>
> My recollection is initially it didn't have the proxy check, then FirstBill
> had a reason why proxied requests shouldn't work with AddOutputFilterByType.

I've said it before and I'll say it again: AddOutputFilterByType is
fundamentally unsatisfactory.  This confusion is an effect, not cause.

* Configuration is inconsistent with other filter directives.  The
  relationship with [Set|Add|Remove]OutputFilter is utterly unintuitive
  and, from a user POV, broken.
* Tying it to ap_set_content_type is, to say the least, hairy.
  IMO we shouldn't *require* modules to call this, and it's utterly
  unreasonable to expect that it will never be called more than once
  for a request, given the number of modules that might take an interest.
  Especially when subrequests and internal redirects may be involved.
* It's a complexity just waiting for modules to break on it.

I've made some more updates to mod_filter since I last posted on the
subject, and I'm getting some very positive feedback from real users.
For 2.2 I'd like to remove AddOutputFilterByType entirely, replacing
it with mod_filter.

mod_filter can also obsolete [Set|Add|Remove]OutputFilter, though I'm
in no hurry to do that.  What I can also do is re-implement all the
outputfilter directives within mod_filter and its updated framework.

-- 
Nick Kew


Re: AddOutputFilterByType oddness

2004-09-18 Thread Justin Erenkrantz
--On Thursday, September 16, 2004 5:11 PM +0100 Joe Orton <[EMAIL PROTECTED]> 
wrote:

But ap_add_output_filters_by_type() explicitly does nothing for a
proxied request.  Anyone know why?  "AddOutputFilterByType DEFLATE
text/plain text/html" seems to work as expected here for a forward proxy
with this applied: maybe I'm missing something fundamental...
My recollection is initially it didn't have the proxy check, then FirstBill 
had a reason why proxied requests shouldn't work with AddOutputFilterByType.

Would have to search the archives to remember why.  *sigh*  -- justin


Re: AddOutputFilterByType oddness

2004-09-16 Thread Joe Orton
On Wed, Aug 25, 2004 at 02:40:39PM +0200, Graham Leggett wrote:
> Justin Erenkrantz wrote:
> >Ultimately, all that is needed is a call to ap_set_content_type() before 
> >any bytes are written to the client to get AddOutputFilterByType to 
> >work. Perhaps with the recent momentum behind mod_proxy work, someone 
> >could investigate that and get mod_proxy fixed.
> 
> ap_set_content_type() is called on line 769 of proxy_http.c:

But ap_add_output_filters_by_type() explicitly does nothing for a
proxied request.  Anyone know why?  "AddOutputFilterByType DEFLATE
text/plain text/html" seems to work as expected here for a forward proxy
with this applied: maybe I'm missing something fundamental...

--- server/core.c~  2004-08-31 09:16:56.0 +0100
+++ server/core.c   2004-09-16 16:48:09.0 +0100
@@ -2875,11 +2875,10 @@
 conf = (core_dir_config *)ap_get_module_config(r->per_dir_config,
&core_module);
 
-/* We can't do anything with proxy requests, no content-types or if
- * we don't have a filter configured.
+/* We can't do anything with no content-type or if we don't have a
+ * filter configured.
  */
-if (r->proxyreq != PROXYREQ_NONE || !r->content_type ||
-!conf->ct_output_filters) {
+if (!r->content_type || !conf->ct_output_filters) {
 return;
 }
 



Re: AddOutputFilterByType oddness

2004-08-25 Thread Graham Leggett
Justin Erenkrantz wrote:
Putting on an end user hat I see no reason why AddOutputFilterByType
shouldn't do exactly what it says it does.

I believe it has more to do with mod_proxy than the filter design.  No 
one, at the time we added AddOutputFilterByType, wanted to rewrite 
mod_proxy to be knowledgeable about filters.
I wrote mod_proxy to be knowledgeable about filters shortly after v2.0 
came about, it was one of the first major modules to support filters.

Ultimately, all that is needed is a call to ap_set_content_type() before 
any bytes are written to the client to get AddOutputFilterByType to 
work. Perhaps with the recent momentum behind mod_proxy work, someone 
could investigate that and get mod_proxy fixed.
ap_set_content_type() is called on line 769 of proxy_http.c:
if ((buf = apr_table_get(r->headers_out, "Content-Type"))) {
ap_set_content_type(r, apr_pstrdup(p, buf));
}
Is there anything else that needs to be done to make 
AddOutputFilterByType to work?

Is apr_table_get() case sensitive?
Regards,
Graham
--


smime.p7s
Description: S/MIME Cryptographic Signature


Re: AddOutputFilterByType oddness

2004-08-24 Thread Jess Holle
If I understand this correctly this is a necessity for 
mod_proxy/mod_proxy_ajp to replace mod_jk else this would be a 
significant regression from mod_jk (wherein this issue was fixed last 
year as I recall).

--
Jess Holle
Justin Erenkrantz wrote:
--On Tuesday, August 24, 2004 12:20 PM +0200 Graham Leggett 
<[EMAIL PROTECTED]> wrote:

Putting on an end user hat I see no reason why AddOutputFilterByType
shouldn't do exactly what it says it does.
I believe it has more to do with mod_proxy than the filter design.  No 
one, at the time we added AddOutputFilterByType, wanted to rewrite 
mod_proxy to be knowledgeable about filters.

Ultimately, all that is needed is a call to ap_set_content_type() 
before any bytes are written to the client to get 
AddOutputFilterByType to work. Perhaps with the recent momentum behind 
mod_proxy work, someone could investigate that and get mod_proxy 
fixed.  -- justin



Re: AddOutputFilterByType oddness

2004-08-24 Thread Justin Erenkrantz
--On Tuesday, August 24, 2004 12:20 PM +0200 Graham Leggett 
<[EMAIL PROTECTED]> wrote:

Putting on an end user hat I see no reason why AddOutputFilterByType
shouldn't do exactly what it says it does.
I believe it has more to do with mod_proxy than the filter design.  No one, 
at the time we added AddOutputFilterByType, wanted to rewrite mod_proxy to 
be knowledgeable about filters.

Ultimately, all that is needed is a call to ap_set_content_type() before 
any bytes are written to the client to get AddOutputFilterByType to work. 
Perhaps with the recent momentum behind mod_proxy work, someone could 
investigate that and get mod_proxy fixed.  -- justin


Re: AddOutputFilterByType oddness

2004-08-24 Thread Geoffrey Young


Graham Leggett wrote:
> Nick Kew wrote:
> 
>>> I have just set up the most recent httpd v2.0.51-dev tree, and have
>>> configured a filter that strips leading whitespace from HTML:
>>>
>>> AddOutputFilterByType STRIP text/html
>>>
>>> The content is served by mod_proxy.
> 
> 
>> As it stands, that can't work.
> 
> 
> Then as it stands filter's are broken.
> 
> Putting on an end user hat I see no reason why AddOutputFilterByType
> shouldn't do exactly what it says it does.

for the record, I've know of other cases where AddOutputFilterByType just
doesn't cut it, specifically wrt filter_init.  see

  http://marc.theaimsgroup.com/?l=apache-httpd-dev&m=107090791508163&w=2

for more details.  while I'm trying to address a separate issue there, the
example tarball shows that AddOutputFiltersByType is broken for even some
core module setups.

HTH

--Geoff


Re: AddOutputFilterByType oddness

2004-08-24 Thread Graham Leggett
Nick Kew wrote:
I have just set up the most recent httpd v2.0.51-dev tree, and have
configured a filter that strips leading whitespace from HTML:
AddOutputFilterByType STRIP text/html
The content is served by mod_proxy.

As it stands, that can't work.
Then as it stands filter's are broken.
Putting on an end user hat I see no reason why AddOutputFilterByType 
shouldn't do exactly what it says it does.

It's a manifestation of the problem I'm addressing by reviewing
the filter architecture: see http://www.apachetutor.org/dev/smart-filter
and the "Ideas for smart filtering" thread here.
Reading the above, it seems that people are alergic to having filters 
look at headers to decide whether they should be valid or not.

Having a totally generic non HTTP filter sounds like a nice idea, but in 
practice it's a real pain in the ass. The filters need the knowledge 
contained in the headers regardless otherwise they simply won't work. 
They can either access the headers directly, or they can access some 
generic interface that warps the headers into something generic for the 
filters to access. Right now it seems filters do neither.

This is really annoying for an end user. Having developed the filter we 
need for our application, we deploy it and now find we cannot use it. 
For us it's back to the drawing board. :(

Regards,
Graham
--


smime.p7s
Description: S/MIME Cryptographic Signature


Re: AddOutputFilterByType oddness

2004-08-24 Thread Nick Kew
On Tue, 24 Aug 2004, Nick Kew wrote:

> I actually have an implementation based on the discussion document and
> addressing the concerns people raised in the thread.  I hope to find
> time to finish the accompanying documentation and post it here round
> about this coming weekend.

OK, since you seem to have a real-life use for it, here goes.  As I
said before, I wasn't planning to post without a little more testing
and accompanying documents and discussion, but what the ?
I'm sure I'll regret this premature posting 

Mini-Synopsis:


# 1. Declare a smart filter that dispatches on Content-Type
FilterDeclare   myfilterContent-Type


# 2. Declare your filter as a Provider, to run whenever Content-Type
#includes the string "text/html"
FilterProvider  myfilterSTRIP   $text/html


# 3. Set the smart filter chain to this filter where you want to apply it

FilterChain =myfilter


-- 
Nick Kew/*  Copyright (C) 2004 Nick Kew

This is experimental code.  It may be copied and used only for
evaluation and testing purposes.

The copyright holder offers to the Apache Software Foundation
permission to re-license this code under the ASF license. 
This offer applies if and when the ASF accepts this code or
any derived work for inclusion in a future release of HTTPD.

Regardless of the above, the author undertakes to release the
work under a recognised open-source license in due course.
Information will be available at http://apache.webthing.com/
and/or http://dev.apache.org/~niq/
*/
#include 
#include 

/* apache */
#include 
#include 
#include 
#include 
#include 
#include 

module AP_MODULE_DECLARE_DATA filter_module ;


#ifndef NO_PROTOCOL
#define PROTO_CHANGE 0x1
#define PROTO_CHANGE_LENGTH 0x2
#define PROTO_NO_BYTERANGE 0x4
#define PROTO_NO_PROXY 0x8
#define PROTO_NO_CACHE 0x10
#define PROTO_TRANSFORM 0x20
#endif

typedef apr_status_t (*filter_func_t)(ap_filter_t*, apr_bucket_brigade*) ;

typedef struct {
  const char* name ;
  filter_func_t func ;
  void* fctx ;
} harness_ctx ;

typedef struct mod_filter_provider {
  enum {
STRING_MATCH,
STRING_CONTAINS,
REGEX_MATCH,
INT_EQ,
INT_LE,
INT_GE,
DEFINED
  } match_type ;
  union {
const char* c ;
regex_t* r ;
int i ;
  } match ;
  ap_filter_rec_t* frec ;
  struct mod_filter_provider* next ;
#ifndef NO_PROTOCOL
  unsigned int proto_flags ;
#endif
} mod_filter_provider ;

typedef struct {
  ap_filter_rec_t frec ;
  enum {
REQUEST_HEADERS,
RESPONSE_HEADERS,
SUBPROCESS_ENV,
CONTENT_TYPE
  } dispatch ;
  const char* value ;
  mod_filter_provider* providers ;
#ifndef NO_PROTOCOL
  unsigned int proto_flags ;
  const char* range ;
#endif
} mod_filter_rec ;

typedef struct mod_filter_chain {
  const char* fname ;
  struct mod_filter_chain* next ;
} mod_filter_chain ;

typedef struct {
  apr_hash_t* live_filters ;
  mod_filter_chain* chain ;
} mod_filter_cfg ;

static int filter_init(ap_filter_t* f) {
  mod_filter_provider* p ;
  int err ;
  harness_ctx* ctx = f->ctx ;
  mod_filter_cfg* cfg
= ap_get_module_config(f->r->per_dir_config, &filter_module);
  mod_filter_rec* filter
= apr_hash_get(cfg->live_filters, ctx->name, APR_HASH_KEY_STRING) ;
  for ( p = filter->providers ; p ; p = p->next ) {
if ( p->frec->filter_init_func ) {
  if ( err =  p->frec->filter_init_func(f), err != OK ) {
break ; /* if anyone errors out here, so do we */
  }
}
  }
  return err ;
}
static filter_func_t filter_lookup(request_rec* r, mod_filter_rec* filter) {
  mod_filter_provider* provider ;
  const char* str ;
  const char* cachecontrol ;
  int match ;
  unsigned int proto_flags ;

  /* Check registered providers in order */
  for ( provider = filter->providers; provider; provider = provider->next) {
match = 1 ;
switch ( filter->dispatch ) {
  case REQUEST_HEADERS:
str = apr_table_get(r->headers_in, filter->value) ;
break ;
  case RESPONSE_HEADERS:
str = apr_table_get(r->headers_out, filter->value) ;
break ;
  case SUBPROCESS_ENV:
str = apr_table_get(r->subprocess_env, filter->value) ;
break ;
  case CONTENT_TYPE:
str = r->content_type ;
break ;
}
/* treat nulls so we don't have to check every strcmp individually
 Not sure if there's anything better to do with them
*/
if ( str == NULL ) {
  if ( provider->match_type == DEFINED ) {
if ( provider->match.c != NULL ) {
  match = 0 ;
}
  }
} else if ( provider->match.c == NULL ) {
  match = 0 ;
} else {
/* Now we have no nulls, so we can do string and regexp matching */
  switch ( provider->match_type ) {
case STRING_MATCH:
  if ( strcasecmp(str, provider->match.c) ) {
match = 0 ;
  }
  break ;
case STRING_CONTA

Re: AddOutputFilterByType oddness

2004-08-24 Thread Graham Leggett
William A. Rowe, Jr. wrote:
Is your DefaultType set to text/html?
It's set like so:
DefaultType text/plain

You are proxying content?  What does the HEAD /image.gif HTTP/1.0
report for content type from the backend server?
It says this:
[EMAIL PROTECTED] root]# telnet gatekeeper.fma.co.za 80
Trying 196.30.143.210...
Connected to gatekeeper.fma.co.za.
Escape character is '^]'.
HEAD /patricia/policy/images/tabaccounting1.gif HTTP/1.1
Host: gatekeeper.fma.co.za
HTTP/1.1 200 OK
Date: Tue, 24 Aug 2004 10:05:54 GMT
Server: Apache-Coyote/1.1
ETag: W/"1636-1092965561000"
Last-Modified: Fri, 20 Aug 2004 01:32:41 GMT
Content-Type: image/gif
Connection: close
Connection closed by foreign host.
Regards,
Graham
--


smime.p7s
Description: S/MIME Cryptographic Signature


Re: AddOutputFilterByType oddness

2004-08-24 Thread Nick Kew
On Tue, 24 Aug 2004, Graham Leggett wrote:

> I have just set up the most recent httpd v2.0.51-dev tree, and have
> configured a filter that strips leading whitespace from HTML:
>
> AddOutputFilterByType STRIP text/html
>
> The content is served by mod_proxy.

As it stands, that can't work.

It's a manifestation of the problem I'm addressing by reviewing
the filter architecture: see http://www.apachetutor.org/dev/smart-filter
and the "Ideas for smart filtering" thread here.

I actually have an implementation based on the discussion document and
addressing the concerns people raised in the thread.  I hope to find
time to finish the accompanying documentation and post it here round
about this coming weekend.

> http://httpd.apache.org/docs-2.0/mod/core.html#addoutputfilterbytype
>
> it says that filters are not applied by proxied requests (It does not
> give a reason why not).

The URL above makes it clear what's happening there.

-- 
Nick Kew


Re: AddOutputFilterByType oddness

2004-08-23 Thread William A. Rowe, Jr.
At 07:23 PM 8/23/2004, Graham Leggett wrote:
>Paul Querna wrote:
>
>>Is your DefaultType set to text/html?
>
>It's set like so:
>
>DefaultType text/plain

You are proxying content?  What does the HEAD /image.gif HTTP/1.0
report for content type from the backend server?

Bill




Re: AddOutputFilterByType oddness

2004-08-23 Thread Graham Leggett
Paul Querna wrote:
Is your DefaultType set to text/html?
It's set like so:
DefaultType text/plain
On Tue, 2004-08-24 at 01:54 +0200, Graham Leggett wrote:
Hi all,
I have just set up the most recent httpd v2.0.51-dev tree, and have 
configured a filter that strips leading whitespace from HTML:

AddOutputFilterByType STRIP text/html
The content is served by mod_proxy.
This seems to work fine for HTML requests, but I have noticed that this 
filter is also being applied to images as well (thus corrupting them). 
Why would the above directive apply to all content, instead of text/html 
only as is configured?

Looking at the following docs:
http://httpd.apache.org/docs-2.0/mod/core.html#addoutputfilterbytype
it says that filters are not applied by proxied requests (It does not 
give a reason why not). From the test above however this statement is 
false, filters are applied to proxy requests - all proxied requests.

Am I doing something wrong, or is AddOutputFilterByType broken?
Regards,
Graham
--


smime.p7s
Description: S/MIME Cryptographic Signature


Re: AddOutputFilterByType oddness

2004-08-23 Thread Paul Querna
Is your DefaultType set to text/html?

On Tue, 2004-08-24 at 01:54 +0200, Graham Leggett wrote:
> Hi all,
> 
> I have just set up the most recent httpd v2.0.51-dev tree, and have 
> configured a filter that strips leading whitespace from HTML:
> 
> AddOutputFilterByType STRIP text/html
> 
> The content is served by mod_proxy.
> 
> This seems to work fine for HTML requests, but I have noticed that this 
> filter is also being applied to images as well (thus corrupting them). 
> Why would the above directive apply to all content, instead of text/html 
> only as is configured?
> 
> Looking at the following docs:
> 
> http://httpd.apache.org/docs-2.0/mod/core.html#addoutputfilterbytype
> 
> it says that filters are not applied by proxied requests (It does not 
> give a reason why not). From the test above however this statement is 
> false, filters are applied to proxy requests - all proxied requests.
> 
> Am I doing something wrong, or is AddOutputFilterByType broken?
> 
> Regards,
> Graham
> --



AddOutputFilterByType oddness

2004-08-23 Thread Graham Leggett
Hi all,
I have just set up the most recent httpd v2.0.51-dev tree, and have 
configured a filter that strips leading whitespace from HTML:

AddOutputFilterByType STRIP text/html
The content is served by mod_proxy.
This seems to work fine for HTML requests, but I have noticed that this 
filter is also being applied to images as well (thus corrupting them). 
Why would the above directive apply to all content, instead of text/html 
only as is configured?

Looking at the following docs:
http://httpd.apache.org/docs-2.0/mod/core.html#addoutputfilterbytype
it says that filters are not applied by proxied requests (It does not 
give a reason why not). From the test above however this statement is 
false, filters are applied to proxy requests - all proxied requests.

Am I doing something wrong, or is AddOutputFilterByType broken?
Regards,
Graham
--


smime.p7s
Description: S/MIME Cryptographic Signature


Re: AddOutputFilterByType and mod_jk [was Re: mod_deflate with mod_jk]

2003-01-06 Thread Henri Gomez
Aditya wrote:

On Mon, 09 Dec 2002 16:29:40 +0100, Henri Gomez <[EMAIL PROTECTED]> said:
BTW, I updated mod_jk 1.2.2-dev, 2.0.4-dev and also mod_webapp to
set the content type the correct way, previously there was a direct
set of content-type and I now use ap_set_content_type :




hgomez 2002/12/09 05:19:18

  Modified: jk/native/apache-2.0 mod_jk.c Log: Make jk works with
filters in Apache 2.0, ie mod_deflate and

  AddOutputFilterByType DEFLATE text/html.



I can confirm this now "does the right thing" with Apache 2.0.39 under
FreeBSD running mod_jk from CVS HEAD and mod-xslt (www.mod-xslt.com)
as an output filter for content of type text/xml


Thanks for the confirmation.

BTW, Happy New Year to all of you





AddOutputFilterByType and mod_jk [was Re: mod_deflate with mod_jk]

2002-12-26 Thread Aditya
> On Mon, 09 Dec 2002 16:29:40 +0100, Henri Gomez <[EMAIL PROTECTED]> said:
> BTW, I updated mod_jk 1.2.2-dev, 2.0.4-dev and also mod_webapp to
> set the content type the correct way, previously there was a direct
> set of content-type and I now use ap_set_content_type :

> hgomez 2002/12/09 05:19:18
>
>Modified: jk/native/apache-2.0 mod_jk.c Log: Make jk works with
> filters in Apache 2.0, ie mod_deflate and
>
>AddOutputFilterByType DEFLATE text/html.

I can confirm this now "does the right thing" with Apache 2.0.39 under
FreeBSD running mod_jk from CVS HEAD and mod-xslt (www.mod-xslt.com)
as an output filter for content of type text/xml

Thanks,
Adi



mod_proxy and AddOutputFilterByType

2002-10-30 Thread Juan Rivera








Question
on Apache 2.0.43:

 

I'm
trying to use AddOutputFilterByType on a reverse proxy vhost. Looking at the
code in core.c line 
2633, it checks if the request is proxied and if so, they don't
add the filters specified in the AddOutputFilterByType entries. Why?

 

 


 
  
  
  
  
  
  
 
 
  
   
  
  
  Juan C. Rivera
  
 
 
  
   
  
  
  Citrix Systems, Inc
  
 
 
  
   
  
  
  Tel: (954)229-6391
  
 


 








Re: Review: AddOutputFilterByType

2002-09-19 Thread André Malo

* André Malo wrote:

> * Justin Erenkrantz wrote:
> 
>> I don't have any easy solutions to this.  Well, I do, but it'd
>> require always doing the ap_getword call in the mainline request
>> path (which is how mod_mime does it).  That is so beyond awful...
> 
> hmm, now I had the time to think a little bit about. You'll find a
> patch attached for the current function in core.c - untested and just
> meant as idea. Feel free to throw it away, if it's not a good idea...
> ;-) 

hmm, I have the feeling, my patches are always overlooked ;-)

Don't wanna be pushy, but is there really nothing to say (e.g. what's
wrong with this or so?) 

nd
-- 
die (eval q-qq:Just Another Perl Hacker
:-)

# André Malo,  #



no doc for MaxMemFree or AddOutputFilterByType

2002-09-13 Thread Jeff Trawick

Does anybody have a short description for MaxMemFree?

What about AddOutputFilterByType?  Is that working properly now?

-- 
Jeff Trawick | [EMAIL PROTECTED]
Born in Roswell... married an alien...



RE: [PATCH] AddOutputFilterByType issues.

2002-04-01 Thread Ryan Bloom

> On Mon, Apr 01, 2002 at 02:05:04PM -0800, Ryan Bloom wrote:
> > I have two conflicting thoughts, so I'll put them both out there for
> > discussion.
> >
> > 1)  I agree (mostly) RESOURCE filters are really the only ones that
make
> > sense to add multiple times.  We should ensure that no other filters
are
> > added more than once.
> >
> > 2)  It is up to the filter to protect against this case.  That can
be
> > done by walking the filter chain to ensure that the same filter
isn't in
> > the list already.  Of course, walking the chain could be slow,
depending
> > on how many filters there are.
> 
> How could the filter itself protect against this case?  By the
> time it is called, it is already too late - the chain is created.
> Or am I missing something?
> 
> The only thing I can think of is that it it looks at f/f->next
> to make sure that there are no other copies left in the chain that
> haven't been called.  I think it would be better to just protect
> against that when we *add* filters rather than when we execute
> them.
> 
> I will commit the strcmp check now.  -- Justin

All it needs to do is leave a message for itself in the request_rec.
That can be done in either the per_request vector, or the r->notes
table.  I wouldn't want to use r->notes, because that could get really
large quickly.

The reality though is that the filter needs to be the one to check this.
There are easily some RESOURCE filters that shouldn't be added more than
once, for example the mod_header_footer filter should only be inserted
once.

There are some other methods that I can think of, but none of them are
really clean.  I kind of like the idea of just putting a note in the
per-request vector.

Ryan






Re: [PATCH] AddOutputFilterByType issues.

2002-04-01 Thread Justin Erenkrantz

On Mon, Apr 01, 2002 at 02:05:04PM -0800, Ryan Bloom wrote:
> I have two conflicting thoughts, so I'll put them both out there for
> discussion.
> 
> 1)  I agree (mostly) RESOURCE filters are really the only ones that make
> sense to add multiple times.  We should ensure that no other filters are
> added more than once.
> 
> 2)  It is up to the filter to protect against this case.  That can be
> done by walking the filter chain to ensure that the same filter isn't in
> the list already.  Of course, walking the chain could be slow, depending
> on how many filters there are.

How could the filter itself protect against this case?  By the
time it is called, it is already too late - the chain is created.
Or am I missing something?

The only thing I can think of is that it it looks at f/f->next
to make sure that there are no other copies left in the chain that
haven't been called.  I think it would be better to just protect
against that when we *add* filters rather than when we execute
them.  

I will commit the strcmp check now.  -- justin



RE: [PATCH] AddOutputFilterByType issues.

2002-04-01 Thread Ryan Bloom

> I've been partly out of it lately, but I think there is a problem
> with AddOutputFilterByType.  Since ap_set_content_type() can
> be called arbitrarily many times, it will try to add each filter
> as directed by AddOutputFilterByType on each call.  For certain
> filters, that isn't a terrifically good idea - such as mod_deflate.
> For others (say mod_include), this is okay.
> 
> We can't compress a file twice (and is causing problems with
> Subversion right now), but that's what we are doing now -
> mod_deflate is in the chain twice.  mod_mime calls
> ap_set_content_type twice in its normal execution (I dunno why,
> but that's sucky).
> 
> I'm using the following patch to minimize how often
> ap_add_output_filters_by_type is called.  This fixes the symptom,
> but perhaps we should think of a better solution?  Perhaps only
> allow AP_FTYPE_RESOURCE filters to be added multiple times, and
> AP_FTYPE_CONTENT_SET or higher can only be added once?
> 
> Thoughts?  -- justin

I have two conflicting thoughts, so I'll put them both out there for
discussion.

1)  I agree (mostly) RESOURCE filters are really the only ones that make
sense to add multiple times.  We should ensure that no other filters are
added more than once.

2)  It is up to the filter to protect against this case.  That can be
done by walking the filter chain to ensure that the same filter isn't in
the list already.  Of course, walking the chain could be slow, depending
on how many filters there are.

Regardless, this patch is all good-ness, and should be committed ASAP.

Ryan






[PATCH] AddOutputFilterByType issues.

2002-04-01 Thread Justin Erenkrantz

I've been partly out of it lately, but I think there is a problem
with AddOutputFilterByType.  Since ap_set_content_type() can
be called arbitrarily many times, it will try to add each filter
as directed by AddOutputFilterByType on each call.  For certain
filters, that isn't a terrifically good idea - such as mod_deflate.
For others (say mod_include), this is okay.

We can't compress a file twice (and is causing problems with
Subversion right now), but that's what we are doing now -
mod_deflate is in the chain twice.  mod_mime calls
ap_set_content_type twice in its normal execution (I dunno why,
but that's sucky).

I'm using the following patch to minimize how often
ap_add_output_filters_by_type is called.  This fixes the symptom,
but perhaps we should think of a better solution?  Perhaps only
allow AP_FTYPE_RESOURCE filters to be added multiple times, and
AP_FTYPE_CONTENT_SET or higher can only be added once?

Thoughts?  -- justin

Index: modules/http/http_protocol.c
===
RCS file: /home/cvs/httpd-2.0/modules/http/http_protocol.c,v
retrieving revision 1.406
diff -u -r1.406 http_protocol.c
--- modules/http/http_protocol.c29 Mar 2002 08:17:22 -  1.406
+++ modules/http/http_protocol.c1 Apr 2002 21:43:07 -
@@ -1272,14 +1272,16 @@
 
 AP_DECLARE(void) ap_set_content_type(request_rec *r, const char *ct)
 {
-r->content_type = ct;
+if (!r->content_type || strcmp(r->content_type, ct)) {
+r->content_type = ct;
 
-/* Insert filters requested by the AddOutputFiltersByType 
- * configuration directive. Content-type filters must be 
- * inserted after the content handlers have run because 
- * only then, do we reliably know the content-type.
- */
-ap_add_output_filters_by_type(r);
+/* Insert filters requested by the AddOutputFiltersByType 
+ * configuration directive. Content-type filters must be 
+ * inserted after the content handlers have run because 
+ * only then, do we reliably know the content-type.
+ */
+ap_add_output_filters_by_type(r);
+}
 }
 
 typedef struct header_filter_ctx {



RE: AddOutputFilterByType

2002-03-05 Thread William A. Rowe, Jr.


> > > What would be most cool is to set an r->replace_request member to
>the
> > > subrequest we will run.  Then in the run request phase, look at
> > > replace_request and run the insert_filters/run_handler against that
> > > replacement.
> >
> > That could be goodness.  I agree that I'm not sure if it is really
> > that trivial.  Perhaps.  Well, actually, I know it won't be since
> > some filters do different things when r->main is set.  I can see
> > rbb saying, "Those filters are broken."  Is that the case?  -- justin
>
>No that isn't the case, and the fix isn't that trivial.  It isn't
>impossible, but it adds an if to every single hook call.

Or would it... your griping about it [both of you] suddenly sparked some
insight... if the return code is != 0 then we test the various cases.

The mainline OK case remains unhindered :)

Bill




RE: AddOutputFilterByType

2002-03-05 Thread Ryan Bloom


> > Rbb and I chatted about this earlier today.  It seems like once the
> decision
> > is reached that we have a fast internal redirect, we should _stop_
> > processing
> > that main request.  Obviously some well defined return value from
the
> hook
> > fn, similar to DONE but not quite the same.
> 
> Ahem, wasn't I saying that?  ;-)  Yes, I think we need this too.
> 
> > What would be most cool is to set an r->replace_request member to
the
> > subrequest we will run.  Then in the run request phase, look at
> > replace_request and run the insert_filters/run_handler against that
> > replacement.
> 
> That could be goodness.  I agree that I'm not sure if it is really
> that trivial.  Perhaps.  Well, actually, I know it won't be since
> some filters do different things when r->main is set.  I can see
> rbb saying, "Those filters are broken."  Is that the case?  -- justin

No that isn't the case, and the fix isn't that trivial.  It isn't
impossible, but it adds an if to every single hook call.

Ryan





Re: AddOutputFilterByType

2002-03-05 Thread Justin Erenkrantz

On Tue, Mar 05, 2002 at 04:43:50PM -0600, William A. Rowe, Jr. wrote:
> Cut it out already :-)

I'll try.

> Rbb and I chatted about this earlier today.  It seems like once the decision
> is reached that we have a fast internal redirect, we should _stop_ 
> processing
> that main request.  Obviously some well defined return value from the hook
> fn, similar to DONE but not quite the same.

Ahem, wasn't I saying that?  ;-)  Yes, I think we need this too.

> What would be most cool is to set an r->replace_request member to the
> subrequest we will run.  Then in the run request phase, look at
> replace_request and run the insert_filters/run_handler against that 
> replacement.

That could be goodness.  I agree that I'm not sure if it is really
that trivial.  Perhaps.  Well, actually, I know it won't be since
some filters do different things when r->main is set.  I can see
rbb saying, "Those filters are broken."  Is that the case?  -- justin




Re: AddOutputFilterByType

2002-03-05 Thread William A. Rowe, Jr.

At 04:29 PM 3/5/2002, you wrote:

>Also, why does mod_negotiation handle fixups in type_checker and
>mod_dir does it in fixups?  I'd suggest that we should move the
>handle_multi call to be a fixups so that we are consistent where
>our redirect calls occur.  -- justin

Cut it out already :-)

We don't know if some other module is -interested- in that dir... so mod_dir
will wait till fixups to say "hey - I can do that!" and auth is run for the 
dir as
well as its index.html.

Negotiation decides way up in type_checker that yea - that's a multiview,
and this is what we will do with it.

Rbb and I chatted about this earlier today.  It seems like once the decision
is reached that we have a fast internal redirect, we should _stop_ processing
that main request.  Obviously some well defined return value from the hook
fn, similar to DONE but not quite the same.

What would be most cool is to set an r->replace_request member to the
subrequest we will run.  Then in the run request phase, look at
replace_request and run the insert_filters/run_handler against that 
replacement.

Need to spend 2 hours away from work + apache ... I'll check in later and
see if the solution is really that trivial.

Bill




Re: AddOutputFilterByType

2002-03-05 Thread Justin Erenkrantz

On Tue, Mar 05, 2002 at 07:04:43AM -0800, Ryan Bloom wrote:
> 
> Why is this thing being run in the fixups phase?  The whole point of the
> insert_filters phase is to insert filters for the given resource.  Why
> are we trying to insert resource based filters in the fixups?
> Especially given that the resource can change during the fixups phase.

True, this is where OtherBill suggested to me where it should go.
But now that I understand our filtering hooks a bit better, I now
agree that insert_filter makes more sense.  However, should we
attempt to abstract setting r->content_type so that we can intercept
whenever the content_type is modified and add the right filters as
dictated by AddOutputFilterByType?  I have some reservations about
that though, but it might solve our filter changing the content-type
on us.  

Also, why does mod_negotiation handle fixups in type_checker and
mod_dir does it in fixups?  I'd suggest that we should move the
handle_multi call to be a fixups so that we are consistent where
our redirect calls occur.  -- justin




AddOutputFilterByType

2002-03-05 Thread Ryan Bloom


Why is this thing being run in the fixups phase?  The whole point of the
insert_filters phase is to insert filters for the given resource.  Why
are we trying to insert resource based filters in the fixups?
Especially given that the resource can change during the fixups phase.

Ryan

--
Ryan Bloom  [EMAIL PROTECTED]
645 Howard St.  [EMAIL PROTECTED]
San Francisco, CA 





Re: FW: problem with AddOutputFilterByType directive

2002-02-20 Thread Ian Holsman

Justin Erenkrantz wrote:
> On Tue, Feb 19, 2002 at 03:27:04PM -0800, Ryan Bloom wrote:
> 
>>We have a function, ap_pass_brigade(), which is called by every content
>>generator, by definition.  Just put a hook into that function.
>>
> 
> And have that hook called every time data is sent through a
> filter (also output filters call ap_pass_brigade)?  We can only make
> this determination once (when we make this decision we have to be
> sure it is right).  But, there also becomes a point that when we
> positively know the answer, we may be too late to insert arbitrary
> filters.
> 
> I understand why you guys are proposing this solution, but I think
> you're missing my point.  Given our current architecture, we have
> no way of guaranteeing the content-type until most filters have
> been run.

so remove the directive completly.
if it can't be done in ALL cases your going to be surprising
a whole lot of people when suddenly their jsp's dont run a filter
while their servlets do..


> 
> You can't guarantee that the content-type is correct when
> ap_pass_brigade() is called (first time or many times).  You have
> no idea when the content-type will be set.  Any of the filters are
> free to change the content-type as they execute themselves.
> Consider the following case:
> 
> A JSP file that has:
> <% response.setContentType("text/plain") %>
> ...jsp code...
> 
> At what point do we run this hook that checks content-type?  The
> first time ap_pass_brigade() is called, it may have text/html set.
> That may be because mod_mime saw .jsp and said, "Okay, text/html
> for .jsp."  However, the JSP engine (via an Apache 2.0 filter) says,
> "nah, this should be text/plain."  If we base it on the first call to
> ap_pass_brigade(), we add "text/html"'s filters.  If we base it on
> the second call to ap_pass_brigade, you now need "text/plain"'s
> filters.  Say, we have JSP page that produces PHP code (hey, it's
> valid in our architecture), and the PHP script now changes it to
> "application/x-ogg".  So, the content-type is now something else.
> It can arbitrarily flip-flop as ap_pass_brigade() is called and the
> filter tree is walked.
> 
> However, in our current implementation, once we reach the HTTP_HEADER
> state, the content-type must be defined.  How about we do it as a
> filter that ran then?  Possibly even in ap_http_header_filter.  So,
> let's say we do it right before ap_http_header_filter - we're
> *guaranteed* to know content_type by then, right?  Add the filter as
> requested by AddOutputFilterByType.
> 
> Is there a violation of our filter ordering semantics by running this
> filter out-of-order?  We'll be running this filter during HTTP_HEADER.
> Assume you have two filters - one is explicitly at a higher-priority
> to ensure that one is always run *after* another.  Now, say that the
> lower-priority filter is added by this directive (but not the other) -
> we're violating the priorities.  I think that's bad.  So, perhaps, we
> could restrict AddOutputFilterByType to only >= HTTP_HEADER filters.
> If it is less than that, we could produce a config-time error.  That
> seems something that might work.  Thoughts?  I think it may make
> us too restrictive with this directive though.  -- justin
> 
> 






Re: FW: problem with AddOutputFilterByType directive

2002-02-19 Thread Justin Erenkrantz

On Tue, Feb 19, 2002 at 03:27:04PM -0800, Ryan Bloom wrote:
> We have a function, ap_pass_brigade(), which is called by every content
> generator, by definition.  Just put a hook into that function.

And have that hook called every time data is sent through a
filter (also output filters call ap_pass_brigade)?  We can only make
this determination once (when we make this decision we have to be
sure it is right).  But, there also becomes a point that when we
positively know the answer, we may be too late to insert arbitrary
filters.

I understand why you guys are proposing this solution, but I think
you're missing my point.  Given our current architecture, we have
no way of guaranteeing the content-type until most filters have
been run.

You can't guarantee that the content-type is correct when
ap_pass_brigade() is called (first time or many times).  You have
no idea when the content-type will be set.  Any of the filters are
free to change the content-type as they execute themselves.
Consider the following case:

A JSP file that has:
<% response.setContentType("text/plain") %>
...jsp code...

At what point do we run this hook that checks content-type?  The
first time ap_pass_brigade() is called, it may have text/html set.
That may be because mod_mime saw .jsp and said, "Okay, text/html
for .jsp."  However, the JSP engine (via an Apache 2.0 filter) says,
"nah, this should be text/plain."  If we base it on the first call to
ap_pass_brigade(), we add "text/html"'s filters.  If we base it on
the second call to ap_pass_brigade, you now need "text/plain"'s
filters.  Say, we have JSP page that produces PHP code (hey, it's
valid in our architecture), and the PHP script now changes it to
"application/x-ogg".  So, the content-type is now something else.
It can arbitrarily flip-flop as ap_pass_brigade() is called and the
filter tree is walked.

However, in our current implementation, once we reach the HTTP_HEADER
state, the content-type must be defined.  How about we do it as a
filter that ran then?  Possibly even in ap_http_header_filter.  So,
let's say we do it right before ap_http_header_filter - we're
*guaranteed* to know content_type by then, right?  Add the filter as
requested by AddOutputFilterByType.

Is there a violation of our filter ordering semantics by running this
filter out-of-order?  We'll be running this filter during HTTP_HEADER.
Assume you have two filters - one is explicitly at a higher-priority
to ensure that one is always run *after* another.  Now, say that the
lower-priority filter is added by this directive (but not the other) -
we're violating the priorities.  I think that's bad.  So, perhaps, we
could restrict AddOutputFilterByType to only >= HTTP_HEADER filters.
If it is less than that, we could produce a config-time error.  That
seems something that might work.  Thoughts?  I think it may make
us too restrictive with this directive though.  -- justin




FW: problem with AddOutputFilterByType directive

2002-02-19 Thread Ryan Bloom

> On Mon, Feb 18, 2002 at 09:17:19PM -0800, Ian Holsman wrote:
> > it doesn't do what it is supposed to do ALL the time.
> >
> > for example.. take mod-status.
> >
> > inside the handler it decides what type of content-type
> > the program will return.
> 
> "bad module, fix module."

That's bogus.  Consider mod_cgi, which can't determine the content type
until the first bucket is written.

I'm with Ian on this one.

> > *) add a filter (with a priority of -1 FTYPE_CONTENT) which just
> >checks the content-type and adds the other filter if it
> >passes.
> 
> If we have to do this, then I think our architecture is in poor
> shape.  I don't think a filter is the right place to do this - there
> is way too much overhead in a filter to do this logic.  And, we'd
> also be adding a filter after we've sent data down the filter chain.
> 
> (Don't get me wrong, I see your rationale behind this, but if the
> only way to do this is via another filter, we've got problems.)
> 
> I could be wrong, but I think the way I did it is the right way to
> do this given what we have now.  Anyone else?  Am I wrong?  -- justin

We have a function, ap_pass_brigade(), which is called by every content
generator, by definition.  Just put a hook into that function.

Ryan





Re: problem with AddOutputFilterByType directive

2002-02-18 Thread Justin Erenkrantz

On Mon, Feb 18, 2002 at 09:17:19PM -0800, Ian Holsman wrote:
> it doesn't do what it is supposed to do ALL the time.
> 
> for example.. take mod-status.
> 
> inside the handler it decides what type of content-type
> the program will return.

"bad module, fix module."

> this filterconfig won't be able to catch it.
> 
> I'm also sure that mod-cgi can not determine the content-type
> of a CGI program before the handler hook either.

Not much we can do here since the content-type is completely
unknown.

> so what I'm saying is this directive is going to confuse a WHOLE
> lot of people when it works some of the time.

No disagreement here, but I come back to "bad module, fix module."

> if this directive needs to be there (and I belive it should be)
> the filters themselves need to be added after the first bucket has been 
> passed to the brigade, as this is the only time we actually know for 
> certain what content-type we have.

I'm not terribly sure how workable this is.  I think that the
content-type needs to be set by the type_checker hook.  Then, the
fixups hook can be run knowing the content-type is valid.  If the
type_checker doesn't do what it says - mainly this:

"This routine is called to determine and/or set the various document
type information bits, like Content-type (via r->content_type),
language, et cetera."

Then we have issues.  But, I believe mod_status can be fixed
pretty trivially.  I'm not at all sure what to do about CGIs or
misbehaving modules.

>   *) add a filter (with a priority of -1 FTYPE_CONTENT) which just
>  checks the content-type and adds the other filter if it
>  passes.

If we have to do this, then I think our architecture is in poor
shape.  I don't think a filter is the right place to do this - there
is way too much overhead in a filter to do this logic.  And, we'd
also be adding a filter after we've sent data down the filter chain.

(Don't get me wrong, I see your rationale behind this, but if the
only way to do this is via another filter, we've got problems.)

I could be wrong, but I think the way I did it is the right way to
do this given what we have now.  Anyone else?  Am I wrong?  -- justin




problem with AddOutputFilterByType directive

2002-02-18 Thread Ian Holsman

sorry.. my mailer is being a royal pain..
so I just noticed this commit.

I've got a issue with this.

it doesn't do what it is supposed to do ALL the time.

for example.. take mod-status.

inside the handler it decides what type of content-type
the program will return.

this filterconfig won't be able to catch it.

I'm also sure that mod-cgi can not determine the content-type
of a CGI program before the handler hook either.

so what I'm saying is this directive is going to confuse a WHOLE
lot of people when it works some of the time.

if this directive needs to be there (and I belive it should be)
the filters themselves need to be added after the first bucket has been 
passed to the brigade, as this is the only time we actually know for 
certain what content-type we have.

so my proposistion is:
*) add a filter (with a priority of -1 FTYPE_CONTENT) which just
   checks the content-type and adds the other filter if it
   passes.

..Ian
-0.9 on the current commit.