Re: AddOutputFilterByType in Apache 2.4 inserts filters as AP_FTYPE_RESOURCE

2016-01-20 Thread Micha Lenk

Hi Nick,

if the patch looks good, as you wrote, what is needed to get it applied 
to trunk and backported to 2.4.x?


Have you seen my follow-up questions in the other mail?

Best regards,
Micha


Am 13.01.2016 22:44, schrieb Nick Kew:

On Wed, 2016-01-13 at 17:59 +0100, Micha Lenk wrote:

Hi,

The directive AddOutputFilterByType can be used to insert filters to 
the
output filter chain depending on the content type of the HTTP 
response.

So far so good.

PROBLEM DESCRIPTION


This is probably worth a bugzilla entry.

I think I can clarify a little of what's happened.
AddOutputFilterByType was something of a hacked afterthought
to filtering back in the days of httpd 2.0.  On the one hand,
it met a need.  On the other hand, it worked only in a very
limited range of situations where the content type was known
at the time the filter was to be added.  It had no capacity
to respond to other aspects of the content, or indeed the
request/response.  And there were other issues.

Then came mod_filter and a generalised framework.
AddOutputFilterByType was now obsolete, but too widely-used
to dispense with entirely.  As a compromise it was re-implemented
within mod_filter, where it could co-exist with other dynamic
filter configuration.

Your observation tells us the semantics aren't quite compatible.
And your patch looks good - thanks.




Re: AddOutputFilterByType in Apache 2.4 inserts filters as AP_FTYPE_RESOURCE

2016-01-14 Thread Micha Lenk

Hi Nick,

Am 13.01.2016 22:44, schrieb Nick Kew:

PROBLEM DESCRIPTION


This is probably worth a bugzilla entry.


Done. https://bz.apache.org/bugzilla/show_bug.cgi?id=58856

Nick, would you mind to provide some insights on these comments from my 
initial mail:

For setups with both, FilterDeclare and AddOutputFilterByType (as
described above as fix), I observed some issues with properly merging
the two filter harnesses. However, I have no clue what semantics the
original author wanted to have in this situation.


Assumed that my patch gets applied, the filter type should be correctly 
set in the filter harness. But what if a user wants to override it? I 
got a few questions in this context:


1.) Should "FilterDeclare" with filter-name "BYTYPE:DEFLATE" (i.e. 
colliding

with the implicit filter-name created by AddOutputFilterByName) be
supported at all?

1a.) If yes, the handling of AddOutputFilterByType needs to be fixed so 
that:

 - a globally configured FilterDeclare is also effective for an
   AddOutputFilterByType within a (location) sub-container.
 - a filter type set by "FilterDeclare BYTYPE: type>"

   does not get overwritten by a subsequent
   "AddOutputFilterByType ".

1b.) If no, the code should detect and reject such configurations.

2.) On a related note to 1., Should FilterDeclare allow a filter-name of
existing filter providers at all? If yes, behavior would we expect?

Nick, I would be really glad if you could share your thoughts.

Best regards,
Micha



Re: AddOutputFilterByType in Apache 2.4 inserts filters as AP_FTYPE_RESOURCE

2016-01-13 Thread Nick Kew
On Wed, 2016-01-13 at 17:59 +0100, Micha Lenk wrote:
> Hi,
> 
> The directive AddOutputFilterByType can be used to insert filters to the 
> output filter chain depending on the content type of the HTTP response. 
> So far so good.
> 
> PROBLEM DESCRIPTION

This is probably worth a bugzilla entry.

I think I can clarify a little of what's happened.
AddOutputFilterByType was something of a hacked afterthought
to filtering back in the days of httpd 2.0.  On the one hand,
it met a need.  On the other hand, it worked only in a very
limited range of situations where the content type was known
at the time the filter was to be added.  It had no capacity
to respond to other aspects of the content, or indeed the
request/response.  And there were other issues.

Then came mod_filter and a generalised framework.
AddOutputFilterByType was now obsolete, but too widely-used
to dispense with entirely.  As a compromise it was re-implemented
within mod_filter, where it could co-exist with other dynamic
filter configuration.

Your observation tells us the semantics aren't quite compatible.
And your patch looks good - thanks.

-- 
Nick Kew



Re: AddOutputFilterByType vs. proxy in 2.0.x

2007-10-09 Thread William A. Rowe, Jr.
Eric Covener wrote:
> 
> I'd like to backport to 2.0.x but it seems like there was a little bit
> of caution in the 2.2.x/trunk commits.

Propose it in STATUS and let's see where that goes.


Re: AddOutputFilterByType oddness

2004-09-24 Thread Justin Erenkrantz
--On Thursday, September 23, 2004 12:43 PM +0100 Nick Kew <[EMAIL PROTECTED]> 
wrote:

Basically it does the lookup/dispatch once per filter in the filterchain
per request.  It checks that filter's providers until it finds a match.
So for anything you could do with an [Add|Set]OutputFilter[ByType]
that's one lookup per request.
Okay, so if I have three rules and ten filters, we'll be doing thirty checks, 
right?  And, this will happen even if mod_filter isn't configured - as 
mod_filter still needs to check ten times that it doesn't have anything to do, 
right?  Hmm.  How expensive is this again?

mod_filter takes the content-type as it is at that point in the chain.
Isn't the real nightmare where a filter calls ap_set_content_type and
some AddOutputFilterByTypes are in effect?  I guess what *really* bothers
me is the idea of adding filters *as a side-effect*.
How wouldn't it be a side-effect?  It's intentional from the admin 
perspective, but a side-effect from the developer's perspective.

  And, then
mod_deflate needs to be conditionally added (sub-case #1: it needs to be
added for 'text/plain'; sub-case #2: it needs to be added for 'text/html').
How and where is it added?  Are you inserting dummy filters?
I'm not sure I follow.  It will dispatch to deflate based on the
content-type (or other dispatch criterion) as it is at that point
in the chain.
The question is at which point in the chain does deflate get added?
So if the handler sets application/xml but that goes through an XSLT
filter which sets it to text/html, then mod_filter sees application/xml
if it's before the XSLT filter in the chain, or text/html after it.
How can AddOutputFilterByType expect to cope with that?
I thought you suggested that mod_filter could easily handle this case.  I'm 
still not seeing how.

But FWIW I have that working locally with
FilterDeclare   filter1 Content-TypeCONTENT_SET
FilterDeclare   filter2 Content-Length  CONTENT_SET
FilterProvider  filter1 filter2 $text
FilterProvider  filter2 DEFLATE >4000
FilterChain filter1
to deflate all "text/*" documents of 4k or greater.
Can I comment that I think a clearer configuration syntax is going to be 
needed if we are going to axe all of the current filter directives?

AddOutputFilterByType, for all of its internal oddness, is a simple directive 
for an administrator to understand.  So, perhaps keep 'AddOutputFilterByType' 
and having it internally converted to a mod_filter directive.  But, I'm just 
not overly excited about moving all filter configuration directives to 
something akin to mod_rewrite.  Ouch.  -- justin


Re: AddOutputFilterByType oddness

2004-09-23 Thread Nick Kew
On Wed, 22 Sep 2004, Justin Erenkrantz wrote:

> --On Wednesday, September 22, 2004 6:17 PM +0100 Nick Kew
> <[EMAIL PROTECTED]> wrote:
>
> > It seems to me heavily counterintuitive that mixing ByType directives
> > with anything else means that the ByType filters *always* come last.
> > And that Remove won't affect them, but will affect others.
>
> I think we could get Remove*Filter to also delete the content-type filters.
>
> > Indeed.  mod_filter addresses this by configuring at the last moment,
> > so any earlier set_content_type()s are irrelevant.  I don't suppose it's
> > a panacaea for everything, but I do think it's a significant improvement
> > on what we have.
>
> I'm concerned about the overhead of mod_filter having to check all of its
> rules each time a filter is invoked.  This is why I started to look through
> the code last night to see how it worked and how invasive it is.

It's improving with time (except when I introduce bugs...).  Merging in
the structs with util_filter saves on having to do superfluous lookups.

Basically it does the lookup/dispatch once per filter in the filterchain
per request.  It checks that filter's providers until it finds a match.
So for anything you could do with an [Add|Set]OutputFilter[ByType]
that's one lookup per request.

> How would you handle the situation when filter #1 sets C-T to be
> "text/plain" and then filter #2 sets C-T to be "text/html"?

mod_filter takes the content-type as it is at that point in the chain.

Isn't the real nightmare where a filter calls ap_set_content_type and
some AddOutputFilterByTypes are in effect?  I guess what *really* bothers
me is the idea of adding filters *as a side-effect*.

> And, then
> mod_deflate needs to be conditionally added (sub-case #1: it needs to be
> added for 'text/plain'; sub-case #2: it needs to be added for 'text/html').
> How and where is it added?  Are you inserting dummy filters?

I'm not sure I follow.  It will dispatch to deflate based on the
content-type (or other dispatch criterion) as it is at that point
in the chain.

So if the handler sets application/xml but that goes through an XSLT
filter which sets it to text/html, then mod_filter sees application/xml
if it's before the XSLT filter in the chain, or text/html after it.

How can AddOutputFilterByType expect to cope with that?

>
> > From the user's perspective, it's simply more powerful and flexible.
> > Works with any request or response headers (not just content-type) or
> > environment variables.  Gets rid of constraints on ordering, like
> > AddOutputFilterbyType filter always coming after other filters
> > regardless of ordering in httpd.conf.
> >
> > Example: I have a user who wants to insert mod_deflate in a reverse
> > proxy, but only for selected content-types AND not if the content
> > length is below a threshold.  How would he do that with the old filter
> > framework?
>
> I guess I'm not clear what the syntax is (I guess I should go read the
> docs).

That particular scenario is complex, and requires mod_filter to be
used as its own provider.  The point is, we *can* now support complex
setups (or will be - that chaining is still broken in CVS).

But FWIW I have that working locally with

FilterDeclare   filter1 Content-TypeCONTENT_SET
FilterDeclare   filter2 Content-Length  CONTENT_SET

FilterProvider  filter1 filter2 $text
FilterProvider  filter2 DEFLATE >4000

FilterChain filter1

to deflate all "text/*" documents of 4k or greater.


> I definitely don't want to see the filters be configured like
> mod_rewrite.  It needs to be fairly straightforward, but still fairly
> simplistic.  I don't want to have users have to read a complicated manual
> or docs to set up filters.  KISS.

Indeed.  Do you think the examples in the manual page are too complex?

Bear in mind that the third example is no more complex than the first two,
yet suddenly enables a frequently-requested capability that simply isn't
possible with the old filtering.

> Well, the point by you committing it into our tree is that the rest of us
> are now responsible for it.  That's why I brought up the code style issue:

OK, OKOK!   I promise to look harder at the code style guidelines!
And I _did_ ask on the list a couple of weeks before introducing to CVS.

> I looked yesterday afternoon (and haven't seen any commits since then).  I

That'll be the latest version.  Which FWIW was introduced prematurely
because it introduced a new feature demanded by a user.  Only that turned
out to be broken, which is why I'm re-hacking that now.

-- 
Nick Kew


Re: AddOutputFilterByType oddness

2004-09-22 Thread Justin Erenkrantz
--On Wednesday, September 22, 2004 6:17 PM +0100 Nick Kew 
<[EMAIL PROTECTED]> wrote:

It seems to me heavily counterintuitive that mixing ByType directives
with anything else means that the ByType filters *always* come last.
And that Remove won't affect them, but will affect others.
I think we could get Remove*Filter to also delete the content-type filters.
Indeed.  mod_filter addresses this by configuring at the last moment,
so any earlier set_content_type()s are irrelevant.  I don't suppose it's
a panacaea for everything, but I do think it's a significant improvement
on what we have.
I'm concerned about the overhead of mod_filter having to check all of its 
rules each time a filter is invoked.  This is why I started to look through 
the code last night to see how it worked and how invasive it is.

How would you handle the situation when filter #1 sets C-T to be 
"text/plain" and then filter #2 sets C-T to be "text/html"?  And, then 
mod_deflate needs to be conditionally added (sub-case #1: it needs to be 
added for 'text/plain'; sub-case #2: it needs to be added for 'text/html'). 
How and where is it added?  Are you inserting dummy filters?

From the user's perspective, it's simply more powerful and flexible.
Works with any request or response headers (not just content-type) or
environment variables.  Gets rid of constraints on ordering, like
AddOutputFilterbyType filter always coming after other filters
regardless of ordering in httpd.conf.
Example: I have a user who wants to insert mod_deflate in a reverse
proxy, but only for selected content-types AND not if the content
length is below a threshold.  How would he do that with the old filter
framework?
I guess I'm not clear what the syntax is (I guess I should go read the 
docs).  I definitely don't want to see the filters be configured like 
mod_rewrite.  It needs to be fairly straightforward, but still fairly 
simplistic.  I don't want to have users have to read a complicated manual 
or docs to set up filters.  KISS.

From a developers perspective, I wrote it for myself, and have at least
two other developers using it operationally in their product.  Time will
tell what others may use it for.
Well, the point by you committing it into our tree is that the rest of us 
are now responsible for it.  That's why I brought up the code style issue: 
we already have a number of modules that were never fully integrated or 
reviewed.  And, then the person who dropped the code ran away and left the 
code in a goofy state.  (See mod_rewrite, mod_ssl, mod_cache, etc.)

When was that?  I made quite a lot of updates to the style towards
conforming (like eliminating tabs and realigning some braces) before
committing to CVS, but I'm willing to believe I need to look more
carefully.
I looked yesterday afternoon (and haven't seen any commits since then).  I 
will say the most distracting parts are the odd spacing (i.e. parenthesis 
and semi-colons) as well as line spacing.  Unfortunately, I get distracted 
by shiny things such as improper code style such that I can't focus on the 
code itself.  =)  -- justin


Re: AddOutputFilterByType oddness

2004-09-22 Thread Nick Kew
On Wed, 22 Sep 2004, Justin Erenkrantz wrote:

> --On Wednesday, September 22, 2004 5:01 PM +0100 Nick Kew <[EMAIL PROTECTED]>
> wrote:
>
> > I've said it before and I'll say it again: AddOutputFilterByType is
> > fundamentally unsatisfactory.  This confusion is an effect, not cause.
>
> Suffice to say, I disagree.
>
> > * Configuration is inconsistent with other filter directives.  The
> >   relationship with [Set|Add|Remove]OutputFilter is utterly unintuitive
> >   and, from a user POV, broken.
>
> I think it's really clear from the user's perspective.  I think the problem
> comes in on the developer's side.

It seems to me heavily counterintuitive that mixing ByType directives
with anything else means that the ByType filters *always* come last.
And that Remove won't affect them, but will affect others.

> > * Tying it to ap_set_content_type is, to say the least, hairy.
> >   IMO we shouldn't *require* modules to call this, and it's utterly
> >   unreasonable to expect that it will never be called more than once
> >   for a request, given the number of modules that might take an interest.
> >   Especially when subrequests and internal redirects may be involved.
>
> We have *always* mandated that ap_set_content_type() should be called rather
> than setting r->content_type.  (I wish we could remove content_type from
> request_rec instead.)

Indeed.  But that doesn't prevent it being called multiple times, perhaps
from different modules.  So using it to insert filters leaves lots of
potantial for trouble.

> > * It's a complexity just waiting for modules to break on it.
>
> Anything that depends upon content-type like this is going to be hairy because
> there may be several 'right' answers during the course of the request.

Indeed.  mod_filter addresses this by configuring at the last moment,
so any earlier set_content_type()s are irrelevant.  I don't suppose it's
a panacaea for everything, but I do think it's a significant improvement
on what we have.

> > I've made some more updates to mod_filter since I last posted on the
> > subject, and I'm getting some very positive feedback from real users.
> > For 2.2 I'd like to remove AddOutputFilterByType entirely, replacing
> > it with mod_filter.
>
> I've yet to see a clear and concise statement as to how mod_filter will solve
> this problem in a better and more efficient way.  (Especially from a user's
> perspective, but also from a developer's perspective.)

>From the user's perspective, it's simply more powerful and flexible.
Works with any request or response headers (not just content-type) or
environment variables.  Gets rid of constraints on ordering, like
AddOutputFilterbyType filter always coming after other filters
regardless of ordering in httpd.conf.

Example: I have a user who wants to insert mod_deflate in a reverse
proxy, but only for selected content-types AND not if the content
length is below a threshold.  How would he do that with the old filter
framework?

>From a developers perspective, I wrote it for myself, and have at least
two other developers using it operationally in their product.  Time will
tell what others may use it for.

> I will also comment that I looked in the mod_filter code the other day and was
> disappointed that it doesn't follow our coding style at all or even have
> comments that help people understand what it is trying to do inside the .c
> file.

When was that?  I made quite a lot of updates to the style towards
conforming (like eliminating tabs and realigning some braces) before
committing to CVS, but I'm willing to believe I need to look more
carefully.

-- 
Nick Kew


Re: AddOutputFilterByType oddness

2004-09-22 Thread Justin Erenkrantz
--On Wednesday, September 22, 2004 5:01 PM +0100 Nick Kew <[EMAIL PROTECTED]> 
wrote:

I've said it before and I'll say it again: AddOutputFilterByType is
fundamentally unsatisfactory.  This confusion is an effect, not cause.
Suffice to say, I disagree.
* Configuration is inconsistent with other filter directives.  The
  relationship with [Set|Add|Remove]OutputFilter is utterly unintuitive
  and, from a user POV, broken.
I think it's really clear from the user's perspective.  I think the problem 
comes in on the developer's side.

* Tying it to ap_set_content_type is, to say the least, hairy.
  IMO we shouldn't *require* modules to call this, and it's utterly
  unreasonable to expect that it will never be called more than once
  for a request, given the number of modules that might take an interest.
  Especially when subrequests and internal redirects may be involved.
We have *always* mandated that ap_set_content_type() should be called rather 
than setting r->content_type.  (I wish we could remove content_type from 
request_rec instead.)

* It's a complexity just waiting for modules to break on it.
Anything that depends upon content-type like this is going to be hairy because 
there may be several 'right' answers during the course of the request.

I've made some more updates to mod_filter since I last posted on the
subject, and I'm getting some very positive feedback from real users.
For 2.2 I'd like to remove AddOutputFilterByType entirely, replacing
it with mod_filter.
I've yet to see a clear and concise statement as to how mod_filter will solve 
this problem in a better and more efficient way.  (Especially from a user's 
perspective, but also from a developer's perspective.)

mod_filter can also obsolete [Set|Add|Remove]OutputFilter, though I'm
in no hurry to do that.  What I can also do is re-implement all the
outputfilter directives within mod_filter and its updated framework.
I will also comment that I looked in the mod_filter code the other day and was 
disappointed that it doesn't follow our coding style at all or even have 
comments that help people understand what it is trying to do inside the .c 
file.  This all makes it very difficult to understand the code.  I'd greatly 
appreciate it if mod_filter (and the code that you inserted elsewhere - i.e. 
in util_filter.c) would conform to our style guidelines and had some comments 
inside of it that say what it does (or trying to do).

For example, some of the things it does just makes no sense at all. 
filter_bucket_type() is completely bogus and needs to be tossed.  The 
type->name field in the bucket should be used instead.

I'm getting annoyed by people doing massive code drops (i.e. mod_filter, 
mod_proxy, mod_auth_ldap, etc.) that don't conform to our code style and have 
no comments.  It makes it much harder to go and fix bugs in 'em.  -- justin


Re: AddOutputFilterByType oddness

2004-09-22 Thread Nick Kew
On Sat, 18 Sep 2004, Justin Erenkrantz wrote:

> > But ap_add_output_filters_by_type() explicitly does nothing for a
> > proxied request.  Anyone know why?  "AddOutputFilterByType DEFLATE
> > text/plain text/html" seems to work as expected here for a forward proxy
> > with this applied: maybe I'm missing something fundamental...
>
> My recollection is initially it didn't have the proxy check, then FirstBill
> had a reason why proxied requests shouldn't work with AddOutputFilterByType.

I've said it before and I'll say it again: AddOutputFilterByType is
fundamentally unsatisfactory.  This confusion is an effect, not cause.

* Configuration is inconsistent with other filter directives.  The
  relationship with [Set|Add|Remove]OutputFilter is utterly unintuitive
  and, from a user POV, broken.
* Tying it to ap_set_content_type is, to say the least, hairy.
  IMO we shouldn't *require* modules to call this, and it's utterly
  unreasonable to expect that it will never be called more than once
  for a request, given the number of modules that might take an interest.
  Especially when subrequests and internal redirects may be involved.
* It's a complexity just waiting for modules to break on it.

I've made some more updates to mod_filter since I last posted on the
subject, and I'm getting some very positive feedback from real users.
For 2.2 I'd like to remove AddOutputFilterByType entirely, replacing
it with mod_filter.

mod_filter can also obsolete [Set|Add|Remove]OutputFilter, though I'm
in no hurry to do that.  What I can also do is re-implement all the
outputfilter directives within mod_filter and its updated framework.

-- 
Nick Kew


Re: AddOutputFilterByType oddness

2004-09-18 Thread Justin Erenkrantz
--On Thursday, September 16, 2004 5:11 PM +0100 Joe Orton <[EMAIL PROTECTED]> 
wrote:

But ap_add_output_filters_by_type() explicitly does nothing for a
proxied request.  Anyone know why?  "AddOutputFilterByType DEFLATE
text/plain text/html" seems to work as expected here for a forward proxy
with this applied: maybe I'm missing something fundamental...
My recollection is initially it didn't have the proxy check, then FirstBill 
had a reason why proxied requests shouldn't work with AddOutputFilterByType.

Would have to search the archives to remember why.  *sigh*  -- justin


Re: AddOutputFilterByType oddness

2004-09-16 Thread Joe Orton
On Wed, Aug 25, 2004 at 02:40:39PM +0200, Graham Leggett wrote:
> Justin Erenkrantz wrote:
> >Ultimately, all that is needed is a call to ap_set_content_type() before 
> >any bytes are written to the client to get AddOutputFilterByType to 
> >work. Perhaps with the recent momentum behind mod_proxy work, someone 
> >could investigate that and get mod_proxy fixed.
> 
> ap_set_content_type() is called on line 769 of proxy_http.c:

But ap_add_output_filters_by_type() explicitly does nothing for a
proxied request.  Anyone know why?  "AddOutputFilterByType DEFLATE
text/plain text/html" seems to work as expected here for a forward proxy
with this applied: maybe I'm missing something fundamental...

--- server/core.c~  2004-08-31 09:16:56.0 +0100
+++ server/core.c   2004-09-16 16:48:09.0 +0100
@@ -2875,11 +2875,10 @@
 conf = (core_dir_config *)ap_get_module_config(r->per_dir_config,
&core_module);
 
-/* We can't do anything with proxy requests, no content-types or if
- * we don't have a filter configured.
+/* We can't do anything with no content-type or if we don't have a
+ * filter configured.
  */
-if (r->proxyreq != PROXYREQ_NONE || !r->content_type ||
-!conf->ct_output_filters) {
+if (!r->content_type || !conf->ct_output_filters) {
 return;
 }
 



Re: AddOutputFilterByType oddness

2004-08-25 Thread Graham Leggett
Justin Erenkrantz wrote:
Putting on an end user hat I see no reason why AddOutputFilterByType
shouldn't do exactly what it says it does.

I believe it has more to do with mod_proxy than the filter design.  No 
one, at the time we added AddOutputFilterByType, wanted to rewrite 
mod_proxy to be knowledgeable about filters.
I wrote mod_proxy to be knowledgeable about filters shortly after v2.0 
came about, it was one of the first major modules to support filters.

Ultimately, all that is needed is a call to ap_set_content_type() before 
any bytes are written to the client to get AddOutputFilterByType to 
work. Perhaps with the recent momentum behind mod_proxy work, someone 
could investigate that and get mod_proxy fixed.
ap_set_content_type() is called on line 769 of proxy_http.c:
if ((buf = apr_table_get(r->headers_out, "Content-Type"))) {
ap_set_content_type(r, apr_pstrdup(p, buf));
}
Is there anything else that needs to be done to make 
AddOutputFilterByType to work?

Is apr_table_get() case sensitive?
Regards,
Graham
--


smime.p7s
Description: S/MIME Cryptographic Signature


Re: AddOutputFilterByType oddness

2004-08-24 Thread Jess Holle
If I understand this correctly this is a necessity for 
mod_proxy/mod_proxy_ajp to replace mod_jk else this would be a 
significant regression from mod_jk (wherein this issue was fixed last 
year as I recall).

--
Jess Holle
Justin Erenkrantz wrote:
--On Tuesday, August 24, 2004 12:20 PM +0200 Graham Leggett 
<[EMAIL PROTECTED]> wrote:

Putting on an end user hat I see no reason why AddOutputFilterByType
shouldn't do exactly what it says it does.
I believe it has more to do with mod_proxy than the filter design.  No 
one, at the time we added AddOutputFilterByType, wanted to rewrite 
mod_proxy to be knowledgeable about filters.

Ultimately, all that is needed is a call to ap_set_content_type() 
before any bytes are written to the client to get 
AddOutputFilterByType to work. Perhaps with the recent momentum behind 
mod_proxy work, someone could investigate that and get mod_proxy 
fixed.  -- justin



Re: AddOutputFilterByType oddness

2004-08-24 Thread Justin Erenkrantz
--On Tuesday, August 24, 2004 12:20 PM +0200 Graham Leggett 
<[EMAIL PROTECTED]> wrote:

Putting on an end user hat I see no reason why AddOutputFilterByType
shouldn't do exactly what it says it does.
I believe it has more to do with mod_proxy than the filter design.  No one, 
at the time we added AddOutputFilterByType, wanted to rewrite mod_proxy to 
be knowledgeable about filters.

Ultimately, all that is needed is a call to ap_set_content_type() before 
any bytes are written to the client to get AddOutputFilterByType to work. 
Perhaps with the recent momentum behind mod_proxy work, someone could 
investigate that and get mod_proxy fixed.  -- justin


Re: AddOutputFilterByType oddness

2004-08-24 Thread Geoffrey Young


Graham Leggett wrote:
> Nick Kew wrote:
> 
>>> I have just set up the most recent httpd v2.0.51-dev tree, and have
>>> configured a filter that strips leading whitespace from HTML:
>>>
>>> AddOutputFilterByType STRIP text/html
>>>
>>> The content is served by mod_proxy.
> 
> 
>> As it stands, that can't work.
> 
> 
> Then as it stands filter's are broken.
> 
> Putting on an end user hat I see no reason why AddOutputFilterByType
> shouldn't do exactly what it says it does.

for the record, I've know of other cases where AddOutputFilterByType just
doesn't cut it, specifically wrt filter_init.  see

  http://marc.theaimsgroup.com/?l=apache-httpd-dev&m=107090791508163&w=2

for more details.  while I'm trying to address a separate issue there, the
example tarball shows that AddOutputFiltersByType is broken for even some
core module setups.

HTH

--Geoff


Re: AddOutputFilterByType oddness

2004-08-24 Thread Graham Leggett
Nick Kew wrote:
I have just set up the most recent httpd v2.0.51-dev tree, and have
configured a filter that strips leading whitespace from HTML:
AddOutputFilterByType STRIP text/html
The content is served by mod_proxy.

As it stands, that can't work.
Then as it stands filter's are broken.
Putting on an end user hat I see no reason why AddOutputFilterByType 
shouldn't do exactly what it says it does.

It's a manifestation of the problem I'm addressing by reviewing
the filter architecture: see http://www.apachetutor.org/dev/smart-filter
and the "Ideas for smart filtering" thread here.
Reading the above, it seems that people are alergic to having filters 
look at headers to decide whether they should be valid or not.

Having a totally generic non HTTP filter sounds like a nice idea, but in 
practice it's a real pain in the ass. The filters need the knowledge 
contained in the headers regardless otherwise they simply won't work. 
They can either access the headers directly, or they can access some 
generic interface that warps the headers into something generic for the 
filters to access. Right now it seems filters do neither.

This is really annoying for an end user. Having developed the filter we 
need for our application, we deploy it and now find we cannot use it. 
For us it's back to the drawing board. :(

Regards,
Graham
--


smime.p7s
Description: S/MIME Cryptographic Signature


Re: AddOutputFilterByType oddness

2004-08-24 Thread Nick Kew
On Tue, 24 Aug 2004, Nick Kew wrote:

> I actually have an implementation based on the discussion document and
> addressing the concerns people raised in the thread.  I hope to find
> time to finish the accompanying documentation and post it here round
> about this coming weekend.

OK, since you seem to have a real-life use for it, here goes.  As I
said before, I wasn't planning to post without a little more testing
and accompanying documents and discussion, but what the ?
I'm sure I'll regret this premature posting 

Mini-Synopsis:


# 1. Declare a smart filter that dispatches on Content-Type
FilterDeclare   myfilterContent-Type


# 2. Declare your filter as a Provider, to run whenever Content-Type
#includes the string "text/html"
FilterProvider  myfilterSTRIP   $text/html


# 3. Set the smart filter chain to this filter where you want to apply it

FilterChain =myfilter


-- 
Nick Kew/*  Copyright (C) 2004 Nick Kew

This is experimental code.  It may be copied and used only for
evaluation and testing purposes.

The copyright holder offers to the Apache Software Foundation
permission to re-license this code under the ASF license. 
This offer applies if and when the ASF accepts this code or
any derived work for inclusion in a future release of HTTPD.

Regardless of the above, the author undertakes to release the
work under a recognised open-source license in due course.
Information will be available at http://apache.webthing.com/
and/or http://dev.apache.org/~niq/
*/
#include 
#include 

/* apache */
#include 
#include 
#include 
#include 
#include 
#include 

module AP_MODULE_DECLARE_DATA filter_module ;


#ifndef NO_PROTOCOL
#define PROTO_CHANGE 0x1
#define PROTO_CHANGE_LENGTH 0x2
#define PROTO_NO_BYTERANGE 0x4
#define PROTO_NO_PROXY 0x8
#define PROTO_NO_CACHE 0x10
#define PROTO_TRANSFORM 0x20
#endif

typedef apr_status_t (*filter_func_t)(ap_filter_t*, apr_bucket_brigade*) ;

typedef struct {
  const char* name ;
  filter_func_t func ;
  void* fctx ;
} harness_ctx ;

typedef struct mod_filter_provider {
  enum {
STRING_MATCH,
STRING_CONTAINS,
REGEX_MATCH,
INT_EQ,
INT_LE,
INT_GE,
DEFINED
  } match_type ;
  union {
const char* c ;
regex_t* r ;
int i ;
  } match ;
  ap_filter_rec_t* frec ;
  struct mod_filter_provider* next ;
#ifndef NO_PROTOCOL
  unsigned int proto_flags ;
#endif
} mod_filter_provider ;

typedef struct {
  ap_filter_rec_t frec ;
  enum {
REQUEST_HEADERS,
RESPONSE_HEADERS,
SUBPROCESS_ENV,
CONTENT_TYPE
  } dispatch ;
  const char* value ;
  mod_filter_provider* providers ;
#ifndef NO_PROTOCOL
  unsigned int proto_flags ;
  const char* range ;
#endif
} mod_filter_rec ;

typedef struct mod_filter_chain {
  const char* fname ;
  struct mod_filter_chain* next ;
} mod_filter_chain ;

typedef struct {
  apr_hash_t* live_filters ;
  mod_filter_chain* chain ;
} mod_filter_cfg ;

static int filter_init(ap_filter_t* f) {
  mod_filter_provider* p ;
  int err ;
  harness_ctx* ctx = f->ctx ;
  mod_filter_cfg* cfg
= ap_get_module_config(f->r->per_dir_config, &filter_module);
  mod_filter_rec* filter
= apr_hash_get(cfg->live_filters, ctx->name, APR_HASH_KEY_STRING) ;
  for ( p = filter->providers ; p ; p = p->next ) {
if ( p->frec->filter_init_func ) {
  if ( err =  p->frec->filter_init_func(f), err != OK ) {
break ; /* if anyone errors out here, so do we */
  }
}
  }
  return err ;
}
static filter_func_t filter_lookup(request_rec* r, mod_filter_rec* filter) {
  mod_filter_provider* provider ;
  const char* str ;
  const char* cachecontrol ;
  int match ;
  unsigned int proto_flags ;

  /* Check registered providers in order */
  for ( provider = filter->providers; provider; provider = provider->next) {
match = 1 ;
switch ( filter->dispatch ) {
  case REQUEST_HEADERS:
str = apr_table_get(r->headers_in, filter->value) ;
break ;
  case RESPONSE_HEADERS:
str = apr_table_get(r->headers_out, filter->value) ;
break ;
  case SUBPROCESS_ENV:
str = apr_table_get(r->subprocess_env, filter->value) ;
break ;
  case CONTENT_TYPE:
str = r->content_type ;
break ;
}
/* treat nulls so we don't have to check every strcmp individually
 Not sure if there's anything better to do with them
*/
if ( str == NULL ) {
  if ( provider->match_type == DEFINED ) {
if ( provider->match.c != NULL ) {
  match = 0 ;
}
  }
} else if ( provider->match.c == NULL ) {
  match = 0 ;
} else {
/* Now we have no nulls, so we can do string and regexp matching */
  switch ( provider->match_type ) {
case STRING_MATCH:
  if ( strcasecmp(str, provider->match.c) ) {
match = 0 ;
  }
  break ;
case STRING_CONTA

Re: AddOutputFilterByType oddness

2004-08-24 Thread Graham Leggett
William A. Rowe, Jr. wrote:
Is your DefaultType set to text/html?
It's set like so:
DefaultType text/plain

You are proxying content?  What does the HEAD /image.gif HTTP/1.0
report for content type from the backend server?
It says this:
[EMAIL PROTECTED] root]# telnet gatekeeper.fma.co.za 80
Trying 196.30.143.210...
Connected to gatekeeper.fma.co.za.
Escape character is '^]'.
HEAD /patricia/policy/images/tabaccounting1.gif HTTP/1.1
Host: gatekeeper.fma.co.za
HTTP/1.1 200 OK
Date: Tue, 24 Aug 2004 10:05:54 GMT
Server: Apache-Coyote/1.1
ETag: W/"1636-1092965561000"
Last-Modified: Fri, 20 Aug 2004 01:32:41 GMT
Content-Type: image/gif
Connection: close
Connection closed by foreign host.
Regards,
Graham
--


smime.p7s
Description: S/MIME Cryptographic Signature


Re: AddOutputFilterByType oddness

2004-08-24 Thread Nick Kew
On Tue, 24 Aug 2004, Graham Leggett wrote:

> I have just set up the most recent httpd v2.0.51-dev tree, and have
> configured a filter that strips leading whitespace from HTML:
>
> AddOutputFilterByType STRIP text/html
>
> The content is served by mod_proxy.

As it stands, that can't work.

It's a manifestation of the problem I'm addressing by reviewing
the filter architecture: see http://www.apachetutor.org/dev/smart-filter
and the "Ideas for smart filtering" thread here.

I actually have an implementation based on the discussion document and
addressing the concerns people raised in the thread.  I hope to find
time to finish the accompanying documentation and post it here round
about this coming weekend.

> http://httpd.apache.org/docs-2.0/mod/core.html#addoutputfilterbytype
>
> it says that filters are not applied by proxied requests (It does not
> give a reason why not).

The URL above makes it clear what's happening there.

-- 
Nick Kew


Re: AddOutputFilterByType oddness

2004-08-23 Thread William A. Rowe, Jr.
At 07:23 PM 8/23/2004, Graham Leggett wrote:
>Paul Querna wrote:
>
>>Is your DefaultType set to text/html?
>
>It's set like so:
>
>DefaultType text/plain

You are proxying content?  What does the HEAD /image.gif HTTP/1.0
report for content type from the backend server?

Bill




Re: AddOutputFilterByType oddness

2004-08-23 Thread Graham Leggett
Paul Querna wrote:
Is your DefaultType set to text/html?
It's set like so:
DefaultType text/plain
On Tue, 2004-08-24 at 01:54 +0200, Graham Leggett wrote:
Hi all,
I have just set up the most recent httpd v2.0.51-dev tree, and have 
configured a filter that strips leading whitespace from HTML:

AddOutputFilterByType STRIP text/html
The content is served by mod_proxy.
This seems to work fine for HTML requests, but I have noticed that this 
filter is also being applied to images as well (thus corrupting them). 
Why would the above directive apply to all content, instead of text/html 
only as is configured?

Looking at the following docs:
http://httpd.apache.org/docs-2.0/mod/core.html#addoutputfilterbytype
it says that filters are not applied by proxied requests (It does not 
give a reason why not). From the test above however this statement is 
false, filters are applied to proxy requests - all proxied requests.

Am I doing something wrong, or is AddOutputFilterByType broken?
Regards,
Graham
--


smime.p7s
Description: S/MIME Cryptographic Signature


Re: AddOutputFilterByType oddness

2004-08-23 Thread Paul Querna
Is your DefaultType set to text/html?

On Tue, 2004-08-24 at 01:54 +0200, Graham Leggett wrote:
> Hi all,
> 
> I have just set up the most recent httpd v2.0.51-dev tree, and have 
> configured a filter that strips leading whitespace from HTML:
> 
> AddOutputFilterByType STRIP text/html
> 
> The content is served by mod_proxy.
> 
> This seems to work fine for HTML requests, but I have noticed that this 
> filter is also being applied to images as well (thus corrupting them). 
> Why would the above directive apply to all content, instead of text/html 
> only as is configured?
> 
> Looking at the following docs:
> 
> http://httpd.apache.org/docs-2.0/mod/core.html#addoutputfilterbytype
> 
> it says that filters are not applied by proxied requests (It does not 
> give a reason why not). From the test above however this statement is 
> false, filters are applied to proxy requests - all proxied requests.
> 
> Am I doing something wrong, or is AddOutputFilterByType broken?
> 
> Regards,
> Graham
> --



Re: AddOutputFilterByType and mod_jk [was Re: mod_deflate with mod_jk]

2003-01-06 Thread Henri Gomez
Aditya wrote:

On Mon, 09 Dec 2002 16:29:40 +0100, Henri Gomez <[EMAIL PROTECTED]> said:
BTW, I updated mod_jk 1.2.2-dev, 2.0.4-dev and also mod_webapp to
set the content type the correct way, previously there was a direct
set of content-type and I now use ap_set_content_type :




hgomez 2002/12/09 05:19:18

  Modified: jk/native/apache-2.0 mod_jk.c Log: Make jk works with
filters in Apache 2.0, ie mod_deflate and

  AddOutputFilterByType DEFLATE text/html.



I can confirm this now "does the right thing" with Apache 2.0.39 under
FreeBSD running mod_jk from CVS HEAD and mod-xslt (www.mod-xslt.com)
as an output filter for content of type text/xml


Thanks for the confirmation.

BTW, Happy New Year to all of you





RE: AddOutputFilterByType

2002-03-05 Thread William A. Rowe, Jr.


> > > What would be most cool is to set an r->replace_request member to
>the
> > > subrequest we will run.  Then in the run request phase, look at
> > > replace_request and run the insert_filters/run_handler against that
> > > replacement.
> >
> > That could be goodness.  I agree that I'm not sure if it is really
> > that trivial.  Perhaps.  Well, actually, I know it won't be since
> > some filters do different things when r->main is set.  I can see
> > rbb saying, "Those filters are broken."  Is that the case?  -- justin
>
>No that isn't the case, and the fix isn't that trivial.  It isn't
>impossible, but it adds an if to every single hook call.

Or would it... your griping about it [both of you] suddenly sparked some
insight... if the return code is != 0 then we test the various cases.

The mainline OK case remains unhindered :)

Bill




RE: AddOutputFilterByType

2002-03-05 Thread Ryan Bloom


> > Rbb and I chatted about this earlier today.  It seems like once the
> decision
> > is reached that we have a fast internal redirect, we should _stop_
> > processing
> > that main request.  Obviously some well defined return value from
the
> hook
> > fn, similar to DONE but not quite the same.
> 
> Ahem, wasn't I saying that?  ;-)  Yes, I think we need this too.
> 
> > What would be most cool is to set an r->replace_request member to
the
> > subrequest we will run.  Then in the run request phase, look at
> > replace_request and run the insert_filters/run_handler against that
> > replacement.
> 
> That could be goodness.  I agree that I'm not sure if it is really
> that trivial.  Perhaps.  Well, actually, I know it won't be since
> some filters do different things when r->main is set.  I can see
> rbb saying, "Those filters are broken."  Is that the case?  -- justin

No that isn't the case, and the fix isn't that trivial.  It isn't
impossible, but it adds an if to every single hook call.

Ryan





Re: AddOutputFilterByType

2002-03-05 Thread Justin Erenkrantz

On Tue, Mar 05, 2002 at 04:43:50PM -0600, William A. Rowe, Jr. wrote:
> Cut it out already :-)

I'll try.

> Rbb and I chatted about this earlier today.  It seems like once the decision
> is reached that we have a fast internal redirect, we should _stop_ 
> processing
> that main request.  Obviously some well defined return value from the hook
> fn, similar to DONE but not quite the same.

Ahem, wasn't I saying that?  ;-)  Yes, I think we need this too.

> What would be most cool is to set an r->replace_request member to the
> subrequest we will run.  Then in the run request phase, look at
> replace_request and run the insert_filters/run_handler against that 
> replacement.

That could be goodness.  I agree that I'm not sure if it is really
that trivial.  Perhaps.  Well, actually, I know it won't be since
some filters do different things when r->main is set.  I can see
rbb saying, "Those filters are broken."  Is that the case?  -- justin




Re: AddOutputFilterByType

2002-03-05 Thread William A. Rowe, Jr.

At 04:29 PM 3/5/2002, you wrote:

>Also, why does mod_negotiation handle fixups in type_checker and
>mod_dir does it in fixups?  I'd suggest that we should move the
>handle_multi call to be a fixups so that we are consistent where
>our redirect calls occur.  -- justin

Cut it out already :-)

We don't know if some other module is -interested- in that dir... so mod_dir
will wait till fixups to say "hey - I can do that!" and auth is run for the 
dir as
well as its index.html.

Negotiation decides way up in type_checker that yea - that's a multiview,
and this is what we will do with it.

Rbb and I chatted about this earlier today.  It seems like once the decision
is reached that we have a fast internal redirect, we should _stop_ processing
that main request.  Obviously some well defined return value from the hook
fn, similar to DONE but not quite the same.

What would be most cool is to set an r->replace_request member to the
subrequest we will run.  Then in the run request phase, look at
replace_request and run the insert_filters/run_handler against that 
replacement.

Need to spend 2 hours away from work + apache ... I'll check in later and
see if the solution is really that trivial.

Bill




Re: AddOutputFilterByType

2002-03-05 Thread Justin Erenkrantz

On Tue, Mar 05, 2002 at 07:04:43AM -0800, Ryan Bloom wrote:
> 
> Why is this thing being run in the fixups phase?  The whole point of the
> insert_filters phase is to insert filters for the given resource.  Why
> are we trying to insert resource based filters in the fixups?
> Especially given that the resource can change during the fixups phase.

True, this is where OtherBill suggested to me where it should go.
But now that I understand our filtering hooks a bit better, I now
agree that insert_filter makes more sense.  However, should we
attempt to abstract setting r->content_type so that we can intercept
whenever the content_type is modified and add the right filters as
dictated by AddOutputFilterByType?  I have some reservations about
that though, but it might solve our filter changing the content-type
on us.  

Also, why does mod_negotiation handle fixups in type_checker and
mod_dir does it in fixups?  I'd suggest that we should move the
handle_multi call to be a fixups so that we are consistent where
our redirect calls occur.  -- justin