Re: AddOutputFilterByType in Apache 2.4 inserts filters as AP_FTYPE_RESOURCE
Hi Nick, if the patch looks good, as you wrote, what is needed to get it applied to trunk and backported to 2.4.x? Have you seen my follow-up questions in the other mail? Best regards, Micha Am 13.01.2016 22:44, schrieb Nick Kew: On Wed, 2016-01-13 at 17:59 +0100, Micha Lenk wrote: Hi, The directive AddOutputFilterByType can be used to insert filters to the output filter chain depending on the content type of the HTTP response. So far so good. PROBLEM DESCRIPTION This is probably worth a bugzilla entry. I think I can clarify a little of what's happened. AddOutputFilterByType was something of a hacked afterthought to filtering back in the days of httpd 2.0. On the one hand, it met a need. On the other hand, it worked only in a very limited range of situations where the content type was known at the time the filter was to be added. It had no capacity to respond to other aspects of the content, or indeed the request/response. And there were other issues. Then came mod_filter and a generalised framework. AddOutputFilterByType was now obsolete, but too widely-used to dispense with entirely. As a compromise it was re-implemented within mod_filter, where it could co-exist with other dynamic filter configuration. Your observation tells us the semantics aren't quite compatible. And your patch looks good - thanks.
Re: AddOutputFilterByType in Apache 2.4 inserts filters as AP_FTYPE_RESOURCE
Hi Nick, Am 13.01.2016 22:44, schrieb Nick Kew: PROBLEM DESCRIPTION This is probably worth a bugzilla entry. Done. https://bz.apache.org/bugzilla/show_bug.cgi?id=58856 Nick, would you mind to provide some insights on these comments from my initial mail: For setups with both, FilterDeclare and AddOutputFilterByType (as described above as fix), I observed some issues with properly merging the two filter harnesses. However, I have no clue what semantics the original author wanted to have in this situation. Assumed that my patch gets applied, the filter type should be correctly set in the filter harness. But what if a user wants to override it? I got a few questions in this context: 1.) Should "FilterDeclare" with filter-name "BYTYPE:DEFLATE" (i.e. colliding with the implicit filter-name created by AddOutputFilterByName) be supported at all? 1a.) If yes, the handling of AddOutputFilterByType needs to be fixed so that: - a globally configured FilterDeclare is also effective for an AddOutputFilterByType within a (location) sub-container. - a filter type set by "FilterDeclare BYTYPE: type>" does not get overwritten by a subsequent "AddOutputFilterByType ". 1b.) If no, the code should detect and reject such configurations. 2.) On a related note to 1., Should FilterDeclare allow a filter-name of existing filter providers at all? If yes, behavior would we expect? Nick, I would be really glad if you could share your thoughts. Best regards, Micha
Re: AddOutputFilterByType in Apache 2.4 inserts filters as AP_FTYPE_RESOURCE
On Wed, 2016-01-13 at 17:59 +0100, Micha Lenk wrote: > Hi, > > The directive AddOutputFilterByType can be used to insert filters to the > output filter chain depending on the content type of the HTTP response. > So far so good. > > PROBLEM DESCRIPTION This is probably worth a bugzilla entry. I think I can clarify a little of what's happened. AddOutputFilterByType was something of a hacked afterthought to filtering back in the days of httpd 2.0. On the one hand, it met a need. On the other hand, it worked only in a very limited range of situations where the content type was known at the time the filter was to be added. It had no capacity to respond to other aspects of the content, or indeed the request/response. And there were other issues. Then came mod_filter and a generalised framework. AddOutputFilterByType was now obsolete, but too widely-used to dispense with entirely. As a compromise it was re-implemented within mod_filter, where it could co-exist with other dynamic filter configuration. Your observation tells us the semantics aren't quite compatible. And your patch looks good - thanks. -- Nick Kew
Re: AddOutputFilterByType vs. proxy in 2.0.x
Eric Covener wrote: > > I'd like to backport to 2.0.x but it seems like there was a little bit > of caution in the 2.2.x/trunk commits. Propose it in STATUS and let's see where that goes.
Re: AddOutputFilterByType oddness
--On Thursday, September 23, 2004 12:43 PM +0100 Nick Kew <[EMAIL PROTECTED]> wrote: Basically it does the lookup/dispatch once per filter in the filterchain per request. It checks that filter's providers until it finds a match. So for anything you could do with an [Add|Set]OutputFilter[ByType] that's one lookup per request. Okay, so if I have three rules and ten filters, we'll be doing thirty checks, right? And, this will happen even if mod_filter isn't configured - as mod_filter still needs to check ten times that it doesn't have anything to do, right? Hmm. How expensive is this again? mod_filter takes the content-type as it is at that point in the chain. Isn't the real nightmare where a filter calls ap_set_content_type and some AddOutputFilterByTypes are in effect? I guess what *really* bothers me is the idea of adding filters *as a side-effect*. How wouldn't it be a side-effect? It's intentional from the admin perspective, but a side-effect from the developer's perspective. And, then mod_deflate needs to be conditionally added (sub-case #1: it needs to be added for 'text/plain'; sub-case #2: it needs to be added for 'text/html'). How and where is it added? Are you inserting dummy filters? I'm not sure I follow. It will dispatch to deflate based on the content-type (or other dispatch criterion) as it is at that point in the chain. The question is at which point in the chain does deflate get added? So if the handler sets application/xml but that goes through an XSLT filter which sets it to text/html, then mod_filter sees application/xml if it's before the XSLT filter in the chain, or text/html after it. How can AddOutputFilterByType expect to cope with that? I thought you suggested that mod_filter could easily handle this case. I'm still not seeing how. But FWIW I have that working locally with FilterDeclare filter1 Content-TypeCONTENT_SET FilterDeclare filter2 Content-Length CONTENT_SET FilterProvider filter1 filter2 $text FilterProvider filter2 DEFLATE >4000 FilterChain filter1 to deflate all "text/*" documents of 4k or greater. Can I comment that I think a clearer configuration syntax is going to be needed if we are going to axe all of the current filter directives? AddOutputFilterByType, for all of its internal oddness, is a simple directive for an administrator to understand. So, perhaps keep 'AddOutputFilterByType' and having it internally converted to a mod_filter directive. But, I'm just not overly excited about moving all filter configuration directives to something akin to mod_rewrite. Ouch. -- justin
Re: AddOutputFilterByType oddness
On Wed, 22 Sep 2004, Justin Erenkrantz wrote: > --On Wednesday, September 22, 2004 6:17 PM +0100 Nick Kew > <[EMAIL PROTECTED]> wrote: > > > It seems to me heavily counterintuitive that mixing ByType directives > > with anything else means that the ByType filters *always* come last. > > And that Remove won't affect them, but will affect others. > > I think we could get Remove*Filter to also delete the content-type filters. > > > Indeed. mod_filter addresses this by configuring at the last moment, > > so any earlier set_content_type()s are irrelevant. I don't suppose it's > > a panacaea for everything, but I do think it's a significant improvement > > on what we have. > > I'm concerned about the overhead of mod_filter having to check all of its > rules each time a filter is invoked. This is why I started to look through > the code last night to see how it worked and how invasive it is. It's improving with time (except when I introduce bugs...). Merging in the structs with util_filter saves on having to do superfluous lookups. Basically it does the lookup/dispatch once per filter in the filterchain per request. It checks that filter's providers until it finds a match. So for anything you could do with an [Add|Set]OutputFilter[ByType] that's one lookup per request. > How would you handle the situation when filter #1 sets C-T to be > "text/plain" and then filter #2 sets C-T to be "text/html"? mod_filter takes the content-type as it is at that point in the chain. Isn't the real nightmare where a filter calls ap_set_content_type and some AddOutputFilterByTypes are in effect? I guess what *really* bothers me is the idea of adding filters *as a side-effect*. > And, then > mod_deflate needs to be conditionally added (sub-case #1: it needs to be > added for 'text/plain'; sub-case #2: it needs to be added for 'text/html'). > How and where is it added? Are you inserting dummy filters? I'm not sure I follow. It will dispatch to deflate based on the content-type (or other dispatch criterion) as it is at that point in the chain. So if the handler sets application/xml but that goes through an XSLT filter which sets it to text/html, then mod_filter sees application/xml if it's before the XSLT filter in the chain, or text/html after it. How can AddOutputFilterByType expect to cope with that? > > > From the user's perspective, it's simply more powerful and flexible. > > Works with any request or response headers (not just content-type) or > > environment variables. Gets rid of constraints on ordering, like > > AddOutputFilterbyType filter always coming after other filters > > regardless of ordering in httpd.conf. > > > > Example: I have a user who wants to insert mod_deflate in a reverse > > proxy, but only for selected content-types AND not if the content > > length is below a threshold. How would he do that with the old filter > > framework? > > I guess I'm not clear what the syntax is (I guess I should go read the > docs). That particular scenario is complex, and requires mod_filter to be used as its own provider. The point is, we *can* now support complex setups (or will be - that chaining is still broken in CVS). But FWIW I have that working locally with FilterDeclare filter1 Content-TypeCONTENT_SET FilterDeclare filter2 Content-Length CONTENT_SET FilterProvider filter1 filter2 $text FilterProvider filter2 DEFLATE >4000 FilterChain filter1 to deflate all "text/*" documents of 4k or greater. > I definitely don't want to see the filters be configured like > mod_rewrite. It needs to be fairly straightforward, but still fairly > simplistic. I don't want to have users have to read a complicated manual > or docs to set up filters. KISS. Indeed. Do you think the examples in the manual page are too complex? Bear in mind that the third example is no more complex than the first two, yet suddenly enables a frequently-requested capability that simply isn't possible with the old filtering. > Well, the point by you committing it into our tree is that the rest of us > are now responsible for it. That's why I brought up the code style issue: OK, OKOK! I promise to look harder at the code style guidelines! And I _did_ ask on the list a couple of weeks before introducing to CVS. > I looked yesterday afternoon (and haven't seen any commits since then). I That'll be the latest version. Which FWIW was introduced prematurely because it introduced a new feature demanded by a user. Only that turned out to be broken, which is why I'm re-hacking that now. -- Nick Kew
Re: AddOutputFilterByType oddness
--On Wednesday, September 22, 2004 6:17 PM +0100 Nick Kew <[EMAIL PROTECTED]> wrote: It seems to me heavily counterintuitive that mixing ByType directives with anything else means that the ByType filters *always* come last. And that Remove won't affect them, but will affect others. I think we could get Remove*Filter to also delete the content-type filters. Indeed. mod_filter addresses this by configuring at the last moment, so any earlier set_content_type()s are irrelevant. I don't suppose it's a panacaea for everything, but I do think it's a significant improvement on what we have. I'm concerned about the overhead of mod_filter having to check all of its rules each time a filter is invoked. This is why I started to look through the code last night to see how it worked and how invasive it is. How would you handle the situation when filter #1 sets C-T to be "text/plain" and then filter #2 sets C-T to be "text/html"? And, then mod_deflate needs to be conditionally added (sub-case #1: it needs to be added for 'text/plain'; sub-case #2: it needs to be added for 'text/html'). How and where is it added? Are you inserting dummy filters? From the user's perspective, it's simply more powerful and flexible. Works with any request or response headers (not just content-type) or environment variables. Gets rid of constraints on ordering, like AddOutputFilterbyType filter always coming after other filters regardless of ordering in httpd.conf. Example: I have a user who wants to insert mod_deflate in a reverse proxy, but only for selected content-types AND not if the content length is below a threshold. How would he do that with the old filter framework? I guess I'm not clear what the syntax is (I guess I should go read the docs). I definitely don't want to see the filters be configured like mod_rewrite. It needs to be fairly straightforward, but still fairly simplistic. I don't want to have users have to read a complicated manual or docs to set up filters. KISS. From a developers perspective, I wrote it for myself, and have at least two other developers using it operationally in their product. Time will tell what others may use it for. Well, the point by you committing it into our tree is that the rest of us are now responsible for it. That's why I brought up the code style issue: we already have a number of modules that were never fully integrated or reviewed. And, then the person who dropped the code ran away and left the code in a goofy state. (See mod_rewrite, mod_ssl, mod_cache, etc.) When was that? I made quite a lot of updates to the style towards conforming (like eliminating tabs and realigning some braces) before committing to CVS, but I'm willing to believe I need to look more carefully. I looked yesterday afternoon (and haven't seen any commits since then). I will say the most distracting parts are the odd spacing (i.e. parenthesis and semi-colons) as well as line spacing. Unfortunately, I get distracted by shiny things such as improper code style such that I can't focus on the code itself. =) -- justin
Re: AddOutputFilterByType oddness
On Wed, 22 Sep 2004, Justin Erenkrantz wrote: > --On Wednesday, September 22, 2004 5:01 PM +0100 Nick Kew <[EMAIL PROTECTED]> > wrote: > > > I've said it before and I'll say it again: AddOutputFilterByType is > > fundamentally unsatisfactory. This confusion is an effect, not cause. > > Suffice to say, I disagree. > > > * Configuration is inconsistent with other filter directives. The > > relationship with [Set|Add|Remove]OutputFilter is utterly unintuitive > > and, from a user POV, broken. > > I think it's really clear from the user's perspective. I think the problem > comes in on the developer's side. It seems to me heavily counterintuitive that mixing ByType directives with anything else means that the ByType filters *always* come last. And that Remove won't affect them, but will affect others. > > * Tying it to ap_set_content_type is, to say the least, hairy. > > IMO we shouldn't *require* modules to call this, and it's utterly > > unreasonable to expect that it will never be called more than once > > for a request, given the number of modules that might take an interest. > > Especially when subrequests and internal redirects may be involved. > > We have *always* mandated that ap_set_content_type() should be called rather > than setting r->content_type. (I wish we could remove content_type from > request_rec instead.) Indeed. But that doesn't prevent it being called multiple times, perhaps from different modules. So using it to insert filters leaves lots of potantial for trouble. > > * It's a complexity just waiting for modules to break on it. > > Anything that depends upon content-type like this is going to be hairy because > there may be several 'right' answers during the course of the request. Indeed. mod_filter addresses this by configuring at the last moment, so any earlier set_content_type()s are irrelevant. I don't suppose it's a panacaea for everything, but I do think it's a significant improvement on what we have. > > I've made some more updates to mod_filter since I last posted on the > > subject, and I'm getting some very positive feedback from real users. > > For 2.2 I'd like to remove AddOutputFilterByType entirely, replacing > > it with mod_filter. > > I've yet to see a clear and concise statement as to how mod_filter will solve > this problem in a better and more efficient way. (Especially from a user's > perspective, but also from a developer's perspective.) >From the user's perspective, it's simply more powerful and flexible. Works with any request or response headers (not just content-type) or environment variables. Gets rid of constraints on ordering, like AddOutputFilterbyType filter always coming after other filters regardless of ordering in httpd.conf. Example: I have a user who wants to insert mod_deflate in a reverse proxy, but only for selected content-types AND not if the content length is below a threshold. How would he do that with the old filter framework? >From a developers perspective, I wrote it for myself, and have at least two other developers using it operationally in their product. Time will tell what others may use it for. > I will also comment that I looked in the mod_filter code the other day and was > disappointed that it doesn't follow our coding style at all or even have > comments that help people understand what it is trying to do inside the .c > file. When was that? I made quite a lot of updates to the style towards conforming (like eliminating tabs and realigning some braces) before committing to CVS, but I'm willing to believe I need to look more carefully. -- Nick Kew
Re: AddOutputFilterByType oddness
--On Wednesday, September 22, 2004 5:01 PM +0100 Nick Kew <[EMAIL PROTECTED]> wrote: I've said it before and I'll say it again: AddOutputFilterByType is fundamentally unsatisfactory. This confusion is an effect, not cause. Suffice to say, I disagree. * Configuration is inconsistent with other filter directives. The relationship with [Set|Add|Remove]OutputFilter is utterly unintuitive and, from a user POV, broken. I think it's really clear from the user's perspective. I think the problem comes in on the developer's side. * Tying it to ap_set_content_type is, to say the least, hairy. IMO we shouldn't *require* modules to call this, and it's utterly unreasonable to expect that it will never be called more than once for a request, given the number of modules that might take an interest. Especially when subrequests and internal redirects may be involved. We have *always* mandated that ap_set_content_type() should be called rather than setting r->content_type. (I wish we could remove content_type from request_rec instead.) * It's a complexity just waiting for modules to break on it. Anything that depends upon content-type like this is going to be hairy because there may be several 'right' answers during the course of the request. I've made some more updates to mod_filter since I last posted on the subject, and I'm getting some very positive feedback from real users. For 2.2 I'd like to remove AddOutputFilterByType entirely, replacing it with mod_filter. I've yet to see a clear and concise statement as to how mod_filter will solve this problem in a better and more efficient way. (Especially from a user's perspective, but also from a developer's perspective.) mod_filter can also obsolete [Set|Add|Remove]OutputFilter, though I'm in no hurry to do that. What I can also do is re-implement all the outputfilter directives within mod_filter and its updated framework. I will also comment that I looked in the mod_filter code the other day and was disappointed that it doesn't follow our coding style at all or even have comments that help people understand what it is trying to do inside the .c file. This all makes it very difficult to understand the code. I'd greatly appreciate it if mod_filter (and the code that you inserted elsewhere - i.e. in util_filter.c) would conform to our style guidelines and had some comments inside of it that say what it does (or trying to do). For example, some of the things it does just makes no sense at all. filter_bucket_type() is completely bogus and needs to be tossed. The type->name field in the bucket should be used instead. I'm getting annoyed by people doing massive code drops (i.e. mod_filter, mod_proxy, mod_auth_ldap, etc.) that don't conform to our code style and have no comments. It makes it much harder to go and fix bugs in 'em. -- justin
Re: AddOutputFilterByType oddness
On Sat, 18 Sep 2004, Justin Erenkrantz wrote: > > But ap_add_output_filters_by_type() explicitly does nothing for a > > proxied request. Anyone know why? "AddOutputFilterByType DEFLATE > > text/plain text/html" seems to work as expected here for a forward proxy > > with this applied: maybe I'm missing something fundamental... > > My recollection is initially it didn't have the proxy check, then FirstBill > had a reason why proxied requests shouldn't work with AddOutputFilterByType. I've said it before and I'll say it again: AddOutputFilterByType is fundamentally unsatisfactory. This confusion is an effect, not cause. * Configuration is inconsistent with other filter directives. The relationship with [Set|Add|Remove]OutputFilter is utterly unintuitive and, from a user POV, broken. * Tying it to ap_set_content_type is, to say the least, hairy. IMO we shouldn't *require* modules to call this, and it's utterly unreasonable to expect that it will never be called more than once for a request, given the number of modules that might take an interest. Especially when subrequests and internal redirects may be involved. * It's a complexity just waiting for modules to break on it. I've made some more updates to mod_filter since I last posted on the subject, and I'm getting some very positive feedback from real users. For 2.2 I'd like to remove AddOutputFilterByType entirely, replacing it with mod_filter. mod_filter can also obsolete [Set|Add|Remove]OutputFilter, though I'm in no hurry to do that. What I can also do is re-implement all the outputfilter directives within mod_filter and its updated framework. -- Nick Kew
Re: AddOutputFilterByType oddness
--On Thursday, September 16, 2004 5:11 PM +0100 Joe Orton <[EMAIL PROTECTED]> wrote: But ap_add_output_filters_by_type() explicitly does nothing for a proxied request. Anyone know why? "AddOutputFilterByType DEFLATE text/plain text/html" seems to work as expected here for a forward proxy with this applied: maybe I'm missing something fundamental... My recollection is initially it didn't have the proxy check, then FirstBill had a reason why proxied requests shouldn't work with AddOutputFilterByType. Would have to search the archives to remember why. *sigh* -- justin
Re: AddOutputFilterByType oddness
On Wed, Aug 25, 2004 at 02:40:39PM +0200, Graham Leggett wrote: > Justin Erenkrantz wrote: > >Ultimately, all that is needed is a call to ap_set_content_type() before > >any bytes are written to the client to get AddOutputFilterByType to > >work. Perhaps with the recent momentum behind mod_proxy work, someone > >could investigate that and get mod_proxy fixed. > > ap_set_content_type() is called on line 769 of proxy_http.c: But ap_add_output_filters_by_type() explicitly does nothing for a proxied request. Anyone know why? "AddOutputFilterByType DEFLATE text/plain text/html" seems to work as expected here for a forward proxy with this applied: maybe I'm missing something fundamental... --- server/core.c~ 2004-08-31 09:16:56.0 +0100 +++ server/core.c 2004-09-16 16:48:09.0 +0100 @@ -2875,11 +2875,10 @@ conf = (core_dir_config *)ap_get_module_config(r->per_dir_config, &core_module); -/* We can't do anything with proxy requests, no content-types or if - * we don't have a filter configured. +/* We can't do anything with no content-type or if we don't have a + * filter configured. */ -if (r->proxyreq != PROXYREQ_NONE || !r->content_type || -!conf->ct_output_filters) { +if (!r->content_type || !conf->ct_output_filters) { return; }
Re: AddOutputFilterByType oddness
Justin Erenkrantz wrote: Putting on an end user hat I see no reason why AddOutputFilterByType shouldn't do exactly what it says it does. I believe it has more to do with mod_proxy than the filter design. No one, at the time we added AddOutputFilterByType, wanted to rewrite mod_proxy to be knowledgeable about filters. I wrote mod_proxy to be knowledgeable about filters shortly after v2.0 came about, it was one of the first major modules to support filters. Ultimately, all that is needed is a call to ap_set_content_type() before any bytes are written to the client to get AddOutputFilterByType to work. Perhaps with the recent momentum behind mod_proxy work, someone could investigate that and get mod_proxy fixed. ap_set_content_type() is called on line 769 of proxy_http.c: if ((buf = apr_table_get(r->headers_out, "Content-Type"))) { ap_set_content_type(r, apr_pstrdup(p, buf)); } Is there anything else that needs to be done to make AddOutputFilterByType to work? Is apr_table_get() case sensitive? Regards, Graham -- smime.p7s Description: S/MIME Cryptographic Signature
Re: AddOutputFilterByType oddness
If I understand this correctly this is a necessity for mod_proxy/mod_proxy_ajp to replace mod_jk else this would be a significant regression from mod_jk (wherein this issue was fixed last year as I recall). -- Jess Holle Justin Erenkrantz wrote: --On Tuesday, August 24, 2004 12:20 PM +0200 Graham Leggett <[EMAIL PROTECTED]> wrote: Putting on an end user hat I see no reason why AddOutputFilterByType shouldn't do exactly what it says it does. I believe it has more to do with mod_proxy than the filter design. No one, at the time we added AddOutputFilterByType, wanted to rewrite mod_proxy to be knowledgeable about filters. Ultimately, all that is needed is a call to ap_set_content_type() before any bytes are written to the client to get AddOutputFilterByType to work. Perhaps with the recent momentum behind mod_proxy work, someone could investigate that and get mod_proxy fixed. -- justin
Re: AddOutputFilterByType oddness
--On Tuesday, August 24, 2004 12:20 PM +0200 Graham Leggett <[EMAIL PROTECTED]> wrote: Putting on an end user hat I see no reason why AddOutputFilterByType shouldn't do exactly what it says it does. I believe it has more to do with mod_proxy than the filter design. No one, at the time we added AddOutputFilterByType, wanted to rewrite mod_proxy to be knowledgeable about filters. Ultimately, all that is needed is a call to ap_set_content_type() before any bytes are written to the client to get AddOutputFilterByType to work. Perhaps with the recent momentum behind mod_proxy work, someone could investigate that and get mod_proxy fixed. -- justin
Re: AddOutputFilterByType oddness
Graham Leggett wrote: > Nick Kew wrote: > >>> I have just set up the most recent httpd v2.0.51-dev tree, and have >>> configured a filter that strips leading whitespace from HTML: >>> >>> AddOutputFilterByType STRIP text/html >>> >>> The content is served by mod_proxy. > > >> As it stands, that can't work. > > > Then as it stands filter's are broken. > > Putting on an end user hat I see no reason why AddOutputFilterByType > shouldn't do exactly what it says it does. for the record, I've know of other cases where AddOutputFilterByType just doesn't cut it, specifically wrt filter_init. see http://marc.theaimsgroup.com/?l=apache-httpd-dev&m=107090791508163&w=2 for more details. while I'm trying to address a separate issue there, the example tarball shows that AddOutputFiltersByType is broken for even some core module setups. HTH --Geoff
Re: AddOutputFilterByType oddness
Nick Kew wrote: I have just set up the most recent httpd v2.0.51-dev tree, and have configured a filter that strips leading whitespace from HTML: AddOutputFilterByType STRIP text/html The content is served by mod_proxy. As it stands, that can't work. Then as it stands filter's are broken. Putting on an end user hat I see no reason why AddOutputFilterByType shouldn't do exactly what it says it does. It's a manifestation of the problem I'm addressing by reviewing the filter architecture: see http://www.apachetutor.org/dev/smart-filter and the "Ideas for smart filtering" thread here. Reading the above, it seems that people are alergic to having filters look at headers to decide whether they should be valid or not. Having a totally generic non HTTP filter sounds like a nice idea, but in practice it's a real pain in the ass. The filters need the knowledge contained in the headers regardless otherwise they simply won't work. They can either access the headers directly, or they can access some generic interface that warps the headers into something generic for the filters to access. Right now it seems filters do neither. This is really annoying for an end user. Having developed the filter we need for our application, we deploy it and now find we cannot use it. For us it's back to the drawing board. :( Regards, Graham -- smime.p7s Description: S/MIME Cryptographic Signature
Re: AddOutputFilterByType oddness
On Tue, 24 Aug 2004, Nick Kew wrote: > I actually have an implementation based on the discussion document and > addressing the concerns people raised in the thread. I hope to find > time to finish the accompanying documentation and post it here round > about this coming weekend. OK, since you seem to have a real-life use for it, here goes. As I said before, I wasn't planning to post without a little more testing and accompanying documents and discussion, but what the ? I'm sure I'll regret this premature posting Mini-Synopsis: # 1. Declare a smart filter that dispatches on Content-Type FilterDeclare myfilterContent-Type # 2. Declare your filter as a Provider, to run whenever Content-Type #includes the string "text/html" FilterProvider myfilterSTRIP $text/html # 3. Set the smart filter chain to this filter where you want to apply it FilterChain =myfilter -- Nick Kew/* Copyright (C) 2004 Nick Kew This is experimental code. It may be copied and used only for evaluation and testing purposes. The copyright holder offers to the Apache Software Foundation permission to re-license this code under the ASF license. This offer applies if and when the ASF accepts this code or any derived work for inclusion in a future release of HTTPD. Regardless of the above, the author undertakes to release the work under a recognised open-source license in due course. Information will be available at http://apache.webthing.com/ and/or http://dev.apache.org/~niq/ */ #include #include /* apache */ #include #include #include #include #include #include module AP_MODULE_DECLARE_DATA filter_module ; #ifndef NO_PROTOCOL #define PROTO_CHANGE 0x1 #define PROTO_CHANGE_LENGTH 0x2 #define PROTO_NO_BYTERANGE 0x4 #define PROTO_NO_PROXY 0x8 #define PROTO_NO_CACHE 0x10 #define PROTO_TRANSFORM 0x20 #endif typedef apr_status_t (*filter_func_t)(ap_filter_t*, apr_bucket_brigade*) ; typedef struct { const char* name ; filter_func_t func ; void* fctx ; } harness_ctx ; typedef struct mod_filter_provider { enum { STRING_MATCH, STRING_CONTAINS, REGEX_MATCH, INT_EQ, INT_LE, INT_GE, DEFINED } match_type ; union { const char* c ; regex_t* r ; int i ; } match ; ap_filter_rec_t* frec ; struct mod_filter_provider* next ; #ifndef NO_PROTOCOL unsigned int proto_flags ; #endif } mod_filter_provider ; typedef struct { ap_filter_rec_t frec ; enum { REQUEST_HEADERS, RESPONSE_HEADERS, SUBPROCESS_ENV, CONTENT_TYPE } dispatch ; const char* value ; mod_filter_provider* providers ; #ifndef NO_PROTOCOL unsigned int proto_flags ; const char* range ; #endif } mod_filter_rec ; typedef struct mod_filter_chain { const char* fname ; struct mod_filter_chain* next ; } mod_filter_chain ; typedef struct { apr_hash_t* live_filters ; mod_filter_chain* chain ; } mod_filter_cfg ; static int filter_init(ap_filter_t* f) { mod_filter_provider* p ; int err ; harness_ctx* ctx = f->ctx ; mod_filter_cfg* cfg = ap_get_module_config(f->r->per_dir_config, &filter_module); mod_filter_rec* filter = apr_hash_get(cfg->live_filters, ctx->name, APR_HASH_KEY_STRING) ; for ( p = filter->providers ; p ; p = p->next ) { if ( p->frec->filter_init_func ) { if ( err = p->frec->filter_init_func(f), err != OK ) { break ; /* if anyone errors out here, so do we */ } } } return err ; } static filter_func_t filter_lookup(request_rec* r, mod_filter_rec* filter) { mod_filter_provider* provider ; const char* str ; const char* cachecontrol ; int match ; unsigned int proto_flags ; /* Check registered providers in order */ for ( provider = filter->providers; provider; provider = provider->next) { match = 1 ; switch ( filter->dispatch ) { case REQUEST_HEADERS: str = apr_table_get(r->headers_in, filter->value) ; break ; case RESPONSE_HEADERS: str = apr_table_get(r->headers_out, filter->value) ; break ; case SUBPROCESS_ENV: str = apr_table_get(r->subprocess_env, filter->value) ; break ; case CONTENT_TYPE: str = r->content_type ; break ; } /* treat nulls so we don't have to check every strcmp individually Not sure if there's anything better to do with them */ if ( str == NULL ) { if ( provider->match_type == DEFINED ) { if ( provider->match.c != NULL ) { match = 0 ; } } } else if ( provider->match.c == NULL ) { match = 0 ; } else { /* Now we have no nulls, so we can do string and regexp matching */ switch ( provider->match_type ) { case STRING_MATCH: if ( strcasecmp(str, provider->match.c) ) { match = 0 ; } break ; case STRING_CONTA
Re: AddOutputFilterByType oddness
William A. Rowe, Jr. wrote: Is your DefaultType set to text/html? It's set like so: DefaultType text/plain You are proxying content? What does the HEAD /image.gif HTTP/1.0 report for content type from the backend server? It says this: [EMAIL PROTECTED] root]# telnet gatekeeper.fma.co.za 80 Trying 196.30.143.210... Connected to gatekeeper.fma.co.za. Escape character is '^]'. HEAD /patricia/policy/images/tabaccounting1.gif HTTP/1.1 Host: gatekeeper.fma.co.za HTTP/1.1 200 OK Date: Tue, 24 Aug 2004 10:05:54 GMT Server: Apache-Coyote/1.1 ETag: W/"1636-1092965561000" Last-Modified: Fri, 20 Aug 2004 01:32:41 GMT Content-Type: image/gif Connection: close Connection closed by foreign host. Regards, Graham -- smime.p7s Description: S/MIME Cryptographic Signature
Re: AddOutputFilterByType oddness
On Tue, 24 Aug 2004, Graham Leggett wrote: > I have just set up the most recent httpd v2.0.51-dev tree, and have > configured a filter that strips leading whitespace from HTML: > > AddOutputFilterByType STRIP text/html > > The content is served by mod_proxy. As it stands, that can't work. It's a manifestation of the problem I'm addressing by reviewing the filter architecture: see http://www.apachetutor.org/dev/smart-filter and the "Ideas for smart filtering" thread here. I actually have an implementation based on the discussion document and addressing the concerns people raised in the thread. I hope to find time to finish the accompanying documentation and post it here round about this coming weekend. > http://httpd.apache.org/docs-2.0/mod/core.html#addoutputfilterbytype > > it says that filters are not applied by proxied requests (It does not > give a reason why not). The URL above makes it clear what's happening there. -- Nick Kew
Re: AddOutputFilterByType oddness
At 07:23 PM 8/23/2004, Graham Leggett wrote: >Paul Querna wrote: > >>Is your DefaultType set to text/html? > >It's set like so: > >DefaultType text/plain You are proxying content? What does the HEAD /image.gif HTTP/1.0 report for content type from the backend server? Bill
Re: AddOutputFilterByType oddness
Paul Querna wrote: Is your DefaultType set to text/html? It's set like so: DefaultType text/plain On Tue, 2004-08-24 at 01:54 +0200, Graham Leggett wrote: Hi all, I have just set up the most recent httpd v2.0.51-dev tree, and have configured a filter that strips leading whitespace from HTML: AddOutputFilterByType STRIP text/html The content is served by mod_proxy. This seems to work fine for HTML requests, but I have noticed that this filter is also being applied to images as well (thus corrupting them). Why would the above directive apply to all content, instead of text/html only as is configured? Looking at the following docs: http://httpd.apache.org/docs-2.0/mod/core.html#addoutputfilterbytype it says that filters are not applied by proxied requests (It does not give a reason why not). From the test above however this statement is false, filters are applied to proxy requests - all proxied requests. Am I doing something wrong, or is AddOutputFilterByType broken? Regards, Graham -- smime.p7s Description: S/MIME Cryptographic Signature
Re: AddOutputFilterByType oddness
Is your DefaultType set to text/html? On Tue, 2004-08-24 at 01:54 +0200, Graham Leggett wrote: > Hi all, > > I have just set up the most recent httpd v2.0.51-dev tree, and have > configured a filter that strips leading whitespace from HTML: > > AddOutputFilterByType STRIP text/html > > The content is served by mod_proxy. > > This seems to work fine for HTML requests, but I have noticed that this > filter is also being applied to images as well (thus corrupting them). > Why would the above directive apply to all content, instead of text/html > only as is configured? > > Looking at the following docs: > > http://httpd.apache.org/docs-2.0/mod/core.html#addoutputfilterbytype > > it says that filters are not applied by proxied requests (It does not > give a reason why not). From the test above however this statement is > false, filters are applied to proxy requests - all proxied requests. > > Am I doing something wrong, or is AddOutputFilterByType broken? > > Regards, > Graham > --
Re: AddOutputFilterByType and mod_jk [was Re: mod_deflate with mod_jk]
Aditya wrote: On Mon, 09 Dec 2002 16:29:40 +0100, Henri Gomez <[EMAIL PROTECTED]> said: BTW, I updated mod_jk 1.2.2-dev, 2.0.4-dev and also mod_webapp to set the content type the correct way, previously there was a direct set of content-type and I now use ap_set_content_type : hgomez 2002/12/09 05:19:18 Modified: jk/native/apache-2.0 mod_jk.c Log: Make jk works with filters in Apache 2.0, ie mod_deflate and AddOutputFilterByType DEFLATE text/html. I can confirm this now "does the right thing" with Apache 2.0.39 under FreeBSD running mod_jk from CVS HEAD and mod-xslt (www.mod-xslt.com) as an output filter for content of type text/xml Thanks for the confirmation. BTW, Happy New Year to all of you
RE: AddOutputFilterByType
> > > What would be most cool is to set an r->replace_request member to >the > > > subrequest we will run. Then in the run request phase, look at > > > replace_request and run the insert_filters/run_handler against that > > > replacement. > > > > That could be goodness. I agree that I'm not sure if it is really > > that trivial. Perhaps. Well, actually, I know it won't be since > > some filters do different things when r->main is set. I can see > > rbb saying, "Those filters are broken." Is that the case? -- justin > >No that isn't the case, and the fix isn't that trivial. It isn't >impossible, but it adds an if to every single hook call. Or would it... your griping about it [both of you] suddenly sparked some insight... if the return code is != 0 then we test the various cases. The mainline OK case remains unhindered :) Bill
RE: AddOutputFilterByType
> > Rbb and I chatted about this earlier today. It seems like once the > decision > > is reached that we have a fast internal redirect, we should _stop_ > > processing > > that main request. Obviously some well defined return value from the > hook > > fn, similar to DONE but not quite the same. > > Ahem, wasn't I saying that? ;-) Yes, I think we need this too. > > > What would be most cool is to set an r->replace_request member to the > > subrequest we will run. Then in the run request phase, look at > > replace_request and run the insert_filters/run_handler against that > > replacement. > > That could be goodness. I agree that I'm not sure if it is really > that trivial. Perhaps. Well, actually, I know it won't be since > some filters do different things when r->main is set. I can see > rbb saying, "Those filters are broken." Is that the case? -- justin No that isn't the case, and the fix isn't that trivial. It isn't impossible, but it adds an if to every single hook call. Ryan
Re: AddOutputFilterByType
On Tue, Mar 05, 2002 at 04:43:50PM -0600, William A. Rowe, Jr. wrote: > Cut it out already :-) I'll try. > Rbb and I chatted about this earlier today. It seems like once the decision > is reached that we have a fast internal redirect, we should _stop_ > processing > that main request. Obviously some well defined return value from the hook > fn, similar to DONE but not quite the same. Ahem, wasn't I saying that? ;-) Yes, I think we need this too. > What would be most cool is to set an r->replace_request member to the > subrequest we will run. Then in the run request phase, look at > replace_request and run the insert_filters/run_handler against that > replacement. That could be goodness. I agree that I'm not sure if it is really that trivial. Perhaps. Well, actually, I know it won't be since some filters do different things when r->main is set. I can see rbb saying, "Those filters are broken." Is that the case? -- justin
Re: AddOutputFilterByType
At 04:29 PM 3/5/2002, you wrote: >Also, why does mod_negotiation handle fixups in type_checker and >mod_dir does it in fixups? I'd suggest that we should move the >handle_multi call to be a fixups so that we are consistent where >our redirect calls occur. -- justin Cut it out already :-) We don't know if some other module is -interested- in that dir... so mod_dir will wait till fixups to say "hey - I can do that!" and auth is run for the dir as well as its index.html. Negotiation decides way up in type_checker that yea - that's a multiview, and this is what we will do with it. Rbb and I chatted about this earlier today. It seems like once the decision is reached that we have a fast internal redirect, we should _stop_ processing that main request. Obviously some well defined return value from the hook fn, similar to DONE but not quite the same. What would be most cool is to set an r->replace_request member to the subrequest we will run. Then in the run request phase, look at replace_request and run the insert_filters/run_handler against that replacement. Need to spend 2 hours away from work + apache ... I'll check in later and see if the solution is really that trivial. Bill
Re: AddOutputFilterByType
On Tue, Mar 05, 2002 at 07:04:43AM -0800, Ryan Bloom wrote: > > Why is this thing being run in the fixups phase? The whole point of the > insert_filters phase is to insert filters for the given resource. Why > are we trying to insert resource based filters in the fixups? > Especially given that the resource can change during the fixups phase. True, this is where OtherBill suggested to me where it should go. But now that I understand our filtering hooks a bit better, I now agree that insert_filter makes more sense. However, should we attempt to abstract setting r->content_type so that we can intercept whenever the content_type is modified and add the right filters as dictated by AddOutputFilterByType? I have some reservations about that though, but it might solve our filter changing the content-type on us. Also, why does mod_negotiation handle fixups in type_checker and mod_dir does it in fixups? I'd suggest that we should move the handle_multi call to be a fixups so that we are consistent where our redirect calls occur. -- justin