> Aha! Revisiting that, I see I still have an uncommitted patch to make
> content types to process configurable. I think that was an issue you
> originally raised? But compression is another issue.
Yep.
> Hmmm?
> If the backend sends compressed contents with no content-encoding,
doesn't that imply:
> 1. INFLATE doesn't see encoding, so steps away.
> 2. xml2enc and proxy-html can't parse compressed content, so step away
(log an error?)
> 3. DEFLATE … aha, that's what you meant about double-compression.
> In effect the whole chain was reduced to just DEFLATE. That's a bit
nonsensical
> but not incorrect, and the user-agent will reverse the DEFLATE and
restore the
> original from the backend, yesno?
I think you are right. Yet, when using FF or Chrome (both in the latest
versions) the final result is 'double compressed' nonetheless. Repeating
the steps 'manually' (curl + gzip) it's all good, meaning the original file
from the server is restored as it should be. I'm reluctant to blame the
clients however.
> But is the real issue anything more than an inability to use
ProxyHTMLEnable
> with compressed contents? In which case, wouldn't mod_proxy_html be the
> place to patch? Have it test/insert deflate at the same point as it
inserts xml2enc?
No, yes and I tried but couldn't get it to work. Following your advice I
went along the lines of
diff --git a/modules/filters/mod_proxy_html.c
b/modules/filters/mod_proxy_html.c
index b964fec..9760115 100644
--- a/modules/filters/mod_proxy_html.c
+++ b/modules/filters/mod_proxy_html.c
@@ -1569,10 +1569,19 @@ static void proxy_html_insert(request_rec *r)
proxy_html_conf *cfg;
cfg = ap_get_module_config(r->per_dir_config, &proxy_html_module);
if (cfg->enabled) {
- if (xml2enc_filter)
+ int add_deflate_output_filter = 0;
+ if (apr_table_get(r->headers_in, "Content-Encoding:") != NULL) {
+ ap_add_input_filter("inflate", NULL, r, r->connection);
+ add_deflate_output_filter = 1;
+ }
+ if (xml2enc_filter) {
xml2enc_filter(r, NULL, ENCIO_INPUT_CHECKS);
+ }
ap_add_output_filter("proxy-html", NULL, r, r->connection);
ap_add_output_filter("proxy-css", NULL, r, r->connection);
+ if (add_deflate_output_filter) {
+ ap_add_output_filter("deflate", NULL, r, r->connection);
+ }
}
}
static void proxy_html_hooks(apr_pool_t *p)
but it appears to be way off because it does exactly nothing. When logging
the headers at this point, I found r->headers_in to contain the client
request whereas r->headers_out was empty. Doesn't this tell me I'm doing
all of this too early ?
On Tue, Dec 17, 2013 at 12:47 PM, Nick Kew <[email protected]> wrote:
>
> On 17 Dec 2013, at 10:32, Thomas Eckert wrote:
>
> > I've been over this with Nick before: mod_proxy_html uses mod_xml2enc to
> do the detection magic but mod_xml2enc fails to detect compressed content
> correctly. Hence a simple "ProxyHTMLEnable" fails when content compression
> is in place.
>
> Aha! Revisiting that, I see I still have an uncommitted patch to make
> content types to process configurable. I think that was an issue you
> originally raised? But compression is another issue.
>
> > To work around this without dropping support for content compression you
> can do
> >
> > SetOutputfilter INFLATE;xml2enc;proxy-html;DEFLATE
> >
> > or at least that was the kind-of-result of the half-finished discussion
> last time.
>
> I didn't find that discussion. But I suspect my reaction would have
> included
> a certain aversion to that level of processing overhead in the proxy in
> these
> days of fatter pipes and hardware compression.
>
> > Suppose the client does
> >
> > GET /something.tar.gz HTTP/1.1
> > ...
> > Accept-Encoding: gzip, deflate
> >
> > to which the backend will respond with 200 but *not* send an
> "Content-Encoding" header since the content is already encoded. Using the
> above filter chain "corrupts" the content because it will be inflated and
> then deflated, double compressing it in the end.
>
> Hmmm?
>
> If the backend sends compressed contents with no content-encoding, doesn't
> that imply:
> 1. INFLATE doesn't see encoding, so steps away.
> 2. xml2enc and proxy-html can't parse compressed content, so step away
> (log an error?)
> 3. DEFLATE … aha, that's what you meant about double-compression.
> In effect the whole chain was reduced to just DEFLATE. That's a bit
> nonsensical
> but not incorrect, and the user-agent will reverse the DEFLATE and restore
> the
> original from the backend, yesno?
>
> > Imho this whole issue lies with proxy_html using xml2enc to do the
> content type detection and xml2enc failing to detect the content encoding.
> I guess all it really takes is to have xml2enc inspect the headers_in to
> see if there is a "Content-Encoding" header and then add the
> inflate/deflate filters (unless there is a general reason not to rely on
> the input headers, see below).
>
> Well in this particular case, surely it lies with the backend?
> But is the real issue anything more than an inability to use
> ProxyHTMLEnable
> with compressed contents? In which case, wouldn't mod_proxy_html be the
> place to patch? Have it test/insert deflate at the same point as it
> inserts xml2enc?
>
> > Of course, this whole issue would disappear if inflate/deflate would be
> run automagically (upon seeing a Content-Encoding header) in general.
> Anyway, what's the reasoning behind not having them run always and give
> them the knowledge (e.g. about the input headers) to get out of the way if
> necessary ?
>
> That's an interesting thought. mod_deflate will of course do exactly that
> if configured, so the issue seems to boil down to configuring that filter
> chain.
>
> The ultimate chain here would be:
> 1. INFLATE // unpack compressed contents
> 2. xml2enc // deal with charset for libxml2/mod_proxy_html
> 3. proxy-html // fix URLs
> 4. xml2enc // set an output encoding other than utf-8
> 5. DEFLATE // compress
>
> That's not possible with SetOutputFilter or FilterChain&family, because
> you can't configure both instances of xml2enc at once (that's what
> ProxyHTMLEnable deals with). But of those, 4 and 5 seem low-priority
> as they're not doing really essential things.
>
> Returning to:
> > SetOutputfilter INFLATE;xml2enc;proxy-html;DEFLATE
>
> AFAICS the only thing that's missing is the nonessential step 4 above.
>
> Am I missing something?
>
> --
> Nick Kew