Re: [users@httpd] Generating a gzip response from multiple pre-gzipped files on disk

2014-02-07 Thread Thomas Eckert
Can you post the headers, from sending the request(s) up to and including
the response(s) ?

I think you might be hitting the same spot as I recently did in (1). In
short, most (if not all) popular clients do not unpack responses if they
think they shouldn't even if the headers tell them to. So for example,
Content-Encoding: gzip, deflate will not have my Firefox run gunzip on a
file like data.gz. At this point I can only speculate because I did not
dig deeper with the client behaviour but I *think* this is because they
sniff in on the content or at least on the file ending.

(1)
http://mail-archives.apache.org/mod_mbox/httpd-dev/201401.mbox/%3CCAPV0b06Z6Yey7Wa6gACCyrxui36WnB5gvJxQwCSWiZMahgnynQ%40mail.gmail.com%3E


On Thu, Feb 6, 2014 at 6:54 PM, Tom Evans tevans...@googlemail.com wrote:

 Hi all

 At $JOB we have a web app that generates XML for another web app to
 use. Each complete XML document is a list of individual items, and
 each item is stored on disk, in gzip format to save space - the format
 is overly verbose, and compression is highly effective, and gzip is
 nicely transparent to lots of utilities (vim mainly).

 Currently, a django app assembles the document together (it also
 generates them if they are missing, but lets ignore that for now). It
 first reads each file off disk, decompresses it, assembles one large
 string (sometimes 100MB+ XML), compresses it again (sigh) and then
 hands it off to apache.

 As a naive attempt, I modified the django app to simply load the file
 from disk, pre- and append a compressed header and footer, and then
 hand that off to apache with the appropriate content type.

 This worked in some respects - downloading the file to disk using
 fetch, then gzcat+md5 confirmed that the uncompressed response was
 bit-for-bit, but all real web clients I gave it to (firefox, chrome,
 libcurl) would only see the first chunk - the header, where as gzcat
 sees all the chunks.

 So, my questions are two-fold:

 1) Is there something in the gzip file header which makes this approach a
 no-go
 2) Is there any approach in stock httpd that could assemble docs like
 this (if it is even possible), or would I be looking at a custom
 module?

 I appreciate only the second one is really on topic here :)

 Cheers

 Tom

 -
 To unsubscribe, e-mail: users-unsubscr...@httpd.apache.org
 For additional commands, e-mail: users-h...@httpd.apache.org




[users@httpd] Generating a gzip response from multiple pre-gzipped files on disk

2014-02-06 Thread Tom Evans
Hi all

At $JOB we have a web app that generates XML for another web app to
use. Each complete XML document is a list of individual items, and
each item is stored on disk, in gzip format to save space - the format
is overly verbose, and compression is highly effective, and gzip is
nicely transparent to lots of utilities (vim mainly).

Currently, a django app assembles the document together (it also
generates them if they are missing, but lets ignore that for now). It
first reads each file off disk, decompresses it, assembles one large
string (sometimes 100MB+ XML), compresses it again (sigh) and then
hands it off to apache.

As a naive attempt, I modified the django app to simply load the file
from disk, pre- and append a compressed header and footer, and then
hand that off to apache with the appropriate content type.

This worked in some respects - downloading the file to disk using
fetch, then gzcat+md5 confirmed that the uncompressed response was
bit-for-bit, but all real web clients I gave it to (firefox, chrome,
libcurl) would only see the first chunk - the header, where as gzcat
sees all the chunks.

So, my questions are two-fold:

1) Is there something in the gzip file header which makes this approach a no-go
2) Is there any approach in stock httpd that could assemble docs like
this (if it is even possible), or would I be looking at a custom
module?

I appreciate only the second one is really on topic here :)

Cheers

Tom

-
To unsubscribe, e-mail: users-unsubscr...@httpd.apache.org
For additional commands, e-mail: users-h...@httpd.apache.org