[EMAIL PROTECTED] wrote...
Hi to all,
A new question to HTTP / RFC gurus.
A customer has developped a custom PHP HTTP client,
using HTTP 1.0 and compression.
That's like mixing Vodka and Beer... something could
easily puke... but OK... I hear ya...
This HTTP client compress both request and replies.
Sure, why not.
For replies it works great but for request we have
a doubt.
I imagine so, yes.
Since the HTTP client compress a request there is in
HTTP header :
Content-Encoding: gzip
Also the Content-Length is set to the size of the
plain request (not the size of the compressed request).
Is it correct or should it send the Content-Length with
the size of the compressed request ?
In such case, it seems that mod_deflate INPUT filter should
modify the Content-Length accordingly ?
Thanks for your help
You've got some messed up code on your hands, Henri.
In your particular case... Content-length should ALWAYS be
ACTUAL length of the number of bytes on the wire. Anything else
is going to screw something up somewhere.
You have to remember the difference between 'Content-Encoding:'
and 'Tranfer-encoding:'. 'Transfer-Encoding:' is TRANSPORT
layer thing but 'Content-Encoding:' is a PRESENTATION
layer thing.
When any HTTP request or response says that it's BODY DATA
has 'Content-type: ' and/or 'Content-Length: ' what that
really meant ( in early HTTP terms ) is...
Content-Type: = Original MIME type of original data (file).
Content-Length = Actual length of original data (file).
The original assumption in early HTTP was that this would always
represent some file on some disk and the 'Content-type:' was
usually just the file extension (mapped) and the 'Content-length:' was
whatever a 'stat()' call says the file length was.
When Content started to get produced dynamically ( does not
exist until asked for ) things got a little sticky but the CONCEPT
is still the same. Content-type: is supposed to be the MIME type
'as-if' the 'file' already existed and 'Content-length' would be the
exact number of ( PRESENTATION LAYER ) bytes 'as-if' the
'data file' was sitting on a disk somewhere.
If ANYTHING steps in to alter or filter or convert the 'content'
at the PRESENTATION layer then it MUST change the 'Content-Length'
as well because from the 'Content-x' perspective... the
content has, in fact, changed at the PRESENTATION layer.
There is no HTTP header field that looks like this...
Original-Content-Length: - Length of data before P layer content changed
All you have to work with is this...
Content-length: - Length of P layer data NOW after something changes it.
RFC 2616 says...
4.4 Message Length
3. If a Content-Length header field ( section 14.41 ) is present, its
decimal value in OCTETs represents BOTH the entity-length and
the transfer-length. The Content-Length header field must NOT be sent
if these two lengths are different [snip]
What this really means is...
3. If a ( PRESENTATION layer ) Content-Length header field
( section 14.41 ) is present, its decimal value in OCTETs represents
BOTH the entity-length ( Actual PRESENTATION layer length ) and
the transfer-length. ( TRANSPORT layer length - actual number of
bytes on the wire ). The Content-Length header field must NOT be sent
if these two lengths are different [snip]
The last part is kind of moot since it's not uncommon at all for
presentation layer content-length to be 'different' from the actual
transport layer length. You will see it all the time 'out there'. The
only thing that gets you into real trouble is when the actual length
of the data is MORE than whatever the 'Content-length:' field says
it's supposed to be.
Example: Even with all the above being said... it is actually OK to
leave 'Content-Length:' set to the original size of the file IF you are
using GZIP or DEFLATE ( or any LZ77 ) to compress the content.
As long as the specified 'Content-length:' ( original size ) is MORE
than the number of compressed LZ77 bytes on the wire you will
usually still be OK.
Why?... because GZIP and ZLIB and all other LZ77 decompressors
already KNOW what the original content length was and they don't
need HTTP to tell it to them. The size of the orignal file is (usually)
contained in the LZ77 headers.
Even 'streamed compression' ( sic: ZLIB ) will KNOW when the
decompression has ended. There's an EOD signal built into
the stream itself... but that doesn't mean the Server will know
what the decompressor 'knows'.
Which brings us to your 'action items', methinks.
If you are using 'streamed compression' ( Sic: ZLIB ) then there
will only be 2 ways that the Server knows how many bytes the
Client is actually SENDING...
1. The Content-Length in the request header is, in fact, the transfer
length and the Server will stop uploading data when that length is
reached and won't 'timeout' or anything waiting for more data that
never arrives.
2. The Client is using HTTP/1.1 and "Transfer-encoding: