"Andrew M. Bishop" wrote:
>
> "Paul A. Rombouts" <[EMAIL PROTECTED]> writes:
>
> > "Andrew M. Bishop" wrote:
> > >
> > > "Paul A. Rombouts" <[EMAIL PROTECTED]> writes:
> > >
> > > > I've had a problem with the website abcnews.go.com that also turned to be
> > > > deflate-compression related.
> > > > When I use Mozilla the site displays OK, but the Internet Explorer users on my
> > > > LAN got a blank page.
...
> > > What does Internet Explorer ask for and what does the site send if
> > > WWWOFFLE is not used?
> >
> > OK, this is the request of Internet Explorer 5.5 (without using a proxy):
> >
> > GET / HTTP/1.1
> > Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg,
> > application/vnd.ms-excel, application/msword, */*
> > Accept-Language: nl,en;q=0.5
> > Accept-Encoding: gzip, deflate
> > User-Agent: Mozilla/4.0 (compatible; MSIE 5.5; Windows 98; T312461)
> > Host: abcnews.go.com
> > Connection: Keep-Alive
>
> > This is the header of the response:
> >
> > HTTP/1.1 200 OK
> > Server: Microsoft-IIS/5.0
> > Set-Cookie: SWID=2481AFA9-5A69-4892-9E0D-B699F717BE7A; path=/; expires=Wed,
> > 03-Jul-2022 18:50:17 GMT; domain=.go.com;
> > Cache-Expires: Wed, 03 Jul 2002 18:51:45 GMT
> > Cache-Control: max-age=300
> > Date: Wed, 03 Jul 2002 18:50:17 GMT
> > Content-Type: text/html
> > Accept-Ranges: bytes
> > Last-Modified: Wed, 03 Jul 2002 18:46:45 GMT
> > ETag: "f857bcfac122c21:953"
> > Vary: Accept-Encoding, User-Agent
> > Content-Encoding: deflate
> > Warning: 214 abcnews.go.com "Redline Networks Densitron active"
> > Transfer-Encoding: chunked
> > Via: 1.1 abcnews.go.com (Redline Networks Accelerator 2.0.7A3)
> >
> >
> > Here's a hex dump of the first 64 bytes of the content:
> >
> > 0001151 33 32 36 43 0d 0a ed 7d 7b 57 e2 c8 16 ef df ed
> > 0001171 5a fd 1d aa 33 6b 14 8f f2 56 7c 62 2f 04 54 66
> > 0001211 14 3c 80 ed f4 9c 7b 16 2b 40 80 4c 07 c2 24 41
> > 0001231 db 99 d3 1f e8 7e cb fb db 55 95 a4 02 41 d1 ee
> >
> I am very interested to see that the first six characters of the reply
> data look like a DOS format line, terminated with CR, LF.
>
> 326C^M^J
>
> I don't know if it means anything or is just coincidence.
Yes, I noticed that too. I also noticed that that the reply header contains
"Transfer-Encoding: chunked".
The following quote from RFC 2616 explains it:
-------------------- RFC 2616 --------------------
The chunked encoding modifies the body of a message in order to transfer it as a
series of chunks, each with its own size indicator, followed by an OPTIONAL
trailer containing entity-header fields. This allows dynamically produced
content to be transferred along with the information necessary for the recipient
to verify that it has received the full message.
Chunked-Body = *chunk
last-chunk
trailer
CRLF
chunk = chunk-size [ chunk-extension ] CRLF
chunk-data CRLF
chunk-size = 1*HEX
last-chunk = 1*("0") [ chunk-extension ] CRLF
chunk-extension= *( ";" chunk-ext-name [ "=" chunk-ext-val ] )
chunk-ext-name = token
chunk-ext-val = token | quoted-string
chunk-data = chunk-size(OCTET)
trailer = *(entity-header CRLF)
The chunk-size field is a string of hex digits indicating the size of the chunk.
The chunked encoding is ended by any chunk whose size is zero, followed by the
trailer, which is terminated by an empty line.
--------------------------------------------------
So the first byte of the compressed data is actually 0xed.
I fetched several different pages from abcnews.go.com with
"wget --user-agent='Mozilla/4.0 (compatible; MSIE 5.5; Windows 98; T312461)'
--header='Accept-Encoding: gzip, deflate' [URL]..."
and the first two bytes were always 0xed, 0x7d.
>
> In RFC 2616 (HTTP/1.1) the deflate content encoding is defined as:
>
> -------------------- RFC 2616 --------------------
> deflate
> The "zlib" format defined in RFC 1950 [31] in combination with
> the "deflate" compression mechanism described in RFC 1951 [29].
> -------------------- RFC 2616 --------------------
>
> The deflate compression method is defined in RFC 1951 and the zlib
> format in RFC 1950 defines the header and checksum to use for deflate
> compressed data. The first byte is defined as:
>
> -------------------- RFC 1950 --------------------
> CMF (Compression Method and flags)
> This byte is divided into a 4-bit compression method and a 4-
> bit information field depending on the compression method.
>
> bits 0 to 3 CM Compression method
> bits 4 to 7 CINFO Compression info
>
> CM (Compression method)
> This identifies the compression method used in the file. CM = 8
> denotes the "deflate" compression method with a window size up
> to 32K. This is the method used by gzip and PNG (see
> references [1] and [2] in Chapter 3, below, for the reference
> documents). CM = 15 is reserved. It might be used in a future
> version of this specification to indicate the presence of an
> extra field before the compressed data.
>
> CINFO (Compression info)
> For CM = 8, CINFO is the base-2 logarithm of the LZ77 window
> size, minus eight (CINFO=7 indicates a 32K window size). Values
> of CINFO above 7 are not allowed in this version of the
> specification. CINFO is not defined in this specification for
> CM not equal to 8.
> -------------------- RFC 1950 --------------------
>
> The value of CM must be 8 for deflate and the value of CINFO depends
> on the compression, but 7 indicates maximum compression. The first
> byte of the zlib/deflate data stream is therefore 0x78.
>
> The first bytes of a gzip data stream are defined in RFC 1952 and are
> always 0x1f, 0x8b.
Hmmm..., that's very odd. That's the way I read it too.
I can only conclude that Microsoft-IIS is non-compliant with the RFCs.
Perhaps Microsoft has "embraced and extended" the ZLIB Compressed Data Format?
--
Paul A. Rombouts <[EMAIL PROTECTED]>
Vincent van Goghlaan 27
5246 GA Rosmalen
Netherlands