Paul Slootman <[EMAIL PROTECTED]> writes: > I just ran into another site where this goes wrong > (http://www.canon.co.jp/Imaging/PSG1/PSG1_Firmware-e.html) > with the latest & greatest 2.7b (plain, no additional patches like in > the version I initially talked about).
> On Sat 02 Mar 2002, Paul Slootman wrote: > > > I've noticed sometimes that when I request a page that's already in the > > cache because I accessed it in another browser, it won't load properly. > > This is when a page won't work in Opera (my main browser) because of > > javascript problems, and then I try it in Mozilla or Netscape. Mozilla > > shows a blank page (show source shows <html><body></body></html>, but > > I've since learnt here that Mozilla does that to empty pages, grrr). > > Netscape says "communications error". When I request this URL, I get the following in the WWWOFFLE log file: -------------------- wwwoffle log file -------------------- wwwoffled[20825] Information: Forked wwwoffles -fetch (pid=21401). wwwoffles[21401] Information: URL='http://www.canon.co.jp/Imaging/PSG1/PSG1_Firmware-e.html'. wwwoffles[21401] Debug: proto='http'; host='www.canon.co.jp'; path='/Imaging/PSG1/PSG1_Firmware-e.html'; args='(null)'; user:pass='(null):(null)'. wwwoffles[21401] Information: Directory 'http/www.canon.co.jp' does not exist [No such file or directory]; creating one. wwwoffles[21401] Debug: CensorRequestHeader (RefererSelfDir) replaced '(none)' by 'http://www.canon.co.jp/Imaging/PSG1/'. wwwoffles[21401] Debug: CensorRequestHeader replaced 'User-Agent: Mozilla/5.0 Galeon/1.0.3 (X11; Linux i686; U;) Gecko/20020214' by 'User-Agent: WWWOFFLE/2.7 (http://www.gedanken.demon.co.uk/wwwoffle/)'. wwwoffles[21401] Information: Cache Access Status='New Page'. wwwoffles[21401] Debug: Parsing document of MIME Type 'text/html'. wwwoffles[21401] Debug: Parsing document using HTML parser. wwwoffles[21401] Debug: Image=http://www.canon.co.jp/Imaging/G7/CMP/belogo.gif ... wwwoffles[21401] Debug: Image=http://www.canon.co.jp/Imaging/PSG1/FRM/IMG/404_serial.gif wwwoffled[20825] Information: Child wwwoffles exited with status 4 (pid=21401). -------------------- wwwoffle log file -------------------- The header in the lasttime directory is what I expect. No gzip header, normal uncompressed data. > Here's what I now get: > > $ nc localhost 8080 > /tmp/ps1 > get http://www.canon.co.jp/Imaging/PSG1/PSG1_Firmware-e.html HTTP/1.0 > > $ less /tmp/ps1 > HTTP/1.0 200 OK > Date: Wed, 24 Apr 2002 16:48:39 GMT > Server: Apache/1.3.14 (Unix) > Content-Type: text/html > Content-Encoding: gzip > Content-Length: 53407 > Connection: close > Proxy-Connection: close When I try this I get the following: $ nc localhost 8080 | head -8 get http://www.canon.co.jp/Imaging/PSG1/PSG1_Firmware-e.html HTTP/1.0 HTTP/1.0 200 OK Date: Sat, 27 Apr 2002 06:14:55 GMT Server: Apache/1.3.4 (Unix) Content-Type: text/html Content-Length: 54756 Connection: close Proxy-Connection: close This is the same if I use the 'reply-compressed-data' option or not. > Interesting detail: > > $ nc www.canon.co.jp 80 > /tmp/ps1 > GET /Imaging/PSG1/PSG1_Firmware-e.html HTTP/1.0 > Host: www.canon.co.jp > Accept-Encoding: gzip, deflate, compress;q=0.9 > > $ less /tmp/ps1 > HTTP/1.1 200 OK > Date: Wed, 24 Apr 2002 17:03:32 GMT > Server: Apache/1.3.4 (Unix) > Transfer-Encoding: chunked > Content-Type: text/html > > 1000 > <html>^M^M <head>^M <meta http-equiv="content-type" content= > [...] > > > Note how I request compressed data, but don't get it > (I get "chunked" instead?!). The server is broken. It should not send back chunked data since that is an HTTP/1.1 feature and is only allowed if asked for. > So why wwwoffle add the Content-Encoding: gzip header? What does the WWWOFFLE debug log file say? -- Andrew. ---------------------------------------------------------------------- Andrew M. Bishop [EMAIL PROTECTED] http://www.gedanken.demon.co.uk/ WWWOFFLE users page: http://www.gedanken.demon.co.uk/wwwoffle/version-2.7/user.html
