Stefan Hartmann wrote:

Amos Jeffries wrote:
Hello,

i am running squid as reverse proxy in front of a web server farm. We
are trying to implement Content-Compression, and it gets broken from
"time to time".

The www-servers are windows IIS 5, and the compression is done using a
ISAPI Filter (no, not the original broken M$ filter from the server).

We are using Version 2.7.STABLE6 in our setup. The www-servers are all
sending a "Vary: Accept-Enconding" header, and the setup is working
perfectly in my test scenarios. We have no "broken_vary_encoding"
configured, and no ETag in the responses (we are only using Expire:
Headers).

We installed the ISAPI Filters last week without putting the "Vary:
Accept-Encoding" header on the www-servers in place, and blocked the
"Accept-Encoding:" and the "Vary:" headers at squid, waiting for
maintainance window to activate it. The site worked without any problem.

During the last maintainance window, we activated the "Accept-Encoding:"
and "Vary:" Headers (no longer blocking it in squid), and set up the
WWW-Servers to send "Vary: Accept-Encoding" headers, and it works -
sometimes with some browsers.

The failure we see are content-pages which are ending after some kB of
correct data. ie the homepage is about 150 kB uncompressed, compressed
around 30 kB (this is why we want compression), and the Serverfarm
delivered Content-Pages consisting of the first 18 to 25 kB
(uncompressed, different sizes possible) of the complete page, never
coming to an end. This never happened in our test setup.

The pages were (as intended) cached by squid, so we had the situation
that for example Internet Explorer was working, but Firefox got the
short page. And vice versa, sometimes Firefox worked, but IE failed. And
sometimes all browsers worked.

From tonight logs:
11:30 pm to 01:00 am: IE pages broken
01:00 am to 09:30 am: all working
09:30 am to 10:15 am: Firefox pages broken
10:15 am to 11:00 am: IE and Firefox pages broken

Ok, perhaps the ISAPI filter is faulty in some conditions we did not
test, with some browsers or bots or... so we uninstalled the ISAPI
filter from all WWW-Servers, but left the "Vary: Accept-Encoding" header
in place.

Result: The error did not stop! I had to block the "Accept-Encoding:"
and "Vary:" in squid to get the site working properly.

Next step was to remove the Vary: Header from the WWW-Servers and not
blocking the "Accept-Encoding:" and "Vary:" headers in squid: the site
is working properly.

So... are there any issues regarding squid and WWW-Servers sending
"Vary: Accept-Encoding" (without actually doing Content-Compression)?

When the error occurs, our logs are showing connections with "short"
pages (ie 18 kB vs. 150 kB normaly), which are obviously aborted after
900 seconds:

Mon Apr 27 15:38:29 2009 900301 111.111.111.111 TCP_MISS/200 18806 GET
http://real.server.de/ - DEFAULT_PARENT/real.server.de text/html
[
Accept-Encoding: gzip, deflate
User-Agent: Nutscrape/1.0 (CP/M; 8-bit)
Host: real.server.de
Cookie: WT_SET=id=213.253.......
Cache-Control: max-age=259200
]

[
HTTP/1.0 200 OK
Date: Mon, 27 Apr 2009 13:23:29 GMT
X-Powered-By: ASP.NET
X-AspNet-Version: 2.0.50727
Realserver-info: BuildTime: 27.04.2009 15:23:29; TimeSpan:
00:00:02.6719434; CacheTime: 120; Server: WWW31
Publisher: Real-Server
Expires: Mon, 27 Apr 2009 13:25:29 GMT
Content-Type: text/html; charset=iso-8859-1
Content-Length: 168830
X-Cache: HIT from accel3
Connection: close
]

Please help!

Since the browser seems to be eratic, I assume that one particular client
request is causing some bad data to enter squid cache and being served for
all following clients for a period.
Look to the requests at the beginning of the time when things break. If
you can find the exact conditions or client it will be much easier to
track through the logs on later occurances.

It sounds a little bit like:
  http://squidproxy.wordpress.com/2008/04/29/chunked-decoding/

except for a few factors that don't fit:
 IIS 5 is not known for this issue,
 2.7 has a decoding hack to fix it
 and Vary: seemed to show relevance.

I'd try raising the debug_options levels for request processing a bit and
see what becomes visible.

Amos,

thanks for the reply. debugging is somewhat tricky, since the serverfarm
has to handle lots of traffic (around 200 Mio content pages per month)
and debugging the real servers would generate a (too) huge amount of
data. And in my test scenario i don`t get the error...

I will try to filter the "bad" requests. The idea is to stop the
Accept-Encoding headers if the are "crazy", ie (all seen live)

Accept-Encoding: FFFF, FFFFFFF
Accept-Encoding: mzip, meflate
Accept-Encoding: identity, deflate, gzip
Accept-Encoding: gzip;q=1.0, deflate;q=0.8, chunked;q=0.6,
identity;q=0.4, *;q=0
Accept-Encoding: gzip, deflate, x-gzip, identity; q=0.9
Accept-Encoding: gzip,deflate,bzip2
Accept-Encoding: nnnnndeflate
Accept-Encoding: x-gzip, gzip
Accept-Encoding: gzip,identity
Accept-Encoding: gzip, deflate, compress;q=0.9
Accept-Encoding: gzip,deflate,X.509


lol. Thanks.

and only let pass these two:

Accept-Encoding: gzip,deflate
Accept-Encoding: gzip, deflate

first one is Firefox, the other is IE. This will match in about 80-90%
of all requests, which would be ok.

so i tried

acl zipit req_header Accept-Encoding ^gzip,deflate$
acl zipit req_header Accept-Encoding ^gzip, deflate$
[...]
header_access Accept-Encoding allow zipit

but something seems to be wrong with the regex above, squid will let
pass not only "gzip,deflate" as i would expect but also
"gzip,deflate,xx" and "gzip,xx". "bla" will be blocked.

seems like squid will let pass the header if it starts with gzip,
disregarding the rest. am i wrong with my regex?

Squid splits on whitespace. The space in your second pattern makes that into three patterns.

I'd use:
 acl zipit req_header Accept-Encoding ^gzip,.deflate$


Amos
--
Please be using
  Current Stable Squid 2.7.STABLE6 or 3.0.STABLE14
  Current Beta Squid 3.1.0.7

Reply via email to