Re: [squid-users] ecap adapter munging cached body

2011-01-26 Thread Jonathan Wolfe
Okay, I narrowed this down a bit more with some help from Alex Rousskov:

When it works (doing a string replace from asdf to fdsa for example, so 
same total content length):

2011/01/26 16:07:21.312| storeEntryValidLength: Checking 
'1078B4E8EC1D17CFEBCD533EE19F7FD6'
2011/01/26 16:07:21.312| storeEntryValidLength: object_len = 20366
2011/01/26 16:07:21.312| storeEntryValidLength: hdr_sz = 360
2011/01/26 16:07:21.312| storeEntryValidLength: content_length = 20006
2011/01/26 16:07:21.317| StoreEntry::setMemStatus: inserted mem node 
http://www.example.com/squid-test

When it doesn't work (asdf to just a):

2011/01/26 16:05:59.878| storeEntryValidLength: Checking 
'1078B4E8EC1D17CFEBCD533EE19F7FD6'
2011/01/26 16:05:59.878| storeEntryValidLength: object_len = 14843
2011/01/26 16:05:59.878| storeEntryValidLength: hdr_sz = 360
2011/01/26 16:05:59.878| storeEntryValidLength: content_length = 20006
2011/01/26 16:05:59.878| storeEntryValidLength: 5523 bytes too small; 
'1078B4E8EC1D17CFEBCD533EE19F7FD6'
2011/01/26 16:05:59.879| StoreEntry::checkCachable: NO: wrong content-length

The headers returned in both cases don't actually include a Content-Length 
header, which is removed by the module using adapted-header().removeAny.

It looks like squid is restoring the content length in the second case, and 
declaring it too small.

See https://answers.launchpad.net/ecap/+question/142965 for my discussion with 
Alex on this.  The diff he provided, which is repeated here, seems to have the 
effect of setting the message content length to -1 when removing the content 
length header from within the ecap module, and that results in this:

2011/01/26 17:21:46.539| storeEntryValidLength: Checking 
'1078B4E8EC1D17CFEBCD533EE19F7FD6'
2011/01/26 17:21:46.539| storeEntryValidLength: object_len = 16190
2011/01/26 17:21:46.539| storeEntryValidLength: hdr_sz = 360
2011/01/26 17:21:46.539| storeEntryValidLength: content_length = -1
2011/01/26 17:21:46.539| storeEntryValidLength: Unspecified content length: 
1078B4E8EC1D17CFEBCD533EE19F7FD6
2011/01/26 17:21:46.544| StoreEntry::setMemStatus: inserted mem node 
http://www.example.com/squid-test

Not the best behavior, but it does cache as expected now.

Likely there's a better place to reset the content length, right?  Perhaps in 
src/adaptation/ecap/XactionRep.cc, in moveAbContent() when we've received the 
full adapted body?

Regards,
-Jon

On Jan 23, 2011, at 8:46 PM, Amos Jeffries wrote:

 On 24/01/11 13:43, Henrik Nordström wrote:
 lör 2011-01-22 klockan 23:04 +1300 skrev Amos Jeffries:
 
 Squid caches only one of N variants so the expected behviour is that
 each new variant is a MISS but becomes a HIT on repeated duplicate
 requests until a new variant pushes it out of cache.
 
 No it caches all N variants seen if the origin response has Vary:
 
 But not sure what happens with the gzip eCAP module in this regard.
 
 AFAIK, that proper variant handling was not yet ported to squid-3. Only in 
 squid-2 right now.
 This identical behaviour is causing some problems with recent Chrome using 
 sdch encoding. Thus clashing with the gzip|deflate cached variant from other 
 browsers.
 
 Though yes the adapter output seems to be borked anyway.
 
 Amos
 -- 
 Please be using
  Current Stable Squid 2.7.STABLE9 or 3.1.10
  Beta testers wanted for 3.2.0.4



Re: [squid-users] ecap adapter munging cached body

2011-01-24 Thread Henrik Nordström
mån 2011-01-24 klockan 17:46 +1300 skrev Amos Jeffries:

 AFAIK, that proper variant handling was not yet ported to squid-3. Only 
 in squid-2 right now.

Correct, but even the basic variant handling is 1-N. The difference is
that the basic mode do not merge equal responses, and each possible
request variation will cause a new copy in the cache.

 This identical behaviour is causing some problems with recent Chrome 
 using sdch encoding. Thus clashing with the gzip|deflate cached variant 
 from other browsers.

?

Regards
Henrik



Re: [squid-users] ecap adapter munging cached body

2011-01-24 Thread Amos Jeffries
On Mon, 24 Jan 2011 20:57:08 +0100, Henrik Nordström wrote:
 mån 2011-01-24 klockan 17:46 +1300 skrev Amos Jeffries:
 
 AFAIK, that proper variant handling was not yet ported to squid-3. Only

 in squid-2 right now.
 
 Correct, but even the basic variant handling is 1-N. The difference is
 that the basic mode do not merge equal responses, and each possible
 request variation will cause a new copy in the cache.
 
 This identical behaviour is causing some problems with recent Chrome 
 using sdch encoding. Thus clashing with the gzip|deflate cached variant

 from other browsers.
 
 ?

http://www.mail-archive.com/squid-users@squid-cache.org/msg76359.html

Amos



Re: [squid-users] ecap adapter munging cached body

2011-01-23 Thread Jonathan Wolfe
 Vary in Squid is currently treated as an exact-match text key. So when asked
 for a gzip,deflate variant Squid does not have enough smarts to serve the
 deflate variant. So it MISSes and gets a fresh one, which may or may not
 be gzipped, but is served gzipped to the client anyway.

Right on, that makes sense.  I was really trying to test gzip module
vs no zipping, for clients that don't support any zipping.

 When passing the second request through squid twice in a row does the reply
 change from a MISS to a HIT? or stay a MISS?

The second request stays a MISS.

 Squid caches only one of N variants so the expected behviour is that each
 new variant is a MISS but becomes a HIT on repeated duplicate requests until
 a new variant pushes it out of cache.

Ah, well that would sort of explain it, except I don't get a
subsequent cache HIT when requesting a zipped version.

So, to test all this out, I have the webserver returning either:
a) a full HTML page (57580 bytes) when no Accept-Encoding header is present
b) some alternate content (the Accept-Encoding header echoed back 5000
times) when Accept-Encoding is present, such that the response is a
different size and dependent on the Accept-Encoding header.

Then, I issue the same request headers, just modifying the
Accept-Encoding header value (or excluding that header altogether).

I'm using the values of asdf for a bogus Accept-Encoding value that
shouldn't trigger gzipping, and gzip for when I actually want to
invoke the module.  To be clear, the webserver isn't zipping at all.

The request headers before optionally adding an Accept-Encoding header are:

GET /squid-test HTTP/1.1
User-Agent: curl/7.19.7 (universal-apple-darwin10.0) libcurl/7.19.7
OpenSSL/0.9.8l zlib/1.2.3
Accept: */*
Host: www.example.com

Here are the response headers, all requests issued serially in the
order listed here:

1. Empty Accept-Encoding header - two requests in a row, expect MISS
then HIT, full content, not zipped.

HTTP/1.0 200 OK
Cache-Control: max-age=600
Expires: Sun, 23 Jan 2011 21:00:19 GMT
Vary: Accept-Encoding
Mime-Version: 1.0
Date: Sun, 23 Jan 2011 20:50:19 GMT
Server: AOLserver/4.5.1
Content-Type: text/html; charset=utf-8
Content-Length: 57580
X-Cache: MISS from www.example.com
X-Cache-Lookup: MISS from www.example.com:80
Via: 1.0 www.example.com (squid/3.1.10)
Connection: keep-alive

HTTP/1.0 200 OK
Cache-Control: max-age=600
Expires: Sun, 23 Jan 2011 21:00:19 GMT
Vary: Accept-Encoding
Mime-Version: 1.0
Date: Sun, 23 Jan 2011 20:50:19 GMT
Server: AOLserver/4.5.1
Content-Type: text/html; charset=utf-8
Content-Length: 57580
Age: 2
X-Cache: HIT from www.example.com
X-Cache-Lookup: HIT from www.example.com:80
Via: 1.0 www.example.com (squid/3.1.10)
Connection: keep-alive

2. Accept-Encoding: asdf header - two requests in a row, expect MISS
then HIT, alternate content, not zipped.

HTTP/1.0 200 OK
Cache-Control: max-age=600
Expires: Sun, 23 Jan 2011 21:00:35 GMT
Vary: Accept-Encoding
Mime-Version: 1.0
Date: Sun, 23 Jan 2011 20:50:35 GMT
Server: AOLserver/4.5.1
Content-Type: text/html; charset=utf-8
Content-Length: 20006
X-Cache: MISS from www.example.com
X-Cache-Lookup: MISS from www.example.com:80
Via: 1.0 www.example.com (squid/3.1.10)
Connection: keep-alive

HTTP/1.0 200 OK
Cache-Control: max-age=600
Expires: Sun, 23 Jan 2011 21:00:35 GMT
Vary: Accept-Encoding
Mime-Version: 1.0
Date: Sun, 23 Jan 2011 20:50:35 GMT
Server: AOLserver/4.5.1
Content-Type: text/html; charset=utf-8
Content-Length: 20006
Age: 2
X-Cache: HIT from www.example.com
X-Cache-Lookup: HIT from www.example.com:80
Via: 1.0 www.example.com (squid/3.1.10)
Connection: keep-alive

3. Try no Accept-Encoding again - get a HIT, same full content from (1).

HTTP/1.0 200 OK
Cache-Control: max-age=600
Expires: Sun, 23 Jan 2011 21:00:19 GMT
Vary: Accept-Encoding
Mime-Version: 1.0
Date: Sun, 23 Jan 2011 20:50:19 GMT
Server: AOLserver/4.5.1
Content-Type: text/html; charset=utf-8
Content-Length: 57580
Age: 22
X-Cache: HIT from www.example.com
X-Cache-Lookup: HIT from www.example.com:80
Via: 1.0 www.example.com (squid/3.1.10)
Connection: keep-alive

4. Now try Accept-Encoding: gzip.  Two requests in a row, expect MISS
then HIT, get MISSes every time.
(I included the squid access log rows here to see the small zipped
content length - 660 bytes.)

HTTP/1.0 200 OK
Cache-Control: max-age=600
Expires: Sun, 23 Jan 2011 21:00:49 GMT
Vary: Accept-Encoding
Mime-Version: 1.0
Date: Sun, 23 Jan 2011 20:50:49 GMT
Server: AOLserver/4.5.1
Content-Type: text/html; charset=utf-8
Content-Encoding: gzip
X-Cache: MISS from www.example.com
X-Cache-Lookup: MISS from www.example.com:80
Via: 1.0 www.example.com (squid/3.1.10)
Connection: close

[23/Jan/2011:15:50:49 -0500] GET http://www.example.com/squid-test
HTTP/1.1 200 660 345 ms - curl/7.19.7
(universal-apple-darwin10.0) libcurl/7.19.7 OpenSSL/0.9.8l zlib/1.2.3
TCP_MISS:ROUNDROBIN_PARENT

HTTP/1.0 200 OK
Cache-Control: max-age=600
Expires: Sun, 23 Jan 2011 21:00:50 GMT
Vary: 

Re: [squid-users] ecap adapter munging cached body

2011-01-23 Thread Henrik Nordström
lör 2011-01-22 klockan 23:04 +1300 skrev Amos Jeffries:

 Squid caches only one of N variants so the expected behviour is that 
 each new variant is a MISS but becomes a HIT on repeated duplicate 
 requests until a new variant pushes it out of cache.

No it caches all N variants seen if the origin response has Vary:

But not sure what happens with the gzip eCAP module in this regard.

Regards
Henrik



Re: [squid-users] ecap adapter munging cached body

2011-01-23 Thread Henrik Nordström
sön 2011-01-23 klockan 14:14 -0800 skrev Jonathan Wolfe:

 I'm using the values of asdf for a bogus Accept-Encoding value that
 shouldn't trigger gzipping, and gzip for when I actually want to
 invoke the module.  To be clear, the webserver isn't zipping at all.

Is the web server responding with Vary: Accept-Encoding?

 I can change the behavior of the webserver to not include Vary:
 Accept-Encoding for content meant to be cached by squid, but that
 results in responses of the cached (unzipped) version even for clients
 who accept zipped versions, once the cache is populated by a client
 not requesting a zipped version, and that defeats the point of the
 gzip module for me because I want to gzip cached content for clients
 that support it.

Sounds like the gzip eCAP module handles things in a bad manner. It
should add Vary, and it's responses should be cacheable if the original
response is. Seems it does neither..

Regards
Henrik



Re: [squid-users] ecap adapter munging cached body

2011-01-23 Thread Jonathan Wolfe
In my test, yes, the web server was responding with Vary:
Accept-Encoding.  But that's only because of the behavior below, where
once a non-gzipped version is cached (i.e. a request comes in first
with no Accept-Encoding header at all) all subsequent requests get the
unzipped version, even if presenting gzip in the Accept-Encoding
header.

The eCAP module does add Vary: Accept-Encoding, actually.  Running the
same test without the webserver setting Vary results in the same
behavior, though - zipped response via the gzip module doesn't cache
(two MISSes in a row), and then once a nonzipped version enters the
cache, that nonzipped cached version gets served up on every request
for any incoming Accept-Encoding.

The module does not seem to touch Cache-Control or Expires headers at
all - they come through in the uncached gzipped responses just fine
(if requesting gzip encoding before anything else is cached).  Are
there headers besides Vary that the module needs to add or change to
ensure that the response can be cached?

-Jon

2011/1/23 Henrik Nordström hen...@henriknordstrom.net:
 sön 2011-01-23 klockan 14:14 -0800 skrev Jonathan Wolfe:

 I'm using the values of asdf for a bogus Accept-Encoding value that
 shouldn't trigger gzipping, and gzip for when I actually want to
 invoke the module.  To be clear, the webserver isn't zipping at all.

 Is the web server responding with Vary: Accept-Encoding?

 I can change the behavior of the webserver to not include Vary:
 Accept-Encoding for content meant to be cached by squid, but that
 results in responses of the cached (unzipped) version even for clients
 who accept zipped versions, once the cache is populated by a client
 not requesting a zipped version, and that defeats the point of the
 gzip module for me because I want to gzip cached content for clients
 that support it.

 Sounds like the gzip eCAP module handles things in a bad manner. It
 should add Vary, and it's responses should be cacheable if the original
 response is. Seems it does neither..

 Regards
 Henrik




Re: [squid-users] ecap adapter munging cached body

2011-01-23 Thread Amos Jeffries

On 24/01/11 13:43, Henrik Nordström wrote:

lör 2011-01-22 klockan 23:04 +1300 skrev Amos Jeffries:


Squid caches only one of N variants so the expected behviour is that
each new variant is a MISS but becomes a HIT on repeated duplicate
requests until a new variant pushes it out of cache.


No it caches all N variants seen if the origin response has Vary:

But not sure what happens with the gzip eCAP module in this regard.


AFAIK, that proper variant handling was not yet ported to squid-3. Only 
in squid-2 right now.
 This identical behaviour is causing some problems with recent Chrome 
using sdch encoding. Thus clashing with the gzip|deflate cached variant 
from other browsers.


Though yes the adapter output seems to be borked anyway.

Amos
--
Please be using
  Current Stable Squid 2.7.STABLE9 or 3.1.10
  Beta testers wanted for 3.2.0.4


Re: [squid-users] ecap adapter munging cached body

2011-01-22 Thread Amos Jeffries

On 22/01/11 19:22, Jonathan Wolfe wrote:

First, request and reply headers for a cached version, as gzip isn't included 
in the Accept-Encoding header.

Request:

GET /styles/media.css HTTP/1.1
User-Agent: httperf/0.9.0
Host: www.example.com
Accept-Encoding: deflate

Reply:

HTTP/1.0 200 OK
Cache-Control: max-age=86400
Expires: Sun, 23 Jan 2011 06:06:04 GMT
Vary: Accept-Encoding
Last-Modified: Thu, 30 Jul 2009 11:30:14 GMT
Mime-Version: 1.0
Date: Sat, 22 Jan 2011 06:06:04 GMT
Server: AOLserver/4.5.1
Content-Type: text/css; charset=utf-8
Content-Length: 2654
Age: 1
X-Cache: HIT from www.example.com
X-Cache-Lookup: HIT from www.example.com:80
Via: 1.0 www.example.com (squid/3.1.10)
Connection: keep-alive


Now, request and reply headers when accepting gzip.

Request:

GET /styles/media.css HTTP/1.1
User-Agent: httperf/0.9.0
Host: www.example.com
Accept-Encoding: gzip,deflate

Reply:

HTTP/1.0 200 OK
Cache-Control: max-age=86400
Expires: Sun, 23 Jan 2011 06:06:08 GMT
Vary: Accept-Encoding
Last-Modified: Thu, 30 Jul 2009 11:30:14 GMT
Mime-Version: 1.0
Date: Sat, 22 Jan 2011 06:06:08 GMT
Server: AOLserver/4.5.1
Content-Type: text/css; charset=utf-8
Content-Encoding: gzip
X-Cache: MISS from www.example.com
X-Cache-Lookup: MISS from www.example.com:80
Via: 1.0 www.example.com (squid/3.1.10)
Connection: keep-alive

I see that the gzipped version doesn't reply with a Content-Length header.


Not nice, but fine. The connection should get closed after the object 
despite the keep-alive. If not that is another bug.




Tried with Firefox for a more standard request, exactly the same response 
headers.

Regards,
Jonathan Wolfe


Ah, sorry. I'm not sure what has happened to my mind these last few 
days. I think that is a perfectly normal and working set of transactions 
with nothing to do with the ecap module.



Vary in Squid is currently treated as an exact-match text key. So when 
asked for a gzip,deflate variant Squid does not have enough smarts to 
serve the deflate variant. So it MISSes and gets a fresh one, which 
may or may not be gzipped, but is served gzipped to the client anyway.



When passing the second request through squid twice in a row does the 
reply change from a MISS to a HIT? or stay a MISS?


Squid caches only one of N variants so the expected behviour is that 
each new variant is a MISS but becomes a HIT on repeated duplicate 
requests until a new variant pushes it out of cache.



I think one of two things are needed to make this work nicely:
  (1) request modification to normalize and add gzip to the 
Accept-Encoding header prior to the cache variant lookup

  (2) de-zipping for clients who can't accept gzip
  (3) zipping for clients who can but are sent a un-zipped version

I gather that (3) is occurring. But some variant has entered your cache 
due to (1) not being done.



NOTE: patches very welcome to make Squid treat deflate as a subset of 
gzip,deflate when finding variants.
  It's a bit trickier than it sounds though. One will have to alter the 
variant key syntax to store the variant with the Content-Encoding type 
instead of the full Accept-Encoding header. Then check for all options 
in the client Accept-Encoding header when looking up the hash.


Amos
--
Please be using
  Current Stable Squid 2.7.STABLE9 or 3.1.10
  Beta testers wanted for 3.2.0.4


Re: [squid-users] ecap adapter munging cached body

2011-01-21 Thread Amos Jeffries

On 22/01/11 11:43, Jonathan Wolfe wrote:

With squid (3.1.8 - .10) in reverse-proxy mode running an eCAP
adapter (gzip), does squid still pull the body to be gzipped from its
cache?  I'm setting Vary: Accept-Encoding, and seeing HITs when
Accept-Encoding doesn't include gzip, but only MISSes when the
adapter is gzipping.  Is this by design with a respmod_precache
adapter, or can I gzip content that's already cached?

Regards, Jonathan Wolfe


NP: please configure your mailer to wrap lines at under 8 characters.
The web archives are *very* difficult to read by scrolling sideways for 
a mile.


That sounds like the opposite of good behaviour. Can you produce full 
request and reply headers flowing between the client and Squid for your 
test transactions?



In the general background info:

eCAP gets pushed data from one of the in-memory data streams. 
Technically it does not matter which one pre or post gets zipped. However...


When working correctly the adapter should update the Content-Encoding 
and ETag headers when it changes the content. Due to the ETag 
alterations required I would it expect to work best on pre-cache. So 
that the validation IMS requests work.


Running it post-cache Squid would always report ETag X from the client 
does not match the ETag Y in cache and send out a full new copy 
(gzipping down again to X ... oops).


Amos
--
Please be using
  Current Stable Squid 2.7.STABLE9 or 3.1.10
  Beta testers wanted for 3.2.0.4


Re: [squid-users] ecap adapter munging cached body

2011-01-21 Thread Jonathan Wolfe
First, request and reply headers for a cached version, as gzip isn't included 
in the Accept-Encoding header.

Request:

GET /styles/media.css HTTP/1.1
User-Agent: httperf/0.9.0
Host: www.example.com
Accept-Encoding: deflate

Reply:

HTTP/1.0 200 OK
Cache-Control: max-age=86400
Expires: Sun, 23 Jan 2011 06:06:04 GMT
Vary: Accept-Encoding
Last-Modified: Thu, 30 Jul 2009 11:30:14 GMT
Mime-Version: 1.0
Date: Sat, 22 Jan 2011 06:06:04 GMT
Server: AOLserver/4.5.1
Content-Type: text/css; charset=utf-8
Content-Length: 2654
Age: 1
X-Cache: HIT from www.example.com
X-Cache-Lookup: HIT from www.example.com:80
Via: 1.0 www.example.com (squid/3.1.10)
Connection: keep-alive


Now, request and reply headers when accepting gzip.

Request:

GET /styles/media.css HTTP/1.1
User-Agent: httperf/0.9.0
Host: www.example.com
Accept-Encoding: gzip,deflate

Reply:

HTTP/1.0 200 OK
Cache-Control: max-age=86400
Expires: Sun, 23 Jan 2011 06:06:08 GMT
Vary: Accept-Encoding
Last-Modified: Thu, 30 Jul 2009 11:30:14 GMT
Mime-Version: 1.0
Date: Sat, 22 Jan 2011 06:06:08 GMT
Server: AOLserver/4.5.1
Content-Type: text/css; charset=utf-8
Content-Encoding: gzip
X-Cache: MISS from www.example.com
X-Cache-Lookup: MISS from www.example.com:80
Via: 1.0 www.example.com (squid/3.1.10)
Connection: keep-alive

I see that the gzipped version doesn't reply with a Content-Length header.

Tried with Firefox for a more standard request, exactly the same response 
headers.

Regards,
Jonathan Wolfe

On Jan 21, 2011, at 9:00 PM, squid-users-digest-h...@squid-cache.org wrote:

 That sounds like the opposite of good behaviour. Can you produce full request 
 and reply headers flowing between the client and Squid for your test 
 transactions?