Re: [squid-users] Squid transparent not caching apt requests from deb.debian.org

2020-04-04 Thread Alex Rousskov
On 4/3/20 4:55 PM, zrm wrote:
> On 4/3/20 16:34, Alex Rousskov wrote:
>> On 4/3/20 4:26 PM, zrm wrote:
>>> In the first case we get TCP_MISS every time because it isn't caching
>>> the data, in the second case it's only the first time and after that we
>>> get TCP_REFRESH_UNMODIFIED. But how and why is this happening?

> squid to apt:
> -
> X-Cache: MISS from tproxy
> X-Cache-Lookup: MISS from tproxy:3130

> squid to wget:
> -
> X-Cache: MISS from tproxy
> X-Cache-Lookup: MISS from tproxy:3130

The headers you have posted tell us that the object was not in Squid
cache when apt and wget transactions started. Since wget was started
after apt, we can speculate that apt transaction did not cache the
object. This is consistent with your observations. I saw nothing in the
posted headers that would explain the difference between apt and wget
outcomes.

Unfortunately, you did not show the headers for the case where the
object actually got cached by Squid. You can probably do that by
repeating the wget transaction twice. I would also repeat the apt
transaction twice after that. It would also be interesting to see the
wget-apt and apt-wget sequences. In summary, I would do
wget-wget-apt-apt-wget-wget. Sleep for 10+ seconds between each
transaction to reduce chances of overlapping cache operations.

BTW, you probably do not need to make ALL,2 logs pretty -- we can figure
out what happens based on Squid messages if you submit one transaction
at a time and disclose transaction sequence. You can just post (a link
to) raw (or sanitized) logs. Compress them if they are too big.

Before sharing the logs, please double check that the problem you want
to address was reproduced during the test.


> Last-Modified: Sat, 15 Jun 2019 17:46:35 GMT
> ETag: "1389dc-58b605823fa6e"
> Cache-Control: public, max-age=2592000
> Content-Length: 1280476
> Age: 4248100

FWIW: The object is 4'248'100 seconds old. The object max-age is
2'592'000 seconds. Your Squid is probably using an internal max-age of
259'200 seconds, so Squid will require cache hit revalidation during
subsequent transactions after Squid caches the object (if it caches it).


HTH,

Alex.


> apt to squid:
> -
> GET /debian/pool/main/v/vim/vim_8.1.0875-5_amd64.deb HTTP/1.1
> Host: deb.debian.org
> User-Agent: Debian APT-HTTP/1.3 (1.8.2)
> 
> 
> --
> squid to deb.debian.org:
> -
> GET /debian/pool/main/v/vim/vim_8.1.0875-5_amd64.deb HTTP/1.1
> User-Agent: Debian APT-HTTP/1.3 (1.8.2)
> Host: deb.debian.org
> Via: 1.1 tproxy (squid/4.6)
> X-Forwarded-For: 192.168.111.55
> Cache-Control: max-age=259200
> Connection: keep-alive
> 
> 
> --
> deb.debian.org to squid:
> -
> HTTP/1.1 200 OK
> Server: Apache
> X-Content-Type-Options: nosniff
> X-Frame-Options: sameorigin
> Referrer-Policy: no-referrer
> X-Xss-Protection: 1
> Last-Modified: Sat, 15 Jun 2019 17:46:35 GMT
> ETag: "1389dc-58b605823fa6e"
> X-Clacks-Overhead: GNU Terry Pratchett
> Cache-Control: public, max-age=2592000
> Content-Type: application/x-debian-package
> Via: 1.1 varnish
> Content-Length: 1280476
> Accept-Ranges: bytes
> Date: Fri, 03 Apr 2020 05:28:46 GMT
> Via: 1.1 varnish
> Age: 4248100
> Connection: keep-alive
> X-Served-By: cache-ams21028-AMS, cache-wdc5559-WDC
> X-Cache: HIT, HIT
> X-Cache-Hits: 1, 2
> X-Timer: S1585891726.434375,VS0,VE0
> 
> --
> squid to apt:
> -
> HTTP/1.1 200 OK
> Server: Apache
> X-Content-Type-Options: nosniff
> X-Frame-Options: sameorigin
> Referrer-Policy: no-referrer
> X-Xss-Protection: 1
> Last-Modified: Sat, 15 Jun 2019 17:46:35 GMT
> ETag: "1389dc-58b605823fa6e"
> X-Clacks-Overhead: GNU Terry Pratchett
> Cache-Control: public, max-age=2592000
> Content-Type: application/x-debian-package
> Content-Length: 1280476
> Accept-Ranges: bytes
> Date: Fri, 03 Apr 2020 05:28:46 GMT
> Age: 4248100
> X-Served-By: cache-ams21028-AMS, cache-wdc5559-WDC
> X-Cache: HIT, HIT
> X-Cache-Hits: 1, 2
> X-Timer: S1585891726.434375,VS0,VE0
> X-Cache: MISS from tproxy
> X-Cache-Lookup: MISS from tproxy:3130
> Via: 1.1 varnish, 1.1 varnish, 1.1 tproxy (squid/4.6)
> Connection: keep-alive
> 
> 
> --
> 
> 
> 
> wget to squid:
> -
> GET /debian/pool/main/v/vim/vim_8.1.0875-5_amd64.deb HTTP/1.1
> User-Agent: Wget/1.20.1 (linux-gnu)
> Accept: */*
> Accept-Encoding: identity
> Host: deb.debian.org
> Connection: Keep-Alive
> 
> 
> --
> squid to deb.debian.org:
> --
> GET /debian/pool/main/v/vim/vim_8.1.0875-5_amd64.deb HTTP/1.1
> User-Agent: Wget/1.20.1 (linux-gnu)
> Accept: */*
> Accept-Encoding: identity
> Host: deb.debian.org
> Via: 1.1 tproxy (squid/4.6)
> X-Forwarded-For: 192.168.111.55
> Cache-Control: max-age=259200
> Connection: keep-alive
> 
> 
> --
> deb.debian.org to squid:
> -
> HTTP/1.1 200 OK
> Server: Apache
> X-Content-Type-Options: nosniff
> X-Frame-Options: sameorigin
> Referrer-Policy: no-referrer
> X-Xss-Protection: 1
> Last-Modified: Sat, 15 Jun 2019 17:46:35 GMT
> ETag: 

Re: [squid-users] Squid transparent not caching apt requests from deb.debian.org

2020-04-04 Thread Amos Jeffries

On 5/04/20 2:53 am, Alex Rousskov wrote:
> On 4/3/20 4:55 PM, zrm wrote:
>> On 4/3/20 16:34, Alex Rousskov wrote:
>>> On 4/3/20 4:26 PM, zrm wrote:
 In the first case we get TCP_MISS every time because it isn't caching
 the data, in the second case it's only the first time and after that we
 get TCP_REFRESH_UNMODIFIED. But how and why is this happening?
> 
>> squid to apt:
>> -
>> X-Cache: MISS from tproxy
>> X-Cache-Lookup: MISS from tproxy:3130
> 
>> squid to wget:
>> -
>> X-Cache: MISS from tproxy
>> X-Cache-Lookup: MISS from tproxy:3130
> 
> The headers you have posted tell us that the object was not in Squid
> cache when apt and wget transactions started. Since wget was started
> after apt, we can speculate that apt transaction did not cache the
> object. This is consistent with your observations. I saw nothing in the
> posted headers that would explain the difference between apt and wget
> outcomes.
> 
> Unfortunately, you did not show the headers for the case where the
> object actually got cached by Squid. You can probably do that by
> repeating the wget transaction twice. I would also repeat the apt
> transaction twice after that. It would also be interesting to see the
> wget-apt and apt-wget sequences. In summary, I would do
> wget-wget-apt-apt-wget-wget. Sleep for 10+ seconds between each
> transaction to reduce chances of overlapping cache operations.
> 
> BTW, you probably do not need to make ALL,2 logs pretty -- we can figure
> out what happens based on Squid messages if you submit one transaction
> at a time and disclose transaction sequence. You can just post (a link
> to) raw (or sanitized) logs. Compress them if they are too big.
> 
> Before sharing the logs, please double check that the problem you want
> to address was reproduced during the test.
> 
> 
>> Last-Modified: Sat, 15 Jun 2019 17:46:35 GMT
>> ETag: "1389dc-58b605823fa6e"
>> Cache-Control: public, max-age=2592000
>> Content-Length: 1280476
>> Age: 4248100
> 
> FWIW: The object is 4'248'100 seconds old. The object max-age is
> 2'592'000 seconds. Your Squid is probably using an internal max-age of
> 259'200 seconds, so Squid will require cache hit revalidation during
> subsequent transactions after Squid caches the object (if it caches it).
> 
> 

One thing to notice as well is that the object delivered by the upstream
caches expired over 19 days before these tests took place:

> Cache-Control: public, max-age=2592000
> Age: 4248100

The request from Squid in both transactions was to receive content no
longer than 3 days old. The other caches ignored that requirement and
served old content from their storage, apparently without even checking
an origin whether it was stale.

Amos
___
squid-users mailing list
squid-users@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-users


Re: [squid-users] Squid transparent not caching apt requests from deb.debian.org

2020-04-04 Thread zrm

On 4/4/20 10:53, Alex Rousskov wrote:

The headers you have posted tell us that the object was not in Squid
cache when apt and wget transactions started. Since wget was started
after apt, we can speculate that apt transaction did not cache the
object. This is consistent with your observations. I saw nothing in the
posted headers that would explain the difference between apt and wget
outcomes.

Unfortunately, you did not show the headers for the case where the
object actually got cached by Squid. You can probably do that by
repeating the wget transaction twice. I would also repeat the apt
transaction twice after that. It would also be interesting to see the
wget-apt and apt-wget sequences. In summary, I would do
wget-wget-apt-apt-wget-wget. Sleep for 10+ seconds between each
transaction to reduce chances of overlapping cache operations.


Attached cache.log excerpt for wget-wget-apt-apt-wget-wget. It answers 
the apt requests from the cache once it's in there, it just won't cache 
it to begin with when apt makes the request:


[wget] 1586041686.600725 192.168.111.55 TCP_MISS/200 1281195 GET 
http://deb.debian.org/debian/pool/main/v/vim/vim_8.1.0875-5_amd64.deb - 
ORIGINAL_DST/199.232.66.133 application/x-debian-package
[wget] 1586041710.518107 192.168.111.55 TCP_REFRESH_UNMODIFIED/200 
1281232 GET 
http://deb.debian.org/debian/pool/main/v/vim/vim_8.1.0875-5_amd64.deb - 
ORIGINAL_DST/199.232.66.133 application/x-debian-package
[apt] 1586041733.058 69 192.168.111.55 TCP_REFRESH_UNMODIFIED/200 
1281234 GET 
http://deb.debian.org/debian/pool/main/v/vim/vim_8.1.0875-5_amd64.deb - 
ORIGINAL_DST/151.101.200.204 application/x-debian-package
[apt] 1586041753.971101 192.168.111.55 TCP_REFRESH_UNMODIFIED/200 
1281234 GET 
http://deb.debian.org/debian/pool/main/v/vim/vim_8.1.0875-5_amd64.deb - 
ORIGINAL_DST/151.101.200.204 application/x-debian-package
[wget] 1586041769.162160 192.168.111.55 TCP_REFRESH_UNMODIFIED/200 
1281232 GET 
http://deb.debian.org/debian/pool/main/v/vim/vim_8.1.0875-5_amd64.deb - 
ORIGINAL_DST/199.232.66.133 application/x-debian-package
[wget] 1586041786.916 71 192.168.111.55 TCP_REFRESH_UNMODIFIED/200 
1281232 GET 
http://deb.debian.org/debian/pool/main/v/vim/vim_8.1.0875-5_amd64.deb - 
ORIGINAL_DST/151.101.250.133 application/x-debian-package



BTW, you probably do not need to make ALL,2 logs pretty -- we can figure
out what happens based on Squid messages if you submit one transaction
at a time and disclose transaction sequence. You can just post (a link
to) raw (or sanitized) logs. Compress them if they are too big.



Before sharing the logs, please double check that the problem you want
to address was reproduced during the test.


In this case we start with wget and then it is already in the cache for 
the requests made by apt. The problem is the data not being cached when 
apt makes the request and it isn't already there. The apt requests do 
get answered from the cache if it is already there.


The headers from the previous email show what happens when apt makes the 
request and it's not already in the cache.



Last-Modified: Sat, 15 Jun 2019 17:46:35 GMT
ETag: "1389dc-58b605823fa6e"
Cache-Control: public, max-age=2592000
Content-Length: 1280476
Age: 4248100


FWIW: The object is 4'248'100 seconds old. The object max-age is
2'592'000 seconds. Your Squid is probably using an internal max-age of
259'200 seconds, so Squid will require cache hit revalidation during
subsequent transactions after Squid caches the object (if it caches it).


That makes sense. These packages never really change at all -- the 
package version is part of the URI so if it's updated the package URI 
changes rather than the data at the old URI. I might set a longer max 
age in the config once this is worked out.



HTH,

Alex.




Thanks.
2020/04/04 19:08:05.874 kid1| 5,2| TcpAcceptor.cc(224) doAccept: New connection on FD 15
2020/04/04 19:08:05.874 kid1| 5,2| TcpAcceptor.cc(317) acceptNext: connection on local=[::]:3128 remote=[::] FD 15 flags=41
2020/04/04 19:08:05.874 kid1| 11,2| client_side.cc(1319) parseHttpRequest: HTTP Client local=199.232.66.133:80 remote=192.168.111.55:47028 FD 11 flags=33
2020/04/04 19:08:05.874 kid1| 11,2| client_side.cc(1323) parseHttpRequest: HTTP Client REQUEST:
-
GET /debian/pool/main/v/vim/vim_8.1.0875-5_amd64.deb HTTP/1.1
User-Agent: Wget/1.20.1 (linux-gnu)
Accept: */*
Accept-Encoding: identity
Host: deb.debian.org
Connection: Keep-Alive


--
2020/04/04 19:08:05.875 kid1| 85,2| client_side_request.cc(752) clientAccessCheckDone: The request GET http://deb.debian.org/debian/pool/main/v/vim/vim_8.1.0875-5_amd64.deb is ALLOWED; last ACL checked: localnet
2020/04/04 19:08:05.875 kid1| 85,2| client_side_request.cc(728) clientAccessCheck2: No adapted_http_access configuration. default: ALLOW
2020/04/04 19:08:05.875 kid1| 85,2| client_side_request.cc(752) clientAccessCheckDone: The request GET http://deb.debian.org/debian/pool/main/v/vim/vim_8.1.0875-