Re: [squid-users] Squid transparent not caching apt requests from deb.debian.org

2020-04-10 Thread zrm

On 4/8/20 13:13, Matus UHLAR - fantomas wrote:

On 08.04.20 13:01, zrm wrote:
I checked the DNS query apt is making to see why it's different. It's 
making a SRV query for _http._tcp.deb.debian.org and then using the IP 
address of the name (prod.debian.map.fastly.net) returned in the SRV 
query. By contrast, squid does the A record query for deb.debian.org 
and gets a CNAME for debian.map.fastly.net. Almost the same, but since 
it's a CDN with many IP addresses, enough different that they happen 
to not both return the same address and then validation fails.


Meanwhile wget does the same A record query as squid and gets the same 
address.


The question then becomes what to do about it. Maybe if squid fails 
the validation for the A query, it should try the SRV query and accept 
the address as valid if it matches that. Another possibility would be 
a config option to have squid completely ignore the address the client 
used and always use the address it gets by doing its own DNS query for 
the host, in which case the result would be safe to cache.


But these are obviously changes requiring a new version of squid. Is 
there any way to make it work without that?


I'd contact debian.org DNS masters. I believe CDN wasn't designedto 
cause this

kind of issues.



It seems to be like this on purpose:

https://deb.debian.org/

I'm not sure there is actually anything wrong with their DNS 
configuration, it's not required for the SRV record to point to the same 
name/address as the A record (which would in fact defeat the point of 
the SRV record).


It *might* be a bug in apt, since if the SRV record says to use 
prod.debian.map.fastly.net then maybe it ought to be using that as 
"Host: " in the HTTP headers, but I'm not sure about that. If it's 
supposed to use the original domain in the HTTP headers when using a SRV 
record then this could still be a failing of squid for not checking and 
considering valid the address from the HTTP SRV record.


I'll send them an email and see what they have to say about it.
___
squid-users mailing list
squid-users@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-users


Re: [squid-users] Squid transparent not caching apt requests from deb.debian.org

2020-04-08 Thread zrm

On 4/8/20 10:46, Alex Rousskov wrote:

On 4/7/20 8:48 PM, zrm wrote:


https://www.trustiosity.com/squid/cache-debug.log.xz


I found the reason for the difference.

After the destination IP address of your apt requests fails Host header
validation, Squid marks the request as "not cachable":


I checked the DNS query apt is making to see why it's different. It's 
making a SRV query for _http._tcp.deb.debian.org and then using the IP 
address of the name (prod.debian.map.fastly.net) returned in the SRV 
query. By contrast, squid does the A record query for deb.debian.org and 
gets a CNAME for debian.map.fastly.net. Almost the same, but since it's 
a CDN with many IP addresses, enough different that they happen to not 
both return the same address and then validation fails.


Meanwhile wget does the same A record query as squid and gets the same 
address.


The question then becomes what to do about it. Maybe if squid fails the 
validation for the A query, it should try the SRV query and accept the 
address as valid if it matches that. Another possibility would be a 
config option to have squid completely ignore the address the client 
used and always use the address it gets by doing its own DNS query for 
the host, in which case the result would be safe to cache.


But these are obviously changes requiring a new version of squid. Is 
there any way to make it work without that?

___
squid-users mailing list
squid-users@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-users


Re: [squid-users] Squid transparent not caching apt requests from deb.debian.org

2020-04-07 Thread zrm

cache.log for apt-apt-wget-wget with squid -k debug:

https://www.trustiosity.com/squid/cache-debug.log.xz

On 4/6/20 11:49, Alex Rousskov wrote:

On 4/4/20 8:02 PM, zrm wrote:

Attached cache.log excerpt for wget-wget-apt-apt-wget-wget. It answers
the apt requests from the cache once it's in there, it just won't cache
it to begin with when apt makes the request


Thank you for sharing this log. I agree with your conclusion. The apt
query results in cache revalidation and does not purge the already
cached copy. This conclusion eliminates a few suspects.

There is probably something special about the combination of an apt
request and a 200 OK miss response that prevents Squid from caching that
response. I do not see anything wrong in the logs you have already
already posted. Perhaps others will spot something.

If you get no better responses, please post a link to a compressed
apt-apt-wget-wget log, starting from a cache that does not contain the
response in question and after enabling elevated Squid debugging with
"squid -k debug" or similar. You can find more instructions about
debugging individual transactions at
https://wiki.squid-cache.org/SquidFaq/BugReporting#Debugging_a_single_transaction

A detailed apt-apt-wget-wget log should tell us why Squid is refusing to
cache a 200 OK response to the apt query but caches a very similar
response to a very similar wget query.


Thank you,

Alex.




[wget] 1586041686.600    725 192.168.111.55 TCP_MISS/200 1281195 GET
http://deb.debian.org/debian/pool/main/v/vim/vim_8.1.0875-5_amd64.deb -
ORIGINAL_DST/199.232.66.133 application/x-debian-package
[wget] 1586041710.518    107 192.168.111.55 TCP_REFRESH_UNMODIFIED/200
1281232 GET
http://deb.debian.org/debian/pool/main/v/vim/vim_8.1.0875-5_amd64.deb -
ORIGINAL_DST/199.232.66.133 application/x-debian-package
[apt] 1586041733.058 69 192.168.111.55 TCP_REFRESH_UNMODIFIED/200
1281234 GET
http://deb.debian.org/debian/pool/main/v/vim/vim_8.1.0875-5_amd64.deb -
ORIGINAL_DST/151.101.200.204 application/x-debian-package
[apt] 1586041753.971    101 192.168.111.55 TCP_REFRESH_UNMODIFIED/200
1281234 GET
http://deb.debian.org/debian/pool/main/v/vim/vim_8.1.0875-5_amd64.deb -
ORIGINAL_DST/151.101.200.204 application/x-debian-package
[wget] 1586041769.162    160 192.168.111.55 TCP_REFRESH_UNMODIFIED/200
1281232 GET
http://deb.debian.org/debian/pool/main/v/vim/vim_8.1.0875-5_amd64.deb -
ORIGINAL_DST/199.232.66.133 application/x-debian-package
[wget] 1586041786.916 71 192.168.111.55 TCP_REFRESH_UNMODIFIED/200
1281232 GET
http://deb.debian.org/debian/pool/main/v/vim/vim_8.1.0875-5_amd64.deb -
ORIGINAL_DST/151.101.250.133 application/x-debian-package


BTW, you probably do not need to make ALL,2 logs pretty -- we can figure
out what happens based on Squid messages if you submit one transaction
at a time and disclose transaction sequence. You can just post (a link
to) raw (or sanitized) logs. Compress them if they are too big.



Before sharing the logs, please double check that the problem you want
to address was reproduced during the test.


In this case we start with wget and then it is already in the cache for
the requests made by apt. The problem is the data not being cached when
apt makes the request and it isn't already there. The apt requests do
get answered from the cache if it is already there.

The headers from the previous email show what happens when apt makes the
request and it's not already in the cache.


Last-Modified: Sat, 15 Jun 2019 17:46:35 GMT
ETag: "1389dc-58b605823fa6e"
Cache-Control: public, max-age=2592000
Content-Length: 1280476
Age: 4248100


FWIW: The object is 4'248'100 seconds old. The object max-age is
2'592'000 seconds. Your Squid is probably using an internal max-age of
259'200 seconds, so Squid will require cache hit revalidation during
subsequent transactions after Squid caches the object (if it caches it).


That makes sense. These packages never really change at all -- the
package version is part of the URI so if it's updated the package URI
changes rather than the data at the old URI. I might set a longer max
age in the config once this is worked out.


HTH,

Alex.




Thanks.



___
squid-users mailing list
squid-users@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-users


Re: [squid-users] Squid transparent not caching apt requests from deb.debian.org

2020-04-04 Thread zrm

On 4/4/20 10:53, Alex Rousskov wrote:

The headers you have posted tell us that the object was not in Squid
cache when apt and wget transactions started. Since wget was started
after apt, we can speculate that apt transaction did not cache the
object. This is consistent with your observations. I saw nothing in the
posted headers that would explain the difference between apt and wget
outcomes.

Unfortunately, you did not show the headers for the case where the
object actually got cached by Squid. You can probably do that by
repeating the wget transaction twice. I would also repeat the apt
transaction twice after that. It would also be interesting to see the
wget-apt and apt-wget sequences. In summary, I would do
wget-wget-apt-apt-wget-wget. Sleep for 10+ seconds between each
transaction to reduce chances of overlapping cache operations.


Attached cache.log excerpt for wget-wget-apt-apt-wget-wget. It answers 
the apt requests from the cache once it's in there, it just won't cache 
it to begin with when apt makes the request:


[wget] 1586041686.600725 192.168.111.55 TCP_MISS/200 1281195 GET 
http://deb.debian.org/debian/pool/main/v/vim/vim_8.1.0875-5_amd64.deb - 
ORIGINAL_DST/199.232.66.133 application/x-debian-package
[wget] 1586041710.518107 192.168.111.55 TCP_REFRESH_UNMODIFIED/200 
1281232 GET 
http://deb.debian.org/debian/pool/main/v/vim/vim_8.1.0875-5_amd64.deb - 
ORIGINAL_DST/199.232.66.133 application/x-debian-package
[apt] 1586041733.058 69 192.168.111.55 TCP_REFRESH_UNMODIFIED/200 
1281234 GET 
http://deb.debian.org/debian/pool/main/v/vim/vim_8.1.0875-5_amd64.deb - 
ORIGINAL_DST/151.101.200.204 application/x-debian-package
[apt] 1586041753.971101 192.168.111.55 TCP_REFRESH_UNMODIFIED/200 
1281234 GET 
http://deb.debian.org/debian/pool/main/v/vim/vim_8.1.0875-5_amd64.deb - 
ORIGINAL_DST/151.101.200.204 application/x-debian-package
[wget] 1586041769.162160 192.168.111.55 TCP_REFRESH_UNMODIFIED/200 
1281232 GET 
http://deb.debian.org/debian/pool/main/v/vim/vim_8.1.0875-5_amd64.deb - 
ORIGINAL_DST/199.232.66.133 application/x-debian-package
[wget] 1586041786.916 71 192.168.111.55 TCP_REFRESH_UNMODIFIED/200 
1281232 GET 
http://deb.debian.org/debian/pool/main/v/vim/vim_8.1.0875-5_amd64.deb - 
ORIGINAL_DST/151.101.250.133 application/x-debian-package



BTW, you probably do not need to make ALL,2 logs pretty -- we can figure
out what happens based on Squid messages if you submit one transaction
at a time and disclose transaction sequence. You can just post (a link
to) raw (or sanitized) logs. Compress them if they are too big.



Before sharing the logs, please double check that the problem you want
to address was reproduced during the test.


In this case we start with wget and then it is already in the cache for 
the requests made by apt. The problem is the data not being cached when 
apt makes the request and it isn't already there. The apt requests do 
get answered from the cache if it is already there.


The headers from the previous email show what happens when apt makes the 
request and it's not already in the cache.



Last-Modified: Sat, 15 Jun 2019 17:46:35 GMT
ETag: "1389dc-58b605823fa6e"
Cache-Control: public, max-age=2592000
Content-Length: 1280476
Age: 4248100


FWIW: The object is 4'248'100 seconds old. The object max-age is
2'592'000 seconds. Your Squid is probably using an internal max-age of
259'200 seconds, so Squid will require cache hit revalidation during
subsequent transactions after Squid caches the object (if it caches it).


That makes sense. These packages never really change at all -- the 
package version is part of the URI so if it's updated the package URI 
changes rather than the data at the old URI. I might set a longer max 
age in the config once this is worked out.



HTH,

Alex.




Thanks.
2020/04/04 19:08:05.874 kid1| 5,2| TcpAcceptor.cc(224) doAccept: New connection on FD 15
2020/04/04 19:08:05.874 kid1| 5,2| TcpAcceptor.cc(317) acceptNext: connection on local=[::]:3128 remote=[::] FD 15 flags=41
2020/04/04 19:08:05.874 kid1| 11,2| client_side.cc(1319) parseHttpRequest: HTTP Client local=199.232.66.133:80 remote=192.168.111.55:47028 FD 11 flags=33
2020/04/04 19:08:05.874 kid1| 11,2| client_side.cc(1323) parseHttpRequest: HTTP Client REQUEST:
-
GET /debian/pool/main/v/vim/vim_8.1.0875-5_amd64.deb HTTP/1.1
User-Agent: Wget/1.20.1 (linux-gnu)
Accept: */*
Accept-Encoding: identity
Host: deb.debian.org
Connection: Keep-Alive


--
2020/04/04 19:08:05.875 kid1| 85,2| client_side_request.cc(752) clientAccessCheckDone: The request GET http://deb.debian.org/debian/pool/main/v/vim/vim_8.1.0875-5_amd64.deb is ALLOWED; last ACL checked: localnet
2020/04/04 19:08:05.875 kid1| 85,2| client_side_request.cc(728) clientAccessCheck2: No adapted_http_access configuration. default: ALLOW
2020/04/04 19:08:05.875 kid1| 85,2| client_side_request.cc(752) clientAccessCheckDone: The request GET 

Re: [squid-users] Squid transparent not caching apt requests from deb.debian.org

2020-04-03 Thread zrm

On 4/3/20 16:34, Alex Rousskov wrote:

On 4/3/20 4:26 PM, zrm wrote:

In the first case we get TCP_MISS every time because it isn't caching
the data, in the second case it's only the first time and after that we
get TCP_REFRESH_UNMODIFIED. But how and why is this happening?


Those questions can often be answered by looking at HTTP headers
received and sent by Squid when talking to clients and origin servers.
You can see those headers in basic packet captures or if you set
debug_options to ALL,2.

Alex.



apt to squid:
-
GET /debian/pool/main/v/vim/vim_8.1.0875-5_amd64.deb HTTP/1.1
Host: deb.debian.org
User-Agent: Debian APT-HTTP/1.3 (1.8.2)


--
squid to deb.debian.org:
-
GET /debian/pool/main/v/vim/vim_8.1.0875-5_amd64.deb HTTP/1.1
User-Agent: Debian APT-HTTP/1.3 (1.8.2)
Host: deb.debian.org
Via: 1.1 tproxy (squid/4.6)
X-Forwarded-For: 192.168.111.55
Cache-Control: max-age=259200
Connection: keep-alive


--
deb.debian.org to squid:
-
HTTP/1.1 200 OK
Server: Apache
X-Content-Type-Options: nosniff
X-Frame-Options: sameorigin
Referrer-Policy: no-referrer
X-Xss-Protection: 1
Last-Modified: Sat, 15 Jun 2019 17:46:35 GMT
ETag: "1389dc-58b605823fa6e"
X-Clacks-Overhead: GNU Terry Pratchett
Cache-Control: public, max-age=2592000
Content-Type: application/x-debian-package
Via: 1.1 varnish
Content-Length: 1280476
Accept-Ranges: bytes
Date: Fri, 03 Apr 2020 05:28:46 GMT
Via: 1.1 varnish
Age: 4248100
Connection: keep-alive
X-Served-By: cache-ams21028-AMS, cache-wdc5559-WDC
X-Cache: HIT, HIT
X-Cache-Hits: 1, 2
X-Timer: S1585891726.434375,VS0,VE0

--
squid to apt:
-
HTTP/1.1 200 OK
Server: Apache
X-Content-Type-Options: nosniff
X-Frame-Options: sameorigin
Referrer-Policy: no-referrer
X-Xss-Protection: 1
Last-Modified: Sat, 15 Jun 2019 17:46:35 GMT
ETag: "1389dc-58b605823fa6e"
X-Clacks-Overhead: GNU Terry Pratchett
Cache-Control: public, max-age=2592000
Content-Type: application/x-debian-package
Content-Length: 1280476
Accept-Ranges: bytes
Date: Fri, 03 Apr 2020 05:28:46 GMT
Age: 4248100
X-Served-By: cache-ams21028-AMS, cache-wdc5559-WDC
X-Cache: HIT, HIT
X-Cache-Hits: 1, 2
X-Timer: S1585891726.434375,VS0,VE0
X-Cache: MISS from tproxy
X-Cache-Lookup: MISS from tproxy:3130
Via: 1.1 varnish, 1.1 varnish, 1.1 tproxy (squid/4.6)
Connection: keep-alive


--



wget to squid:
-
GET /debian/pool/main/v/vim/vim_8.1.0875-5_amd64.deb HTTP/1.1
User-Agent: Wget/1.20.1 (linux-gnu)
Accept: */*
Accept-Encoding: identity
Host: deb.debian.org
Connection: Keep-Alive


--
squid to deb.debian.org:
--
GET /debian/pool/main/v/vim/vim_8.1.0875-5_amd64.deb HTTP/1.1
User-Agent: Wget/1.20.1 (linux-gnu)
Accept: */*
Accept-Encoding: identity
Host: deb.debian.org
Via: 1.1 tproxy (squid/4.6)
X-Forwarded-For: 192.168.111.55
Cache-Control: max-age=259200
Connection: keep-alive


--
deb.debian.org to squid:
-
HTTP/1.1 200 OK
Server: Apache
X-Content-Type-Options: nosniff
X-Frame-Options: sameorigin
Referrer-Policy: no-referrer
X-Xss-Protection: 1
Last-Modified: Sat, 15 Jun 2019 17:46:35 GMT
ETag: "1389dc-58b605823fa6e"
X-Clacks-Overhead: GNU Terry Pratchett
Cache-Control: public, max-age=2592000
Content-Type: application/x-debian-package
Via: 1.1 varnish
Content-Length: 1280476
Accept-Ranges: bytes
Date: Fri, 03 Apr 2020 05:28:49 GMT
Via: 1.1 varnish
Age: 4248102
Connection: keep-alive
X-Served-By: cache-ams21028-AMS, cache-wdc5583-WDC
X-Cache: HIT, HIT
X-Cache-Hits: 1, 1
X-Timer: S1585891729.316362,VS0,VE0

--
squid to wget:
-
HTTP/1.1 200 OK
Server: Apache
X-Content-Type-Options: nosniff
X-Frame-Options: sameorigin
Referrer-Policy: no-referrer
X-Xss-Protection: 1
Last-Modified: Sat, 15 Jun 2019 17:46:35 GMT
ETag: "1389dc-58b605823fa6e"
X-Clacks-Overhead: GNU Terry Pratchett
Cache-Control: public, max-age=2592000
Content-Type: application/x-debian-package
Content-Length: 1280476
Accept-Ranges: bytes
Date: Fri, 03 Apr 2020 05:28:49 GMT
Age: 4248102
X-Served-By: cache-ams21028-AMS, cache-wdc5583-WDC
X-Cache: HIT, HIT
X-Cache-Hits: 1, 1
X-Timer: S1585891729.316362,VS0,VE0
X-Cache: MISS from tproxy
X-Cache-Lookup: MISS from tproxy:3130
Via: 1.1 varnish, 1.1 varnish, 1.1 tproxy (squid/4.6)
Connection: keep-alive


--

___
squid-users mailing list
squid-users@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-users


[squid-users] Squid transparent not caching apt requests from deb.debian.org

2020-04-03 Thread zrm

Greetings! Today I bring you a Squid cache mystery.

I configured a simple transparent proxy to cache package downloads for 
Debian, using Squid on Debian 10. When apt clients download packages 
from deb.debian.org, Squid says TCP_MISS, downloads the package, and 
then doesn't cache it. The mystery is, why not.


It caches other requests by other applications. It caches other requests 
by apt -- packages from security.debian.org get cached. It caches 
requests for deb.debian.org when the client is not apt -- using wget or 
curl on the same URL causes it to be cached, as does pasting the HTTP 
request into netcat. The same HTTP request that apt sends. I checked the 
packets with WireShark and the TCP payload for the request is byte for 
byte identical, but when apt does it, it isn't cached. I'm not sure how 
it's even distinguishing the requests in order to behave differently.


These are the changes from the default squid.conf packaged with Debian 10:
http_access allow localnet
http_port 3130
http_port 3128 intercept
maximum_object_size_in_memory 4 MB

Here are repeated requests using "apt remove -y vim;apt install vim", 
access.log:


1585891724.223560 192.168.111.55 TCP_MISS/200 1281195 GET 
http://deb.debian.org/debian/pool/main/v/vim/vim_8.1.0875-5_amd64.deb - 
ORIGINAL_DST/199.232.64.204 application/x-debian-package
1585891726.697277 192.168.111.55 TCP_MISS/200 1281195 GET 
http://deb.debian.org/debian/pool/main/v/vim/vim_8.1.0875-5_amd64.deb - 
ORIGINAL_DST/199.232.64.204 application/x-debian-package


store.log:

1585891535.154 RELEASE -1  040017120100  200 
1585891535 1560620795 1588483535 application/x-debian-package 
1280476/1280476 GET 
http://deb.debian.org/debian/pool/main/v/vim/vim_8.1.0875-5_amd64.deb
1585891724.223 RELEASE -1  050017120100  200 
1585891723 1560620795 1588483723 application/x-debian-package 
1280476/1280476 GET 
http://deb.debian.org/debian/pool/main/v/vim/vim_8.1.0875-5_amd64.deb


But here are repeated requests for the same url using wget, access.log:

1585891729.700433 192.168.111.55 TCP_MISS/200 1281195 GET 
http://deb.debian.org/debian/pool/main/v/vim/vim_8.1.0875-5_amd64.deb - 
ORIGINAL_DST/199.232.66.133 application/x-debian-package
1585891731.089 70 192.168.111.55 TCP_REFRESH_UNMODIFIED/200 1281232 
GET 
http://deb.debian.org/debian/pool/main/v/vim/vim_8.1.0875-5_amd64.deb - 
ORIGINAL_DST/199.232.66.133 application/x-debian-package


store.log:

1585891726.697 RELEASE -1  060017120100  200 
1585891726 1560620795 1588483726 application/x-debian-package 
1280476/1280476 GET 
http://deb.debian.org/debian/pool/main/v/vim/vim_8.1.0875-5_amd64.deb
1585891731.047 RELEASE -1  080017120100  304 
1585891731-1 1588483731 unknown -1/-293 GET 
http://deb.debian.org/debian/pool/main/v/vim/vim_8.1.0875-5_amd64.deb


In the first case we get TCP_MISS every time because it isn't caching 
the data, in the second case it's only the first time and after that we 
get TCP_REFRESH_UNMODIFIED. But how and why is this happening?

___
squid-users mailing list
squid-users@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-users