Re: [squid-users] IPv4 addresses go missing - markAsBad wrong?

2024-01-16 Thread Alex Rousskov

On 2024-01-16 10:44, Stephen Borrill wrote:

On 16/01/2024 14:43, Stephen Borrill wrote:
I have created a local DNS entry for 
forcesafesearch.google.com that only returns the A record. I think 
that should work around it (for that site, but not others).


Huh, it appears not to work around it properly. See error of "no DNS 
records" when it has literally just found the address in the cache.


These level-7 debugging records are meant for developers. The snippet 
below is not as self-contradictory as it may appear to a casual 
observer. It implies that the transaction hit a cached _set_ of DNS 
lookups. That set was previously formed from a usable DNS A response 
record (216.239.38.120) and an empty DNS  response ("No DNS records").


Alex.


2024/01/16 15:40:06.409 kid1| 14,4| ipcache.cc(617) nbgethostbyname: 
forcesafesearch.google.com
2024/01/16 15:40:06.409 kid1| 14,4| ipcache.cc(657) 
ipcache_nbgethostbyname_: ipcache_nbgethostbyname: HIT for 
'forcesafesearch.google.com'
2024/01/16 15:40:06.409 kid1| 14,7| ipcache.cc(253) forwardIp: 
216.239.38.120
2024/01/16 15:40:06.409 kid1| 44,2| peer_select.cc(1174) handlePath: 
PeerSelector260781 found conn1888968 local=0.0.0.0 
remote=216.239.38.120:443 HIER_DIRECT flags=1, destination #1 for 
forcesafesearch.google.com:443
2024/01/16 15:40:06.409 kid1| 44,2| peer_select.cc(1180) handlePath: 
always_direct = ALLOWED
2024/01/16 15:40:06.409 kid1| 44,2| peer_select.cc(1181) handlePath: 
never_direct = DENIED
2024/01/16 15:40:06.409 kid1| 44,2| peer_select.cc(1182) handlePath:
timedout = 0
2024/01/16 15:40:06.409 kid1| 14,7| ipcache.cc(236) finalCallback: 
0x189fb5e38  lookup_err=No DNS records



On 10/01/2024 12:40, Stephen Borrill wrote:

On 09/01/2024 15:42, Alex Rousskov wrote:

On 2024-01-09 05:56, Stephen Borrill wrote:

On 09/01/2024 09:51, Stephen Borrill wrote:

On 09/01/2024 03:41, Alex Rousskov wrote:

On 2024-01-08 08:31, Stephen Borrill wrote:
I'm trying to determine why squid 6.x (seen with 6.5) 
connected via IPv4-only periodically fails to connect to the 
destination and then requires a restart to fix it (reload is 
not sufficient).


The problem appears to be that a host that has one address 
each of IPv4 and IPv6 occasionally has its IPv4 address go 
missing as a destination. On closer inspection, this appears 
to happen when the IPv6 address (not the IPv4) address is 
marked as bad.


ipcache.cc(990) have: [2001:4860:4802:32::78]:443 at 0 in 
216.239.38.120 #1/2-0



Thank you for sharing more debugging info!


The following seemed odd to. It finds an IPv4 address (this host 
does not have IPv6), puts it in the cache and then says "No DNS 
records":


2024/01/09 12:31:24.020 kid1| 14,4| ipcache.cc(617) 
nbgethostbyname: schoolbase.online
2024/01/09 12:31:24.020 kid1| 14,3| ipcache.cc(313) ipcacheRelease: 
ipcacheRelease: Releasing entry for 'schoolbase.online'
2024/01/09 12:31:24.020 kid1| 14,5| ipcache.cc(670) 
ipcache_nbgethostbyname_: ipcache_nbgethostbyname: MISS for 
'schoolbase.online'
2024/01/09 12:31:24.020 kid1| 14,3| ipcache.cc(480) ipcacheParse: 1 
answers for schoolbase.online
2024/01/09 12:31:24.020 kid1| 14,7| ipcache.cc(995) have:  no 
20.54.32.34 in [no cached IPs]
2024/01/09 12:31:24.020 kid1| 14,7| ipcache.cc(995) have:  no 
20.54.32.34 in [no cached IPs]
2024/01/09 12:31:24.020 kid1| 14,5| ipcache.cc(549) updateTtl: use 
first 69 from RR TTL 69
2024/01/09 12:31:24.020 kid1| 14,3| ipcache.cc(535) addGood: 
schoolbase.online #1 20.54.32.34
2024/01/09 12:31:24.020 kid1| 14,7| ipcache.cc(253) forwardIp: 
20.54.32.34
2024/01/09 12:31:24.020 kid1| 44,2| peer_select.cc(1174) 
handlePath: PeerSelector72389 found conn564274 local=0.0.0.0 
remote=20.54.32.34:443 HIER_DIRECT flags=1, destination #1 for 
schoolbase.online:443
2024/01/09 12:31:24.020 kid1| 14,3| ipcache.cc(459) latestError: 
ERROR: DNS failure while resolving schoolbase.online: No DNS records
2024/01/09 12:31:24.020 kid1| 14,3| ipcache.cc(586) 
ipcacheHandleReply: done with schoolbase.online: 20.54.32.34 #1/1-0
2024/01/09 12:31:24.020 kid1| 14,7| ipcache.cc(236) finalCallback: 
0x1b7381f38  lookup_err=No DNS records


It seemed to happen about the same time as the other failure, so 
perhaps another symptom of the same.


The above log line is self-contradictory AFAICT: It says that the 
cache has both IPv6-looking and IPv4-looking address at the same 
cache position (0) and, judging by the corresponding code, those 
two IP addresses are equal. This is not possible (for those 
specific IP address values). The subsequent Squid behavior can be 
explained by this (unexplained) conflict.


I assume you are running official Squid v6.5 code.


Yes, compiled from source on NetBSD. I have the patch I refer to 
here applied too:

https://lists.squid-cache.org/pipermail/squid-users/2023-November/026279.html


I can suggest the following two steps for going forward:

1. Upgrade to the latest Squid v6 in hope that the problem goes away.


I have just upgraded to 6.6.

2. If 

Re: [squid-users] IPv4 addresses go missing - markAsBad wrong?

2024-01-16 Thread Stephen Borrill

On 16/01/2024 14:43, Stephen Borrill wrote:

On 16/01/2024 14:37, Alex Rousskov wrote:

On 2024-01-16 06:01, Stephen Borrill wrote:
The problem is no different with 6.6. Is there any more debugging I 
can provide, Alex?


Yes, but I need to give you a patch that adds that (temporary) 
debugging first (assuming I fail to reproduce the problem in the lab). 
The ball is on my side (unless somebody else steps in). Unfortunately, 
I do not have any free time for any of that right now. If you do not 
hear from me sooner, please ping me again on or after February 8, 2024.


Thanks. In the meantime, I have created a local DNS entry for 
forcesafesearch.google.com that only returns the A record. I think that 
should work around it (for that site, but not others).


Huh, it appears not to work around it properly. See error of "no DNS 
records" when it has literally just found the address in the cache.


2024/01/16 15:40:06.409 kid1| 14,4| ipcache.cc(617) nbgethostbyname: 
forcesafesearch.google.com
2024/01/16 15:40:06.409 kid1| 14,4| ipcache.cc(657) 
ipcache_nbgethostbyname_: ipcache_nbgethostbyname: HIT for 
'forcesafesearch.google.com'
2024/01/16 15:40:06.409 kid1| 14,7| ipcache.cc(253) forwardIp: 
216.239.38.120
2024/01/16 15:40:06.409 kid1| 44,2| peer_select.cc(1174) handlePath: 
PeerSelector260781 found conn1888968 local=0.0.0.0 
remote=216.239.38.120:443 HIER_DIRECT flags=1, destination #1 for 
forcesafesearch.google.com:443
2024/01/16 15:40:06.409 kid1| 44,2| peer_select.cc(1180) handlePath: 
always_direct = ALLOWED
2024/01/16 15:40:06.409 kid1| 44,2| peer_select.cc(1181) handlePath: 
never_direct = DENIED
2024/01/16 15:40:06.409 kid1| 44,2| peer_select.cc(1182) handlePath: 
   timedout = 0
2024/01/16 15:40:06.409 kid1| 14,7| ipcache.cc(236) finalCallback: 
0x189fb5e38  lookup_err=No DNS records



On 10/01/2024 12:40, Stephen Borrill wrote:

On 09/01/2024 15:42, Alex Rousskov wrote:

On 2024-01-09 05:56, Stephen Borrill wrote:

On 09/01/2024 09:51, Stephen Borrill wrote:

On 09/01/2024 03:41, Alex Rousskov wrote:

On 2024-01-08 08:31, Stephen Borrill wrote:
I'm trying to determine why squid 6.x (seen with 6.5) connected 
via IPv4-only periodically fails to connect to the destination 
and then requires a restart to fix it (reload is not sufficient).


The problem appears to be that a host that has one address each 
of IPv4 and IPv6 occasionally has its IPv4 address go missing 
as a destination. On closer inspection, this appears to happen 
when the IPv6 address (not the IPv4) address is marked as bad.


ipcache.cc(990) have: [2001:4860:4802:32::78]:443 at 0 in 
216.239.38.120 #1/2-0



Thank you for sharing more debugging info!


The following seemed odd to. It finds an IPv4 address (this host 
does not have IPv6), puts it in the cache and then says "No DNS 
records":


2024/01/09 12:31:24.020 kid1| 14,4| ipcache.cc(617) nbgethostbyname: 
schoolbase.online
2024/01/09 12:31:24.020 kid1| 14,3| ipcache.cc(313) ipcacheRelease: 
ipcacheRelease: Releasing entry for 'schoolbase.online'
2024/01/09 12:31:24.020 kid1| 14,5| ipcache.cc(670) 
ipcache_nbgethostbyname_: ipcache_nbgethostbyname: MISS for 
'schoolbase.online'
2024/01/09 12:31:24.020 kid1| 14,3| ipcache.cc(480) ipcacheParse: 1 
answers for schoolbase.online
2024/01/09 12:31:24.020 kid1| 14,7| ipcache.cc(995) have:  no 
20.54.32.34 in [no cached IPs]
2024/01/09 12:31:24.020 kid1| 14,7| ipcache.cc(995) have:  no 
20.54.32.34 in [no cached IPs]
2024/01/09 12:31:24.020 kid1| 14,5| ipcache.cc(549) updateTtl: use 
first 69 from RR TTL 69
2024/01/09 12:31:24.020 kid1| 14,3| ipcache.cc(535) addGood: 
schoolbase.online #1 20.54.32.34
2024/01/09 12:31:24.020 kid1| 14,7| ipcache.cc(253) forwardIp: 
20.54.32.34
2024/01/09 12:31:24.020 kid1| 44,2| peer_select.cc(1174) handlePath: 
PeerSelector72389 found conn564274 local=0.0.0.0 
remote=20.54.32.34:443 HIER_DIRECT flags=1, destination #1 for 
schoolbase.online:443
2024/01/09 12:31:24.020 kid1| 14,3| ipcache.cc(459) latestError: 
ERROR: DNS failure while resolving schoolbase.online: No DNS records
2024/01/09 12:31:24.020 kid1| 14,3| ipcache.cc(586) 
ipcacheHandleReply: done with schoolbase.online: 20.54.32.34 #1/1-0
2024/01/09 12:31:24.020 kid1| 14,7| ipcache.cc(236) finalCallback: 
0x1b7381f38  lookup_err=No DNS records


It seemed to happen about the same time as the other failure, so 
perhaps another symptom of the same.


The above log line is self-contradictory AFAICT: It says that the 
cache has both IPv6-looking and IPv4-looking address at the same 
cache position (0) and, judging by the corresponding code, those 
two IP addresses are equal. This is not possible (for those 
specific IP address values). The subsequent Squid behavior can be 
explained by this (unexplained) conflict.


I assume you are running official Squid v6.5 code.


Yes, compiled from source on NetBSD. I have the patch I refer to 
here applied too:

https://lists.squid-cache.org/pipermail/squid-users/2023-November/026279.html


I can suggest the 

Re: [squid-users] IPv4 addresses go missing - markAsBad wrong?

2024-01-16 Thread Stephen Borrill

On 16/01/2024 14:37, Alex Rousskov wrote:

On 2024-01-16 06:01, Stephen Borrill wrote:
The problem is no different with 6.6. Is there any more debugging I 
can provide, Alex?


Yes, but I need to give you a patch that adds that (temporary) debugging 
first (assuming I fail to reproduce the problem in the lab). The ball is 
on my side (unless somebody else steps in). Unfortunately, I do not have 
any free time for any of that right now. If you do not hear from me 
sooner, please ping me again on or after February 8, 2024.


Thanks. In the meantime, I have created a local DNS entry for 
forcesafesearch.google.com that only returns the A record. I think that 
should work around it (for that site, but not others).



On 10/01/2024 12:40, Stephen Borrill wrote:

On 09/01/2024 15:42, Alex Rousskov wrote:

On 2024-01-09 05:56, Stephen Borrill wrote:

On 09/01/2024 09:51, Stephen Borrill wrote:

On 09/01/2024 03:41, Alex Rousskov wrote:

On 2024-01-08 08:31, Stephen Borrill wrote:
I'm trying to determine why squid 6.x (seen with 6.5) connected 
via IPv4-only periodically fails to connect to the destination 
and then requires a restart to fix it (reload is not sufficient).


The problem appears to be that a host that has one address each 
of IPv4 and IPv6 occasionally has its IPv4 address go missing as 
a destination. On closer inspection, this appears to happen when 
the IPv6 address (not the IPv4) address is marked as bad.


ipcache.cc(990) have: [2001:4860:4802:32::78]:443 at 0 in 
216.239.38.120 #1/2-0



Thank you for sharing more debugging info!


The following seemed odd to. It finds an IPv4 address (this host does 
not have IPv6), puts it in the cache and then says "No DNS records":


2024/01/09 12:31:24.020 kid1| 14,4| ipcache.cc(617) nbgethostbyname: 
schoolbase.online
2024/01/09 12:31:24.020 kid1| 14,3| ipcache.cc(313) ipcacheRelease: 
ipcacheRelease: Releasing entry for 'schoolbase.online'
2024/01/09 12:31:24.020 kid1| 14,5| ipcache.cc(670) 
ipcache_nbgethostbyname_: ipcache_nbgethostbyname: MISS for 
'schoolbase.online'
2024/01/09 12:31:24.020 kid1| 14,3| ipcache.cc(480) ipcacheParse: 1 
answers for schoolbase.online
2024/01/09 12:31:24.020 kid1| 14,7| ipcache.cc(995) have:  no 
20.54.32.34 in [no cached IPs]
2024/01/09 12:31:24.020 kid1| 14,7| ipcache.cc(995) have:  no 
20.54.32.34 in [no cached IPs]
2024/01/09 12:31:24.020 kid1| 14,5| ipcache.cc(549) updateTtl: use 
first 69 from RR TTL 69
2024/01/09 12:31:24.020 kid1| 14,3| ipcache.cc(535) addGood: 
schoolbase.online #1 20.54.32.34
2024/01/09 12:31:24.020 kid1| 14,7| ipcache.cc(253) forwardIp: 
20.54.32.34
2024/01/09 12:31:24.020 kid1| 44,2| peer_select.cc(1174) handlePath: 
PeerSelector72389 found conn564274 local=0.0.0.0 
remote=20.54.32.34:443 HIER_DIRECT flags=1, destination #1 for 
schoolbase.online:443
2024/01/09 12:31:24.020 kid1| 14,3| ipcache.cc(459) latestError: 
ERROR: DNS failure while resolving schoolbase.online: No DNS records
2024/01/09 12:31:24.020 kid1| 14,3| ipcache.cc(586) 
ipcacheHandleReply: done with schoolbase.online: 20.54.32.34 #1/1-0
2024/01/09 12:31:24.020 kid1| 14,7| ipcache.cc(236) finalCallback: 
0x1b7381f38  lookup_err=No DNS records


It seemed to happen about the same time as the other failure, so 
perhaps another symptom of the same.


The above log line is self-contradictory AFAICT: It says that the 
cache has both IPv6-looking and IPv4-looking address at the same 
cache position (0) and, judging by the corresponding code, those two 
IP addresses are equal. This is not possible (for those specific IP 
address values). The subsequent Squid behavior can be explained by 
this (unexplained) conflict.


I assume you are running official Squid v6.5 code.


Yes, compiled from source on NetBSD. I have the patch I refer to here 
applied too:

https://lists.squid-cache.org/pipermail/squid-users/2023-November/026279.html


I can suggest the following two steps for going forward:

1. Upgrade to the latest Squid v6 in hope that the problem goes away.


I have just upgraded to 6.6.

2. If the problem is still there, patch the latest Squid v6 to add 
more debugging in hope to explain what is going on. This may take a 
few iterations, and it will take me some time to produce the 
necessary debugging patch.


Unfortunately, I don't have a test case that will cause the problem 
so I need to run this at a customer's production site that is 
particularly affected by it. Luckily, the problem recurs pretty quickly.


Here's a run with 6.6 where the number of destinations drops from 2 
to 1 before reverting. Not seen this before - usually once it has 
dropped to 1 (the IPv6 address), it stays there until a restart (and 
this did happen about a minute after this log fragment). Happy to 
test out any debugging patch.


2024/01/10 11:55:49.849 kid1| 14,4| ipcache.cc(617) nbgethostbyname: 
forcesafesearch.google.com
2024/01/10 11:55:49.849 kid1| 14,3| Address.cc(389) lookupHostIP: 
Given Non-IP 'forcesafesearch.google.com': hostname or 

Re: [squid-users] IPv4 addresses go missing - markAsBad wrong?

2024-01-16 Thread Alex Rousskov

On 2024-01-16 06:01, Stephen Borrill wrote:
The problem is no different with 6.6. Is there any more debugging I can 
provide, Alex?


Yes, but I need to give you a patch that adds that (temporary) debugging 
first (assuming I fail to reproduce the problem in the lab). The ball is 
on my side (unless somebody else steps in). Unfortunately, I do not have 
any free time for any of that right now. If you do not hear from me 
sooner, please ping me again on or after February 8, 2024.



Thank you,

Alex.


On 10/01/2024 12:40, Stephen Borrill wrote:

On 09/01/2024 15:42, Alex Rousskov wrote:

On 2024-01-09 05:56, Stephen Borrill wrote:

On 09/01/2024 09:51, Stephen Borrill wrote:

On 09/01/2024 03:41, Alex Rousskov wrote:

On 2024-01-08 08:31, Stephen Borrill wrote:
I'm trying to determine why squid 6.x (seen with 6.5) connected 
via IPv4-only periodically fails to connect to the destination 
and then requires a restart to fix it (reload is not sufficient).


The problem appears to be that a host that has one address each 
of IPv4 and IPv6 occasionally has its IPv4 address go missing as 
a destination. On closer inspection, this appears to happen when 
the IPv6 address (not the IPv4) address is marked as bad.


ipcache.cc(990) have: [2001:4860:4802:32::78]:443 at 0 in 
216.239.38.120 #1/2-0



Thank you for sharing more debugging info!


The following seemed odd to. It finds an IPv4 address (this host does 
not have IPv6), puts it in the cache and then says "No DNS records":


2024/01/09 12:31:24.020 kid1| 14,4| ipcache.cc(617) nbgethostbyname: 
schoolbase.online
2024/01/09 12:31:24.020 kid1| 14,3| ipcache.cc(313) ipcacheRelease: 
ipcacheRelease: Releasing entry for 'schoolbase.online'
2024/01/09 12:31:24.020 kid1| 14,5| ipcache.cc(670) 
ipcache_nbgethostbyname_: ipcache_nbgethostbyname: MISS for 
'schoolbase.online'
2024/01/09 12:31:24.020 kid1| 14,3| ipcache.cc(480) ipcacheParse: 1 
answers for schoolbase.online
2024/01/09 12:31:24.020 kid1| 14,7| ipcache.cc(995) have:  no 
20.54.32.34 in [no cached IPs]
2024/01/09 12:31:24.020 kid1| 14,7| ipcache.cc(995) have:  no 
20.54.32.34 in [no cached IPs]
2024/01/09 12:31:24.020 kid1| 14,5| ipcache.cc(549) updateTtl: use 
first 69 from RR TTL 69
2024/01/09 12:31:24.020 kid1| 14,3| ipcache.cc(535) addGood: 
schoolbase.online #1 20.54.32.34
2024/01/09 12:31:24.020 kid1| 14,7| ipcache.cc(253) forwardIp: 
20.54.32.34
2024/01/09 12:31:24.020 kid1| 44,2| peer_select.cc(1174) handlePath: 
PeerSelector72389 found conn564274 local=0.0.0.0 
remote=20.54.32.34:443 HIER_DIRECT flags=1, destination #1 for 
schoolbase.online:443
2024/01/09 12:31:24.020 kid1| 14,3| ipcache.cc(459) latestError: 
ERROR: DNS failure while resolving schoolbase.online: No DNS records
2024/01/09 12:31:24.020 kid1| 14,3| ipcache.cc(586) 
ipcacheHandleReply: done with schoolbase.online: 20.54.32.34 #1/1-0
2024/01/09 12:31:24.020 kid1| 14,7| ipcache.cc(236) finalCallback: 
0x1b7381f38  lookup_err=No DNS records


It seemed to happen about the same time as the other failure, so 
perhaps another symptom of the same.


The above log line is self-contradictory AFAICT: It says that the 
cache has both IPv6-looking and IPv4-looking address at the same 
cache position (0) and, judging by the corresponding code, those two 
IP addresses are equal. This is not possible (for those specific IP 
address values). The subsequent Squid behavior can be explained by 
this (unexplained) conflict.


I assume you are running official Squid v6.5 code.


Yes, compiled from source on NetBSD. I have the patch I refer to here 
applied too:

https://lists.squid-cache.org/pipermail/squid-users/2023-November/026279.html


I can suggest the following two steps for going forward:

1. Upgrade to the latest Squid v6 in hope that the problem goes away.


I have just upgraded to 6.6.

2. If the problem is still there, patch the latest Squid v6 to add 
more debugging in hope to explain what is going on. This may take a 
few iterations, and it will take me some time to produce the 
necessary debugging patch.


Unfortunately, I don't have a test case that will cause the problem so 
I need to run this at a customer's production site that is 
particularly affected by it. Luckily, the problem recurs pretty quickly.


Here's a run with 6.6 where the number of destinations drops from 2 to 
1 before reverting. Not seen this before - usually once it has dropped 
to 1 (the IPv6 address), it stays there until a restart (and this did 
happen about a minute after this log fragment). Happy to test out any 
debugging patch.


2024/01/10 11:55:49.849 kid1| 14,4| ipcache.cc(617) nbgethostbyname: 
forcesafesearch.google.com
2024/01/10 11:55:49.849 kid1| 14,3| Address.cc(389) lookupHostIP: 
Given Non-IP 'forcesafesearch.google.com': hostname or servname not 
provided or not known
2024/01/10 11:55:49.849 kid1| 14,4| ipcache.cc(657) 
ipcache_nbgethostbyname_: ipcache_nbgethostbyname: HIT for 
'forcesafesearch.google.com'
2024/01/10 11:55:49.849 kid1| 14,7| 

Re: [squid-users] IPv4 addresses go missing - markAsBad wrong?

2024-01-16 Thread Stephen Borrill
The problem is no different with 6.6. Is there any more debugging I can 
provide, Alex?


On 10/01/2024 12:40, Stephen Borrill wrote:

On 09/01/2024 15:42, Alex Rousskov wrote:

On 2024-01-09 05:56, Stephen Borrill wrote:

On 09/01/2024 09:51, Stephen Borrill wrote:

On 09/01/2024 03:41, Alex Rousskov wrote:

On 2024-01-08 08:31, Stephen Borrill wrote:
I'm trying to determine why squid 6.x (seen with 6.5) connected 
via IPv4-only periodically fails to connect to the destination and 
then requires a restart to fix it (reload is not sufficient).


The problem appears to be that a host that has one address each of 
IPv4 and IPv6 occasionally has its IPv4 address go missing as a 
destination. On closer inspection, this appears to happen when the 
IPv6 address (not the IPv4) address is marked as bad.


ipcache.cc(990) have: [2001:4860:4802:32::78]:443 at 0 in 
216.239.38.120 #1/2-0



Thank you for sharing more debugging info!


The following seemed odd to. It finds an IPv4 address (this host does 
not have IPv6), puts it in the cache and then says "No DNS records":


2024/01/09 12:31:24.020 kid1| 14,4| ipcache.cc(617) nbgethostbyname: 
schoolbase.online
2024/01/09 12:31:24.020 kid1| 14,3| ipcache.cc(313) ipcacheRelease: 
ipcacheRelease: Releasing entry for 'schoolbase.online'
2024/01/09 12:31:24.020 kid1| 14,5| ipcache.cc(670) 
ipcache_nbgethostbyname_: ipcache_nbgethostbyname: MISS for 
'schoolbase.online'
2024/01/09 12:31:24.020 kid1| 14,3| ipcache.cc(480) ipcacheParse: 1 
answers for schoolbase.online
2024/01/09 12:31:24.020 kid1| 14,7| ipcache.cc(995) have:  no 
20.54.32.34 in [no cached IPs]
2024/01/09 12:31:24.020 kid1| 14,7| ipcache.cc(995) have:  no 
20.54.32.34 in [no cached IPs]
2024/01/09 12:31:24.020 kid1| 14,5| ipcache.cc(549) updateTtl: use first 
69 from RR TTL 69
2024/01/09 12:31:24.020 kid1| 14,3| ipcache.cc(535) addGood: 
schoolbase.online #1 20.54.32.34

2024/01/09 12:31:24.020 kid1| 14,7| ipcache.cc(253) forwardIp: 20.54.32.34
2024/01/09 12:31:24.020 kid1| 44,2| peer_select.cc(1174) handlePath: 
PeerSelector72389 found conn564274 local=0.0.0.0 remote=20.54.32.34:443 
HIER_DIRECT flags=1, destination #1 for schoolbase.online:443
2024/01/09 12:31:24.020 kid1| 14,3| ipcache.cc(459) latestError: ERROR: 
DNS failure while resolving schoolbase.online: No DNS records
2024/01/09 12:31:24.020 kid1| 14,3| ipcache.cc(586) ipcacheHandleReply: 
done with schoolbase.online: 20.54.32.34 #1/1-0
2024/01/09 12:31:24.020 kid1| 14,7| ipcache.cc(236) finalCallback: 
0x1b7381f38  lookup_err=No DNS records


It seemed to happen about the same time as the other failure, so perhaps 
another symptom of the same.


The above log line is self-contradictory AFAICT: It says that the 
cache has both IPv6-looking and IPv4-looking address at the same cache 
position (0) and, judging by the corresponding code, those two IP 
addresses are equal. This is not possible (for those specific IP 
address values). The subsequent Squid behavior can be explained by 
this (unexplained) conflict.


I assume you are running official Squid v6.5 code.


Yes, compiled from source on NetBSD. I have the patch I refer to here 
applied too:

https://lists.squid-cache.org/pipermail/squid-users/2023-November/026279.html


I can suggest the following two steps for going forward:

1. Upgrade to the latest Squid v6 in hope that the problem goes away.


I have just upgraded to 6.6.

2. If the problem is still there, patch the latest Squid v6 to add 
more debugging in hope to explain what is going on. This may take a 
few iterations, and it will take me some time to produce the necessary 
debugging patch.


Unfortunately, I don't have a test case that will cause the problem so I 
need to run this at a customer's production site that is particularly 
affected by it. Luckily, the problem recurs pretty quickly.


Here's a run with 6.6 where the number of destinations drops from 2 to 1 
before reverting. Not seen this before - usually once it has dropped to 
1 (the IPv6 address), it stays there until a restart (and this did 
happen about a minute after this log fragment). Happy to test out any 
debugging patch.


2024/01/10 11:55:49.849 kid1| 14,4| ipcache.cc(617) nbgethostbyname: 
forcesafesearch.google.com
2024/01/10 11:55:49.849 kid1| 14,3| Address.cc(389) lookupHostIP: Given 
Non-IP 'forcesafesearch.google.com': hostname or servname not provided 
or not known
2024/01/10 11:55:49.849 kid1| 14,4| ipcache.cc(657) 
ipcache_nbgethostbyname_: ipcache_nbgethostbyname: HIT for 
'forcesafesearch.google.com'
2024/01/10 11:55:49.849 kid1| 14,7| ipcache.cc(253) forwardIp: 
[2001:4860:4802:32::78]
2024/01/10 11:55:49.849 kid1| 44,2| peer_select.cc(1174) handlePath: 
PeerSelector300176 found conn2388484 local=[::] 
remote=[2001:4860:4802:32::78]:443 HIER_DIRECT flags=1, destination #1 
for forcesafesearch.google.com:443
2024/01/10 11:55:49.849 kid1| 44,2| peer_select.cc(1180) handlePath: 
always_direct = ALLOWED
2024/01/10 11:55:49.849 kid1| 44,2|