I have experienced a strange situation in which squid repeatedly
returns dns resolution error messages even though I can resolve the
same names at the command line, and even fetch the same pages via
wget.
Running squid -k reconfigure fixes the problem immediately.

My first theory was that there is some sporadic network outage, but
once its fixed squid retains the negative dns cache of failed
requests.
So I tried lowering the negative_dns_ttl setting down to a few
seconds, but it didn't help.

This problem is hard to diagnose because this server is in production
and I can't afford cranking up the logs and wait until it happens
again, and I also have to be quick to fix it instead when this
situation happens instead of spending more time investigating the
cause.

I wrote a perl script that "tails" the access log to detect when this
error condition happens and then automatically runs squid -k
reconfigure and informs me via email.
This script is useful to warn me about error conditions, even if they
are unrelated to Squid, and I would like to retain it even if I can
solve the root problem.

But at the same time I would like to improve it by using the Cache
Manager interface, or SNMP, and somehow get the number of failed
requests and total requests served in the last x minutes.  Is it
possible to get this information in this way instead of reading the
access log?

So I would appreciate any help with both issues:
1. strange dns cache problem
2. improving my monitoring via Cache Manager or SNMP

Thanks.

Reply via email to