Hi,

I run a number of resolvers used by customers, which uses dnsdist in front of several recursive servers, and I'm trying to track down a really strange issue.

We almost only get Do53 traffic, but other listeners are configured, in case someone wants to use them.

I have two different configuration files, (both have been slightly edited, removing some downstream servers and shortening the ACL to keep it shorter):

 * Old config (Lua): https://pastebin.com/v42Aateh
   This configuration has been in use for years. It generates warnings
   when dnsdist starts, but otherwise dnsdist performs as I would
   expect (handling 65K qps on the busiest server)
 * New config (dnsdist 2.0, Yaml): https://pastebin.com/NmpU6uP0
   This is the latest attempt at removing these errors and
   restructuring the configuration (I have also tried before with Lua
   configuration with the same outcome)

With the new version (as with previous attempts), I am seeing a huge drop in cache hit ratio - from >95% to <30% - and obviously I see a similar increase in requests on the backend.

I have tested with dnspyre (alexa domains file, using 33575 hostnames, with 10000 concurrent requests to localhost) on a test server, which is able to handle 100K+ qps from a single IP with >80% cache hit ratio. The main difference between the two is hardware, production is six cores (no HT) and 16GB RAM, test is eight cores (with HT) and 64GB RAM. As the old configuration works fine on the production server, I doubt it is hardware related.

Has anyone encountered a similar issue, or can suggest possible reasons for the significant drop in cache hit ratio after switching configuration? Any guidance would be much appreciated.

tia,

--
Med venlig hilsten/kind regards
Allan Willems Joergensen - https://nowhere.dk

_______________________________________________
dnsdist mailing list
[email protected]
https://mailman.powerdns.com/mailman/listinfo/dnsdist

Reply via email to