On 7 Aug 2025, at 20:53, brent saner via NANOG wrote:

> On Thu, Aug 7, 2025, 20:45 DurgaPrasad - DatasoftComnet via NANOG <
> [email protected]> wrote:
>
>> Hello all,
>> Do you have any recommendations for recursive DNS servers for a medium
>> sized (20-30k users) ISP.
>> We have used powerdns and unbound but sometimes find the caching times a
>> bit on upper side. Any suggestions between these two or anything new?
>> Also need points on how much we tune the settings
>> pros and cons if any.
>>
>> Thank you /DP
>
> <https://lists.nanog.org/archives/list/[email protected]/message/SUTKDISSISPWQY3YGF25FBQNN2JD5HDP/>
>
>
> It's surprising that you didn't get the performance you hoped for out of
> PowerDNS. You already tried the suggestions in their tuning guide[0], I'm
> assuming?
>
> You may also want to load in entire zones to the hot cache[1].
>
> And there's always horizontal scaling; sometimes you just plain hit limits
> on vertical scale.
>
> I haven't tried it yet, but dnsdist[2] should let you do this.
> (Or keepalived and/or HAproxy, or... etc. Any loadbalancer that can handle
> raw TCP and UDP.)
> Dnsdist in particular seems explicitly targeted towards a large set of
> untrusted clients with additional optional "safeguarding/consumer
> protection" features. Quad9 uses it in some fashion, if I recall correctly.
>
> [0] https://doc.powerdns.com/recursor/performance.html
> [1] https://docs.powerdns.com/recursor/lua-config/ztc.html
> [2] https://www.dnsdist.org/index.html


You beat me to it - dnsdist is an exceptionally robust solution for 
front-ending recursive (or authoritative) servers. Quad9 is indeed using it for 
all our recursive systems, and we split traffic on the "back-end" between 
PowerDNS recursor and Unbound.  It (dnsdist) has a "packet cache" feature which 
handles much of the load once warmed, and it answers on DOT/DOH as well as 
providing for a very rich set of tooling that allows management of unwanted 
behaviors.  The combination of dnsdist plus a good recursive resolver should 
easily be able to handle 30k users on a single modest chassis with ease, though 
of course it there are very good reasons to have several systems similarly 
configured in fail-over models using ECMP or your favorite routing protocol.  
Hot caches work better - try not to spread load too much.)  At this point, I 
can't imagine running a recursive system that is open to anything other than a 
tiny number of users without ensuring that dnsdist is in front of it - it's exa
 ctly the right thing and has been sandblasted by a lot of trial-and-error to 
make it fast and reliable with lots of features for ISP environments.

If a decent-sized system doesn't seem fast, there may be some other underlying 
issue that is at the root of a perceived speed issue. There is useful data that 
can be pulled out of dnsdist with prometheus-style outputs - I would suggest 
instrumenting things and seeing where the problems are.

Now, the original question of "points on how much we tune the settings" - that 
is a much longer discussion, but honestly you can get to 80% goodput without 
too much fiddling.

JT
_______________________________________________
NANOG mailing list 
https://lists.nanog.org/archives/list/[email protected]/message/J4WSKWYCIV7KTCVWXDWT64IGHKQZHERB/

Reply via email to