Re: [dnsdist] backend drops metrics for TCP
Hi Christoph, On 13/09/2023 07:30, Christoph via dnsdist wrote: I've switched back to using UDP. Is there an easy way to log queries that timeout (2s) - and not log any others? To investigate some examples further? I don't think we have a way to log only these, unfortunately :-/ If you have the dnsdist console set up, you can use grepq('1000ms') to look at all queries that took more than 1 second, which is usually indicative of a problem, or even grepq('2000ms'), as dnsdist records timeouts with a very high response time. Best regards, -- Remi Gacogne PowerDNS.COM BV - https://www.powerdns.com/ OpenPGP_signature.asc Description: OpenPGP digital signature ___ dnsdist mailing list dnsdist@mailman.powerdns.com https://mailman.powerdns.com/mailman/listinfo/dnsdist
Re: [dnsdist] backend drops metrics for TCP
This counter will always be 0 for TCP backends indeed, it is only incremented when we give up waiting on a UDP response. Thanks for confirming. The default timeout for TCP backends is set at 30s, while for UDP responses it is at 2s. So it is very possible that dnsdist no longer considers the response a timeout but the application now does. You might try to tune the 'tcpRecvTimeout' on `newServer`. Note that this suggests that the backend is slow to answer, so tuning dnsdist might not help at all and investigating why the backend struggles with these queries might be needed. I've switched back to using UDP. Is there an easy way to log queries that timeout (2s) - and not log any others? To investigate some examples further? https://dnsdist.org/rules-actions.html?highlight=addaction#ERCodeRule https://dnsdist.org/reference/constants.html#dnsrcode The only RCode with "time" in it: DNSRCode.BADTIME Yes, I'm also investigating the increased timeout rate on the backend Recursor side and I'm in contact with Otto about it. So far disabling agg. NSEC caching has been the most significant workaround for that problem. Do you enable out-of-order processing, via 'maxInFlight' on `newServer`? yes (1k) If so, are you sure that the backend actually supports it? A while back you pointed out a problem in our Recursor config since then Recursor should work with maxInFlight config. best regards, Christoph ___ dnsdist mailing list dnsdist@mailman.powerdns.com https://mailman.powerdns.com/mailman/listinfo/dnsdist
Re: [dnsdist] backend drops metrics for TCP
Hello! On 11/09/2023 22:34, Christoph via dnsdist wrote: when playing around with things to reduce the drop rate I noticed that TCP based backends always have 0 drops in showServers() output and these metrics: dnsdist_server_drops dnsdist_downstream_timeouts Is that always the case and that counter has no meaning for TCP based backends or can this counter be non-zero for TCP backends as well? This counter will always be 0 for TCP backends indeed, it is only incremented when we give up waiting on a UDP response. dnsdist's CPU usage doubled after switching to TCP via tcpOnly=true and the DNS timeout rate as measured by the application generating the queries running on the same host as dnsdist actually increased after switching dnsdist to use TCP instead of UDP. So switching to TCP eliminated the drops problem when measured by dnsdist but it made things worse for the application. The default timeout for TCP backends is set at 30s, while for UDP responses it is at 2s. So it is very possible that dnsdist no longer considers the response a timeout but the application now does. You might try to tune the 'tcpRecvTimeout' on `newServer`. Note that this suggests that the backend is slow to answer, so tuning dnsdist might not help at all and investigating why the backend struggles with these queries might be needed. All of these values are also at 0: dnsdist_server_tcpdiedsendingquery{address="127.0.0.1:54"} 0 dnsdist_server_tcpdiedreadingresponse{address="127.0.0.1:54"} 0 dnsdist_server_tcpgaveup{address="127.0.0.1:54"} 0 dnsdist_server_tcpreadtimeouts{address="127.0.0.1:54"} 0 dnsdist_server_tcpwritetimeouts{address="127.0.0.1:54"} 0 dnsdist_server_tcpconnecttimeouts{address="127.0.0.1:54"} 0 These are indeed the ones that would indicate a problem between dnsdist and a TCP backend, as seen by dnsdist. Since sockets=NUM in newServer() is only for UDP and dnsdist_server_tcpcurrentconnections{address="127.0.0.1:54"} 10 suggests it uses only 10 TCP sockets. How can this be configured? sockets was set to 32, so this implicit change when sitching from UDP to TCP might also have an effect here. dnsdist will create as many outgoing TCP connections as needed by default, unless instructed otherwise via 'maxConcurrentTCPConnections' on `newServer`. So from dnsdist's point of view there was no need for more TCP connections, apparently. Do you enable out-of-order processing, via 'maxInFlight' on `newServer`? If so, are you sure that the backend actually supports it? Best regards, -- Remi Gacogne PowerDNS.COM BV - https://www.powerdns.com/ OpenPGP_signature.asc Description: OpenPGP digital signature ___ dnsdist mailing list dnsdist@mailman.powerdns.com https://mailman.powerdns.com/mailman/listinfo/dnsdist
[dnsdist] backend drops metrics for TCP
Hello! when playing around with things to reduce the drop rate I noticed that TCP based backends always have 0 drops in showServers() output and these metrics: dnsdist_server_drops dnsdist_downstream_timeouts Is that always the case and that counter has no meaning for TCP based backends or can this counter be non-zero for TCP backends as well? dnsdist's CPU usage doubled after switching to TCP via tcpOnly=true and the DNS timeout rate as measured by the application generating the queries running on the same host as dnsdist actually increased after switching dnsdist to use TCP instead of UDP. So switching to TCP eliminated the drops problem when measured by dnsdist but it made things worse for the application. All of these values are also at 0: dnsdist_server_tcpdiedsendingquery{address="127.0.0.1:54"} 0 dnsdist_server_tcpdiedreadingresponse{address="127.0.0.1:54"} 0 dnsdist_server_tcpgaveup{address="127.0.0.1:54"} 0 dnsdist_server_tcpreadtimeouts{address="127.0.0.1:54"} 0 dnsdist_server_tcpwritetimeouts{address="127.0.0.1:54"} 0 dnsdist_server_tcpconnecttimeouts{address="127.0.0.1:54"} 0 dnsdist_server_latency and dnsdist_server_tcplatency are on the same level after switching to TCP for the specific backend. Since sockets=NUM in newServer() is only for UDP and dnsdist_server_tcpcurrentconnections{address="127.0.0.1:54"} 10 suggests it uses only 10 TCP sockets. How can this be configured? sockets was set to 32, so this implicit change when sitching from UDP to TCP might also have an effect here. best regards, Christoph ___ dnsdist mailing list dnsdist@mailman.powerdns.com https://mailman.powerdns.com/mailman/listinfo/dnsdist