Hi,
Thanks for reply...!

We have configured setMaxUDPOutstanding(65535) and still we are seeing backend 
down, logs are showing frequently as below.

Timeout while waiting for the health check response from backend 192.168.1.1:53
Timeout while waiting for the health check response from backend 192.168.1.2:53

Please have a look at below dnsdist configuration and help us to find 
misconfiguration (16 Listeners & 8+8 backends added as per vCPUs available (2 
Socket x 8 Cores):

controlSocket('127.0.0.1:5199')
setKey("")

---- Listen addresses
addLocal('192.168.0.1:53', { reusePort=true })
addLocal('192.168.0.1:53', { reusePort=true })
addLocal('192.168.0.1:53', { reusePort=true })
addLocal('192.168.0.1:53', { reusePort=true })
addLocal('192.168.0.1:53', { reusePort=true })
addLocal('192.168.0.1:53', { reusePort=true })
addLocal('192.168.0.1:53', { reusePort=true })
addLocal('192.168.0.1:53', { reusePort=true })
addLocal('192.168.0.1:53', { reusePort=true })
addLocal('192.168.0.1:53', { reusePort=true })
addLocal('192.168.0.1:53', { reusePort=true })
addLocal('192.168.0.1:53', { reusePort=true })
addLocal('192.168.0.1:53', { reusePort=true })
addLocal('192.168.0.1:53', { reusePort=true })
addLocal('192.168.0.1:53', { reusePort=true })
addLocal('192.168.0.1:53', { reusePort=true })

---- Back-end server
newServer({address='192.168.1.1', maxCheckFailures=3, checkInterval=5, 
weight=4, qps=40000, order=1})
newServer({address='192.168.1.1', maxCheckFailures=3, checkInterval=5, 
weight=4, qps=40000, order=2})
newServer({address='192.168.1.1', maxCheckFailures=3, checkInterval=5, 
weight=4, qps=40000, order=3})
newServer({address='192.168.1.1', maxCheckFailures=3, checkInterval=5, 
weight=4, qps=40000, order=4})
newServer({address='192.168.1.1', maxCheckFailures=3, checkInterval=5, 
weight=4, qps=40000, order=5})
newServer({address='192.168.1.1', maxCheckFailures=3, checkInterval=5, 
weight=4, qps=40000, order=6})
newServer({address='192.168.1.1', maxCheckFailures=3, checkInterval=5, 
weight=4, qps=40000, order=7})
newServer({address='192.168.1.1', maxCheckFailures=3, checkInterval=5, 
weight=4, qps=40000, order=8})
newServer({address='192.168.1.2', maxCheckFailures=3, checkInterval=5, 
weight=4, qps=40000, order=9})
newServer({address='192.168.1.2', maxCheckFailures=3, checkInterval=5, 
weight=4, qps=40000, order=10})
newServer({address='192.168.1.2', maxCheckFailures=3, checkInterval=5, 
weight=4, qps=40000, order=11})
newServer({address='192.168.1.2', maxCheckFailures=3, checkInterval=5, 
weight=4, qps=40000, order=12})
newServer({address='192.168.1.2', maxCheckFailures=3, checkInterval=5, 
weight=4, qps=40000, order=13})
newServer({address='192.168.1.2', maxCheckFailures=3, checkInterval=5, 
weight=4, qps=40000, order=14})
newServer({address='192.168.1.2', maxCheckFailures=3, checkInterval=5, 
weight=4, qps=40000, order=15})
newServer({address='192.168.1.2', maxCheckFailures=3, checkInterval=5, 
weight=4, qps=40000, order=16})

setMaxUDPOutstanding(65535)

---- Server Load Balancing Policy
setServerPolicy(leastOutstanding)

---- Web-server
webserver('192.168.0.1:8083')
setWebserverConfig({acl='192.168.0.0/24', password='Secret'})

---- Customers Policy
customerACLs={'192.168.1.0/24'}
setACL(customerACLs)

pc = newPacketCache(300000, {maxTTL=86400, minTTL=0,
temporaryFailureTTL=60, staleTTL=60, dontAge=false})
getPool(""):setCache(pc)

setVerboseHealthChecks(true)

Servers Specs are as below:
Dnsdist LB Server Specs: 16 vCPUs, 16 GB RAM, Virtio NIC (10G) with 16 
Multiqueues.
Backend bind9 servers Specs: 16 vCPUs, 16GM RAM, Virtio NIC (10G) with 16 
Multiqueues. 

We are trying to handle 500K qps (will increase hardware specs, If required) or 
with above specs atleast 100K qps.


Regards,
Rais 

-----Original Message-----
From: dnsdist <dnsdist-boun...@mailman.powerdns.com> On Behalf Of 
dnsdist-requ...@mailman.powerdns.com
Sent: Wednesday, March 23, 2022 5:00 PM
To: dnsdist@mailman.powerdns.com
Subject: dnsdist Digest, Vol 79, Issue 3

Send dnsdist mailing list submissions to
        dnsdist@mailman.powerdns.com

To subscribe or unsubscribe via the World Wide Web, visit
        https://mailman.powerdns.com/mailman/listinfo/dnsdist
or, via email, send a message with subject or body 'help' to
        dnsdist-requ...@mailman.powerdns.com

You can reach the person managing the list at
        dnsdist-ow...@mailman.powerdns.com

When replying, please edit your Subject line so it is more specific than "Re: 
Contents of dnsdist digest..."


Today's Topics:

   1. dnsdist[29321]: Marking downstream IP:53 as 'down' (Rais Ahmed)
   2. Re: dnsdist[29321]: Marking downstream IP:53 as 'down'
      (Remi Gacogne)


----------------------------------------------------------------------

Message: 1
Date: Tue, 22 Mar 2022 23:00:25 +0000
From: Rais Ahmed <rais.ah...@tes.com.pk>
To: "dnsdist@mailman.powerdns.com" <dnsdist@mailman.powerdns.com>
Subject: [dnsdist] dnsdist[29321]: Marking downstream IP:53 as 'down'
Message-ID:
        
<paxpr08mb70737e4e1ccefc4a7f61e1e6a0...@paxpr08mb7073.eurprd08.prod.outlook.com>
        
Content-Type: text/plain; charset="us-ascii"

Hi,

We have configured dnsdist instance to handle around 500k QPS, but we are 
seeing downstream down frequently once QPS reached above 25k. below are the 
logs which we found to relative issue.

dnsdist[29321]: Marking downstream server1 IP:53 as 'down'
dnsdist[29321]: Marking downstream server2 IP:53 as 'down'
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 
<http://mailman.powerdns.com/pipermail/dnsdist/attachments/20220322/2befd6e2/attachment-0001.htm>

------------------------------

Message: 2
Date: Wed, 23 Mar 2022 10:32:22 +0100
From: Remi Gacogne <remi.gaco...@powerdns.com>
To: Rais Ahmed <rais.ah...@tes.com.pk>, "dnsdist@mailman.powerdns.com"
        <dnsdist@mailman.powerdns.com>
Subject: Re: [dnsdist] dnsdist[29321]: Marking downstream IP:53 as
        'down'
Message-ID: <5a95cbeb-7c82-9bc1-0b4c-8726f8144...@powerdns.com>
Content-Type: text/plain; charset=UTF-8; format=flowed

Hi,

 > We have configured dnsdist instance to handle around 500k QPS, but we  > are 
 > seeing downstream down frequently once QPS reached above 25k. below  > are 
 > the logs which we found to relative issue.
 >
 > dnsdist[29321]: Marking downstream server1 IP:53 as 'down'
 >
 > dnsdist[29321]: Marking downstream server2 IP:53 as 'down'

You might be able to get more information about why the health-checks are 
failing by adding setVerboseHealthChecks(true) to your configuration.

It usually happens because the backend is overwhelmed and needs to be tuned to 
handle the load, but it might also be caused by a network issue, like a link 
reaching its maximum capacity, or by dnsdist itself being overwhelmed and 
needing tuning (like increasing the number of
newServer() directives, see [1]).

[1]: 
https://dnsdist.org/advanced/tuning.html#udp-and-incoming-dns-over-https

Best regards,
--
Remi Gacogne
PowerDNS.COM BV - https://www.powerdns.com/


------------------------------

Subject: Digest Footer

_______________________________________________
dnsdist mailing list
dnsdist@mailman.powerdns.com
https://mailman.powerdns.com/mailman/listinfo/dnsdist


------------------------------

End of dnsdist Digest, Vol 79, Issue 3
**************************************
_______________________________________________
dnsdist mailing list
dnsdist@mailman.powerdns.com
https://mailman.powerdns.com/mailman/listinfo/dnsdist

Reply via email to