Re: [Pdns-users] Slow query and SERVERFAIL from local pdns_recursor

2020-09-11 Thread Thomas Mieslinger via Pdns-users

On 9/10/20 3:40 PM, Christian Degenkolb wrote:

what is a reasonable low value for udp-truncation-threshold? I tried
with 900 and 600 (as low as half the default value) but found no
improvements.


I use 1220 because the always recommended 1232 does not work for me with
IPv6.

Some months ago the network team forgot to configure fragment handling
correctly on JunOS. As soon as I lowered the udp-truncation-threshold
dhl.com and others started working immediately.


Also I don't think this is a vmware.com problem since I have the same
problem with multiple domains.


Another thing that I noticed is that not well utilized recursors perform
bad because they need to work through the whole chain from . to the
zones nameserver including many extra queries for dnssec.

"not well utilized" as in less than 10k queries/second.

Please try to "preheat" your recursor and see what changes. For use at
home I've written https://github.com/miesi/DNS-Standheizung to have all
tld namesserver with their A//... in the recursors chache

Cheers

Thomas
___
Pdns-users mailing list
Pdns-users@mailman.powerdns.com
https://mailman.powerdns.com/mailman/listinfo/pdns-users


Re: [Pdns-users] Slow query and SERVERFAIL from local pdns_recursor

2020-09-11 Thread Otto Moerbeek via Pdns-users
On Thu, Sep 10, 2020 at 03:40:54PM +0200, Christian Degenkolb via Pdns-users 
wrote:

> Hi Thomas,
> 
> what is a reasonable low value for udp-truncation-threshold? I tried with
> 900 and 600 (as low as half the default value) but found no improvements.

Try edns-outgoing-bufsize, that is the one that influences traffic
between the recursor and the authoritative servers.

> 
> Also I don't think this is a vmware.com problem since I have the same
> problem with multiple domains.

Yes, there clear are indications your connectivity is hampered somehwere.

-Otto

> 
> To illustrate I found the tool dnsperf from
> https://www.dns-oarc.net/tools/dnsperf and created a queryfile with the list
> of 500 domains from here https://moz.com/top500 see
> https://paste.ubuntu.com/p/DxGBqRvngv/
> 
> If I call the tool against my local resolver on a clean cache (even with
> udp-truncation-threshol=600) I get the following output.
> 
> # rec_control wipe-cache $
> wiped 4154 records, 8 negative records, 500 packets
> # ./dnsperf -d queryfile_top500_clean
> DNS Performance Testing Tool
> Version 2.3.4
> 
> [Status] Command line: dnsperf -d queryfile_top500_clean
> [Status] Sending queries (to 127.0.0.1)
> [Status] Started at: Thu Sep 10 15:29:26 2020
> [Status] Stopping after 1 run through file
> 
>  "Warning: received a response with an unexpected (maybe timed out) id: 162">
> 
> [Status] Testing complete (end of file)
> 
> Statistics:
> 
>   Queries sent: 500
>   Queries completed:278 (55.60%)
>   Queries lost: 222 (44.40%)
> 
>   Response codes:   NOERROR 209 (75.18%), SERVFAIL 69 (24.82%)
>   Average packet size:  request 29, response 56
>   Run time (s): 16.455935
>   Queries per second:   16.893601
> 
>   Average Latency (s):  1.313376 (min 0.000543, max 4.491949)
>   Latency StdDev (s):   1.446709
> 
> # ./dnsperf -d queryfile_top500_clean
> DNS Performance Testing Tool
> Version 2.3.4
> 
> [Status] Command line: dnsperf -d queryfile_top500_clean
> [Status] Sending queries (to 127.0.0.1)
> [Status] Started at: Thu Sep 10 15:29:49 2020
> [Status] Stopping after 1 run through file
> [Status] Testing complete (end of file)
> 
> Statistics:
> 
>   Queries sent: 500
>   Queries completed:500 (100.00%)
>   Queries lost: 0 (0.00%)
> 
>   Response codes:   NOERROR 281 (56.20%), SERVFAIL 219 (43.80%)
>   Average packet size:  request 29, response 50
>   Run time (s): 4.571526
>   Queries per second:   109.372669
> 
>   Average Latency (s):  0.015253 (min 0.54, max 4.556146)
>   Latency StdDev (s):   0.244755
> 
> As I see this way to much queries lost without a filled cache and way to
> high SERVFAIL for this kind of domains even on retries.
> The  SERVFAIL  stays high on subsequent runs.
> 
> Whereas if I run it against 1.1.1.1 (or the hoster DNS server) I get the
> following output.
> 
> # ./dnsperf -d queryfile_top500_clean -s 1.1.1.1
> DNS Performance Testing Tool
> Version 2.3.4
> 
> [Status] Command line: dnsperf -d queryfile_top500_clean -s 1.1.1.1
> [Status] Sending queries (to 1.1.1.1)
> [Status] Started at: Thu Sep 10 15:33:24 2020
> [Status] Stopping after 1 run through file
> [Status] Testing complete (end of file)
> 
> Statistics:
> 
>   Queries sent: 500
>   Queries completed:500 (100.00%)
>   Queries lost: 0 (0.00%)
> 
>   Response codes:   NOERROR 499 (99.80%), SERVFAIL 1 (0.20%)
>   Average packet size:  request 29, response 77
>   Run time (s): 0.882704
>   Queries per second:   566.441299
> 
>   Average Latency (s):  0.013521 (min 0.005065, max 0.863349)
>   Latency StdDev (s):   0.054510
> 
> A near perfect score.
> 
> Doesn't this mean the problem lies within the local resolver since dnsperf
> would make the same requests the local resolver would make to the external
> DNS server?
> Or at least there does not exist an uplink problem but something local to my
> server?
> 
> regards
> Chris
> 
> 
> 
> 
> 
> Am 2020-09-09 10:05, schrieb Thomas Mieslinger via Pdns-users:
> > Hi Christian,
> > 
> > Hetzner might filter ip fragments. Please try if your situation gets
> > better if you set udp-truncation-threshold to a reasonable low value.
> > 
> > By default pdns-recursor does dnssec. I would like to suggest to set
> > +dnssec on your dig queries.
> > 
> > A possible workaround for the vmware.com problems is to add a negative
> > trust anchor for vmware.com. in pdns config.
> > 
> > Cheers Thomas
> > 
> > On 9/8/20 2:16 PM, Christian Degenkolb via Pdns-users wrote:
> > > Hi,
> > > 
> > > I set the trace=yes option in the recursor config an redid the tests
> > > for
> > > pubs.vmware.com.
> > > 
> > > The log can be found here https://paste.debian.net/hidden/07526601/
> > > 
> > > I found two timeouts in the logs
> > > 
> > > Line 41:
> > > Sep  8 10:21:54 rho pdns_recursor[25208]: [3] pubs.vmware.com:
> > > Resolved
> > > 'vmware.com' NS ns01.vmwdns.com to: 45.54.11.1
> > > Sep  8 10:21:54 rho 

Re: [Pdns-users] Slow query and SERVERFAIL from local pdns_recursor

2020-09-10 Thread Christian Degenkolb via Pdns-users

Hi Thomas,

what is a reasonable low value for udp-truncation-threshold? I tried 
with 900 and 600 (as low as half the default value) but found no 
improvements.


Also I don't think this is a vmware.com problem since I have the same 
problem with multiple domains.


To illustrate I found the tool dnsperf from 
https://www.dns-oarc.net/tools/dnsperf and created a queryfile with the 
list of 500 domains from here https://moz.com/top500 see 
https://paste.ubuntu.com/p/DxGBqRvngv/


If I call the tool against my local resolver on a clean cache (even with 
udp-truncation-threshol=600) I get the following output.


# rec_control wipe-cache $
wiped 4154 records, 8 negative records, 500 packets
# ./dnsperf -d queryfile_top500_clean
DNS Performance Testing Tool
Version 2.3.4

[Status] Command line: dnsperf -d queryfile_top500_clean
[Status] Sending queries (to 127.0.0.1)
[Status] Started at: Thu Sep 10 15:29:26 2020
[Status] Stopping after 1 run through file

"Warning: received a response with an unexpected (maybe timed out) id: 
162">


[Status] Testing complete (end of file)

Statistics:

  Queries sent: 500
  Queries completed:278 (55.60%)
  Queries lost: 222 (44.40%)

  Response codes:   NOERROR 209 (75.18%), SERVFAIL 69 (24.82%)
  Average packet size:  request 29, response 56
  Run time (s): 16.455935
  Queries per second:   16.893601

  Average Latency (s):  1.313376 (min 0.000543, max 4.491949)
  Latency StdDev (s):   1.446709

# ./dnsperf -d queryfile_top500_clean
DNS Performance Testing Tool
Version 2.3.4

[Status] Command line: dnsperf -d queryfile_top500_clean
[Status] Sending queries (to 127.0.0.1)
[Status] Started at: Thu Sep 10 15:29:49 2020
[Status] Stopping after 1 run through file
[Status] Testing complete (end of file)

Statistics:

  Queries sent: 500
  Queries completed:500 (100.00%)
  Queries lost: 0 (0.00%)

  Response codes:   NOERROR 281 (56.20%), SERVFAIL 219 (43.80%)
  Average packet size:  request 29, response 50
  Run time (s): 4.571526
  Queries per second:   109.372669

  Average Latency (s):  0.015253 (min 0.54, max 4.556146)
  Latency StdDev (s):   0.244755

As I see this way to much queries lost without a filled cache and way to 
high SERVFAIL for this kind of domains even on retries.

The  SERVFAIL  stays high on subsequent runs.

Whereas if I run it against 1.1.1.1 (or the hoster DNS server) I get the 
following output.


# ./dnsperf -d queryfile_top500_clean -s 1.1.1.1
DNS Performance Testing Tool
Version 2.3.4

[Status] Command line: dnsperf -d queryfile_top500_clean -s 1.1.1.1
[Status] Sending queries (to 1.1.1.1)
[Status] Started at: Thu Sep 10 15:33:24 2020
[Status] Stopping after 1 run through file
[Status] Testing complete (end of file)

Statistics:

  Queries sent: 500
  Queries completed:500 (100.00%)
  Queries lost: 0 (0.00%)

  Response codes:   NOERROR 499 (99.80%), SERVFAIL 1 (0.20%)
  Average packet size:  request 29, response 77
  Run time (s): 0.882704
  Queries per second:   566.441299

  Average Latency (s):  0.013521 (min 0.005065, max 0.863349)
  Latency StdDev (s):   0.054510

A near perfect score.

Doesn't this mean the problem lies within the local resolver since 
dnsperf would make the same requests the local resolver would make to 
the external DNS server?
Or at least there does not exist an uplink problem but something local 
to my server?


regards
Chris





Am 2020-09-09 10:05, schrieb Thomas Mieslinger via Pdns-users:

Hi Christian,

Hetzner might filter ip fragments. Please try if your situation gets
better if you set udp-truncation-threshold to a reasonable low value.

By default pdns-recursor does dnssec. I would like to suggest to set
+dnssec on your dig queries.

A possible workaround for the vmware.com problems is to add a negative
trust anchor for vmware.com. in pdns config.

Cheers Thomas

On 9/8/20 2:16 PM, Christian Degenkolb via Pdns-users wrote:

Hi,

I set the trace=yes option in the recursor config an redid the tests 
for

pubs.vmware.com.

The log can be found here https://paste.debian.net/hidden/07526601/

I found two timeouts in the logs

Line 41:
Sep  8 10:21:54 rho pdns_recursor[25208]: [3] pubs.vmware.com: 
Resolved

'vmware.com' NS ns01.vmwdns.com to: 45.54.11.1
Sep  8 10:21:54 rho pdns_recursor[25208]: [3] pubs.vmware.com: Trying 
IP

45.54.11.1:53, asking 'pubs.vmware.com|A'
Sep  8 10:21:56 rho pdns_recursor[25208]: [3] pubs.vmware.com: timeout
resolving after 1501.63msec
Sep  8 10:21:56 rho pdns_recursor[25208]: [3] pubs.vmware.com: Trying 
to

resolve NS 'ns04.vmwdns.com' (2/8)

But a request to the 45.54.11.1 for pubs.vmware.com come back within 
11

msec.

$ dig -t A @45.54.11.1 pubs.vmware.com

; <<>> DiG 9.11.5-P4-5.1+deb10u2-Debian <<>> -t A @45.54.11.1
pubs.vmware.com
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 24122
;; flags: qr aa rd; QUERY: 1, ANSWER: 

Re: [Pdns-users] Slow query and SERVERFAIL from local pdns_recursor

2020-09-09 Thread Thomas Mieslinger via Pdns-users

Hi Christian,

Hetzner might filter ip fragments. Please try if your situation gets
better if you set udp-truncation-threshold to a reasonable low value.

By default pdns-recursor does dnssec. I would like to suggest to set
+dnssec on your dig queries.

A possible workaround for the vmware.com problems is to add a negative
trust anchor for vmware.com. in pdns config.

Cheers Thomas

On 9/8/20 2:16 PM, Christian Degenkolb via Pdns-users wrote:

Hi,

I set the trace=yes option in the recursor config an redid the tests for
pubs.vmware.com.

The log can be found here https://paste.debian.net/hidden/07526601/

I found two timeouts in the logs

Line 41:
Sep  8 10:21:54 rho pdns_recursor[25208]: [3] pubs.vmware.com: Resolved
'vmware.com' NS ns01.vmwdns.com to: 45.54.11.1
Sep  8 10:21:54 rho pdns_recursor[25208]: [3] pubs.vmware.com: Trying IP
45.54.11.1:53, asking 'pubs.vmware.com|A'
Sep  8 10:21:56 rho pdns_recursor[25208]: [3] pubs.vmware.com: timeout
resolving after 1501.63msec
Sep  8 10:21:56 rho pdns_recursor[25208]: [3] pubs.vmware.com: Trying to
resolve NS 'ns04.vmwdns.com' (2/8)

But a request to the 45.54.11.1 for pubs.vmware.com come back within 11
msec.

$ dig -t A @45.54.11.1 pubs.vmware.com

; <<>> DiG 9.11.5-P4-5.1+deb10u2-Debian <<>> -t A @45.54.11.1
pubs.vmware.com
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 24122
;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1
;; WARNING: recursion requested but not available

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;pubs.vmware.com.INA

;; ANSWER SECTION:
pubs.vmware.com.30INCNAME   pubs.vmware.com.ds.edgekey.net.

;; Query time: 11 msec
;; SERVER: 45.54.11.1#53(45.54.11.1)
;; WHEN: Tue Sep 08 13:29:57 CEST 2020
;; MSG SIZE  rcvd: 88

and a seconds timeout in line 159:

Sep  8 10:21:56 rho pdns_recursor[25208]: [3] e751.dscx.akamaiedge.net:
Trying IP 2.16.106.23:53, asking 'e751.dscx.akamaiedge.net|A'
Sep  8 10:21:57 rho pdns_recursor[25208]: [3] e751.dscx.akamaiedge.net:
timeout resolving after 1501.74msec
Sep  8 10:21:57 rho pdns_recursor[25208]: [3] e751.dscx.akamaiedge.net:
Trying to resolve NS 'n3dscx.akamaiedge.net' (2/8)

Same picture here with a very good response time.

$ dig -t A @2.16.106.23 e751.dscx.akamaiedge.net

; <<>> DiG 9.11.5-P4-5.1+deb10u2-Debian <<>> -t A @2.16.106.23
e751.dscx.akamaiedge.net
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 7947
;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1
;; WARNING: recursion requested but not available

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;e751.dscx.akamaiedge.net.INA

;; ANSWER SECTION:
e751.dscx.akamaiedge.net. 20INA104.111.214.47

;; Query time: 5 msec
;; SERVER: 2.16.106.23#53(2.16.106.23)
;; WHEN: Tue Sep 08 13:31:32 CEST 2020
;; MSG SIZE  rcvd: 69


To check that this is not a vmware.com problem I tested some more and
got the same timeouts.


One more example for

$dig nameservers.dnscheck.co @127.0.0.1

; <<>> DiG 9.11.5-P4-5.1+deb10u2-Debian <<>> nameservers.dnscheck.co
@127.0.0.1
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 23852
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;nameservers.dnscheck.co.INA

;; Query time: 3005 msec
;; SERVER: 127.0.0.1#53(127.0.0.1)
;; WHEN: Tue Sep 08 12:15:29 CEST 2020
;; MSG SIZE  rcvd: 52

can be found here https://paste.debian.net/hidden/b48a78a2/.

This time multiple timeout regarding the root name servers, for example
g.root-servers.net

Sep  8 12:15:21 rho pdns_recursor[25208]: [50] nameservers.dnscheck.co:
Resolved '.' NS g.root-servers.net to: 192.112.36.4
Sep  8 12:15:21 rho pdns_recursor[25208]: [50] nameservers.dnscheck.co:
Trying IP 192.112.36.4:53, asking 'nameservers.dnscheck.co|A'
Sep  8 12:15:22 rho pdns_recursor[25208]: [50] nameservers.dnscheck.co:
timeout resolving after 1501.63msec
Sep  8 12:15:22 rho pdns_recursor[25208]: [50] nameservers.dnscheck.co:
Trying to resolve NS 'j.root-servers.net' (2/13)

Where a direct request via dig works like a charm.

$ dig -t A @192.112.36.4 nameservers.dnscheck.co

; <<>> DiG 9.11.5-P4-5.1+deb10u2-Debian <<>> -t A @192.112.36.4
nameservers.dnscheck.co
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 18641
;; flags: qr rd; QUERY: 1, ANSWER: 0, AUTHORITY: 6, ADDITIONAL: 13
;; WARNING: recursion requested but not available

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
; COOKIE: ce9eaf15bb34977b41354b5f5f576c3841785bfba5901e93 (good)
;; QUESTION SECTION:
;nameservers.dnscheck.co.INA

;; AUTHORITY SECTION:
co.172800  INNSns5.cctld.co.
co.172800  INNSns1.cctld.co.
co.172800  INNSns6.cctld.co.
co.172800  INNSns4.cctld.co.
co.172800  

Re: [Pdns-users] Slow query and SERVERFAIL from local pdns_recursor

2020-09-08 Thread Christian Degenkolb via Pdns-users

Hi,

I set the trace=yes option in the recursor config an redid the tests for 
pubs.vmware.com.


The log can be found here https://paste.debian.net/hidden/07526601/

I found two timeouts in the logs

Line 41:
Sep  8 10:21:54 rho pdns_recursor[25208]: [3] pubs.vmware.com: Resolved 
'vmware.com' NS ns01.vmwdns.com to: 45.54.11.1
Sep  8 10:21:54 rho pdns_recursor[25208]: [3] pubs.vmware.com: Trying IP 
45.54.11.1:53, asking 'pubs.vmware.com|A'
Sep  8 10:21:56 rho pdns_recursor[25208]: [3] pubs.vmware.com: timeout 
resolving after 1501.63msec
Sep  8 10:21:56 rho pdns_recursor[25208]: [3] pubs.vmware.com: Trying to 
resolve NS 'ns04.vmwdns.com' (2/8)


But a request to the 45.54.11.1 for pubs.vmware.com come back within 11 
msec.


$ dig -t A @45.54.11.1 pubs.vmware.com

; <<>> DiG 9.11.5-P4-5.1+deb10u2-Debian <<>> -t A @45.54.11.1 
pubs.vmware.com

; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 24122
;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1
;; WARNING: recursion requested but not available

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;pubs.vmware.com.INA

;; ANSWER SECTION:
pubs.vmware.com.30INCNAME   pubs.vmware.com.ds.edgekey.net.

;; Query time: 11 msec
;; SERVER: 45.54.11.1#53(45.54.11.1)
;; WHEN: Tue Sep 08 13:29:57 CEST 2020
;; MSG SIZE  rcvd: 88

and a seconds timeout in line 159:

Sep  8 10:21:56 rho pdns_recursor[25208]: [3]   
e751.dscx.akamaiedge.net: Trying IP 2.16.106.23:53, asking 
'e751.dscx.akamaiedge.net|A'
Sep  8 10:21:57 rho pdns_recursor[25208]: [3]   
e751.dscx.akamaiedge.net: timeout resolving after 1501.74msec
Sep  8 10:21:57 rho pdns_recursor[25208]: [3]   
e751.dscx.akamaiedge.net: Trying to resolve NS 'n3dscx.akamaiedge.net' 
(2/8)


Same picture here with a very good response time.

$ dig -t A @2.16.106.23 e751.dscx.akamaiedge.net

; <<>> DiG 9.11.5-P4-5.1+deb10u2-Debian <<>> -t A @2.16.106.23 
e751.dscx.akamaiedge.net

; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 7947
;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1
;; WARNING: recursion requested but not available

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;e751.dscx.akamaiedge.net.INA

;; ANSWER SECTION:
e751.dscx.akamaiedge.net. 20INA104.111.214.47

;; Query time: 5 msec
;; SERVER: 2.16.106.23#53(2.16.106.23)
;; WHEN: Tue Sep 08 13:31:32 CEST 2020
;; MSG SIZE  rcvd: 69


To check that this is not a vmware.com problem I tested some more and 
got the same timeouts.



One more example for

$dig nameservers.dnscheck.co @127.0.0.1

; <<>> DiG 9.11.5-P4-5.1+deb10u2-Debian <<>> nameservers.dnscheck.co 
@127.0.0.1

;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 23852
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;nameservers.dnscheck.co.INA

;; Query time: 3005 msec
;; SERVER: 127.0.0.1#53(127.0.0.1)
;; WHEN: Tue Sep 08 12:15:29 CEST 2020
;; MSG SIZE  rcvd: 52

can be found here https://paste.debian.net/hidden/b48a78a2/.

This time multiple timeout regarding the root name servers, for example 
g.root-servers.net


Sep  8 12:15:21 rho pdns_recursor[25208]: [50] nameservers.dnscheck.co: 
Resolved '.' NS g.root-servers.net to: 192.112.36.4
Sep  8 12:15:21 rho pdns_recursor[25208]: [50] nameservers.dnscheck.co: 
Trying IP 192.112.36.4:53, asking 'nameservers.dnscheck.co|A'
Sep  8 12:15:22 rho pdns_recursor[25208]: [50] nameservers.dnscheck.co: 
timeout resolving after 1501.63msec
Sep  8 12:15:22 rho pdns_recursor[25208]: [50] nameservers.dnscheck.co: 
Trying to resolve NS 'j.root-servers.net' (2/13)


Where a direct request via dig works like a charm.

$ dig -t A @192.112.36.4 nameservers.dnscheck.co

; <<>> DiG 9.11.5-P4-5.1+deb10u2-Debian <<>> -t A @192.112.36.4 
nameservers.dnscheck.co

; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 18641
;; flags: qr rd; QUERY: 1, ANSWER: 0, AUTHORITY: 6, ADDITIONAL: 13
;; WARNING: recursion requested but not available

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
; COOKIE: ce9eaf15bb34977b41354b5f5f576c3841785bfba5901e93 (good)
;; QUESTION SECTION:
;nameservers.dnscheck.co.INA

;; AUTHORITY SECTION:
co.172800  INNSns5.cctld.co.
co.172800  INNSns1.cctld.co.
co.172800  INNSns6.cctld.co.
co.172800  INNSns4.cctld.co.
co.172800  INNSns3.cctld.co.
co.172800  INNSns2.cctld.co.

;; ADDITIONAL SECTION:
ns1.cctld.co.   172800  INA156.154.100.25
ns2.cctld.co.   172800  INA156.154.101.25
ns3.cctld.co.   172800  INA156.154.102.25
ns4.cctld.co.   172800  INA156.154.103.25
ns5.cctld.co.   172800  INA156.154.104.25
ns6.cctld.co.   172800  INA156.154.105.25
ns1.cctld.co.   172800  IN2001:502:2eda::21
ns2.cctld.co.   172800  

Re: [Pdns-users] Slow query and SERVERFAIL from local pdns_recursor

2020-09-08 Thread Otto Moerbeek via Pdns-users
On Tue, Sep 08, 2020 at 09:22:31AM +0200, Christian Degenkolb wrote:

> (send again, first answer was not send cc to the ML)
> 
> Hi,
> 
> sorry for not sending any configs. pdns_recursor runs more or less with the
> vanilla config with the following changes:
> 
> forward-zones-recurse=zen.spamhaus.org=1.1.1.1;1.0.0.1 (thats why I wanted
> to use the local recursor, as mentioned the server is located in the hetzner
> IP Range which apparently is blocked for the spamhaus DNSBL)
> loglevel=6
> log-common-errors=yes
> quiet=no
> root-nx-trust=no (found this as a solution for the SERVERFAIL but did not
> work)
> 
> and
> # rec_control set-carbon-server 37.252.122.50 rho-test (for the grafs)
> 
> 
> A trace for the same resolves from my last mail:
> 
>  $ time dig +trace pubs.vmware.com @127.0.0.1
> ; <<>> DiG 9.11.5-P4-5.1+deb10u2-Debian <<>> +trace pubs.vmware.com
> @127.0.0.1
> ;; global options: +cmd
> .   86118   IN  NS  d.root-servers.net.
> .   86118   IN  NS  c.root-servers.net.
> .   86118   IN  NS  l.root-servers.net.
> .   86118   IN  NS  b.root-servers.net.
> .   86118   IN  NS  f.root-servers.net.
> .   86118   IN  NS  m.root-servers.net.
> .   86118   IN  NS  e.root-servers.net.
> .   86118   IN  NS  a.root-servers.net.
> .   86118   IN  NS  i.root-servers.net.
> .   86118   IN  NS  k.root-servers.net.
> .   86118   IN  NS  g.root-servers.net.
> .   86118   IN  NS  h.root-servers.net.
> .   86118   IN  NS  j.root-servers.net.
> .   86118   IN  RRSIG   NS 8 0 518400 2020092105
> 2020090804 46594 .
> wgnBz8tKA9hjwIxmMQgTVwnZaiUpAB9a1+oC5T/syHzqNj1e5qhApLQN
> NLok43hu5Ykt8RFe/IiDZuYxIdyyzItwk
> 4QN8xNgsQsfhVfBbZ26bWRz
> fskquwnFn6Gmvq2qI6o42tsBxXUw09X4sNlNYI2zHB3sKaaMu0AbN9WI
> Pe14jpX/PwaP3m78+XqMy9CiKmuDon6g3BuyecPhCZL5Pa8ZPC7nrKfV
> pfyNSiPoBODsJE96UHGlOCJTFcbu/6Ia4ek3AGOJf+WC84HPrxLT
> riyk XHfbPl7EjTbFSPgT8D7jGBfVCTQU3JSfynv29VFAHWZu1gm5VJWNQGaw u5gatA==
> ;; Received 540 bytes from 127.0.0.1#53(127.0.0.1) in 0 ms
> 
> com.172800  IN  NS  a.gtld-servers.net.
> com.172800  IN  NS  b.gtld-servers.net.
> com.172800  IN  NS  c.gtld-servers.net.
> com.172800  IN  NS  d.gtld-servers.net.
> com.172800  IN  NS  e.gtld-servers.net.
> com.172800  IN  NS  f.gtld-servers.net.
> com.172800  IN  NS  g.gtld-servers.net.
> com.172800  IN  NS  h.gtld-servers.net.
> com.172800  IN  NS  i.gtld-servers.net.
> com.172800  IN  NS  j.gtld-servers.net.
> com.172800  IN  NS  k.gtld-servers.net.
> com.172800  IN  NS  l.gtld-servers.net.
> com.172800  IN  NS  m.gtld-servers.net.
> com.86400   IN  DS  30909 8 2
> E2D3C916F6DEEAC73294E8268FB5885044A833FC5459588F4A9184CF C41A5766
> com.86400   IN  RRSIG   DS 8 1 86400 2020092105
> 2020090804 46594 .
> zz85z6R/YUHxyW+ywA6zrgiYILjPo0i248M3wU+2XCRCneBH6yknQfjM
> LIcbo3vADVUlkJd0l4W2TLd7NPgC255hr2
> +ALojzzHa07jyFmE203Kdw
> ma7XL0C55TdFrCEMhARkZf4EncfJH9JH+fdWRWdMr0EQZd1A+FzMYemO
> o7/L/8ZYq4FOt0vz+zheAJNDveGii+QpXAoDyw4xt3HMUVM+40Z/VgD1
> tk9Y3K9e2wwRNISeHdlq21JFVA2SY/gDgPCzBtM1r9Yz7oFZ2ld5W
> AD0 P84GPEUMgUceAGofwxlV9+dSawhunskb+yVrpdjpizLageyJRWEu/F9A zDXxew==
> ;; Received 1175 bytes from 198.97.190.53#53(h.root-servers.net) in 5 ms
> 
> vmware.com. 172800  IN  NS  dns1.p05.nsone.net.
> vmware.com. 172800  IN  NS  dns2.p05.nsone.net.
> vmware.com. 172800  IN  NS  dns3.p05.nsone.net.
> vmware.com. 172800  IN  NS  dns4.p05.nsone.net.
> vmware.com. 172800  IN  NS  ns01.vmwdns.com.
> vmware.com. 172800  IN  NS  ns02.vmwdns.com.
> vmware.com. 172800  IN  NS  ns03.vmwdns.com.
> vmware.com. 172800  IN  NS  ns04.vmwdns.com.
> vmware.com. 86400   IN  DS  48553 13 2
> AA2C697F3990472642AF01509A18224828E403CA8608EC75D5C83002 CE21847E
> vmware.com. 86400   IN  RRSIG   DS 8 2 86400 20200915062203
> 20200908051203 24966 com.
> FA2xsJKvT2LLn5UEy7hAE7PaYmds7FBkQB0SGhm8riwJRKnxbHAY0tvv
> I1T/k0EzXJ4wy1J5qzNLMjhzFgPxEQB
> 6BwBfJm8qo8Cnzxm4YC5Ko1/9
> pDWooVBHoFfMmJgu14Dk+u1AcHobxH9pPs7az16cLK/3YeaFW3dCrIVQ
> NK2fZc0d/pc7CY0Zl1LjYQdTq+MsZiL2kbepEHD6A/4J6g==
> ;; Received 523 bytes from 

Re: [Pdns-users] Slow query and SERVERFAIL from local pdns_recursor

2020-09-08 Thread Christian Degenkolb via Pdns-users

(send again, first answer was not send cc to the ML)

Hi,

sorry for not sending any configs. pdns_recursor runs more or less with 
the vanilla config with the following changes:


forward-zones-recurse=zen.spamhaus.org=1.1.1.1;1.0.0.1 (thats why I 
wanted to use the local recursor, as mentioned the server is located in 
the hetzner IP Range which apparently is blocked for the spamhaus DNSBL)

loglevel=6
log-common-errors=yes
quiet=no
root-nx-trust=no (found this as a solution for the SERVERFAIL but did 
not work)


and
# rec_control set-carbon-server 37.252.122.50 rho-test (for the grafs)


A trace for the same resolves from my last mail:

 $ time dig +trace pubs.vmware.com @127.0.0.1
; <<>> DiG 9.11.5-P4-5.1+deb10u2-Debian <<>> +trace pubs.vmware.com 
@127.0.0.1

;; global options: +cmd
.   86118   IN  NS  d.root-servers.net.
.   86118   IN  NS  c.root-servers.net.
.   86118   IN  NS  l.root-servers.net.
.   86118   IN  NS  b.root-servers.net.
.   86118   IN  NS  f.root-servers.net.
.   86118   IN  NS  m.root-servers.net.
.   86118   IN  NS  e.root-servers.net.
.   86118   IN  NS  a.root-servers.net.
.   86118   IN  NS  i.root-servers.net.
.   86118   IN  NS  k.root-servers.net.
.   86118   IN  NS  g.root-servers.net.
.   86118   IN  NS  h.root-servers.net.
.   86118   IN  NS  j.root-servers.net.
.   86118   IN  RRSIG   NS 8 0 518400 
2020092105 2020090804 46594 . 
wgnBz8tKA9hjwIxmMQgTVwnZaiUpAB9a1+oC5T/syHzqNj1e5qhApLQN 
NLok43hu5Ykt8RFe/IiDZuYxIdyyzItwk
4QN8xNgsQsfhVfBbZ26bWRz 
fskquwnFn6Gmvq2qI6o42tsBxXUw09X4sNlNYI2zHB3sKaaMu0AbN9WI 
Pe14jpX/PwaP3m78+XqMy9CiKmuDon6g3BuyecPhCZL5Pa8ZPC7nrKfV 
pfyNSiPoBODsJE96UHGlOCJTFcbu/6Ia4ek3AGOJf+WC84HPrxLT

riyk XHfbPl7EjTbFSPgT8D7jGBfVCTQU3JSfynv29VFAHWZu1gm5VJWNQGaw u5gatA==
;; Received 540 bytes from 127.0.0.1#53(127.0.0.1) in 0 ms

com.172800  IN  NS  a.gtld-servers.net.
com.172800  IN  NS  b.gtld-servers.net.
com.172800  IN  NS  c.gtld-servers.net.
com.172800  IN  NS  d.gtld-servers.net.
com.172800  IN  NS  e.gtld-servers.net.
com.172800  IN  NS  f.gtld-servers.net.
com.172800  IN  NS  g.gtld-servers.net.
com.172800  IN  NS  h.gtld-servers.net.
com.172800  IN  NS  i.gtld-servers.net.
com.172800  IN  NS  j.gtld-servers.net.
com.172800  IN  NS  k.gtld-servers.net.
com.172800  IN  NS  l.gtld-servers.net.
com.172800  IN  NS  m.gtld-servers.net.
com.86400   IN  DS  30909 8 2 
E2D3C916F6DEEAC73294E8268FB5885044A833FC5459588F4A9184CF C41A5766
com.86400   IN  RRSIG   DS 8 1 86400 
2020092105 2020090804 46594 . 
zz85z6R/YUHxyW+ywA6zrgiYILjPo0i248M3wU+2XCRCneBH6yknQfjM 
LIcbo3vADVUlkJd0l4W2TLd7NPgC255hr2
+ALojzzHa07jyFmE203Kdw 
ma7XL0C55TdFrCEMhARkZf4EncfJH9JH+fdWRWdMr0EQZd1A+FzMYemO 
o7/L/8ZYq4FOt0vz+zheAJNDveGii+QpXAoDyw4xt3HMUVM+40Z/VgD1 
tk9Y3K9e2wwRNISeHdlq21JFVA2SY/gDgPCzBtM1r9Yz7oFZ2ld5W

AD0 P84GPEUMgUceAGofwxlV9+dSawhunskb+yVrpdjpizLageyJRWEu/F9A zDXxew==
;; Received 1175 bytes from 198.97.190.53#53(h.root-servers.net) in 5 ms

vmware.com. 172800  IN  NS  dns1.p05.nsone.net.
vmware.com. 172800  IN  NS  dns2.p05.nsone.net.
vmware.com. 172800  IN  NS  dns3.p05.nsone.net.
vmware.com. 172800  IN  NS  dns4.p05.nsone.net.
vmware.com. 172800  IN  NS  ns01.vmwdns.com.
vmware.com. 172800  IN  NS  ns02.vmwdns.com.
vmware.com. 172800  IN  NS  ns03.vmwdns.com.
vmware.com. 172800  IN  NS  ns04.vmwdns.com.
vmware.com. 86400   IN  DS  48553 13 2 
AA2C697F3990472642AF01509A18224828E403CA8608EC75D5C83002 CE21847E
vmware.com. 86400   IN  RRSIG   DS 8 2 86400 
20200915062203 20200908051203 24966 com. 
FA2xsJKvT2LLn5UEy7hAE7PaYmds7FBkQB0SGhm8riwJRKnxbHAY0tvv 
I1T/k0EzXJ4wy1J5qzNLMjhzFgPxEQB
6BwBfJm8qo8Cnzxm4YC5Ko1/9 
pDWooVBHoFfMmJgu14Dk+u1AcHobxH9pPs7az16cLK/3YeaFW3dCrIVQ 
NK2fZc0d/pc7CY0Zl1LjYQdTq+MsZiL2kbepEHD6A/4J6g==
;; Received 523 bytes from 2001:503:eea3::30#53(g.gtld-servers.net) in 6 
ms


pubs.vmware.com.30  IN  CNAME   
pubs.vmware.com.ds.edgekey.net.
pubs.vmware.com.30  IN  RRSIG   CNAME 13 3 30 
20200909071011 20200907071011 12752 

Re: [Pdns-users] Slow query and SERVERFAIL from local pdns_recursor

2020-09-04 Thread Otto Moerbeek via Pdns-users
On Wed, Sep 02, 2020 at 09:44:37AM +0200, Christian Degenkolb via Pdns-users 
wrote:

> Hi,
> 
> I hope somebody on the ML can help me figure out what I'm doing wrong.
> I have a local pdns_recursor (version 4.1.11-1+deb10u1 from debian 10)
> runing and added it at the top of my /etc/resolve.conf as 127.0.0.1.
> 
> However I see some strange SERVERFAIL resolves happening and all in all a
> slow DNS system.
> 
> For example see the following two consecutive resolves and a direct request
> to the NS.
> The first one takes nearly 3 seconds vs 11 ms from the same system if I
> query the NS directly.
> 
> $ dig pubs.vmware.com @127.0.0.1
> 
> ; <<>> DiG 9.11.5-P4-5.1+deb10u2-Debian <<>> pubs.vmware.com @127.0.0.1
> ;; global options: +cmd
> ;; Got answer:
> ;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 4929
> ;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 1
> 
> ;; OPT PSEUDOSECTION:
> ; EDNS: version: 0, flags:; udp: 4096
> ;; QUESTION SECTION:
> ;pubs.vmware.com.INA
> 
> ;; ANSWER SECTION:
> pubs.vmware.com.30INCNAME   pubs.vmware.com.ds.edgekey.net.
> pubs.vmware.com.ds.edgekey.net. 10 IN   CNAME   e751.dscx.akamaiedge.net.
> 
> ;; Query time: 3009 msec
> ;; SERVER: 127.0.0.1#53(127.0.0.1)
> ;; WHEN: Wed Sep 02 09:19:04 CEST 2020
> ;; MSG SIZE  rcvd: 123
> 
> $ dig pubs.vmware.com @127.0.0.1
> 
> ; <<>> DiG 9.11.5-P4-5.1+deb10u2-Debian <<>> pubs.vmware.com @127.0.0.1
> ;; global options: +cmd
> ;; Got answer:
> ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 1345
> ;; flags: qr rd ra; QUERY: 1, ANSWER: 3, AUTHORITY: 0, ADDITIONAL: 1
> 
> ;; OPT PSEUDOSECTION:
> ; EDNS: version: 0, flags:; udp: 4096
> ;; QUESTION SECTION:
> ;pubs.vmware.com.INA
> 
> ;; ANSWER SECTION:
> pubs.vmware.com.18INCNAME   pubs.vmware.com.ds.edgekey.net.
> pubs.vmware.com.ds.edgekey.net. 4 INCNAME   e751.dscx.akamaiedge.net.
> e751.dscx.akamaiedge.net. 16INA104.111.214.47
> 
> ;; Query time: 0 msec
> ;; SERVER: 127.0.0.1#53(127.0.0.1)
> ;; WHEN: Wed Sep 02 09:19:08 CEST 2020
> ;; MSG SIZE  rcvd: 139
> 
> $ dig pubs.vmware.com @ns03.vmwdns.com
> 
> ; <<>> DiG 9.11.5-P4-5.1+deb10u2-Debian <<>> pubs.vmware.com
> @ns03.vmwdns.com
> ;; global options: +cmd
> ;; Got answer:
> ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 5509
> ;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1
> ;; WARNING: recursion requested but not available
> 
> ;; OPT PSEUDOSECTION:
> ; EDNS: version: 0, flags:; udp: 4096
> ;; QUESTION SECTION:
> ;pubs.vmware.com.INA
> 
> ;; ANSWER SECTION:
> pubs.vmware.com.30INCNAME   pubs.vmware.com.ds.edgekey.net.
> 
> ;; Query time: 11 msec
> ;; SERVER: 45.54.11.129#53(45.54.11.129)
> ;; WHEN: Wed Sep 02 09:34:42 CEST 2020
> ;; MSG SIZE  rcvd: 88
> 
> Also I have a number SERVFAIL in /var/log/syslog (pdns_recurser is currently
> running with loglevel=6).
> For example:
> 
> Sep  2 08:45:35 rho pdns_recursor[19311]: Sending SERVFAIL to 127.0.0.1
> during resolve of 'pubs.vmware.com' because: Too much time waiting for
> pubs.vmware.com.ds.edgekey.net|A, timeouts: 5,
> throttles: 1, queries: 6, 7991msec
> 
> # grep 'Too much time waiting for' /var/log/syslog | wc -l
> 184
> 
> As per https://blog.powerdns.com/2014/12/11/powerdns-graphing-as-a-service/
> I send the metrics to 
> https://metronome1.powerdns.com/?server=pdns.rho-test.recursor=-172800
> 
> Does anybody have an idea whats wrong? This seems way to slow for DNS and
> the SERVFAIL schouldn't happen this often.
> The server in question is running in a DC of the german Hoster hetzner.de.
> Besides the strange DNS I don't have any problems with the reliability of
> the network connection.
> 
> thanks
> Chris
> 
> ___
> Pdns-users mailing list
> Pdns-users@mailman.powerdns.com
> https://mailman.powerdns.com/mailman/listinfo/pdns-users

You did not share any config or traces, so it's hard to tell. A wild
guess: It might be you enabled IPV6 but your IPV6 connectivity is bad.

-Otto
___
Pdns-users mailing list
Pdns-users@mailman.powerdns.com
https://mailman.powerdns.com/mailman/listinfo/pdns-users