On Thu, Sep 6, 2018 at 3:05 PM John W. Blue <john.b...@rrcic.com> wrote: > > Alex, > > Have you uploaded this pcap with the SERVFAIL's? I didn't have time to look > at your first upload but can review this one.
Thanks very much. I've uploaded the pcap file here. It's about ~100MB compressed, and represents about 4hrs of data, I believe. https://drive.google.com/file/d/1KUpDoQ2zuz5ITeKuO0BhlK7JvWSUAG3B/view?usp=sharing Thanks, Alex > > John > > -----Original Message----- > From: bind-users [mailto:bind-users-boun...@lists.isc.org] On Behalf Of Alex > Sent: Thursday, September 06, 2018 1:49 PM > To: c...@byington.org; bind-users@lists.isc.org > Subject: Re: Frequent timeout > > Hi, > > On Mon, Sep 3, 2018 at 12:45 PM Carl Byington <c...@byington.org> wrote: > > > > -----BEGIN PGP SIGNED MESSAGE----- > > Hash: SHA512 > > > > On Sun, 2018-09-02 at 21:54 -0400, Alex wrote: > > > Do you have any other ideas on how I can isolate this problem? > > > > Run tcpdump on the external ethernet connection. > > > > tcpdump -s0 -vv -i %s -nn -w /tmp/outputfile udp dst port domain > > I've captured some packets that I believe include the packets relating to the > SERVFAIL errors I've been receiving. Now I have to figure out how to go > through them. > > In the meantime, I've configured /etc/resolv.conf to send queries to a remote > system of ours, and the errors have (mostly) stopped. > > I also notice some traces take an abnormal amount of time. Ping times to > google.com are less than 20ms, but this trace shows reaching the root servers > takes 104ms: > > # dig +trace +nodnssec google.com > > ; <<>> DiG 9.11.4-P1-RedHat-9.11.4-5.P1.fc28 <<>> +trace +nodnssec google.com > ;; global options: +cmd > . 3451 IN NS g.root-servers.net. > . 3451 IN NS k.root-servers.net. > . 3451 IN NS j.root-servers.net. > . 3451 IN NS c.root-servers.net. > . 3451 IN NS i.root-servers.net. > . 3451 IN NS e.root-servers.net. > . 3451 IN NS m.root-servers.net. > . 3451 IN NS l.root-servers.net. > . 3451 IN NS a.root-servers.net. > . 3451 IN NS h.root-servers.net. > . 3451 IN NS b.root-servers.net. > . 3451 IN NS d.root-servers.net. > . 3451 IN NS f.root-servers.net. > ;; Received 839 bytes from 127.0.0.1#53(127.0.0.1) in 0 ms > > com. 172800 IN NS h.gtld-servers.net. > com. 172800 IN NS g.gtld-servers.net. > com. 172800 IN NS b.gtld-servers.net. > com. 172800 IN NS j.gtld-servers.net. > com. 172800 IN NS f.gtld-servers.net. > com. 172800 IN NS m.gtld-servers.net. > com. 172800 IN NS c.gtld-servers.net. > com. 172800 IN NS d.gtld-servers.net. > com. 172800 IN NS k.gtld-servers.net. > com. 172800 IN NS i.gtld-servers.net. > com. 172800 IN NS l.gtld-servers.net. > com. 172800 IN NS a.gtld-servers.net. > com. 172800 IN NS e.gtld-servers.net. > ;; Received 835 bytes from 202.12.27.33#53(m.root-servers.net) in 104 ms > > google.com. 172800 IN NS ns2.google.com. > google.com. 172800 IN NS ns1.google.com. > google.com. 172800 IN NS ns3.google.com. > google.com. 172800 IN NS ns4.google.com. > ;; Received 287 bytes from 192.33.14.30#53(b.gtld-servers.net) in 44 ms > > ;; expected opt record in response > google.com. 300 IN A 172.217.10.14 > ;; Received 44 bytes from 216.239.36.10#53(ns3.google.com) in 29 ms > > Running the same trace again showed 129ms. > > I also located this warning: > 06-Sep-2018 12:03:33.304 client: warning: client @0x7f502c1d3d50 > 127.0.0.1#60968 (cmail20.com.multi.surbl.org): recursive-clients soft limit > exceeded (901/900/1000), aborting oldest query > > I've increased recursive-clients to 2500 but the SERVFAIL errors continue. > > There are also a ton of lame-server entries, many of which are related to one > RBL or another, as part of my postscreen config: > 06-Sep-2018 13:16:50.686 lame-servers: info: connection refused resolving > '48.167.85.209.zz.countries.nerd.dk/A/IN': 195.182.36.121#53 > 06-Sep-2018 13:16:50.706 lame-servers: info: connection refused resolving > '48.167.85.209.bb.barracudacentral.org/A/IN': > 64.235.154.72#53 > 06-Sep-2018 13:16:51.308 lame-servers: info: connection refused resolving > '48.167.85.209.bl.blocklist.de/A/IN': 185.21.103.31#53 > 06-Sep-2018 13:16:54.798 lame-servers: info: connection refused resolving > 'e51dd24f684d212a7da1119b23603b0f.generic.ixhash.net/A/IN': > 178.254.39.16#53 > 06-Sep-2018 13:16:54.799 lame-servers: info: connection refused resolving > 'f4d997d8949e6dbd30f6a418ad364589.generic.ixhash.net/A/IN': > 178.254.39.16#53 > 06-Sep-2018 13:16:55.762 lame-servers: info: connection refused resolving > '2.164.177.209.bb.barracudacentral.org/A/IN': > 64.235.145.15#53 > 06-Sep-2018 13:16:55.845 lame-servers: info: connection refused resolving > '2.164.177.209.bb.barracudacentral.org/A/IN': > 64.235.154.72#53 > > What would be a cause of such a significant delay in reaching the root > servers? > > Thanks, > Alex > _______________________________________________ > Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe > from this list > > bind-users mailing list > bind-users@lists.isc.org > https://lists.isc.org/mailman/listinfo/bind-users > _______________________________________________ > Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe > from this list > > bind-users mailing list > bind-users@lists.isc.org > https://lists.isc.org/mailman/listinfo/bind-users _______________________________________________ Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users