[Mimedefang] long dns timeouts when first dns in /etc/resolv.conf is down
Hi, I had a strange behaviour this weekend: the first of two dns server of my hoster in /etc/resolv.conf was down. The second dns server was working. Mimedefang/Spamassassin didn't like that. I had very long dns time outs for remote checks. It took over a minute for an email to run through mimedefang. System wide dns worked fine, eg. like pinging domains. After removing the "down" dns server from /etc/resolv.conf and restarting sendmail and mimedefang emails went through within a second again. It shouldn't make a difference to mimedefang if one of the dns server is down. Any ideas? Ciao Marcus ___ NOTE: If there is a disclaimer or other legal boilerplate in the above message, it is NULL AND VOID. You may ignore it. Visit http://www.mimedefang.org and http://www.roaringpenguin.com MIMEDefang mailing list MIMEDefang@lists.roaringpenguin.com http://lists.roaringpenguin.com/mailman/listinfo/mimedefang
Re: [Mimedefang] long dns timeouts when first dns in /etc/resolv.conf is down
On Mon, 14 Mar 2016 14:11:38 +0100 Marcus Schopen wrote: > It shouldn't make a difference to mimedefang if one of > the dns server is down. Any ideas? I think this is an artifact of the Net::DNS Perl module, which doesn't seem to handle multiple name servers very well. I ran the following test program, where 10.50.100.100 is a nonexistent machine and 192.168.10.23 is the real name server. Results of strace are shown below; it seems by default that Net::DNS only moves to the next name server after 10s. If you do lots of DNS lookups, that can really slow things down. Regards, Dianne. #!/usr/bin/perl ns.pl test program use Net::DNS; use Net::DNS::Resolver; my $r = Net::DNS::Resolver->new(nameservers => ['10.50.100.100', '192.168.10.23']); my $x = $r->query('colo3.roaringpenguin.com', 'A'); - strace output: $ strace -t -esendto perl ns.pl 10:03:49 sendto(4, "N\341\1\0\0\1\0\0\0\0\0\0\5colo3\16roaringpengui"..., 42, 0, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("10.50.100.100")}, 16) = 42 10:03:59 sendto(4, "N\341\1\0\0\1\0\0\0\0\0\0\5colo3\16roaringpengui"..., 42, 0, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("192.168.10.23")}, 16) = 42 10:03:59 +++ exited with 0 +++ ___ NOTE: If there is a disclaimer or other legal boilerplate in the above message, it is NULL AND VOID. You may ignore it. Visit http://www.mimedefang.org and http://www.roaringpenguin.com MIMEDefang mailing list MIMEDefang@lists.roaringpenguin.com http://lists.roaringpenguin.com/mailman/listinfo/mimedefang
Re: [Mimedefang] long dns timeouts when first dns in /etc/resolv.conf is down
Hi there, On Mon, 14 Mar 2016, Marcus Schopen wrote: ... It shouldn't make a difference to mimedefang if one of the dns server is down. Any ideas? Run a nameserver of your own? -- 73, Ged. ___ NOTE: If there is a disclaimer or other legal boilerplate in the above message, it is NULL AND VOID. You may ignore it. Visit http://www.mimedefang.org and http://www.roaringpenguin.com MIMEDefang mailing list MIMEDefang@lists.roaringpenguin.com http://lists.roaringpenguin.com/mailman/listinfo/mimedefang
Re: [Mimedefang] long dns timeouts when first dns in /etc/resolv.conf is down
Am Montag, den 14.03.2016, 16:08 + schrieb G.W. Haywood: > Hi there, > > On Mon, 14 Mar 2016, Marcus Schopen wrote: > > > ... It shouldn't make a difference to mimedefang if one of the dns > > server is down. Any ideas? > > Run a nameserver of your own? An own dns can go down too. ___ NOTE: If there is a disclaimer or other legal boilerplate in the above message, it is NULL AND VOID. You may ignore it. Visit http://www.mimedefang.org and http://www.roaringpenguin.com MIMEDefang mailing list MIMEDefang@lists.roaringpenguin.com http://lists.roaringpenguin.com/mailman/listinfo/mimedefang
Re: [Mimedefang] long dns timeouts when first dns in /etc/resolv.conf is down
Hi there, On Tue, 15 Mar 2016, Marcus Schopen wrote: Am Montag, den 14.03.2016, 16:08 + schrieb G.W. Haywood: > On Mon, 14 Mar 2016, Marcus Schopen wrote: > > > ... It shouldn't make a difference to mimedefang if one of the dns > > server is down. Any ideas? > > Run a nameserver of your own? An own dns can go down too. Of course it can. But you can fix it. ;) -- 73, Ged. ___ NOTE: If there is a disclaimer or other legal boilerplate in the above message, it is NULL AND VOID. You may ignore it. Visit http://www.mimedefang.org and http://www.roaringpenguin.com MIMEDefang mailing list MIMEDefang@lists.roaringpenguin.com http://lists.roaringpenguin.com/mailman/listinfo/mimedefang
Re: [Mimedefang] long dns timeouts when first dns in /etc/resolv.conf is down
Seems it's a recommendation by apache https://wiki.apache.org/spamassassin/CachingNameserver -Original Message- From: MIMEDefang [mailto:mimedefang-boun...@lists.roaringpenguin.com] On Behalf Of G.W. Haywood Sent: 15 March 2016 16:07 To: mimedefang@lists.roaringpenguin.com Subject: Re: [Mimedefang] long dns timeouts when first dns in /etc/resolv.conf is down Hi there, On Tue, 15 Mar 2016, Marcus Schopen wrote: > Am Montag, den 14.03.2016, 16:08 + schrieb G.W. Haywood: > > On Mon, 14 Mar 2016, Marcus Schopen wrote: > > > > > ... It shouldn't make a difference to mimedefang if one of the dns > > > server is down. Any ideas? > > > > Run a nameserver of your own? > > An own dns can go down too. Of course it can. But you can fix it. ;) -- 73, Ged. ___ NOTE: If there is a disclaimer or other legal boilerplate in the above message, it is NULL AND VOID. You may ignore it. Visit http://www.mimedefang.org and http://www.roaringpenguin.com MIMEDefang mailing list MIMEDefang@lists.roaringpenguin.com http://lists.roaringpenguin.com/mailman/listinfo/mimedefang This Email Has Been Anti-Virus Scanned DISCLAIMER: This email and any attachments transmitted with it are confidential and may contain privileged or copyright information. If you are not the named or intended recipient of this email you must not read, use or disseminate the information contained within it for any purpose other than to notify us. If you have received this email in error, please notify the sender immediately and delete this email from your system. It is your responsibility to protect your system from viruses and any other harmful code or device, we try to eliminate them from emails and attachments, but accept no liability for any which remain. ___ NOTE: If there is a disclaimer or other legal boilerplate in the above message, it is NULL AND VOID. You may ignore it. Visit http://www.mimedefang.org and http://www.roaringpenguin.com MIMEDefang mailing list MIMEDefang@lists.roaringpenguin.com http://lists.roaringpenguin.com/mailman/listinfo/mimedefang
Re: [Mimedefang] long dns timeouts when first dns in /etc/resolv.conf is down
On 14 Mar 2016, at 12:08, G.W. Haywood wrote: Hi there, On Mon, 14 Mar 2016, Marcus Schopen wrote: ... It shouldn't make a difference to mimedefang if one of the dns server is down. Any ideas? Run a nameserver of your own? This is ALWAYS the right approach for a mail system that traffic directly from the Internet. A caching resolver on the same host as each MTA or at least a resolver on the same LAN segment (i.e. LOW-LATENCY) is critical to performance in a modern mail system. ___ NOTE: If there is a disclaimer or other legal boilerplate in the above message, it is NULL AND VOID. You may ignore it. Visit http://www.mimedefang.org and http://www.roaringpenguin.com MIMEDefang mailing list MIMEDefang@lists.roaringpenguin.com http://lists.roaringpenguin.com/mailman/listinfo/mimedefang
Re: [Mimedefang] long dns timeouts when first dns in /etc/resolv.conf is down
On 14 Mar 2016, at 10:06, Dianne Skoll wrote: On Mon, 14 Mar 2016 14:11:38 +0100 Marcus Schopen wrote: It shouldn't make a difference to mimedefang if one of the dns server is down. Any ideas? I think this is an artifact of the Net::DNS Perl module, which doesn't seem to handle multiple name servers very well. The flaw is not intrinsic in Net::DNS. Net::DNS has roughly the same tunables as the system resolver, reads resolv.conf to get values for what it does not receive in a RES_OPTIONS environment variable, lets you set them all explicitly, and ultimately uses the system defaults if they aren't set explicitly. I ran the following test program, where 10.50.100.100 is a nonexistent machine and 192.168.10.23 is the real name server. Results of strace are shown below; it seems by default that Net::DNS only moves to the next name server after 10s. If you do lots of DNS lookups, that can really slow things down. Try it with a modern version of Net::DNS and see if that changes. I haven't dug up the documentation, but on one older system with a "base" perl 5.10 & Net::DNS 0.65 it queries the nameservers list synchronously in series. If I use the perl 5.22 & Net::DNS 1.04 it seems to be querying all nameservers in the list somewhat asynchronously, in quasi-parallel. If the first one answers fast enough it never queries the second, but it's clearly not waiting around to exhaust all retries/retrans/timeouts. It is important to note that Net::DNS has also had some ugly compatibility problems from rapid and essentially untested change in the 1.0x line, but it seems to work fine with MD. So, using a modified version of your script with the debug flag set and the resolver state printed, using 2 bogus nameservers and one that works (set via the RES_NAMESERVERS environment variable) here's the antique version: # PATH=/usr/bin/:$PATH time -p /tmp/DiaNneStest.pl 0.65 ;; RESOLVER state: ;; domain = ;; searchlist = ;; nameservers = 192.0.2.1 172.16.1.1 127.0.0.1 ;; port = 53 ;; srcport = 0 ;; srcaddr = 0.0.0.0 ;; tcp_timeout = 120 ;; retrans = 5 retry= 4 ;; usevc= 0 stayopen = 0igntc = 0 ;; defnames = 1 dnsrch = 1 ;; recurse = 1 debug= 1 ;; force_v4 = 0 (IPv6 Transport is available) ;; query(colo3.roaringpenguin.com, A) ;; Trying to set up a AF_INET6() family type UDP socket with srcaddr: 0.0.0.0 ... done ;; setting up an AF_INET() family type UDP socket ;; send_udp(192.0.2.1:53) ;; send_udp(172.16.1.1:53) ;; send_udp(127.0.0.1:53) ;; answer from 127.0.0.1:53 : 94 bytes ;; HEADER SECTION ;; id = 53885 ;; qr = 1opcode = QUERYaa = 0tc = 0rd = 1 ;; ra = 1ad = 0cd = 0rcode = NOERROR ;; qdcount = 1 ancount = 1 nscount = 2 arcount = 0 ;; QUESTION SECTION (1 record) ;; colo3.roaringpenguin.com.IN A ;; ANSWER SECTION (1 record) colo3.roaringpenguin.com. 80684 IN A 70.38.112.54 ;; AUTHORITY SECTION (2 records) roaringpenguin.com. 21838 IN NS ns3.roaringpenguin.com. roaringpenguin.com. 21838 IN NS ns4.roaringpenguin.com. ;; ADDITIONAL SECTION (0 records) real20.12 user 0.08 sys 0.02 = Oh look, there seems to be a 10s timeout per bad server, even though the udp_timeout value isn't in that old version... And here's with the perl that anything other than the base OS would use: # time -p /tmp/DiaNneStest.pl 1.04 ;; RESOLVER state: ;; domain = ;; searchlist = ;; nameservers = 192.0.2.1 172.16.1.1 127.0.0.1 ;; defnames = 1 dnsrch = 1 ;; retrans = 5 retry = 4 ;; recurse = 1 igntc = 0 ;; usevc= 0 port= 53 ;; srcaddr = 0 srcport = 0 ;; tcp_timeout = 120 persistent_tcp = 0 ;; udp_timeout = 30persistent_udp = 0 ;; debug= 1 force_v4= 0 ;; prefer_v6= 0 force_v6= 0 ;; query( colo3.roaringpenguin.com A ) ;; udp send [192.0.2.1]:53 ;; udp send [172.16.1.1]:53 ;; udp send [127.0.0.1]:53 ;; answer from [127.0.0.1] length 94 ;; HEADER SECTION ;; id = 31299 ;; qr = 1 aa = 0 tc = 0 rd = 1 opcode = QUERY ;; ra = 1 z = 0 ad = 0 cd = 0 rcode = NOERROR ;; qdcount = 1 ancount = 1 nscount = 2 arcount = 0 ;; do = 0 ;; QUESTION SECTION (1 record) ;; colo3.roaringpenguin.com.IN A ;; ANSWER SECTION (1 record) colo3.roaringpenguin.com. 80625 IN A 70.38.112.54 ;; AUTHORITY SECTION (2 records) roaringpenguin.com. 21779 IN NS ns3.roaringpenguin.com. roaringpenguin.com. 21779 IN NS ns4.roaringpenguin.com. ;; ADDITIONAL SECTION (0 records) real 3.53 user 0.15 sys 0.02 And since 3.53s is still an awfully long time to wait, let's not gi