On 14 Mar 2016, at 10:06, Dianne Skoll wrote:

On Mon, 14 Mar 2016 14:11:38 +0100
Marcus Schopen <li...@localguru.de> wrote:

It shouldn't make a difference to mimedefang if one of
the dns server is down. Any ideas?

I think this is an artifact of the Net::DNS Perl module, which doesn't
seem to handle multiple name servers very well.

The flaw is not intrinsic in Net::DNS.

Net::DNS has roughly the same tunables as the system resolver, reads resolv.conf to get values for what it does not receive in a RES_OPTIONS environment variable, lets you set them all explicitly, and ultimately uses the system defaults if they aren't set explicitly.

I ran the following test program, where 10.50.100.100 is a nonexistent
machine and 192.168.10.23 is the real name server. Results of strace are shown below; it seems by default that Net::DNS only moves to the next name
server after 10s.  If you do lots of DNS lookups, that can really
slow things down.

Try it with a modern version of Net::DNS and see if that changes.

I haven't dug up the documentation, but on one older system with a "base" perl 5.10 & Net::DNS 0.65 it queries the nameservers list synchronously in series. If I use the perl 5.22 & Net::DNS 1.04 it seems to be querying all nameservers in the list somewhat asynchronously, in quasi-parallel. If the first one answers fast enough it never queries the second, but it's clearly not waiting around to exhaust all retries/retrans/timeouts. It is important to note that Net::DNS has also had some ugly compatibility problems from rapid and essentially untested change in the 1.0x line, but it seems to work fine with MD.

So, using a modified version of your script with the debug flag set and the resolver state printed, using 2 bogus nameservers and one that works (set via the RES_NAMESERVERS environment variable) here's the antique version:

# PATH=/usr/bin/:$PATH time -p /tmp/DiaNneStest.pl
0.65
;; RESOLVER state:
;;  domain       =
;;  searchlist   =
;;  nameservers  = 192.0.2.1 172.16.1.1 127.0.0.1
;;  port         = 53
;;  srcport      = 0
;;  srcaddr      = 0.0.0.0
;;  tcp_timeout  = 120
;;  retrans  = 5  retry    = 4
;;  usevc    = 0  stayopen = 0    igntc = 0
;;  defnames = 1  dnsrch   = 1
;;  recurse  = 1  debug    = 1
;;  force_v4 = 0  (IPv6 Transport is available)

;; query(colo3.roaringpenguin.com, A)
;; Trying to set up a AF_INET6() family type UDP socket with srcaddr: 0.0.0.0 ... done
;; setting up an AF_INET() family type UDP socket
;; send_udp(192.0.2.1:53)
;; send_udp(172.16.1.1:53)
;; send_udp(127.0.0.1:53)
;; answer from 127.0.0.1:53 : 94 bytes
;; HEADER SECTION
;; id = 53885
;; qr = 1    opcode = QUERY    aa = 0    tc = 0    rd = 1
;; ra = 1    ad = 0    cd = 0    rcode  = NOERROR
;; qdcount = 1  ancount = 1  nscount = 2  arcount = 0

;; QUESTION SECTION (1 record)
;; colo3.roaringpenguin.com.    IN      A

;; ANSWER SECTION (1 record)
colo3.roaringpenguin.com.       80684   IN      A       70.38.112.54

;; AUTHORITY SECTION (2 records)
roaringpenguin.com.     21838   IN      NS      ns3.roaringpenguin.com.
roaringpenguin.com.     21838   IN      NS      ns4.roaringpenguin.com.

;; ADDITIONAL SECTION (0 records)

real        20.12
user         0.08
sys          0.02

=========================================================

Oh look, there seems to be a 10s timeout per bad server, even though the udp_timeout value isn't in that old version...

And here's with the perl that anything other than the base OS would use:


# time -p /tmp/DiaNneStest.pl
1.04
;; RESOLVER state:
;; domain       =
;; searchlist   =
;; nameservers  = 192.0.2.1 172.16.1.1 127.0.0.1
;; defnames     = 1     dnsrch          = 1
;; retrans      = 5     retry           = 4
;; recurse      = 1     igntc           = 0
;; usevc        = 0     port            = 53
;; srcaddr      = 0     srcport         = 0
;; tcp_timeout  = 120   persistent_tcp  = 0
;; udp_timeout  = 30    persistent_udp  = 0
;; debug        = 1     force_v4        = 0
;; prefer_v6    = 0     force_v6        = 0


;; query( colo3.roaringpenguin.com A )

;; udp send [192.0.2.1]:53

;; udp send [172.16.1.1]:53

;; udp send [127.0.0.1]:53

;; answer from [127.0.0.1] length 94
;; HEADER SECTION
;;      id = 31299
;;      qr = 1  aa = 0  tc = 0  rd = 1  opcode = QUERY
;;      ra = 1  z  = 0  ad = 0  cd = 0  rcode  = NOERROR
;;      qdcount = 1     ancount = 1     nscount = 2     arcount = 0
;;      do = 0

;; QUESTION SECTION (1 record)
;; colo3.roaringpenguin.com.    IN      A

;; ANSWER SECTION (1 record)
colo3.roaringpenguin.com.       80625   IN      A       70.38.112.54

;; AUTHORITY SECTION (2 records)
roaringpenguin.com.     21779   IN      NS      ns3.roaringpenguin.com.
roaringpenguin.com.     21779   IN      NS      ns4.roaringpenguin.com.

;; ADDITIONAL SECTION (0 records)

real         3.53
user         0.15
sys          0.02

============================================================

And since 3.53s is still an awfully long time to wait, let's not give anyone a second chance to answer a simple question:

# RES_OPTIONS="retry:0 retrans:0" time -p /tmp/DiaNneStest.pl
1.04
;; RESOLVER state:
;; domain       =
;; searchlist   =
;; nameservers  = 192.0.2.1 172.16.1.1 127.0.0.1
;; defnames     = 1     dnsrch          = 1
;; retrans      = 0     retry           = 0
;; recurse      = 1     igntc           = 0
;; usevc        = 0     port            = 53
;; srcaddr      = 0     srcport         = 0
;; tcp_timeout  = 120   persistent_tcp  = 0
;; udp_timeout  = 30    persistent_udp  = 0
;; debug        = 1     force_v4        = 0
;; prefer_v6    = 0     force_v6        = 0


;; query( colo3.roaringpenguin.com A )

;; udp send [192.0.2.1]:53

;; udp send [172.16.1.1]:53

;; udp send [127.0.0.1]:53

;; answer from [127.0.0.1] length 94
;; HEADER SECTION
;;      id = 48510
;;      qr = 1  aa = 0  tc = 0  rd = 1  opcode = QUERY
;;      ra = 1  z  = 0  ad = 0  cd = 0  rcode  = NOERROR
;;      qdcount = 1     ancount = 1     nscount = 2     arcount = 0
;;      do = 0

;; QUESTION SECTION (1 record)
;; colo3.roaringpenguin.com.    IN      A

;; ANSWER SECTION (1 record)
colo3.roaringpenguin.com.       80510   IN      A       70.38.112.54

;; AUTHORITY SECTION (2 records)
roaringpenguin.com.     21664   IN      NS      ns4.roaringpenguin.com.
roaringpenguin.com.     21664   IN      NS      ns3.roaringpenguin.com.

;; ADDITIONAL SECTION (0 records)

real         0.87
user         0.14
sys          0.02

============================================================

That's not so bad. ~5x slower than if I just use the working resolver, but tolerable.

Regards,

Dianne.

#!/usr/bin/perl
#### ns.pl test program
use Net::DNS;
use Net::DNS::Resolver;
my $r = Net::DNS::Resolver->new(nameservers => ['10.50.100.100', '192.168.10.23']);
my $x = $r->query('colo3.roaringpenguin.com', 'A');

My derivative:

#!/usr/bin/env perl
 #### derived from Dianne Skoll's ns.pl test program
 use Net::DNS;
 use Net::DNS::Resolver;
 my $r = Net::DNS::Resolver->new();
 $r->debug(1);
 print Net::DNS->version, "\n";
 print  $r->string, "\n";
 my $x = $r->query('colo3.roaringpenguin.com', 'A');
_______________________________________________
NOTE: If there is a disclaimer or other legal boilerplate in the above
message, it is NULL AND VOID.  You may ignore it.

Visit http://www.mimedefang.org and http://www.roaringpenguin.com
MIMEDefang mailing list MIMEDefang@lists.roaringpenguin.com
http://lists.roaringpenguin.com/mailman/listinfo/mimedefang

Reply via email to