Package: dnsutils
Version: 1:9.3.1-2
Priority: normal

I've been banging my head with an abnormal issue in DNS resolution 
in a Linux server when using the bind9 utilities that did not appear 
with any other tool (i.e. those that use the system's resolver). 
Basicly, when I tried to resolve anything using either `dig', `host'
or `nslookup' the query would always timeout. Other tools would do
DNS resolution just fine. 

More strangely, when tracing with tcpdump, I did not see any IP packets 
being sent to the configured nameservers in the network interface, however,
when capturing the traffic from the 'lo' interface I did see DNS
requests in it! DNS queries were being sent with source address 127.0.0.1
and destination address 127.0.0.1 port 53.

After debugging the issue, it turns out that it was a configuration
issue that the bind9 code did not complain about and ended up breaking
the tools. The configuration file /etc/resolv.conf was as follows:

-------------------
nameserver X Y
-------------------

yes, this should have been:

-------------------
nameserver X
nameserver Y
-------------------

The funny thing is that the standard resolver library handles the
first case just fine while as dnsutils doesn't, breaks and does not give
a warning to the end user at all. After debugging this issue 
it seems that the culprit here is ISC's lwconfig library. The following
change in lib/lwres/lwconfig seems to be sufficient to prevent a
misconfiguration from breaking these utilities. Notice how the 
lwres_conf_parsenameserver() function has been changed so it
will *not* abort if it finds a non-space after a nameserver definition:

$ diff -u lwconfig.c.orig lwconfig.c
--- lwconfig.c.orig     2005-08-19 01:51:11.000000000 +0200
+++ lwconfig.c  2005-08-19 01:48:39.000000000 +0200
@@ -287,7 +287,7 @@
        if (strlen(word) == 0U)
                return (LWRES_R_FAILURE); /* Nothing on line. */
        else if (res == ' ' || res == '\t')
-               res = eatwhite(fp);
+               res = eatline(fp);

        if (res != EOF && res != '\n')
                return (LWRES_R_FAILURE); /* Extra junk on line. */

---------------------------------------------------------------------

This same change could be implemented in lwres_conf_parsedomain() and, 
maybe, lwres_conf_parselwserver(). Again, I find it weird that the 
dnsutils do not even warn about this configuration problem and go ahead 
and generate loopback packets in the event of this misconfiguration. 

A better fix would be for the setup_system() call to exit with an
error message if no nameservers have been found. I find extremely
enlightening the following code comment in /lib/bind/resolv/res_init.c:

---------------------------------------------------------------------
 * An interrim version of this code (BIND 4.9, pre-4.4BSD) used 127.0.0.1
 * rather than INADDR_ANY ("0.0.0.0") as the default name server address
 * since it was noted that INADDR_ANY actually meant ``the first interface
 * you "ifconfig"'d at boot time'' and if this was a SLIP or PPP interface,
 * it had to be "up" in order for you to reach your own name server.  It
 * was later decided that since the recommended practice is to always
 * install local static routes through 127.0.0.1 for all your network
 * interfaces, that we could solve this problem without a code change.
 *
 * The configuration file should always be used, since it is the only way
 * to specify a default domain.  If you are running a server on your local
 * machine, you should say "nameserver 0.0.0.0" or "nameserver 127.0.0.1"
 * in the configuration file.
---------------------------------------------------------------------

This is similar to what happens in a misconfiguration case with dig, host
and nslookup in bind9. If no nameservers are configured, it defaults
to 127.0.0.1 which is (IMHO) plain wrong.


Regards

Javier Fernandez-Sanguino

Attachment: signature.asc
Description: Digital signature

Reply via email to