Daniel Kahn Gillmor wrote:
> Following up on the discussion about dnscache problems in
> https://bugs.debian.org/796118:
> 
> dnscache is, sadly, buggy even beyond the concerns about cache
> poisoning.  It is unable to resolve the domain name indymedia.org, while
> other DNS resolvers like unbound can do.

Hi, Daniel:

I suspect this issue is caused by a design flaw in dnscache: it has a
hardcoded maximum amount of "gluelessness" that it will tolerate from
domains that aren't set up using the preferred DJB style of
"in-bailiwick servers with glue":

    [...]

    ``As far as I know, the Internet has not yet lost any domains to
    gluelessness,'' I wrote in 2000. ``But there are an increasing
    number of glueless domains, and I've spotted a glueless domain with
    glueless DNS servers. How much gluelessness must a cache tolerate?
    Currently dnscache allows three levels of gluelessness. This seems
    to be enough for now, but will it be enough in the future?''

    [...]

    I recommend that all DNS servers be in-bailiwick servers with glue.
    External DNS servers should be given internal names, with address
    records copied automatically (preferably by some secure mechanism)
    from the external names to the internal names.

    DNS should have been designed with addresses, not names, in NS
    records and MX records. The ``additional section'' of DNS responses
    should have been eliminated. RFC 1035 observes correctly that NS
    indirection and MX indirection ``insure [sic] consistency'' of
    addresses; however, this indirection should have been handled by the
    server, not the client. [...]

    -- "Gluelessness" from http://cr.yp.to/djbdns/notes.html

The static "gluelessness" limit (QUERY_MAXLEVEL in the source) was later
increased from 3 to 5, about 15 (!) years ago:

    20010105
            ui: increased MAXLEVEL to 5. the Internet is becoming more
                    glueless every day.

    -- "CHANGES" from src:djbdns-1.05

The "QUERY_MAXLEVEL" constant in the source is then used to allocate
statically sized arrays when spawning query objects.  (Some servers that
used a more flexible design that permitted more gluelessness had to be
hardened recently, though: https://www.kb.cert.org/vuls/id/264212.)  You
could probably patch dnscache to increase this number by a small amount
in order to tolerate slightly more gluelessness, but increasing it by a
large amount would waste memory, due to the statically sized arrays.  If
I understand correctly, BIND and Unbound have various per-resolution
counters that might permit something like an order of magnitude more
"gluelessness".

I might be wrong, but it looks like it's possible for indymedia.org to
exceed the gluelessness limit when resolving one of its nameservers,
ns2.fs-dl.net.  At least, if you grep your dnscache log:

    grep ^cached dnscache.indymedia.org.log | sort -u

It looks like it was able to find an address record for ns2.riseup.net
but not ns2.fs-dl.net.  But if that's the case, why didn't it at least
try to resolve indymedia.org using the one nameserver address that it
found?  (I suspect it may have found the non-authoritative glue record
for ns2.riseup.net but wasn't able to find the authoritative address
record.)

ns2.riseup.net is 204.13.164.8, which is cc0da408 in DJB's IPv4
presentation format.  ns2.fs-dl.net is 147.95.16.164, which is 935f10a4
in DJB.  dnscache never tries to contact those servers in your log:

    edmonds@chase{0}:/tmp$ grep ^tx dnscache.indymedia.org.log | grep cc0da408
    edmonds@chase{1}:/tmp$ grep ^tx dnscache.indymedia.org.log | grep 935f10a4
    edmonds@chase{1}:/tmp$ 

I agree with your conclusion that dnscache is buggy.  There are many
domains on the Internet that rely on amounts of gluelessness beyond the
small amount that dnscache is willing to tolerate.  The operators of
those domains still get good performance from modern DNS resolvers and
the DNS standards do not specify concrete upper or lower bounds on the
amount of gluelessness that must be supported by resolvers.  So I don't
see a good standards-based argument that those operators are doing
something wrong.

-- 
Robert Edmonds
edmo...@debian.org

Reply via email to