PLEASE NOTE: I am in no way related to the ISC or any other DNS authority... just a user who wants to play his part in helping to work toward the resolution of this vulnerability! And, I also want to thank the ISC for its Herculean efforts to protect us all! This FAQ started as an internal document that ballooned. Hope someone else finds this helpful.
********************************* KAMINSKY VULNERABILITY MAILING LIST FAQ ********************************* It seems to me that the traffic on the bind mailing list has exploded since word about the Kaminsky vulnerability was posted. I've literally had a hard time managing my inbox. I've also noted that this may be affecting others, as repeat questions have become kind of prevalent. I've put together this compacted summary of pertinent topics that have hit the list in the last month, hoping to make it easier to find answers to some of the most FAQs regarding the vulnerability, and its prevention. I have not included much attribution, honestly, because I don't have sufficient time - I hope that no one takes offense... Besides, a quick search will make it pretty easy to find the original sources of any assertion contained herein. Also, I think we all know who the power hitters in this forum are! Please don't hesitate to post corrections to any errors I may have introduced by attempting to create this FAQ! Regards, Steven Stromer IMPORTANT: ISC has announced that -P2 versions of the patch will be made available at the end of the week of July 28, 2008 (We wait with bated breath!). These upgrades promise to correct many (but, not necessarily all) of the bugs reported in -P1 patches and in present beta versions. Until the -P2 patches are available, it is recommended by the ISC that the P1 patch be applied, despite the potential performance hit on high volume servers (>10k queries/sec). -P2 patches will not contain port customization capabilities (see below for brief description) promised in 9.5.1 version. I will refer to this announcement as 'the -P2 announcement' where applicable. For full announcement, see: http://marc.info/?l=bind-users&m=121726908015389&w=2 ALSO NOTE: This summary does not address Windows-specific issues. Sorry. ************************* WHICH SERVERS ARE AT RISK ************************* Installations of BIND are at risk. Some other dns servers have always randomized source ports, and are not likely at risk. If your server is authoritative only, it is not at risk. If it is recursive, it is at risk. No ifs, ands, or buts about it. ********************************** CHANGING NATURE OF BIND PORT USAGE ********************************** TCP All BIND versions use high, random ports for TCP connections. The Kaminsky vulnerability doesn't affect TCP queries. And, no, DNS cannot be turned into a TCP only system. UDP The Kaminsky vulnerability does affect UDP queries. Prior to 9.5.0, BIND chose a high, random UDP *source* port on startup and used that for the life of the process for outbound queries. 9.5.0 improved that by choosing from a small pool of 8 ports, and randomly changed ports every 15 minutes, but it contains no feature for customizing the ports used without altering source code. The -P1 versions (containing Kaminsky vulnerability fix) introduce a per-query randomization across all available high ports (1024-65535). The present betas (9.5.1b2) and (9.4.3b1), following on the -P1 versions, allow fine-grained control for the UDP ports used. This fine-grained control is not yet completely fleshed out; ISC docs explain the developing nature of these controls, which include use-v4- udp-ports, use-v6-udp-ports, net.inet.ip.portrange.hifirst, net.inet.ip.portrange.hilast, and certain sysctl tunables. Important port selection considerations include a) permitting at least 16384 ports, for 14 bits of entropy, to obtain desirable amount of port randomization, and b) picking a range that will not interfere with ports required by other running services. Note that the queryport options are obsoleted in 9.5.1 which uses a random source port for every query. DESTINATION PORT All DNS queries will continue to have a destination of port 53. ******************************** MINIMUM MANDATORY UPGRADE STEPS ******************************** 1. Update bind to the -P1 release with your present branch: (NOTE: Please see the -P2 announcement, above, as it pertains to this.) 9.3.x -> 9.3.5-P1 9.4.x -> 9.4.2-P1 9.5.0 -> 9.5.0-P1 2. Confirm that either there is no 'query-source port ##' statement in named.conf, or, if it does exist, that it is set to 'port *'. 3. Open unprivileged UDP ports on firewall. 4. Test to confirm port randomization. *********************** FIREWALL CONSIDERATIONS *********************** Most modern firewalls have an option to do "udp keep state". In an ideal world, use this option for dns activity on unprivileged ports. Alternatively, open all UDP ports >1024 to the name server's IP address. In practice this should not be a problem if no other services requiring these ports is running. If you do, use the combination of avoid-*-udp-ports in named.conf and firewall rules to block those specific ports, and allow all the others. In the various beta versions of BIND there is an option to specify a range of ports for named to use for outgoing UDP queries (see above), which should make it easier to configure the firewall. Confirm that NATing firewalls are not rewriting source addresses on outbound dns queries, or all your patching and port randomization will be for naught. This is usually configurable, but confirm it's not occurring. Cisco FWSM and various PIX releases are being reported as particularly troublesome in this regard. It is also possible to use iptables firewall rules to (further) randomize the source port. SEE: http://cipherdyne.org/blog/2008/07/mitigating-dns-cache-poisoning- attacks-with-iptables.html Or, to do the same with pf: http://blog.spoofed.org/2008/07/mitigating-dns-cache-poisoning-with- pf.html ********************************** TEST TO CONFIRM PORT RANDOMIZATION ********************************** DNS-OARC and Doxpara Tests would previously pass nameservers with even 9.5.0 level port randomization, though this now seems to have been fixed. It is still wise to confirm that more than 8 (16?) ports are being used. ------------------------------- TEST OPTION #1: DNS-OARC at CLI ------------------------------- At command prompt: dig +short porttest.dns-oarc.net TXT To test a specific nameserver, at command prompt: dig @<ip_of_nameserver> +short porttest.dns-oarc.net TXT Response will include one of the following ratings: Rating Standard Deviation Bits of Entropy GREAT 3980 -- 20,000+ 13.75 -- 16.0 GOOD 296 -- 3980 10.0 --13.75 POOR 0 -- 296 0 -- 10.0 Note the standard deviation shown at the end of the response - you want 5 digits before the decimal point. ------------------------------- TEST OPTION #2: DNS-OARC on Web ------------------------------- Visit: https://www.dns-oarc.net/oarc/services/dnsentropy Click 'Test My DNS' ------------------------------------- TEST OPTION #3: Other Web-based tests ------------------------------------- http://www.doxpara.com http://member.dnsstuff.com/tools/vu800113.php I'm sure this list is not comprehensive... --------------------------------- TEST OPTION #4: PERL based tester --------------------------------- Download at: http://michael.toren.net/code/noclicky/noclicky-1.00.pl Download patch to same directory as perl script, for more accurate results: http://www.smtps.net/issues/01-noclicky.patch Apply patch: $ patch -p0 <02-noclicky.patch Run the perl script. ------------------ ADDITIONAL TESTING ------------------ In addition to running one of the above tests, run tcpdump and confirm it is showing multiple ports on queries. *********************************** DOCUMENTED ISSUES WITH -P1 VERSIONS *********************************** (NOTE: Please see the -P2 announcement, above, as it pertains to this.) There appear to be a number of threading issues appearing when applying some -P1 patches to certain OS/hardware combinations. There also appear to be some configuration settings that reduce these errors, in some instances. Betas of the next versions, applied to avoid the -P1 issues, introduce another set of bugs/errors (see below). If you are concerned about upgrading a busy server, Jinmei Tatuya (ISC) has created a testing tool to help pre-determine whether your server will experience problems. The test tool is available at: http://www.jinmei.org/selecttest.tgz For more info, see: http://marc.info/?l=bind-users&m=121745487721871&w=2 At this time, ISC is recommending that everyone stay within their branch and move to the -P1 releases (ie, if you are at 9.3.x, move to 9.3.5-P1, 9.4.x users should move to 9.4.2-P1). Unless you have a definitive need to run the beta code, they recommend remaining with the -P1 releases to reduce the number of changes that are being introduced into your environment. ---------------------------- 'TOO MANY OPEN FILES' ERROR: ---------------------------- NOTE: Seems to be affecting those running 9.5.0-P1. This is not necessarily a fatal error, but will return errors to clients. LOGGED AS: general: error: socket.c:1996: unexpected error: general: error: internal_accept: fcntl() failed: Too many open files OR: named[xxx]: [ID xxxxx daemon.error] general: error: socket: too many open file descriptors EXPERIMENTAL SOLUTIONS: 1. Increase file-descriptors default from 256 to 4096 for 32-bit apps, or to 65535 for 64-bit apps. Set the #define __FD_SETSIZE in /usr/include/linux/posix_types.h to 4096, save, and recompile. 2. Increase max-cache-size to 512M. 3. Edit /etc/security/limit.conf to allow for processes to have more open files: cat /proc/sys/fs/file-max to see what the kernel thinks is the max number of processes Edit /etc/security/limits.conf as follows: * - nofile 16384 SEE: http://kbase.redhat.com/faq/FAQ_80_1540.shtm 4. tcp-clients and tcp-listen-queue to 1000 (seemingly discredited solution) 5. Increase ulimit in limits.conf: Use ulimit -n to see how many open files you currently allow. Edit /etc/security/limits.conf as follows: * soft nofile 16384 * hard nofile 16384 (This changes the limits for everything, but if on dedicated nameservers, this isn't an issue.) ************************************ DOCUMENTED ISSUES WITH BETA VERSIONS ************************************ ---------------------------- 'ASSERTION FAILURE' ERROR: ---------------------------- LOGGED AS: #general: resolver.c:5494: REQUIRE((((query) != 0) && (((const isc__magic_t *)(query))->magic == ((('Q') << 24 | ('!') << 16 | ('!') << 8 | ('!')))))) failed #general: exiting (due to assertion failure) EXPERIMENTAL SOLUTIONS: 1. ISC is asking that willing beta testers experiencing this error apply the following patch to help capture detailed debugging info: http://www.jinmei.org/bind-9.4.3b2-dispatch.diff and: http://www.jinmei.org/patch/dispatch.c.diff 2. Possible temporary solution is to recompile beta versions from source without threads (opens door to possible performance degradation). 2. Increase ISC_SOCKET_MAXECENTS to 12. ---------------------------------- ANOTHER 'ASSERTION FAILURE' ERROR: ---------------------------------- LOGGED AS: named[xxxxx]: socket.c:1736: INSIST(!sock->pending_recv) failed named[xxxxx]: exiting (due to assertion failure) ISC REPORTS: We've [...] a fix to that in our development tree (btw: this bug is irrelevant to the recent port randomization change). Unfortunately the fix won't be in the next patch version (P2)[...] -------------------------------------------------- 'MAXIMUM NUMBER OF FD EVENTS (64) RECEIVED' ERROR: -------------------------------------------------- NOTE: Seems to be affecting bind-9.4.3b2 (Possibly isolated to Solaris...) LOGGED AS: general: sockmgr: maximum number of FD events (64) received EXPERIMENTAL SOLUTIONS: Increasing ISC_SOCKET_MAXEVENTS from 64 to 128 seems to reduce frequency of warning. ---------------------------- 'BAD FILE HANDLE' ERROR: ---------------------------- EDITOR'S NOTE: Mentioned in list, but could not locate any documentation... ******************************** REDHAT'S RESPONSE TO THE VULNERABILITY ******************************** RedHat's advisory is at: http://rhn.redhat.com/errata/ RHSA-2008-0533.html Advisory does not clearly specify whether patches only remove the query port restriction out oftheir sample named.conf, or whether the full code update needed to make do real port randomization has been implemented. dig test should confirm its activity. Advisory details that updates to selinux-policy packages permit port randomization, so update these, too. ******************************* HANDLING OLDER VERSIONS OF BIND ******************************* OPTION #1: UPGRADE! OPTION #2: For the moment, set older, unpatchable servers to use a newer server as a forwarder. **************************************** DNSSEC AS PART OF A LONGER TERM SOLUTION **************************************** Implement DNSSEC. This may be more practical for busy admins once 9.6 is released. Until then, zones need to be resigned on a regular schedule, manually. Another consideration is security; implementation basically reveals the full contents of a zone, though this is also supposed to be resolved in 9.6. EDITOR'S COMMENT: It seems that the discovery of this vulnerability has reinforced the need for universal implementation of dnssec. 9.6 seems to promise a lot. I, for one, look forward to its release, and hope that all pertinent and available resources are being thrown at its completion! Good implementation instructions are at: http://www.isc.org/sw/bind/docs/DNSSEC_in_6_minutes.pdf Posted corrections to this document (Version 1.4 contains these corrections): On page 31: dnssec-keygen -a rsasha1 -b 4096 -n ZONE -k KSK zonename Should be: dnssec-keygen -a rsasha1 -b 4096 -n ZONE -f KSK zonename On page 49: dlv.isg.org. 3 257 "BEA[...]gDB"; Should be: dlv.isc.org. 257 3 5 "BEA[...]guDB"; Also SEE: http://alan.clegg.com/dnssec ******************************************************************* CACHE SNOOPING IS A RELATED, STILL UNADDRESSED VULNERABILITY IN DNS ******************************************************************* Chris Buxton, of Men & Mice, provided a fantastic explanation of this that all should read: http://groups.google.com/group/comp.protocols.dns.bind/msg/ b6c67170b468d693
