If you're a secondary MX for a domain, and your system can resolve the MX record for the domain, but the resolution of the A record for any lower preference MX entries fails with a soft DNS error (e.g. timeout), qmail bounces the message as best-preference-MX-without-further-instructions. $ dnsqr mx mail.test.sub-rosa.com 15 mail.test.sub-rosa.com: 94 bytes, 1+2+0+0 records, response, noerror query: 15 mail.test.sub-rosa.com answer: mail.test.sub-rosa.com 0 MX 0 mx.timeout.test.sub-rosa.com answer: mail.test.sub-rosa.com 0 MX 100 spool.mail.sub-rosa.com $ dnsqr a mx.timeout.test.sub-rosa.com 1 mx.timeout.test.sub-rosa.com: temporary failure $ dnsq a mx.timeout.test.sub-rosa.com 63.141.2.19 1 mx.timeout.test.sub-rosa.com: timed out | Return-Path: <> | Delivered-To: [EMAIL PROTECTED] | Received: (qmail 32495 invoked for bounce); 19 Oct 2000 14:23:00 -0000 | Date: 19 Oct 2000 14:23:00 -0000 | From: [EMAIL PROTECTED] | To: [EMAIL PROTECTED] | Subject: failure notice | | Hi. This is the qmail-send program at califia.sub-rosa.com. | I'm afraid I wasn't able to deliver your message to the following | addresses. | This is a permanent error; I've given up. Sorry it didn't work out. | | <[EMAIL PROTECTED]>: | Sorry. Although I'm listed as a best-preference MX or A for that host, | it isn't in my control/locals file, so I don't treat it as local. (#5.4.6) | | --- Below this line is a copy of the message. | | Return-Path: <[EMAIL PROTECTED]> | Received: (qmail 32488 invoked by uid 1000); 19 Oct 2000 14:21:37 -0000 | Date: 19 Oct 2000 14:21:37 -0000 | Message-ID: <[EMAIL PROTECTED]> | From: "Michael Handler" <[EMAIL PROTECTED]> | Subject: test | To: [EMAIL PROTECTED] | | test Looking through qmail-remote.c, it becomes apparent that in this situation, dns_mxip() only returns the IP addresses & preferences that it could resolve completely, with no indication that there were additional lower preference MX records that were omitted due to soft DNS errors. Thus, when qmail-remote walks through the list of addresses, it finds itself as the best-preference MX for the domain, and attempts to handle the mail locally. Empirical testing bears this diagnosis out: $ src/qmail-1.03/dnsmxip mail.test.sub-rosa.com 64.0.106.44 100 Scenarios that would run afoul of this are not difficult to imagine: if domain example.com has MX 0 mx.provider.net and MX 100 spool.mail.sub-rosa.com, and mx.provider.net has a lower TTL than the MX for example.com, and provider.net's nameservers are unreachable when my dnscache tries to go resolve mx.provider.net... I think I'm starting to see why Dan's DNS software encourages using all in-name zones; though even that is vulnerable if the TTL on the A record is lower than the TTL on the MX record. Note that I don't consider this a problem for hard DNS failures, e.g. an MX record that points at a hostname that authoritatively doesn't exist; that's what the smtproutes functionality is for. However, I think it's reasonable for qmail to not bounce messages based on soft DNS failures. Searching the archives, I note that Chuck Foster noted this problem waaaaaaaaaaaay back in 1997: http://www.ornl.gov/its/archives/mailing-lists/qmail/1997/07/msg00802.html It seems to me that the best way to address this is to have dns_mxip return the full MX list set, with the IP address set to null or 0.0.0.0 for A records that could not be successfully resolved, and have qmail-remote.c's for loop skip those MX entries. This would result in temp_noconn() for these situations, rather than perm_ambigmx(). Note that all of the *.test.sub-rosa.com entries mentioned here exist, and the tests were done live, with no post-production touchups. Feel free to poke at my DNS and SMTP servers if you want to do your own tests. Thoughts? --michael