Re: RES_TRUSTAD, was Trying again on SERVFAIL
>> So ... I can't get the glibc behaviour to mesh with the standard >> on this particular point. > > It's set in RFC 6840: I stand corrected, thanks. - Håvard ___ Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list ISC funds the development of this software with paid support subscriptions. Contact us at https://www.isc.org/contact/ for more information. bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users
RES_TRUSTAD, was Trying again on SERVFAIL
On Thu 11/Feb/2021 17:44:20 +0100 Havard Eidnes wrote: Yeah, by the time it lands on Debian's glibc we'll have grown a long long beard. I'm still missing RES_TRUSTAD... Oh, this set me off on a tangent. I hadn't heard of RES_TRUSTAD before, so I found https://man7.org/linux/man-pages/man5/resolv.conf.5.html which under "trust-ad" contains this text: If the trust-ad option is active, the stub resolver sets the AD bit in outgoing DNS queries (to enable AD bit support), [...] It's similar to dig's man page: +[no]adflag Set [do not set] the AD (authentic data) bit in the query. This requests the server to return whether all of the answer and authority sections have all been validated as secure according to the security policy of the server. AD=1 indicates that all records have been validated as secure and the answer is not from a OPT-OUT range. AD=0 indicate that some part of the answer was insecure or not validated. This bit is set by default. I could not get that to rhyme with what I had perceived to be the semantics of the AD bit, so I looked up RFC 4035 where near the end of section 3 (just before 3.1), I find this text: The AD bit is controlled by name servers; a security-aware name server MUST ignore the setting of the AD bit in queries. That's the name server, not the resolver. So ... I can't get the glibc behaviour to mesh with the standard on this particular point. It's set in RFC 6840: 5.7. Setting the AD Bit on Queries The semantics of the Authentic Data (AD) bit in the query were previously undefined. Section 4.6 of [RFC4035] instructed resolvers to always clear the AD bit when composing queries. This document defines setting the AD bit in a query as a signal indicating that the requester understands and is interested in the value of the AD bit in the response. This allows a requester to indicate that it understands the AD bit without also requesting DNSSEC data via the DO bit. Best Ale -- ___ Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list ISC funds the development of this software with paid support subscriptions. Contact us at https://www.isc.org/contact/ for more information. bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users
Re: Trying again on SERVFAIL
The internet isn’t always on and it isn’t only composed of big tech companies with lots of resources. like Google's gmail, which has had hours-long service outages from time to time? ;-)___ Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list ISC funds the development of this software with paid support subscriptions. Contact us at https://www.isc.org/contact/ for more information. bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users
Re: Trying again on SERVFAIL
> Yeah, by the time it lands on Debian's glibc we'll have grown a long > long beard. I'm still missing RES_TRUSTAD... Oh, this set me off on a tangent. I hadn't heard of RES_TRUSTAD before, so I found https://man7.org/linux/man-pages/man5/resolv.conf.5.html which under "trust-ad" contains this text: If the trust-ad option is active, the stub resolver sets the AD bit in outgoing DNS queries (to enable AD bit support), [...] I could not get that to rhyme with what I had perceived to be the semantics of the AD bit, so I looked up RFC 4035 where near the end of section 3 (just before 3.1), I find this text: The AD bit is controlled by name servers; a security-aware name server MUST ignore the setting of the AD bit in queries. So ... I can't get the glibc behaviour to mesh with the standard on this particular point. Regards, - Håvard ___ Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list ISC funds the development of this software with paid support subscriptions. Contact us at https://www.isc.org/contact/ for more information. bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users
Re: Trying again on SERVFAIL
On Thu 11/Feb/2021 14:47:13 +0100 Ondřej Surý wrote: Mark is right. The internet isn’t always on and it isn’t only composed of big tech companies with lots of resources. The internet consists of lot small systems made by people like you and me and we don’t have infinite resources to keep everything always on. 100% agreed. And honestly I find your quote about Cargo Cult very offensive to all those normal people maintaining the rest of the internet infrastructure that isn’t the current -umvirate. I don't share that point of view. I cited it as evidence of a way of thinking. I find it somewhat green, happy-go-lucky, but not offensive. After all, if you limit the range to personal messages, it's a legitimate way to conceive email services. Best Ale -- ___ Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list ISC funds the development of this software with paid support subscriptions. Contact us at https://www.isc.org/contact/ for more information. bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users
Re: Trying again on SERVFAIL
Mark is right. The internet isn’t always on and it isn’t only composed of big tech companies with lots of resources. The internet consists of lot small systems made by people like you and me and we don’t have infinite resources to keep everything always on. And honestly I find your quote about Cargo Cult very offensive to all those normal people maintaining the rest of the internet infrastructure that isn’t the current -umvirate. Ondrej -- Ondřej Surý (He/Him) ond...@isc.org > On 11. 2. 2021, at 14:13, Mark Andrews wrote: > > Machines still fall over. They take the same amount of time to fix now as > they did 30 years ago. > > You still have to diagnose the fault. You still have to get the replacement > part. You still have to potentially restore from backups. Sometimes you can > switch to a standby machine which makes things faster. > > I’ve seem day long outages in the last 7 days. They still happen. Personally > I was happy the emails queued. > -- > Mark Andrews > >> On 11 Feb 2021, at 23:26, Alessandro Vesely wrote: >> >> On Wed 10/Feb/2021 22:38:05 +0100 J Doe wrote: >>> Out of curiosity, what servers have you encountered that no longer use the >>> five day cutoff ? >> >> >> I didn't take note, but I read discussions on the topic. Users expect mail >> to be delivered almost instantly. The "warning, still trying" messages >> should come sometime in between. If it comes the next day, by various >> people's experience, it is unacceptably too late. If you reduce that to a >> few hours, the total max queue lifetime cannot remain five days. >> >> At mine, although I keep the default 5d, I cut queue time for specific >> messages, such as complaints or dmarc reports, to ten hours. >> >> Quoting from the web: >> >> Queue lifetimes over a day is just Cargo Cult system administration, and a >> holdover from when the internet was much less "always on". >> >> https://serverfault.com/questions/735269/is-it-a-good-idea-to-reduce-the-give-up-time-for-e-mail-delivery#answer-826351 >> >> >> Best >> Ale >> -- >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> ___ >> Please visit https://lists.isc.org/mailman/listinfo/bind-users to >> unsubscribe from this list >> >> ISC funds the development of this software with paid support subscriptions. >> Contact us at https://www.isc.org/contact/ for more information. >> >> >> bind-users mailing list >> bind-users@lists.isc.org >> https://lists.isc.org/mailman/listinfo/bind-users > > ___ > Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe > from this list > > ISC funds the development of this software with paid support subscriptions. > Contact us at https://www.isc.org/contact/ for more information. > > > bind-users mailing list > bind-users@lists.isc.org > https://lists.isc.org/mailman/listinfo/bind-users signature.asc Description: Message signed with OpenPGP ___ Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list ISC funds the development of this software with paid support subscriptions. Contact us at https://www.isc.org/contact/ for more information. bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users
Re: Trying again on SERVFAIL
Machines still fall over. They take the same amount of time to fix now as they did 30 years ago. You still have to diagnose the fault. You still have to get the replacement part. You still have to potentially restore from backups. Sometimes you can switch to a standby machine which makes things faster. I’ve seem day long outages in the last 7 days. They still happen. Personally I was happy the emails queued. -- Mark Andrews > On 11 Feb 2021, at 23:26, Alessandro Vesely wrote: > > On Wed 10/Feb/2021 22:38:05 +0100 J Doe wrote: >> Out of curiosity, what servers have you encountered that no longer use the >> five day cutoff ? > > > I didn't take note, but I read discussions on the topic. Users expect mail > to be delivered almost instantly. The "warning, still trying" messages > should come sometime in between. If it comes the next day, by various > people's experience, it is unacceptably too late. If you reduce that to a > few hours, the total max queue lifetime cannot remain five days. > > At mine, although I keep the default 5d, I cut queue time for specific > messages, such as complaints or dmarc reports, to ten hours. > > Quoting from the web: > >Queue lifetimes over a day is just Cargo Cult system administration, and a >holdover from when the internet was much less "always on". > > https://serverfault.com/questions/735269/is-it-a-good-idea-to-reduce-the-give-up-time-for-e-mail-delivery#answer-826351 > > > Best > Ale > -- > > > > > > > > > > > > > > > > > > ___ > Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe > from this list > > ISC funds the development of this software with paid support subscriptions. > Contact us at https://www.isc.org/contact/ for more information. > > > bind-users mailing list > bind-users@lists.isc.org > https://lists.isc.org/mailman/listinfo/bind-users ___ Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list ISC funds the development of this software with paid support subscriptions. Contact us at https://www.isc.org/contact/ for more information. bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users
Re: Trying again on SERVFAIL
On Wed 10/Feb/2021 22:38:05 +0100 J Doe wrote: Out of curiosity, what servers have you encountered that no longer use the five day cutoff ? I didn't take note, but I read discussions on the topic. Users expect mail to be delivered almost instantly. The "warning, still trying" messages should come sometime in between. If it comes the next day, by various people's experience, it is unacceptably too late. If you reduce that to a few hours, the total max queue lifetime cannot remain five days. At mine, although I keep the default 5d, I cut queue time for specific messages, such as complaints or dmarc reports, to ten hours. Quoting from the web: Queue lifetimes over a day is just Cargo Cult system administration, and a holdover from when the internet was much less "always on". https://serverfault.com/questions/735269/is-it-a-good-idea-to-reduce-the-give-up-time-for-e-mail-delivery#answer-826351 Best Ale -- ___ Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list ISC funds the development of this software with paid support subscriptions. Contact us at https://www.isc.org/contact/ for more information. bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users
Re: Trying again on SERVFAIL
On Thu 11/Feb/2021 10:44:58 +0100 Havard Eidnes wrote: Still, being able to differentiate a local network congestion from a remote bad configuration would help. That's true. There's https://tools.ietf.org/html/draft-ietf-dnsop-extended-error-16 which look promising, trying to make it possible to distinguish between the various reasons a recursor might choose to return a SERVFAIL response. It uses an EDNS option to communicate the additional information. Commendable effort! As for its implementation status in general or in BIND in particular I'll admit that I don't know off-hand. Yeah, by the time it lands on Debian's glibc we'll have grown a long long beard. I'm still missing RES_TRUSTAD... Best Ale -- ___ Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list ISC funds the development of this software with paid support subscriptions. Contact us at https://www.isc.org/contact/ for more information. bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users
Re: Trying again on SERVFAIL
> Still, being able to differentiate a local network congestion from a > remote bad configuration would help. That's true. There's https://tools.ietf.org/html/draft-ietf-dnsop-extended-error-16 which look promising, trying to make it possible to distinguish between the various reasons a recursor might choose to return a SERVFAIL response. It uses an EDNS option to communicate the additional information. As for its implementation status in general or in BIND in particular I'll admit that I don't know off-hand. Regards, - Håvard ___ Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list ISC funds the development of this software with paid support subscriptions. Contact us at https://www.isc.org/contact/ for more information. bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users
Re: Trying again on SERVFAIL
On 2021-02-10 3:05 a.m., Alessandro Vesely wrote: Hi Havard, That's what I've been doing. For an incoming message, a temporary failure means replying a 4xx code. The sender keeps the message in its queue, and eventually gives up. Once upon a time, MTAs used to retry sending for five days. Nowadays, several servers don't let queued messages grow older than one day. In the most severe case, a failed DKIM signature might entail a reject. So the best course of action seems to be to reserve temporary failures to this case. Still, being able to differentiate a local network congestion from a remote bad configuration would help. Best Ale Hi Ale and list, This isn't an answer to your original question, but I was curious about something you mentioned near the end of your message, where you wrote: "Once upon a time . . . Nowadays, several servers don't let queued messages grow older than one day". Out of curiosity, what servers have you encountered that no longer use the five day cutoff ? Thanks, - J ___ Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list ISC funds the development of this software with paid support subscriptions. Contact us at https://www.isc.org/contact/ for more information. bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users
Re: Trying again on SERVFAIL
Hi Havard, thanks for your reply. On Tue 09/Feb/2021 18:15:43 +0100 Havard Eidnes wrote: is there a way to know that a query has already been tried a few minutes ago, and failed? From whose perspective? A well-behaved application could remember it asked the same query a short while ago, of course, but that's up to the application. For an application, caching queries feels like stealing the resolver's job. Or is the perspective that of a recursive resolver? As far as I remember, BIND used as a recursive resolver will "cache" this knowledge, but I'm not entirely certain for how long, since it can't use the method from an NXDOMAIN reply which includes the SOA record (and uses the re-purposed "minimum" field for the TTL for the negative cache entry). I too recall that NXDOMAIN can be cached for a while. I'd guess some kinds of failures are also cached. It happens seldomly, but sometimes the DKIM mail filter gets a SERVFAIL when it tries to authenticate an incoming message. SERVFAIL occurs when DNSSEC check fails. ...or when none of the name servers for the containing zone responds with an answer. I.e. it's not *just* DNSSEC failure which can trigger SERVFAIL. Yes, of course. Yet, however sporadic, DNSSEC failure seems to be the most frequent case. Trying again is useless, it has to be treated as a permanent error. Well, now... Basically nothing in the DNS is permanent, because it is not completely static; hence most information in the DNS has a TTL attached to it. So the question then becomes how an application, say a mail server should treat SERVFAIL. It may very well be that the "maximum retry time" of the mail server is far longer than any of the TTLs for the pieces of DNS data that you could not look up, so it may be appropriate to treat SERVFAIL as a signal to "re-queue the message and try again in 30 minutes", so in essence converting SERVFAIL into a "temporary failure" in the context of the mail server. That's what I've been doing. For an incoming message, a temporary failure means replying a 4xx code. The sender keeps the message in its queue, and eventually gives up. Once upon a time, MTAs used to retry sending for five days. Nowadays, several servers don't let queued messages grow older than one day. In the most severe case, a failed DKIM signature might entail a reject. So the best course of action seems to be to reserve temporary failures to this case. Still, being able to differentiate a local network congestion from a remote bad configuration would help. Best Ale -- ___ Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list ISC funds the development of this software with paid support subscriptions. Contact us at https://www.isc.org/contact/ for more information. bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users
Re: Trying again on SERVFAIL
> is there a way to know that a query has already been tried a few > minutes ago, and failed? >From whose perspective? A well-behaved application could remember it asked the same query a short while ago, of course, but that's up to the application. Or is the perspective that of a recursive resolver? As far as I remember, BIND used as a recursive resolver will "cache" this knowledge, but I'm not entirely certain for how long, since it can't use the method from an NXDOMAIN reply which includes the SOA record (and uses the re-purposed "minimum" field for the TTL for the negative cache entry). > It happens seldomly, but sometimes the DKIM mail filter gets a > SERVFAIL when it tries to authenticate an incoming message. > SERVFAIL occurs when DNSSEC check fails. ...or when none of the name servers for the containing zone responds with an answer. I.e. it's not *just* DNSSEC failure which can trigger SERVFAIL. > Trying again is useless, it has to be treated as a permanent > error. Well, now... Basically nothing in the DNS is permanent, because it is not completely static; hence most information in the DNS has a TTL attached to it. So the question then becomes how an application, say a mail server should treat SERVFAIL. It may very well be that the "maximum retry time" of the mail server is far longer than any of the TTLs for the pieces of DNS data that you could not look up, so it may be appropriate to treat SERVFAIL as a signal to "re-queue the message and try again in 30 minutes", so in essence converting SERVFAIL into a "temporary failure" in the context of the mail server. SERVFAIL doesn't mean that the domain name you tried to look up currently doesn't exist in the DNS, you just can't know one way or the other. > Any idea about how to tell a really temporary error? You again have to specify the context. Regards, - Håvard ___ Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list ISC funds the development of this software with paid support subscriptions. Contact us at https://www.isc.org/contact/ for more information. bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users
Trying again on SERVFAIL
Hi, is there a way to know that a query has already been tried a few minutes ago, and failed? It happens seldomly, but sometimes the DKIM mail filter gets a SERVFAIL when it tries to authenticate an incoming message. SERVFAIL occurs when DNSSEC check fails. Trying again is useless, it has to be treated as a permanent error. Any idea about how to tell a really temporary error? Best Ale -- ___ Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list ISC funds the development of this software with paid support subscriptions. Contact us at https://www.isc.org/contact/ for more information. bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users