Re: [DNSOP] New Version Notification for draft-livingood-dnsop-negative-trust-anchors-01.txt
(no hats) On Oct 31, 2014, at 11:53 AM, Warren Kumari wrote: > On Fri, Oct 31, 2014 at 10:26 AM, Paul Ebersman wrote: >> >> I'd hope it would be good operational sense for folks to have automated >> checks of critical things and checks of DNS logs for DNSSEC validation >> failures and that we shouldn't have to spell that out. >> >> But do we want to at least have a mention of doing such kinds of checks >> as a better way of noticing DNSSEC failures than pissed off customers or >> puzzled NOC folks? > > Nope -- because now you have the problem of where to draw the line. Do > we also suggest the folk monitor error rates on WAN circuits? Failing > RAID arrays? Excessive BIND memory usage? > > I think that would be document creep, creep! Well, there might be a useful heuristic for drawing the line though-- it might well be that there's *operational* guidance on what monitoring is useful for this specific purpose, as opposed to "all possible monitoring of everything". I'm fine with "This is what real operators are finding useful in the context of deploying this particular optimization to our service." Suzanne ___ DNSOP mailing list DNSOP@ietf.org https://www.ietf.org/mailman/listinfo/dnsop
Re: [DNSOP] New Version Notification for draft-livingood-dnsop-negative-trust-anchors-01.txt
On Fri, Oct 31, 2014 at 10:26 AM, Paul Ebersman wrote: > > srose> Should there be text describing auto-adding of NTA's based on > srose> important domains (for the ISP/resolver's definition of > srose> important)? So that domains that are used by low level services > srose> don't fail that also aren't normally visible to end users? One > srose> example is nist.gov. When nist.gov messed up and went DNSSEC > srose> BOGUS, time.nist.gov was unreachable by validating resolvers. > > warren> Sorry, but to me this sounds like a bad idea -- you should find > warren> out that you "not normally visible to end users" failures are > warren> happening because your network monitoring system goes "Beep Beep > warren> Beep" when low level important services die. The NOC then goes > warren> off and investigates and discovers that e.g the NTP monitor it > warren> sad because time.nist.gov is unresolvable. > > warren> At this point there really needs to be a *human* in the loop to > warren> decide what to do, if the failure really *is* a DNSSEC failure > warren> and, more importantly, if installing an NTA is the right answer. > > I'd hope it would be good operational sense for folks to have automated > checks of critical things and checks of DNS logs for DNSSEC validation > failures and that we shouldn't have to spell that out. > > But do we want to at least have a mention of doing such kinds of checks > as a better way of noticing DNSSEC failures than pissed off customers or > puzzled NOC folks? Nope -- because now you have the problem of where to draw the line. Do we also suggest the folk monitor error rates on WAN circuits? Failing RAID arrays? Excessive BIND memory usage? I think that would be document creep, creep! > > I do agree that we should not be inserting NTAs automatically for > anything. Yah. W > > ___ > DNSOP mailing list > DNSOP@ietf.org > https://www.ietf.org/mailman/listinfo/dnsop -- I don't think the execution is relevant when it was obviously a bad idea in the first place. This is like putting rabid weasels in your pants, and later expressing regret at having chosen those particular rabid weasels and that pair of pants. ---maf ___ DNSOP mailing list DNSOP@ietf.org https://www.ietf.org/mailman/listinfo/dnsop
Re: [DNSOP] New Version Notification for draft-livingood-dnsop-negative-trust-anchors-01.txt
srose> Should there be text describing auto-adding of NTA's based on srose> important domains (for the ISP/resolver's definition of srose> important)? So that domains that are used by low level services srose> don't fail that also aren't normally visible to end users? One srose> example is nist.gov. When nist.gov messed up and went DNSSEC srose> BOGUS, time.nist.gov was unreachable by validating resolvers. warren> Sorry, but to me this sounds like a bad idea -- you should find warren> out that you "not normally visible to end users" failures are warren> happening because your network monitoring system goes "Beep Beep warren> Beep" when low level important services die. The NOC then goes warren> off and investigates and discovers that e.g the NTP monitor it warren> sad because time.nist.gov is unresolvable. warren> At this point there really needs to be a *human* in the loop to warren> decide what to do, if the failure really *is* a DNSSEC failure warren> and, more importantly, if installing an NTA is the right answer. I'd hope it would be good operational sense for folks to have automated checks of critical things and checks of DNS logs for DNSSEC validation failures and that we shouldn't have to spell that out. But do we want to at least have a mention of doing such kinds of checks as a better way of noticing DNSSEC failures than pissed off customers or puzzled NOC folks? I do agree that we should not be inserting NTAs automatically for anything. ___ DNSOP mailing list DNSOP@ietf.org https://www.ietf.org/mailman/listinfo/dnsop
Re: [DNSOP] New Version Notification for draft-livingood-dnsop-negative-trust-anchors-01.txt
On Thu, Oct 30, 2014 at 3:43 PM, Rose, Scott wrote: > On the subject of NTA's that should be there - > > Should there be text describing auto-adding of NTA's based on important > domains (for the ISP/resolver's definition of important)? > So that domains that are used by low level services don't fail that also > aren't normally visible to end users? One example is nist.gov. When > nist.gov messed up and went DNSSEC BOGUS, time.nist.gov was unreachable by > validating resolvers. Unless the log files were detailed, a user may not > know what is going on or that NTP is having issues. > Sorry, but to me this sounds like a bad idea -- you should find out that you "not normally visible to end users" failures are happening because your network monitoring system goes "Beep Beep Beep" when low level important services die. The NOC then goes off and investigates and discovers that e.g the NTP monitor it sad because time.nist.gov is unresolvable. At this point there really needs to be a *human* in the loop to decide what to do, if the failure really *is* a DNSSEC failure and, more importantly, if installing an NTA is the right answer. This requires a human judgment call if the outage requires serious action. For example, for the *huge* majority of folk time.nist.gov going unresolvable was simply not an issue -- but many years a go I worked for a digital timestamp / nonrepudiation company, and things failed spectacularly if *they* couldn't talk to time.nist.gov. This is, and should be, a "Break Glass" type event, not an automated best guess. Neither Google Public DNS, not Comcast have yet installed an NTA for fema.net... > This could be a monitor, or a pre-loaded NTA for certain domains. Not crazy > about the pre-loaded idea, but it would avoid a period of scrambling. I think that some scrambling is preferable to incorrectly overriding DNSSEC validation in the case of an "actual" issue. If the thingie you are talking to is sufficiently critical to your operations you *really* should be monitoring it. When the monitor fires your well trained NOC leaps into action, opens the binder and flips to the checklist - if they discover that the failure is truly a DNSSEC issue (e.g they call up the operators of criticalservice.org who tells them that the HSM and hidden master are both under 3 ft of piranha infested water) they flip to the section marked "Installing an NTA". This has another checklist that confirms that it really is A: an DNSSEC failure and B: that criticalservice.org is actually critical. Once confirming this, the NOC folder references Appendix A in [draft-livingood-dnsop-negative-trust-anchors] and installs an NTA for criticalservice.org. They then note this in the ticketing system, add it to the daily NOC log and go back to playing Minecraft. The following NOC shift has an open ticket and periodically tests if the issue is resolved. Once it is, the NTA gets removed. This should all be a very rare occurrence - but if, and when you need it, you *really* need it. Having the NTA option available in an emergency is really useful, but it should not be used in an automated manner. If a name is critical enough that you are considering automatically installing an NTA there are probably 3 questions you need to ask: 1: DNSSEC is designed to prevent MITM attacks -- if it is this critical to me, is failing "open" the right answer? 2: If the service is *this* critical to me, perhaps I have too many eggs in one basket and need to find a second provider? and 3: If I'm seeing these failures often enough that I'm considering this, have I selected the wrong supplier here? W > > Scott > > On Oct 29, 2014, at 5:11 PM, Warren Kumari wrote: > >> Over on the BIND-Users list there is currently a discussion of >> fema.net (one the "Federal Emergency Management Agency" domains) >> being DNSSEC borked >> (https://lists.isc.org/pipermail/bind-users/2014-October/094142.html) >> >> This is an example of the sort of issues that an NTA could address -- >> I'd like to note that currently neither Google Public DNS (8.8.8.8) >> nor Comcast (75.75.75.75) have put in an NTA for it, but if it were >> fema.gov, and this were during some sort of national disaster in the >> US, things might be different... >> W >> > > === > Scott Rose > NIST > scott.r...@nist.gov > +1 301-975-8439 > Google Voice: +1 571-249-3671 > http://www.dnsops.gov/ > === > > ___ > DNSOP mailing list > DNSOP@ietf.org > https://www.ietf.org/mailman/listinfo/dnsop -- I don't think the execution is relevant when it was obviously a bad idea in the first place. This is like putting rabid weasels in your pants, and later expressing regret at having chosen those particular rabid weasels and that pair of pants. ---maf ___ DNSOP mailing list DNSOP@ietf.org https://www.ietf.org/mailman/listinfo/dnsop
Re: [DNSOP] New Version Notification for draft-livingood-dnsop-negative-trust-anchors-01.txt
On the subject of NTA's that should be there - Should there be text describing auto-adding of NTA's based on important domains (for the ISP/resolver's definition of important)? So that domains that are used by low level services don't fail that also aren't normally visible to end users? One example is nist.gov. When nist.gov messed up and went DNSSEC BOGUS, time.nist.gov was unreachable by validating resolvers. Unless the log files were detailed, a user may not know what is going on or that NTP is having issues. This could be a monitor, or a pre-loaded NTA for certain domains. Not crazy about the pre-loaded idea, but it would avoid a period of scrambling. Scott On Oct 29, 2014, at 5:11 PM, Warren Kumari wrote: > Over on the BIND-Users list there is currently a discussion of > fema.net (one the "Federal Emergency Management Agency" domains) > being DNSSEC borked > (https://lists.isc.org/pipermail/bind-users/2014-October/094142.html) > > This is an example of the sort of issues that an NTA could address -- > I'd like to note that currently neither Google Public DNS (8.8.8.8) > nor Comcast (75.75.75.75) have put in an NTA for it, but if it were > fema.gov, and this were during some sort of national disaster in the > US, things might be different... > W > === Scott Rose NIST scott.r...@nist.gov +1 301-975-8439 Google Voice: +1 571-249-3671 http://www.dnsops.gov/ === ___ DNSOP mailing list DNSOP@ietf.org https://www.ietf.org/mailman/listinfo/dnsop
Re: [DNSOP] New Version Notification for draft-livingood-dnsop-negative-trust-anchors-01.txt
In message <10d9f4dd-1be6-41ff-954d-fd223547d...@virtualized.org>, David Conrad writes: > Tim, > > On Oct 29, 2014, at 2:55 PM, Morizot Timothy S > wrote: > > If an authoritative domain (e.g. irs.gov) screwed up its delegation NS > records so it effectively went dark or made some similar sort of > authoritative DNS or nameserver error, we wouldn't expect the recursive, > caching side to resolve those sorts of errors. The domain's DNS would > simply be unavailable until they resolved their problem. > > > > I'm not sure I understand why DNSSEC is somehow different. > > Because folks who aren't validating see no problems, thus discouraging > people from leaving validation on. > > To wit, on NANOG: > > > From: Ray Van Dolson > > "I saw the same errors in dnsviz, but was unsure if they were sufficient > to cause lookup failures (they were "warnings" only). > > # dig @8.8.8.8 disa.mil MX +dnssec > ... > I do note that once we disabled DNSSEC on our resolvers we were able to > push mail out to these domains. May have been coincidental -- needs > further testing." > > I figure it would be nice to give people the option of disabling > validation for a single domain instead of turning validation off for > everything. I suspect you will find there are ways to do this in all the validators. BIND has had the following for ages which I know David knows. disable-algorithms { ; ... }; BIND 9.11 will allow for disabling via rndc with automatic periodic testing and re-enabling when validation of the SOA succeeds. Validation will also be automatically re-enabled after a timer goes off. Mark > Regards, > -drc -- Mark Andrews, ISC 1 Seymour St., Dundas Valley, NSW 2117, Australia PHONE: +61 2 9871 4742 INTERNET: ma...@isc.org ___ DNSOP mailing list DNSOP@ietf.org https://www.ietf.org/mailman/listinfo/dnsop
Re: [DNSOP] New Version Notification for draft-livingood-dnsop-negative-trust-anchors-01.txt
Tim, On Oct 29, 2014, at 2:55 PM, Morizot Timothy S wrote: > If an authoritative domain (e.g. irs.gov) screwed up its delegation NS > records so it effectively went dark or made some similar sort of > authoritative DNS or nameserver error, we wouldn't expect the recursive, > caching side to resolve those sorts of errors. The domain's DNS would simply > be unavailable until they resolved their problem. > > I'm not sure I understand why DNSSEC is somehow different. Because folks who aren't validating see no problems, thus discouraging people from leaving validation on. To wit, on NANOG: > From: Ray Van Dolson "I saw the same errors in dnsviz, but was unsure if they were sufficient to cause lookup failures (they were "warnings" only). # dig @8.8.8.8 disa.mil MX +dnssec ... I do note that once we disabled DNSSEC on our resolvers we were able to push mail out to these domains. May have been coincidental -- needs further testing." I figure it would be nice to give people the option of disabling validation for a single domain instead of turning validation off for everything. Regards, -drc signature.asc Description: Message signed with OpenPGP using GPGMail ___ DNSOP mailing list DNSOP@ietf.org https://www.ietf.org/mailman/listinfo/dnsop
Re: [DNSOP] New Version Notification for draft-livingood-dnsop-negative-trust-anchors-01.txt
On Sun, 26 Oct 2014, Paul Hoffman wrote: On Oct 26, 2014, at 6:08 PM, Paul Wouters wrote: That's my problem with the document. It describes a local policy that a site might have. And documents three software implementations on how to make such a negative trust anchor. Is that what an IETF document should do? Yes, that's a fine thing to do as compared to being silent on something that has operational impact. So I'm confused. What is the goal of this document? How does it help us? The first part is stated quite well in the document. The second part is that it documents a practice that affects the operation of DNS. Okay, if those are the goals than the document seems fine to me. I guess I was confused about any kind of planned collecting, distributing or "red button" type signaling methods. Thanks, Paul ___ DNSOP mailing list DNSOP@ietf.org https://www.ietf.org/mailman/listinfo/dnsop
Re: [DNSOP] New Version Notification for draft-livingood-dnsop-negative-trust-anchors-01.txt
On Oct 26, 2014, at 6:08 PM, Paul Wouters wrote: > That's my problem with the document. It describes a local policy that a > site might have. And documents three software implementations on how > to make such a negative trust anchor. Is that what an IETF document > should do? Yes, that's a fine thing to do as compared to being silent on something that has operational impact. > So I'm confused. What is the goal of this document? How does it help us? The first part is stated quite well in the document. The second part is that it documents a practice that affects the operation of DNS. --Paul Hoffman ___ DNSOP mailing list DNSOP@ietf.org https://www.ietf.org/mailman/listinfo/dnsop
Re: [DNSOP] New Version Notification for draft-livingood-dnsop-negative-trust-anchors-01.txt
On Sun, Oct 26, 2014 at 9:52 AM, Livingood, Jason < jason_living...@cable.comcast.com> wrote: > > Warren - Your suspicions are right on the money. A good reference is > http://dns.comcast.net/images/files/dnssec_validation_failure_nasagov_20120118_final.pdf. > Take a look at the flak we got on page 9 – truly fascinating. In any case, > posting on Twitter and our DNS website is what we have been trying to do > based on how people respond. And in nearly every case I have seen so far we > have had PR involved since we have usually gotten press calls about it or > could expect to (in the NASA.gov example it was ironically enough MSNBC). > > Unless and until DNSSEC and DNS management is fully automated everywhere, use of DNSSEC information is going to require curation. The problem here is a very common one in security: Every security problem is simple if you only recognize one problem. It is the need to balance different concerns that makes security hard. This is also why I try to resist the narrowly focused WG charters designed to produce a quick specification that only meets half the real world concerns. In the real world, the goal is to be able to connect to the correct site. This means that there are two (main) failure modes, not one: 1) Client connects to the wrong site. 2) Client is unable to connect to the correct one. Any decision process that is going to be implemented in the end point has to be able to respond to every set of inputs in a deterministic way. One of the reasons the DKIM policy discussion was so nonsensical was that people would suggest solving problem A by adopting decision strategy X and problem B with strategy Y. But there was no deterministic means of choosing between X and Y. So the implementations would have to work by magic. Moving the decision point to the resolver allows a more powerful decision strategy because a resolver can bring more information to bear. Instead of discarding expired records, a high volume resolver can make use of them to make better decisions. So if the DNSSEC records have expired but the A records have not changed, keep serving them. Negative trust anchors are one approach to curating the DNSSEC data but there is a much larger family of possible approaches and curating DNSSEC for A records is not the same as curating for DANE or security policy records. ___ DNSOP mailing list DNSOP@ietf.org https://www.ietf.org/mailman/listinfo/dnsop
Re: [DNSOP] New Version Notification for draft-livingood-dnsop-negative-trust-anchors-01.txt
On 10/25/14, 10:12 PM, "Warren Kumari" mailto:war...@kumari.net>> wrote: On Sat, Oct 25, 2014 at 9:50 PM, Paul Hoffman mailto:paul.hoff...@vpnc.org>> wrote: On Oct 25, 2014, at 6:43 PM, Olafur Gudmundsson mailto:o...@ogud.com>> wrote: We want humans in the loop, I would love to see a twitter feed when ever Comcast does a Negative Trust Anchor. Like https://twitter.com/ComcastDNS, for example? Either things haven't been failing much lately, or they're not updating it as often as we had hoped. Or both... I suspect it might also be: Installing a NTA is annoying. It requires poking at running servers, you may have to talk to lawyers (shudder), you may have to get PR people in the loop, etc. This means that they only get put in for actual issues that affect a large number of users. If maryandjohnsvacation.photo goes bogus (because Mary typo'ed the entry in her crontab) it is highly unlikely that you will get DNS operators to go through the rigmarole of installing an NTA. Warren - Your suspicions are right on the money. A good reference is http://dns.comcast.net/images/files/dnssec_validation_failure_nasagov_20120118_final.pdf. Take a look at the flak we got on page 9 – truly fascinating. In any case, posting on Twitter and our DNS website is what we have been trying to do based on how people respond. And in nearly every case I have seen so far we have had PR involved since we have usually gotten press calls about it or could expect to (in the NASA.gov example it was ironically enough MSNBC). - Jason Livingood ___ DNSOP mailing list DNSOP@ietf.org https://www.ietf.org/mailman/listinfo/dnsop
Re: [DNSOP] New Version Notification for draft-livingood-dnsop-negative-trust-anchors-01.txt
On Sat, Oct 25, 2014 at 9:50 PM, Paul Hoffman wrote: > On Oct 25, 2014, at 6:43 PM, Olafur Gudmundsson wrote: >> We want humans in the loop, I would love to see a twitter feed when ever >> Comcast does a Negative Trust Anchor. > > Like https://twitter.com/ComcastDNS, for example? Either things haven't been > failing much lately, or they're not updating it as often as we had hoped. > Or both... I suspect it might also be: Installing a NTA is annoying. It requires poking at running servers, you may have to talk to lawyers (shudder), you may have to get PR people in the loop, etc. This means that they only get put in for actual issues that affect a large number of users. If maryandjohnsvacation.photo goes bogus (because Mary typo'ed the entry in her crontab) it is highly unlikely that you will get DNS operators to go through the rigmarole of installing an NTA. W > --Paul Hoffman > ___ > DNSOP mailing list > DNSOP@ietf.org > https://www.ietf.org/mailman/listinfo/dnsop -- I don't think the execution is relevant when it was obviously a bad idea in the first place. This is like putting rabid weasels in your pants, and later expressing regret at having chosen those particular rabid weasels and that pair of pants. ---maf ___ DNSOP mailing list DNSOP@ietf.org https://www.ietf.org/mailman/listinfo/dnsop
Re: [DNSOP] New Version Notification for draft-livingood-dnsop-negative-trust-anchors-01.txt
On Oct 25, 2014, at 6:43 PM, Olafur Gudmundsson wrote: > We want humans in the loop, I would love to see a twitter feed when ever > Comcast does a Negative Trust Anchor. Like https://twitter.com/ComcastDNS, for example? Either things haven't been failing much lately, or they're not updating it as often as we had hoped. --Paul Hoffman ___ DNSOP mailing list DNSOP@ietf.org https://www.ietf.org/mailman/listinfo/dnsop
Re: [DNSOP] New Version Notification for draft-livingood-dnsop-negative-trust-anchors-01.txt
Doug, On Oct 24, 2014, at 3:06 PM, Doug Barton wrote: >> I know that there will be some philosophical objections / discussions on >> this... > > It's not just a philosophical objection, it's an operational one. When DNSSEC > fails for a domain there are 2 main reasons. Operator error, and an actual > MITM or similar attack. In these early days of DNSSEC deployment, we've seen numerous occasions in which the first has applied. Can you provide URLs to the cases where the second has applied? > If the operators of validating resolvers simply turn off validation for > domains that should be validating but are not,* it kicks the door open for > the exact problem that DNSSEC was designed to solve. In the cases I'm aware of when an actual attack that DNSSEC was designed to solve has occurred, there are artifacts in logs. I would presume network operators would, when they get notified that a domain isn't validating, check those logs in their process to determine what's going on. If they notice the artifacts, I suspect they won't install an NTA. > But worse yet, in the operator error case you make such errors painless. Only if you believe there will be universal and instantaneous deployment of NTAs any time the zone operator screws up. I suspect this unlikely. > Of course I realize that the counter-argument is that if DNSSEC fails are too > painful then people will simply not do it. But to the extent that people use > that as a line of reasoning it's simply one more in a long string of excuses. No. It is an acceptance of the operational reality that, at least today, the most common failure mode (by far) is for zone operators to screw up and the remedy for that other person screwing up is to either tell the resolver operators' customers "sorry, the guys at made a booboo and you can't talk to them as long as you use our resolver" or turning validation off for ALL zones. If a zone is sufficiently popular that a resolver operator gets multiple calls when it doesn't validate because the zone's operated boobooed, the resolver operator _will_ turn off validation. Do that enough times and I suspect it becomes questionable whether they'll bother to turn it on again. > The other problem is that this feature is only really useful in the DNSSEC > ramp-up period. Sure, mistakes are more common now, software is immature, > etc. etc. But if DNSSEC is successful, the software will get better (it > already is a lot better than even a few years ago), and mistakes will be less > common (both on an absolute, and on a percentage basis). But once you > introduce a feature like this, you cannot remove it. You seem to believe that installing NTAs is without cost. First, it requires operator intervention in response to a notification. Second, I believe it opens the resolver operator up to liability -- by installing an NTA, they are, by direct intervention of their staff, purposefully defeating a protection that they explicitly configured that puts all of their customers at risk. The longer the NTA is left in place, the greater the risk. > ... and of course, I should point out that adding this as a knob is utterly > pointless, since any reasonably competent validating resolver operator can > engineer their own trapdoor. Since not all validating resolvers support NTAs, if you want to supply text, I'd support adding a section describing alternative approaches to implementing NTA functionality. Regards, -drc signature.asc Description: Message signed with OpenPGP using GPGMail ___ DNSOP mailing list DNSOP@ietf.org https://www.ietf.org/mailman/listinfo/dnsop