Re: SECURITY RELEVANT FOR CAs: The curious case of the Dangerous Delegated Responder Cert

Peter Bowen via dev-security-policy Sat, 04 Jul 2020 12:02:26 -0700

On Sat, Jul 4, 2020 at 11:06 AM Ryan Sleevi via dev-security-policy
<dev-security-policy@lists.mozilla.org> wrote:
>
> On Sat, Jul 4, 2020 at 12:52 PM mark.arnott1--- via dev-security-policy <
> dev-security-policy@lists.mozilla.org> wrote:
>
> > This is insane!
> > Those 300 certificates are used to secure healthcare information systems
> > at a time when the global healthcare system is strained by a global
> > pandemic.  I have to coordinate with more than 30 people to make this
> > happen.  This includes three subsidiaries and three contract partner
> > organizations as well as dozens of managers and systems engineers.  One of
> > my contract partners follows the guidance of an HL7 specification that
> > requires them to do certificate pinning.  When we replace these
> > certificates we must give them 30 days lead time to make the change.
> >
>
> As part of this, you should re-evaluate certificate pinning. As one of the
> authors of that specification, and indeed, my co-authors on the
> specification agree, certificate pinning does more harm than good, for
> precisely this reason.
>
> Ultimately, the CA is responsible for following the rules, as covered in
> https://wiki.mozilla.org/CA/Responding_To_An_Incident#Revocation . If
> they're not going to revoke, such as for the situation you describe,
> they're required to treat this as an incident and establish a remediation
> plan to ensure it does not happen again. In this case, a remediation plan
> certainly involves no longer certificate pinning (it is not safe to do),
> and also involves implementing controls so that it doesn't require 30
> people, across three subsidiaries, to replace "only" 300 certificates. The
> Baseline Requirements require those certificates to be revoked in as short
> as 24 hours, and so you need to design your systems robustly to meet that.


One of the things that can be very non-obvious to many people is that
"incident" as Ryan describes it is not a binary thing.  When Ryan says
"treat this as an incident" it is not necessarily the same kind of
incident system where there is a goal to have zero incidents forever.
In some environments the culture is that any incident is a career
limiting event or has financial impacts - for example, a factory might
pay out bonuses to employees for every month in which zero incidents
are reported.  This does not align with what Ryan speaks about.
Instead, based on my experience working with Ryan, incidents are the
trigger for blameless postmortems which are used to teach.  Google
documented this in their SRE book
(https://landing.google.com/sre/sre-book/chapters/postmortem-culture/
) and AWS includes this as part of their well-architected framework
(https://wa.aws.amazon.com/wat.concept.coe.en.html ).

One of the challenges is that not everyone in the WebPKI ecosystem has
aligned around the same view of incidents as learning opportunities.
This makes it very challenging for CAs to find a path that suits all
participants and frequently results in hesitancy to use the blameless
post-mortem version of incidents.

> > After wading through this very long chain of messages I see little
> > discussion of the impact this will have on end users.  Ryan Sleevi, in the
> > name of Google, is purporting to speak for the end users, but it is obvious
> > that Ryan does not understand the implication of applying these rules.
> >
>
> I realize you're new here, and so I encourage you to read
> https://wiki.mozilla.org/CA/Policy_Participants for context about the
> nature of participation.

To clarify what Ryan is saying here: he is pointing out that he is not
representing the position of Google or Alphabet, rather he is stating
he is acting as an independent party.

As you can see from earlier messages, Mozilla has clearly stated that
they are NOT requiring revocation in 7 days in this case, as they
judge the risk from revocation greater than the risks from not
revoking on that same timeframe. Ben Wilson, who does represent
Mozilla, stated:

"Mozilla does not need the certificates that incorrectly have the
id-kp-OCSPSigning EKU to be revoked within the next 7 days, as per
section 4.9.1.2 of the BRs. We want to work with CAs to identify a
path forward, which includes determining a reasonable timeline and
approach to replacing the certificates that incorrectly have the
id-kp-OCSPSigning EKU (and performing key destruction for them)."

The reason this discussion is ongoing is that Ryan does work for
Google and it is widely understood that: 1) certificates that are not
trusted by the Google Chrome browser  in its default configuration
(e.g. install on a home version of Windows with no further
configuration) or not trusted on widely used Android devices by
default are not commercially viable as they do not meet the needs of
many organizations and individuals who request certificates and 2)
Ryan appears to be highly influential in Chrome and Android decision
making about what certificates to trust.

If Google were to officially state something similar to Mozilla, then
this thread would likely resolve itself quickly.  Yes, there are other
trust stores to deal with, but they have historically not engaged in
this Mozilla forum, so discussion here is not helpful for them.

> This is the unfortunate nature of PKI: as a system, the cost of revocation
> is often not properly accounted for when CAs or Subscribers are designing
> their systems, and so folks engage in behaviours that increase risk, such
> as lacking automation or certificate pinning.

This is the nature of PKIs that are used with browsers today.  As you,
Ryan, have frequently stated, one of the big challenges is when a PKI
is used for multiple purposes.  It seems that what Mark is pointing
out is that HL7 has a set of contradictory requirements to those of
Chrome.  In some environments, it would be completely reasonable to
have a certificate policy that certificates must NOT be revoked with
less than 180 days notice (or 30 days or similar).  In these
environments availability is far more critical than confidentiality.
I've seen environments that would strongly prefer to use
TLS_ECDHE_NONE_WITH_AES_256_GCM if such a thing existed, meaning they
would simply encrypt data without any authentication of the remote
party.  I've also seen environments that would prefer a scheme whereby
certificates never expire and are only invalidated if the relying
party records another certificate for the same subject with a newer
issued date. This makes sense for them, given other controls in place.

For the future, HL7 probably would be well served by working to create
a separate PKI that meets their needs.  This would enable a different
risk calculation to be used - one that is specific to the HL7 health
data interoperability world.  I don't know if you or your organization
has influence in HL7, but it is something worth pushing if you can.

Thanks,
Peter
_______________________________________________
dev-security-policy mailing list
dev-security-policy@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-security-policy

Re: SECURITY RELEVANT FOR CAs: The curious case of the Dangerous Delegated Responder Cert

Reply via email to