On Wed, Oct 03, 2018 at 09:31:08AM -0700, Wayne Thayer wrote:
> On Mon, Oct 1, 2018 at 4:49 AM Matt Palmer via dev-security-policy <
> dev-security-policy@lists.mozilla.org> wrote:
> > Thank you for this clear statement of your validation interface design.
> > Unfortunately, this sounds like a design which is extremely risky, from an
> > unintentional errors perspective.  What form(s) of review for UX, including
> > but not limited to human factors of safety, were applied to this interface
> > prior to it being deployed?
> >
> If the implication here is that CAs should apply a high level of UX
> expertise to their internal validation interfaces, I would wager that very
> few CAs meet your standards.

Whilst I would not want to take the other side of that wager, it's still
unfortunate that safety-critical UX is treated with such a cavalier attitude
across the industry.

It's useful, for the ecosystem, that Certigna has been willing to identify
this publicly as a risk in this instance, specifically *because* I agree
with you -- that very few CAs probably do meet my standards in UX design and
review -- and it is an important thing that all CAs *should*, ideally, be
taking into account.

My standards, incidentally, are informed by the approaches taken by other
safety-critical industries, which have seen great benefits from taking UX
seriously, and I'd like to see a similar approach taken by CAs who wish to
participate in the web PKI.

> > > Each operation performed by an operator is traced so that it can be
> > > audited.  The periodic audits of registration requests are also intended
> > > to ensure the conduct of controls by RA and the conformity of their
> > > results.
> >
> > Based on your initial report, I got the impression that the misissuance
> > we're discussing was not picked up by an ordinary operational audit, but
> > was instead identified by some sort of extraordinary review.  Is that an
> > accurate impression?  If so, can you provide more detail around the
> > criteria you use for selecting operations for auditing, and the
> > frequency of those audits, with particular reference to how such an
> > unusual event (overriding a CAA validation failure) wasn't picked up by
> > ordinary auditing?
> >
> Agreed - the misissuance was caught by an audit that was only performed
> after Certigna's CAA validation practices were questioned. CAs are required
> to audit 3% of issued certificates (BR section 8.7), and I would be
> surprised if there are many that exceed that requirement, so this seems
> like an industry-wide concern.

Perhaps the BRs, or Mozilla policy, needs to specify some increased
vigilance over certificates whose issuance required any sort of override or
deviation from the "standard" practice?  (Tricky to define "standard"
practice in a useful fashion, I know...)

Not that such a policy would have been likely to catch this particular
problem any quicker, since this specific issue under discussion wasn't due
to a mistake or malfeasance by an RA, but was instead due to a systemic,
organisation-wide misinterpretation of the BRs.  In general, though, when
unusual things happen, it's useful to pay closer attention to what's going
on, as the chances of mistakes are much higher, and they happen rarely
enough that the standard 3% sampling is unlikely to catch them by accident.

> > Alternately, if the BRs *are*, in fact, sufficiently clear in all respects,
> > the only other possibility that comes to my mind is that Certigna failed to
> > correctly interpret the BRs, which is far more concerning -- for Certigna,
> > at least.  It would mean that there could be any number of other, as yet
> > unidentified, misunderstandings in Certigna's procedures.  I would imagine
> > there would need to be a very comprehensive review of Certigna's processes
> > and procedures, with a detailed public report of the findings of that
> > review, for confidence in Certigna to be restored.
>
> I think we have established that the problem was with Certigna's chosen
> interpretation of the BRs. I am not clear on how you are proposing to have
> a "comprehensive review of Certigna's processes and procedures, with a
> detailed public report of the findings of that review" performed. This
> sounds like an audit to me?

Not by my understanding of an audit (particularly a WebTrust audit).  A
WebTrust audit will validate that the organisation does what it says it
does, but I can't see how it would identify that management has
misinterpreted the BRs, in any but the most egregious of ways.  I doubt ETSI
is any better, given that the impression I get is that those audits are even
less closely aligned with the specific requirements of the BRs.

What I'm envisioning in my suggestion is a review of all Certigna's BR-related
processes and procedures, with a view to identifying any potential issues
based on the two factors already identified:

1. That Certigna misinterpreted the BRs to believe that they could
   substitute controls they considered to be equivalent for BR-mandated
   controls.  Since it is entirely possible that this same misinterpretation
   may have coloured other aspects of their operations, I would expect a
   review of all processes and procedures to identify whether they, too, could
   have suffered from the same misinterpretation in their design or
   implementation.

2. That Certigna managed to misinterpret the BRs *in general*.  Thus, what
   other of Certigna's processes and procedures could possibly have been
   influenced by other misinterpretations of the BRs?  This may require an
   external party, or at least someone who wasn't involved in the initial
   analysis of the BRs to determine Certigna's processes and procedures, to
   ensure that prior misinterpretations are not repeated.

> > > We would recommend to the other CAs to segment each control even if
> > > their objective is the same, in order to be sure that the result of
> > > each control is taken into account and can not be circumvented under
> > > the pretext that a control, having the same purpose, is positive.
> >
> > At the risk of re-iterating the previous point, I'm still at a loss to
> > understand what led Certigna management to believe that substitution of
> > controls that were deemed equivalent was permitted by the BRs or Mozilla
> > policy?
>
> A key point that I hope has been made here is that compensating controls
> are not a substitute for meeting BR requirements. I don't know what more
> there is to say on this.

Without understanding how an error occurred, it is significantly more
difficult to comprehensively remediate a problem, and also to provide
guidance for others in equivalent, but not identical, situations.  "Someone
at Certigna made a mistake" would appear to be the *proximate* cause of the
problem, but I don't really care that someone made a mistake.  I'm keen to
know *why* the mistake was made.

As an example of one *possible* underlying cause: was the person who read
the BRs in this case a non-native English speaker, and some problematic
phrasing tripped them up?  Idioms which seem perfectly comprehensible to a
native speaker can cause all manner of confusion (my use of Australianisms
regularly trips up my professional colleagues).  If that were to be the
cause of this misinterpretation, then it would behoove other CAs to be
especially mindful of that possibility in their own operations, and consider
mitigations (get a native English speaker to review high-risk areas of
interpretation, for example).  Even better (from a risk mitigation
perspective) would be to fix the BRs to remove all instances of the
troublesome phrasing entirely, and give notice to drafters of future BR
amendments to avoid specific problematic language constructions.

All of that useful remediation can't take place, though, unless the
underlying cause of the mistake can be identified.  To repeat: I don't care
that someone at Certigna made a mistake.  That's done, and can't be fixed. 
What *can* be fixed is the circumstances that allowed the mistake to happen,
so that it (and other similar mistakes) can't happen in the future -- to
Certigna or anyone else.

> > > The certification body in charge of our audit includes in these audit
> > > criteria the ETSI standards but also the BR and EV guidelines as we
> > > requested.  As said above, new auditors are selected for our next
> > > certification audit.
> >
> > This statement suggests that Certigna believes there was some degree of
> > blame to be attached to Certigna's auditors regarding this misissuance
> > (otherwise why raise it as part of this discussion?).  If Certigna believes
> > that their auditors are partially at fault, can you expand on why that is?
> > Were they simply incompetent in their duties?  If so, have you reported
> > this
> > incompetence to either Mozilla or the CA/B Forum, so that appropriate steps
> > can be taken?  If you do not believe the auditor to have been incompetent,
> > what leads you to believe that selecting alternate auditors will
> > necessarily
> > lead to a more satisfactory outcome?  Are their changes or improvements to
> > the audit criteria that auditors are required to follow that you can
> > suggest, to prevent future occurrences of this kind?
>
> I think that Certigna is just arguing that audits will uncover any other
> problems that they might have.

That would be a troubling argument in and of itself, if it were made. 
There's a long and distinguished history of auditors missing all sorts of
problems at CAs, and I'd be worried about anyone who has been on the
receiving end of an AICPA-overseen security audit who thinks the auditors
are a useful adjunct to their security program.  Certainly, any organisation
that were to *rely* on auditors to find their problems for them would be
disconcertingly misguided, almost dangerously so.

- Matt

_______________________________________________
dev-security-policy mailing list
dev-security-policy@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-security-policy

Reply via email to