Re: CA disclosure of revocations that exceed 5 days [Was: Re: Incident report D-TRUST: syntax error in one tls certificate]

Dimitris Zacharopoulos via dev-security-policy Tue, 04 Dec 2018 10:30:02 -0800

Fotis,

You have quoted only one part of my message which doesn't capture theentire concept.

CAs that mis-issue and must revoke these mis-issued certificates,already violated the BRs. Delaying revocation for more than what the BRsrequire, is also a violation. There was never doubt about that. I neverproposed that "extended revocation" would somehow "not be considered aBR violation" or "make it legal".

I tried to highlight in this discussion that there were real cases inm.d.s.p. where the revocation was delayed in practice. However, thecircumstances of these extended revocations remain unclear. Yet, thecommunity didn't ask for more details. Seeing this repeated, was thereason I suggested that more disclosure is necessary for CAs thatrequire more time to revoke than the BRs require. At the very minimum,it would help the community understand in more detail the circumstanceswhy a CA asks for more time to revoke.


I think Jakob make an accurate summary.


Dimitris.



On 4/12/2018 8:00 μ.μ., Fotis Loukos via dev-security-policy wrote:

Hello,

On 4/12/18 4:30 μ.μ., Jakob Bohm via dev-security-policy wrote:

Hello to you too.

It seems that you are both misunderstanding what the proposal was.

The proposal was apparently to further restrict the ability of CAs to
make exceptions on their own, by requiring all such exceptions to go
through the public forums where the root programs can challenge or even
deny a proposed exception, after hearing the case by case arguments for
why an exception should be granted.

Can you please point me to the exact place where this is mentioned?

The initial proposal is the following:

Mandating that CAs disclose revocation situations that exceed the 5-day
requirement with some risk analysis information, might be a good place
to start.

I see nothing related to public discussion and root programs challenging
or denying the proposed exception.

In a follow-up email, Dimitris mentions the following:

The reason for requiring disclosure is meant as a first step for
understanding what's happening in reality and collect some meaningful
data by policy. [...] If, for example, m.d.s.p. receives 10 or 20
revocation exception cases within a 12-month period and none of them is
convincing to the community and module owners to justify the exception,
the policy can be updated with clear rules about the risk of distrust if
the revocation doesn't happen within 5 days.

In this proposal it is clear that the CA will *disclose* and not ask for
permission for extending the 24h/5 day period, and furthermore he
accepts the fact that these exceptions may not be later accepted by the
community, which may lead to changing the policy.

A better example would be that if someone broke their leg for some
reason, and therefore wants to delay payment of a debt by a short while,
they should be able to ask for it, and the request would be considered
on its merits, not based on a hard-nosed principle of never granting any
extensions.

I think that the proper analogy is if someone broke their leg, and
therefore wants to delay payment of a bank debt, he should be able to
delay it without notifying the bank in time, but after he has decided
that he is fine and he can walk, he can go to the bank and explain them
why he delayed the payment. I do not consider this a good practice.

Now because CAs making exceptions can be technically considered against
the letter of the BRs, specifying how exceptions should be reviewed
would constitute an admission by the community that exceptions might be
ok in some cases.  Thus from a purely legalistic perspective it would
constitute a weakening of the rules.  But only if one ignores the
reality that such exceptions currently happen with little or no
oversight.

Please see above, there is no review in the original proposal.

As for doing risk assessments and reporting, no deep thinking and no
special logging of considerations is needed when revoking as quickly
as possible, well within the current 24 hour and 5 day deadlines (as
applicable), which hopefully constitutes the vast majority of revocations.

So, is deep thinking needed in the rest of the cases? If yes, how do you
think that a CA will be able to do this risk assessment and how can root
store operators decide on this within 24h in order to extend this
period? If no, would you trust such a risk assessment?

Regards,
Fotis


On 04/12/2018 11:02, Fotis Loukos wrote:

Hello everybody,
First of all, I would like to note that I am writing as an individual
and my opinion does not necessarily represent the opinion of my employer.

An initial comment is that statements such as "I disagree that CAs are
"doing their best" to comply with the rules." because some CAs are
indeed not doing their best is simply a fallacy in Ryan's argumentation,
the fallacy of composition. Dimitris does not represent all CAs, and I'm
pretty sure that you are aware of this Ryan. Generalizations and the
distinction of two teams, our team (the browsers) and their team (the
CAs), where by default our team are the good guys and their team are
malicious is plain demagoguery. Since you like extreme examples, please
note that generalizations (we don't like a member of a demographic thus
all people from that demographic are bad) have lead humanity to
committing atrocities, let's not go down that road, especially since I
know you Ryan and you're definitely not that type of person.

I believe that the arguments presented by Dimitris are simply red
herring. Whether there is a blackout period, the CA lost internet
connectivity or a 65 character OU does not pose a risk to relying
parties is a form of ignoratio elenchi, a fallacy identified even by
Aristotle thousands of years ago. Using the same deductive reasoning,
someone could argue that if a person was scammed in participating in a
ponzi scheme and lost all his fortune, he can steal someone else's money.

The true point of the argument is whether CAs should be allowed to break
the BRs based on their own risk analysis. So, what is a certificate?
It's more or less an assertion. And making an assertion is equally
important as revoking it. As Ryan correctly mentioned, if this becomes a
norm, why shouldn't CAs be allowed to make a risk analysis and decide
that they will break the BRs in making the assertion too, effectively
issuing certificates with their own validation methods? Where would this
lead us? Who would be able to trust the WebPKI afterwards? Are we
looking into making it the wild west of the internet?

In addition, do you think that CAs should be audited regarding their
criteria for their risk analysis?

Furthermore, this poses a great risk for the CAs too. If this becomes a
practice, how can CAs be assured that the browsers won't make a risk
analysis and decide that an issuance made in accordance to all the
requirements in the BRs is a misissuance? Until now, we have seen that
browsers have distrusted CAs based on concrete evidence of misissuances.
Do you think Dimitris that they should be allowed to distrust CAs based
on some risk analysis?

Regards,
Fotis


On 30/11/18 6:13 μ.μ., Ryan Sleevi via dev-security-policy wrote:

On Fri, Nov 30, 2018 at 4:24 AM Dimitris Zacharopoulos <ji...@it.auth.gr>
wrote:


On 30/11/2018 1:49 π.μ., Ryan Sleevi wrote:



On Thu, Nov 29, 2018 at 4:03 PM Dimitris Zacharopoulos via
dev-security-policy <dev-security-policy@lists.mozilla.org> wrote:

I didn't want to hijack the thread so here's a new one.


Times and circumstances change.


You have to demonstrate that.


It's self-proved :-)

This sort of glib reply shows a lack of good-faith effort to meaningfully
engage. It's like forcing the discussion every minute, since, yanno, "times
and circumstances have changed".

I gave you concrete reasons why saying something like this is a
demonstration of a weak and bad-faith argument. If you would like to
meaningfully assert this, you would need to demonstrate what circumstances
have changed in such a way as to warrant a rediscussion of something that
gets 'relitigated' regularly - and, in fact, was something discussed in the
CA/Browser Forum for the past two years. Just because you're unsatisfied
with the result and now we're in a month that ends in "R" doesn't mean time
and circumstances have changed meaningfully to support the discussion.

Concrete suggestions involved a holistic look at _all_ revocations, since
the discussion of exceptions is relevant to know whether we are discussing
something that is 10%, 1%, .1%, or .00001%. Similarly, having the framework
in place to consistently and objectively measure that helps us assess
whether any proposals for exceptions would change that "1%" from being
exceptional to seeing "10%" or "100%" being claimed as exceptional under
some new regime.

In the absence of that, it's an abusive and harmful act.

I already mentioned that this is separate from the incident report (of the
actual mis-issuance). We have repeatedly seen post-mortems that say that
for some specific cases (not all of them), the revocation of certificates
will require more time.

No. We've seen the claim it will require more time, frequently without
evidence. However, I do think you're not understanding - there is nothing
preventing CAs from sharing details, for all revocations they do, about the
factors they considered, and the 'exceptional' cases to the customers,
without requiring any BR violations (of the 24 hour / 5 day rule). That CAs
don't do this only undermines any validity of the argument you are making.

There is zero legitimate reason to normalize aberrant behaviour.

Even the underscore revocation deadline creates problems for some large
organizations as Jeremy pointed out. I understand the compatibility
argument and CAs are doing their best to comply with the rules but you are
advocating there should be no exceptions and you say that without having
looked at specific evidence that would be provided by CAs asking for
exceptions. You would rather have Relying Parties loose their internet
services from one of the Fortune 500 companies. As a Relying Party myself,
I would hate it if I couldn't connect to my favorite online e-shop or bank
or webmail. So I'm still confused about which Relying Party we are trying
to help/protect by requiring the immediate revocation of a Certificate that
has 65 characters in the OU field.

I also see your point that "if we start making exceptions..." it's too
risky. I'm just suggesting that there should be some tolerance for extended
revocations (to help with collecting more information) which doesn't
necessarily mean that we are dealing with a "bad" CA. I trust the Mozilla
module owner's judgement to balance that. If the community believes that
this problem is already solved, I'm happy with that :)

The argument being made here is as odious as saying "We should have one day
where all crime is legal, including murder" or "Those who knowingly buy
stolen goods should be able to keep them, because they're using them".

I disagree that CAs are "doing their best" to comply with the rules. The
post-mortems continually show a lack of applied best practice. DigiCert's
example is, I think, a good one - because I do not believe it's reasonable
for DigiCert to have argued that there was ambiguity, given that prior to
the ballot, it was agreed they were forbidden, a ballot to explicitly
permit them failed, and the discussion of that ballot explicitly cited why
they weren't valid. From that, several non-DigiCert CAs took steps to
migrate their customers and cease issuance. As such, you cannot reasonably
argue DigiCert was doing "their best", unless you're willing to accept that
DigiCert's best is, in fact, far lower than the industry norm.

The framing about "Think about harm to the Subscriber" is, again, one that
is actively harmful, and, as coming from a CA, somewhat offensive, because
it shows a difference in perspective that further emphasizes why CA's
judgement cannot be trusted. In this regard, we're in agreement that the
certificates we're discussing are clearly misissued - the CA was never
authorized to have issued that certificate, and thus the Subscriber has
obtained it illegitimately. Regardless of whether the fault was their own
or not, the CA has "stolen", if you will, from the public bank of trust and
compatibility, that certificate, and then sold it to the Subscriber.

The arguments for why that should be OK have basically boiled into some
segment of trying to figure out whether the victims "deserved" it (that is,
was the car stolen from a church-going grandma or from a violent criminal)
and whether the buyer of the illicit goods really needs it ("They have very
important meetings"). To continue the argument-through-analogy (or, to more
aptly, highlight why I find the underlying argument offensive), it's like
saying it's OK to speed if you have a really important meeting to get to.
This is not about "medical emergencies" as a justification for speeding -
the situation here is entirely predictable to the Subscriber and entirely
under their control. In the Subscriber Agreement, every single Subscriber
agrees to and acknowledges that their CA will revoke if the CA screws up.
Thus, every single Subscriber needs to be prepared. The argument that "They
didn't know" or "They couldn't predict" is demonstrably and factually false.

How do CAs provide this? For *all* revocations, provide meaningful data. I
do not see there being any value to discussing further extensions until we
have systemic transparency in place, and I do not see any good coming from
trying to change at the same time as placing that systemic transparency in
place, because there’s no way to measure the (negative) impact such change
would have.


I don't see how data and evidence for "all revocations" somehow makes
things better, unless I misunderstood your proposal. It's not a balanced
request. It would be a huge effort for CAs to write risk assessment reports
for each revocation. Why not focus on the rare cases which justifies the
extra effort from CAs to write a disclosure letter requesting more days for
revocation? Why not add some rules on what's the minimum information that's
expected for these cases? If you want this to be part of the incident
report, that's fine.

As explained above, the core to the assertion being made here is that a
system of extended revocation is only usable for "exceptional" situations.
But we clearly know that everyone has an incentive to claim their situation
is exceptional. Without a structured analysis, before any changes, about
the nature of revocations, no one can assess whether this is .0001% of
revocations or 100% of revocations. Thus, this is an absolute and
non-negotiable pre-condition for such discussions about exceptional
situations.

The systemic transparency you are asking, as I understand it, would be
m.d.s.p. We already see incident reports being published here. CAs who seek
more than 5 days for revoking affected certificates would disclose more
details about the specifics of these revocations.

Your failure to plan does not make an emergency. The idea that the
community should have only 5 days to discuss. As we've seen during
"exceptional" discussions, Subscribers and CAs tend to assume that the
exception will be granted, and thus fail to take steps to prepare for it
being rejected. So, in effect, not only is the argument "There should be a
discussion" but "All representations from CAs and Subscribers should be
deemed as valid", despite there being ample evidence that such approaches
fundamentally and critically weaken security. That's extremely naive to
think "times and circumstances change".

CAs are evaluated using schemes based on Risk Management.

That is a problem with the schemes, and why significant effort is being
placed to improve those schemes. "The status quo is bad, so what's the harm
in making things worse" is not a compelling narrative.

There is no zero risk. It's like saying there is 100% security. You can
add controls to minimize risk to acceptable levels. Even when mitigations
are added, you have residual risk. However, layered controls and
compensating controls help to avoid disasters. I just don't believe it's
black or white and I think the module owners probably agree with that
statement (
https://groups.google.com/d/msg/mozilla.dev.security.policy/tbSkcGHg1kA/CkrM6taBAwAJ).
If that was the case, every single BR violation or Root Policy violation
would be treated as a trigger for a complete distrust.

Every single BR violation and Root Policy Violation is absolutely a
consideration for complete distrust. Whether or not it triggers complete
distrust is based on weighing the impact of that distrust. We absolutely
want to move to a world where BR violations are exceptions, not the rules.
Your proposal for how to do that is to make sure things aren't BR
violations. That would certainly solve the problem, by making the ecosystem
less reliable and trustworthy. My proposal is that CAs need to do better,
and their failure to adequately inform Subscribers of their Subscriber
Agreements does not a community problem make.

Go read
https://zakird.com/papers/zlint.pdf to see a systemic, thorough, analysis
that supports what I described to you, and disagrees with your framing. We
know what the warning signs are - and it’s continued framing of “low” risk
that collectively presents “severe” risk.


I wasn't aware of that paper, it contains valuable information, thank you
for sharing. Notice the abstract that says "We find that the number of
errors has drastically reduced since 2012. In 2017, only 0.02% of
certificates have errors". To me, this is a positive indicator that the
ecosystem is continuously improving.

Yes. Because CAs are being distrusted.

I have listened to this argument before but unfortunately it leads
nowhere. How badly are we interested in interop to justify being "the bad
guys" and how "disruptive" will our actions be for Relying Parties? It is a
very difficult problem to solve but the ecosystem has made progress:
- disclosure of intermediate CA Certificates
- identifying and fixing problematic OCSP responders
- increased supervision to the issued certificates with CT and linters
providing public information about mis-issuances
- browsers enforcing BR requirements with code (e.g. certificate validity
duration)

With these controls in place, CAs are very much obligated to follow the
rules or face the consequences. Browsers use telemetry to detect violations
of the standards and create plans on addressing those issues. These plans
usually include discussions in m.d.s.p. or the CA/B Forum in order for the
CAs to participate and create the necessary rules -along with the browsers-
to address these incompatibilities.

This is a fundamentally disgusting framing. And I do mean it with that
extreme and emotive work, because it's digusting to suggest that enforcing
contracts and norms is the "bad guys", and is an appeal to the listeners of
this argument to try to place yourself as somehow arguing for the "good"
guys - to use the earlier analogy, to try and suggest you're Robin Hood
rather than Enron.

CAs have a critical and fundamental role in issuing certificates. They
choose whether or not to violate the BRs. Every Subscriber agrees to a
contract that acknowledges that if the CA screws up, their certificate will
be revoked. Now, on the basis that some people (typically, large
corporations) haven't really read their contract, or are convinced that
somehow it's unfair to actually follow the thing they agreed to, they would
like to renegotiate the contract when it no longer benefits them.

We have seen, over the past two decades, incredible harm from not ensuring
these are consistently followed and applied. We've seen, over the past two
decades, incredibly poor judgement. We have not seen any data to suggest
things have improved - rather, we've seen some of the worst offenders
removed as CAs. Even if the mean has gone up, the median and mode have
remained unchanged. That's not improvement.

It’s a perfect example of why your argument DOESN’T work. As Mozilla has
shared in the CA/B Forum, people don’t fix their site - they blame the
browser, and keep on with the brokenness. Firefox is the one having to
change to “accommodate” that.


Or, they might blame the CA for providing them a "thing" that doesn't work
with all major browsers :)

Or the world might end tomorrow. I told you something that exactly is
happening, and you respond with a hypothetical that isn't. That's not as
insightful as the :) may be trying to capture.

This statement underestimating the reflexes of the Root programs. The
reason for requiring disclosure is meant as a first step for understanding
what's happening in reality and collect some meaningful data by policy.
Once Mozilla collects enough information to make a safe estimation, the
policy can be updated to allow or forbid certain situations. If, for
example, m.d.s.p. receives 10 or 20 revocation exception cases within a
12-month period and none of them is convincing to the community and module
owners to justify the exception, the policy can be updated with clear rules
about the risk of distrust if the revocation doesn't happen within 5 days.
That would be a simple, clear rule. Does Mozilla have the information to
make such an aggressive rule change today? Maybe.

This position presumes the argument is valid, which I've tried to show why
it isn't. It further tries to say that the best thing to do is accept the
harm your proposal would cause - which I've shown based on repeated
real-world application of the principles you propose to use - and then
re-evaluate it. Here's a better solution: Don't accept the harm. There's no
reason to hold a Purge to see "if it works out". There's no reason to allow
rampant theft, to see if people are happier once they get what their heart
most desires. That's negligent and irresponsible, and that's what's being
proposed here.

That's an extreme and emotive take - but that's because these arguments are
by no means new, they're ones that have been discussed for years. Even when
wrapped up as a "thinking about how to help" package, they're still
fundamentally flawed and based on a lack of consideration or analysis of
how things have worked or are working. Regardless of good intent, much like
Swift's "Modest Proposal" was rather heinous, the proposal here is
fundamentally flawed in a way that will cause real and lasting harm. A more
negative read would suggest its an attempt to move the Overton Window of
discourse, by suggesting that somehow browsers are "the bad guys" for
requiring that CAs do what they say they'll do as a condition of trust.
We're not "the bad guys" for pointing out deceptive practices and holding
the bearers of keys to the Internet accountable for what they said and did.

I disagree that we’ve seen systemic improvements as a whole. There are a
few CAs trying to do better, but the incident reporting of today clearly
shows exactly what I’m saying - that the industry has not actually matured
as you suggest. What has changed has largely been driven by those outside
CAs - whether those who were wanting to become CAs (Amazon with certlint)
or those analyzing CA’s failures  (ZLint).


If we truly care about the ecosystem, it doesn't really matter where the
systemic improvements come from. CAs and Browsers have contributed in the
Network Security Guidelines, the BRs (to improve and limit validation
methods, add CAA and so much more). I agree we should expect every CA to
develop tools or use existing ones to ensure they are complying with all
rules. We occasionally see some exceptions and this is evaluated on a
case-by-case basis. "Accidents" and mistakes do happen and as it has been
discussed in the past, it's collective failures that pose the greatest risk
and we have seen hard decisions being made to minimize or eliminate these
risks.

I would believe we'd seen systemic improvements once we saw legacy CAs no
longer believing they had "exceptional" situations where they did not do
what they say they would do (correct issuance), do not want to do what they
said they would do in those situations (revoke), and somehow want to
present it as Subscribers not knowing what would happen (when it's baked
right into their contract).

I also protest against the "grossly negligent and irresponsible" part and
I'm afraid statements like that alienate people from participating and
proposing anything. Simply disagreeing would ultimately have the same
effect in this conversation. You have already provided good arguments
against my proposal for people to evaluate.

That's the thing, though. Much like proposals to hold the purge, eat
babies, or legalize theft, there are some arguments that are so deeply
odious, that whether presented in jest or good faith, are actively harmful
and damaging. I don't use this lightly to shut down conversation, but as a
long-standing participant in the Forum and the list, you know that the
ideas you're proposing here are nothing new and have been debated for
years. The community knows and can see the consequences of the principles
you propose - with real harm and real potential at loss of life or serious
disruption - and so this isn't "Hey, I had some new idea to share," but an
idea that has been thoroughly explored and completely rejected.

I've been trying to capture more productive ways forward to countenance
such a re-discussion - and I think the core of it goes to the claim that
"exceptional" situations have incredible cost, to all participants, and so
concrete data is needed. If CAs find it expensive to analyze their
revocations, the reasons, and the challenges, then it suggests that supply
"exceptional" access to their customers / industry is not, in fact, a
priority they're willing to meaningfully engage on.


Enjoy

Jakob

_______________________________________________
dev-security-policy mailing list
dev-security-policy@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-security-policy


_______________________________________________
dev-security-policy mailing list
dev-security-policy@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-security-policy

Re: CA disclosure of revocations that exceed 5 days [Was: Re: Incident report D-TRUST: syntax error in one tls certificate]

Reply via email to