Re: when do things really need to be revoked? who decides?

Wayne Thu, 30 May 2024 07:34:24 -0700

In the delayed revocation incidents recently, the main barrier for 
replacing a certificate has been deployment. I've not heard of validation 
being an issue as-of-yet, but it may just not have been mentioned.


On Thursday, May 30, 2024 at 6:49:04 AM UTC+1 Suchan Seo wrote:

> I wonder what makes certficiate replacement slow and not wanted to do so - 
> is it validation step or deploy new certficiate everywhere old certificate 
> was?
> OV/EV related valiations are valid for 398 days as 3.2.2.14.3 so most of 
> revalidation should be about validating domains: 
>
> for simplyfying later part there could be an ocsp extension that points to 
> another certificate (that signs same skid/publikey) that tell while this 
> certificate itself is revoked, but this is replacement that likely to be 
> valid: this makes in effect skips certificate deployment process, make 
> replacement single email to webmaster to authroize replacement certificate.
> 2024년 5월 21일 화요일 오전 9시 46분 0초 UTC+9에 Mike Shaver님이 작성:
>
>> DELAYED REVOCATION IS TOO COMMON
>>
>> This is long enough, so I’ll spare readers dozens of links to 
>> delayed-revocation incidents collected in Bugzilla; we all know that pretty 
>> much any other incident that involves misissuance will come with a 
>> delayed-revocation chaser these days.
>>
>> In *many* of those delrev (?)incidents, we see a phrase like “we 
>> requested that our subscribers revoke and reissue”. They are not informing 
>> their subscribers as to a fixed revocation timeline, but rather simply 
>> asking if those subscribers if they would please do the revocation process 
>> when they’re able. In one case, I heard of a revocation request from a 
>> major CA that didn’t even have a timeline *suggested*. Of course, the 
>> subscriber gets no value out of replacing their certs: it’s pure overhead, 
>> and if WebPKI were operated perfectly, it would never be necessary. This is 
>> an externality of, most often, a CA’s failure to sufficiently invest in 
>> understanding, implementing, and verifying the processes that they use to 
>> twirl the keys to the entire web’s security.
>>
>> Indeed in a number of cases the CAs didn’t even stop issuing once they 
>> realized that they were misissuing certs! Intentionally issuing certs that 
>> are known to be bad, what a world.
>>
>> While CAs generally claim that they would be able to handle a mass 
>> revocation incident (such as due to leaked key material), the evidence we 
>> have for CAs aggressively revoking as called for by the BRs and the root 
>> programs is…scant. We’ve seen “it was a long weekend” as a reason for 
>> delaying revocation for certs—including some used by a different part of 
>> the CA’s company! One CA has proposed a “global fire drill” to stress-test 
>> revocation procedures, but we’re seeing revocation timelines reaching 
>> multiple months right now, so…a lot of stuff would end up burning in that 
>> fire.
>>
>> CAs also tell us that they advocate and recommend for their subscribers 
>> to implement automation for cert management, but we never see any concrete 
>> targets or success criteria for those efforts, so they certainly seem to me 
>> to just be more “asking nicely”. (I’m not sure that all of the CAs claiming 
>> to be pushing for subscriber automation actually have robust ACME or 
>> similar support yet, in fact.)
>>
>> (Some of the CAs made explicit promises years ago to not delay 
>> revocation, some of them issued even though they knew that zlint showed an 
>> error—there are lots of additional twists on simply “issuing bad certs and 
>> not cleaning them up as agreed”.)
>>
>> Now, in the wake of these *many* delrev incidents, over years of history, 
>> the root programs have responded with pretty much no consequences 
>> whatsoever as far as I can tell. There’s one case open about Entrust’s 
>> overall behaviour, who are certainly over-achieving when it comes to ways 
>> to get location fields wrong, but they are definitely not the only ones who 
>> treat the BRs’ 1/5-day revocation instruction as instead meaning “when it’s 
>> convenient for the customer”.
>>
>> THE QUESTION
>>
>> So: what should be done to make revocations of misissued 
>> certificates—sometimes *intentionally* misissued certificates—as prompt as 
>> the BRs require?
>>
>> The cost equation for CAs is obviously skewed against the health the web 
>> PKI, if we are to believe that the BRs are important. Once a CA has 
>> violated the BRs and misissued, it is *in their commercial interest* to not 
>> revoke promptly: it causes embarrassment and subscriber frustration, or 
>> even disruption to subscriber services. At the limit it might even lead a 
>> subscriber to change CAs if the reissuance events are frequent and 
>> disruptive enough.
>>
>> On the other hand, the more bad certs there are floating around, even if 
>> it’s “only” a matter of a case mismatch, the less interoperable the web PKI 
>> is, and the harder it is for a relying party to make effective use of 
>> WebPKI’s guarantees. Let’s please not end up with a “quirks mode” for TLS 
>> certificate handling!
>>
>> SOME OPTIONS
>>
>> One option: decide that there really are some BR violations that “don’t 
>> matter”, such that revocation can happen on a more relaxed, accommodating 
>> timeline—or perhaps not at all, just letting them expire as has been seen 
>> in some delrev incidents already. This would mean that we would still see 
>> incident reports that in theory help other CAs learn to put the postal code 
>> in the right field or similar, but subscribers and CAs and root programs 
>> would have to do less work.
>>
>> Another option: have affected certificates added to OneCRL after 72 
>> hours. It would benefit from some automation, but it’s probably feasible to 
>> make relatively smooth. It is sometimes the case, worryingly, that it takes 
>> CAs a fair bit of time and multiple attempts to find all the affected 
>> certificates, so this might require some linter running off CT logs or 
>> similar as a watchdog.
>>
>> Another another option: forbid CAs from selling WebPKI certificates into 
>> environments where a) revocation within a 5-day limit is operationally 
>> infeasible, and b) disruption of the related services would cause risk to 
>> human health and safety or similar. There are apparently many organizations 
>> out there which are critical to national economies or whatever, but need 
>> literal Earth months to replace a certificate. These are clearly cases 
>> where the requirements of WebPKI are incompatible with the operational 
>> constraints of the subscriber, so it’s not a good idea to mix them. (I’m 
>> sure some CAs could offer help with private PKI systems, probably with 
>> compelling margins.)
>>
>> Yet another, this time somewhat more preventative: if a CA repeatedly 
>> demonstrates that they are unable or (always the case?) unwilling to honour 
>> their commitments to the BRs, impose validity length restrictions on certs 
>> that they issue. At least in that case future misissued certificates would 
>> be in the wild for longer, and it would also show nicely that CAs’ advocacy 
>> for certificate automation was fruitful. Ignoring Entrust’s diatribe 
>> against 90-day validity periods in that weird blog post, I don’t think that 
>> any CA has made a credible case that their customers would not be able to 
>> handle rotating certificates every 90 days, even if they have to carve the 
>> new fingerprint into a mountain using a toothbrush or whatever. They’d even 
>> know it’s coming.
>>
>> One more: make delayed revocation incidents, specifically, more visible 
>> to subscribers and potential subscribers, and see if business pressure does 
>> what merely “agreeing legally to follow the BRs” (and optionally making 
>> empty “it’ll never happen again” promises) has been unable to do in too 
>> many cases.
>>
>> THANKS FOR READING
>>
>> I think the WebPKI is being poorly served by the *realities* of 
>> certificate integrity and misissuance responses. If nothing else, it’s 
>> causing a ton of delrev incidents for Ben to have to shepherd, without even 
>> module peers to assist him.
>>
>> Something needs to change.
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"dev-security-policy@mozilla.org" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dev-security-policy+unsubscr...@mozilla.org.
To view this discussion on the web visit 
https://groups.google.com/a/mozilla.org/d/msgid/dev-security-policy/79c8a805-c043-45d4-8a06-8946425a3cb5n%40mozilla.org.

Re: when do things really need to be revoked? who decides?

Reply via email to