On Mon, Mar 23, 2020 at 03:01:34PM +0000, Jeremy Rowley wrote: > Ryan's post was the part I thought was relevant, but I understood it > differently. The cert was issued, but we should have now revoked it (24 > hours after receiving notice). I do see your interpretation though, and > the language does support 24 hours after issuing the new cert.
Aha, righto. Glad we've gotten on the same page there. > What I need is a tool that scans after revocation to ensure there are no > additional certs with the same key. I can give you a certwatch SQL query that'll do that, if you like -- "show me all certs with this SPKI (or set of SPKIs) which aren't expired or OCSP-revoked". I use pretty much that query to get periodic reports of new certificates that have appeared with keys already in the dungeon. It's not ideal for your purposes, though -- you might get some false positives because your OCSP responders aren't up-to-date, and false negatives are possible if certwatch is backlogged (or a cert wasn't logged). Beyond that, though, if your internal certificate archive isn't indexed on SPKI fingerprint, and updated in near-real-time, those are problems that I think you'll have to fix, because the Internet's propensity to post their private keys on the Internet, and then reuse them to get new certs after the old one got revoked for key compromise, does not seem to be one that is going away any time soon. > The frustration is that this was > where the cert was issued after our scan of all keys but just before > revocation. As a side note, our system blacklists the keys when a cert is > revoked for key compromise, which means I don't have a way to blacklist a > key before a cert is ever issued. To the software developers! *blows trumpets* > >> I don't think that supports your point, though, so I wonder if I've got > >> the wrong part. That last part of Ryan's: "shenanigans, such as CAs > >> arguing the[y?] require per-cert evidence rather than systemic > >> demonstrations", seems to me like it's describing your statement, > >> above, that you (only?) "need to revoke a cert that is key compromised > >> once we're the key is compromised *for that cert*" (emphasis added). I > >> don't read Ryan's use of "shenanigans" as approving of that sort of > >> thing. > > I don't think its shenanigans, but I do think it's a pretty common > interpretation. Such information would help determine the common > interpretation of this section. I agree that CAs should scan all certs > for the same key and revoke where they find them. Is that actually > happening? A lot to unpack here, let me make up some specific questions and answer them as best I can. "Have any other CAs failed to revoke certificates issued for keys for which they had previously revoked a certificate for key compromise?" Yes, one CA has failed to do so, and I've reported that to this list. "Have any other CAs successfully revoked a certificate within the BR-mandated (am I OK using that phrase now?) time period, that they issued for a private key for which they had previously revoked a certificate due to key compromise?" I don't know, I haven't checked (yet). It's on my (lengthy) list of Interesting Questions For Which I Need To Write Insanely Complicated SQL Queries. "Have any other CAs blacklisted a private key that was reported as compromised and prevented issuance before they had issued a certificate for that key?" Naturally I can't answer that one directly, because I don't have internal access to CA systems. *But*, I have one test case, in which a private key known to have "hopped" between CAs after revocation was reported to a number of CAs proactively, but there hasn't been a resolution to that test case one way or the other. No new certs have been issued for the same name, with the compromised key or any other, so it's impossible to infer what the outcome may be. Other very specific questions welcomed. The answers will probably be "I haven't looked yet", but there are a lot of questions I've got at least a vague idea of how to answer, once I have time2SQL. > Do other CAs object to there being a lack of specificity if you give the > keys without a cert attached? Since I have been sending (links to) CSR format compromise attestations, no CA has communicated an objection to the format of my reports, nor have they failed to act (in some fashion) to any of my reports, as far as I am aware. > >> Bim bam bom, all done and dusted, and we can get back to washing our > >> hands. > > That you're *not* doing that is perplexing, and a little disconcerting. My wife is immuno-suppressed -- about the only time I'm not washing my hands is when I'm typing at my keyboard (the suds get in the switches and cause all sorts of problems). If I could get a reliable supply of sanitizer, I'd be swimming in the stuff. > That's an oversimplification of the incident report process. I'm not > resisting the incident report itself since incident reports are a good and > healthy way for the public to have transparency into a CAs operation. In > fact, I wouldn't mind filing one just to give public documentation on what > DigiCert is doing for revocation. I was objecting to actually calling > this an incident/breach of the guidelines since I disagreed with that, and > there's a trend where the interpretation of these sections seems to evolve > over time. My main emphasis was that the guidelines are ambiguous and > need refinement. I'd support refining them to be more clear, but calling > something shenanigans when the language is unclear is unfair. I can see your point there -- there's no denying that the BR language can be misinterpreted. Given a suitably motivated reader, of course, *anything* can be creatively misinterpreted, but that use of "obtains evidence" is undoubtedly tricky. >From what Ryan said in the previous thread, my impression is that attempts were made to improve the language, but were blocked. Unfortunately, in that sense, you're beholden to the most reactionary majority of your fellow CAs in improving the clarity of the language in the BRs. I suppose these sorts of enduring ambiguity are part of the reason why all CAs are expected to follow this list -- you could think of it as the "judiciary branch", interpreting the "laws" and providing precedent. > For the SPKI, you're right. The slow part is downloading it from your > website. I guess I should have said "a link to the key compromise proof" > rather than the SPKI. Just for clarity, was it the HTTP requests themselves that were slow (which is something I want to know about, and fix) or was it that your front-line people aren't tooled up for ad-hoc automation? (To be clear, I'm not criticising you for having front-line people who don't know shell scripting, just trying to clarify the problem) > 3) Have the tool process the revocation at the 24 hour mark. You'll want to factor in the delays in getting the message to your OCSP responders and CRLs. I don't think that arguing that "publishing" doesn't need to include actually making the revocation public would get very far. > >> Even if Digicert would just prefer that I did something differently in > >> my problem reports in the future -- something that would make it > >> legitimately easier for Digicert to process my reports, without making > >> my life a misery (after all, y'all have a lot more resources than I do) > >> -- all you have to do is ask. My e-mail address is right there in > >> every problem report I send. > > Nah - although I'd prefer just to get the CSRs in a zip file, I can work > with what you send us. Well, there shouldn't be any more mass key compromise notifications coming through, at least not of the scale of that last round, so "CSRs in a zip" shouldn't be necessary. I'm intending to get to the point in the near future where the notifications go out one by one, as soon as the key is discovered to be compromised. > The reason I brought up the format is that I'm not sure when the 24 hour > clock actually kicks off for revocation. Is it 24 hours after we get your > email or 24 hours after we can confirm key compromise? I've always > regarded these as a certificate problem report under 4.9.5 (requiring us > to kick off the investigation) but then revocation should happen 24 hours > after we have reason to suspect the key compromise is legit. It's surprising that you say that, because to me, 4.9.5 isn't nearly as ambiguous as the relevant point in 4.9.1.1. 4.9.5 says "The period from receipt of the Certificate Problem Report or revocation-related notice to published revocation". While "receipt" could, I suppose, still be argued as meaning "when we checked the mailbox", I don't think it's reasonable to claim that "receipt" means "when we validated the CSR" -- especially since it's "receipt of the Certificate Problem Report", not "receipt of concrete evidence" or anything like that. At any rate, for the specific issue we're discussing -- failure to revoke a cert for a previously-reported compromised private key -- I don't think 4.9.5 really matters. 4.9.1.1 says the CA has to revoke within 24 hours of "obtaining evidence of key compromise". For a certificate issued after a report of key compromise has been sent to the CA, the moment at which it can be said that the CA has "obtained evidence of key compromise" *for that specific certificate*, depending on your interpretation, could be: * the time at which the problem report which initially reported the compromised key landed on the CA's mail server (which is the last moment that an external observer can know what's going on); * the time at which that the CA validates that the evidence is valid; * the time at which the new certificate using the compromised private key is issued; or * when someone says "hey, you issued another cert for that key I mentioned before!" (If you think there's another one, please add it. Again, remember, this is *only* for the case of a certificate issued after an initial key compromise report is received by the CA.) The last of those moments is what it looks (to me) like you were relying on, but which Ryan's statement seems to describe as "shenanigans". It implies to me that CAs have no memory of the past when it comes to key compromises, which isn't a situation that is reasonable, to my mind. The same "obtains evidence" phrasing is used in the next point (regarding domain control validation that cannot be relied upon), and the idea that a CA would argue that they could legitimately issue a certificate for a name which they had previously accepted had not been properly validated is, I hope you'd agree, disconcerting in the extreme. > To be clear, I'm not complaining about the format. I'm wondering when we > obtained the private key for the 24 hour purposes. With automation, the > time between when we get the email and when we confirm key compromise > should be nearly zero. However, with a more manual process, that time is > not insignificant. What I don't like about the interpretation that the > revocation event is 24 hours from when we get an email is some emails are > very vague about key compromise. With that reading, if we get an email > without proof that is later followed up by proof, the 24 hour period could > start when we get the initial email even if the proof is provided 25 hours > later. Well, if that were the case, then within 24 hours of the receipt of the problem report you should have provided a preliminary report which says "evidence is insufficient", and thus that problem report could reasonably be considered to have been handled. The "followup" e-mail that did contain useful evidence would reasonably be considered a "new" problem report. Even when the useful evidence is provided after the initial e-mail, but before you put out the preliminary compromise report, if the initial e-mail didn't contain anything actionable I doubt that a root store is going to class it as an incident if you don't revoke within 24 hours of the initial e-mail -- as long as you got it done within 24 hours of when the actual, useful evidence landed on your MTA. Certainly, if someone came here and tried to drop a CA in it with the above set of facts, I'd be in the front row of people saying "pull the other one, it plays jingle bells". > That does happen, which is why I think the time period should be > 24 hours from after the CA receives proof of key compromise. But even > that is ambiguous. When did we receive proof of key compromise? I'd say > it's when all the CSRs finished downloading. You could have started processing revocations for the first CSRs downloaded -- waiting for the last CSR to download before starting to process the first is unlikely to be a valid argument for delayed revocation. It would almost certainly be worse for you, though, if it *was* a valid argument, because the solution to that, from my perspective, would be to send 800-odd separate certificate problem reports, each with one key. I can't imagine a situation in which that pile of e-mails would be *easier* to process than one big list of compromised keys, but it would prevent a CA from arguing that it look a long time to download all the CSRs. Claiming that you didn't have enough staff to process that volume of key compromise reports would be unlikely to be accepted, because if "lack of staff" were seen as a valid excuse it would encourage CAs to understaff their problem report processing, which would be Extremely Terribad. > If that's not the case, then you are encouraging CAs to be myopic in the > way they accept key compromise information. Yes, it is possible that being strict on revocation deadlines might cause some CAs to react by trying to reject valid-but-tricky certificate problem reports. There's all *sorts* of reasons why CAs might react in ways that the root stores (and the surrounding community) would find problematic. At the end of the day, the ultimate arbiters of the reasonableness of those behaviours are the root stores. As I have already done once before, if a CA rejected one of my problem reports for what I felt were spurious reasons, I'd check with Mozilla and the community, via this list, to see if my expectations were unreasonable. If the response was "your problem report was reasonable", then I'd lay out the full details of the case, and the CA gets to explain their behaviour, and root stores get to act as they see fit. If, on the other hand, the response is "stop being a goose, Matt" then I don't make a report, and I do something else in the future. - Matt _______________________________________________ dev-security-policy mailing list dev-security-policy@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-security-policy