Thank you. The answers to your questions below. On 04.12.2018 00:47, Jakob Bohm via dev-security-policy wrote:
On 03/12/2018 12:06, Wojciech Trapczyński wrote:Please find our incident report below.This post links to https://bugzilla.mozilla.org/show_bug.cgi?id=1511459. --- 1. How your CA first became aware of the problem (e.g. via a problem report submitted to your Problem Reporting Mechanism, a discussion in mozilla.dev.security.policy, a Bugzilla bug, or internal self-audit), and the time and date. 10.11.2018 10:10 UTC + 0 – We received a notification from our internal monitoring system concerning issues with publishing CRLs. 2. A timeline of the actions your CA took in response. A timeline is a date-and-time-stamped sequence of all relevant events. This may include events before the incident was reported, such as when a particular requirement became applicable, or a document changed, or a bug was introduced, or an audit was done. (All times in UTC±00:00) 10.11.2018 10:10 – We received a notification from our internal monitoring system for issuing certificates and CRLs concerning issues with publishing CRLs. We started verification. 10.11.2018 12:00 – We established that one of about 50 CRLs has corrupted digital signature value. We noticed that this CRL has a much larger size that others. We verified that in short period of time over 30 000 certificates had been added to this CRL. 10.11.2018 15:30 – We confirmed that the signing module has a trouble with signing CRL greater than 1 MB. We started working on it. 10.11.2018 18:00 – We disabled the automatic publication of this CRL. We verified that others CRLs have correct signature. 11.11.2018 07:30 – As part of the post-failure verification procedure, we started the inspection of whole system including all certificates issued at that time. 11.11.2018 10:00 – We verified that some parts of issued certificates have corrupted digital signature. 11.11.2018 10:40 – We established that one from a few working in parallel signing modules was producing corrupted signatures. We turned it off. 11.11.2018 18:00 – We confirmed that the reason for the corrupted signature of certificates was a large CRL which prevented further correct operation of that signing module. 11.11.2018 19:30 – We left only one working signing module which prevent further mis-issuances. 19.11.2018 11:00 – We deployed on production an additional digital signature verification in external module, out of the signing module. 19.11.2018 21:00 – We deployed on production a new version of the signing module which correctly handle a large CRL.Question 1: Was there a period during which this issuing CA had no validly signed non-expired CRL due to this incident?
Between 10.11.2018 01:05 (UTC±00:00) and 14.11.2018 07:35 (UTC±00:00) we were serving one CRL with corrupted signature.
Question 2: How long were ordinary revocations (via CRL) delayed by this incident?
There was no delay in ordinary revocations. All CRLs were generating and publishing in accordance with CABF BR.
Question 3: Was Certum's OCSP handling for any issuing or root CA affected by this incident (for example, were any OCSP responses incorrectly signed?, were OCSP servers not responding? were OCSP servers returning outdated revocation data until the large-CRL signing was operational on 2018-11-19 21:00 UTC ?)
No, OCSP was not impacted. We were serving correct OCSP responses all the time.
3. Whether your CA has stopped, or has not yet stopped, issuing certificates with the problem. A statement that you have will be considered a pledge to the community; a statement that you have not requires an explanation. 11.11.2018 17:47 4. A summary of the problematic certificates. For each problem: number of certs, and the date the first and last certs with that problem were issued. 355. The first one: 10.11.2018 01:26:10 The last one: 11.11.2018 17:47:36 All certificates were revoked. 5. The complete certificate data for the problematic certificates. The recommended way to provide this is to ensure each certificate is logged to CT and then list the fingerprints or crt.sh IDs, either in the report or as an attached spreadsheet, with one list per distinct problem. Full list of certificates in attachment. 6. Explanation about how and why the mistakes were made or bugs introduced, and how they avoided detection until now. The main reason for the corrupted operation of the signing module was the lack of proper handling of a large CRL, greater than 1 MB. At the moment when the signing module received such a large list for signing it was not able to sign it correctly. In addition, the signing module started to incorrectly sign the remaining objects received for signing later, i.e. after receiving a large CRL for signature. Due to the fact that at the time when problem occurred we were using simultaneously several signing modules, the problem did not affect all certificates issued at that time. Our analysis shows that the problem affected about 10% of all certificates issued at that time. We have been using this signing module for a few last years and at the time of its implementation the tests did not include creation of the signature for such large CRL. None of our CRLs for SSL certificates have exceeded 100 KB so far. Such a significant increase in the size of one of the CRLs was associated with the mass revocation of certificates by one of our partner (revocations was due to business reasons). In a short time, almost 30,000 certificates were found on the CRL, what is extremely rare. All issued certificates were unusable due to corrupted signature. 7. List of steps your CA is taking to resolve the situation and ensure such issuance will not be repeated in the future, accompanied with a timeline of when your CA expects to accomplish these things. We have deployed a new version of the signing module that correctly signs large CRLs. From now, we are able to sign a CRL that is up to 128 MB. In addition, we have improved the part of the signing module responsible for verification of signatures (at the time of failure it did not work properly). We have deployed additional verification of certificate and CRL signatures in the external component, in addition to the signing module. This module blocks the issuance of certificates and CRLs that have an corrupted signature. We have extended the monitoring system tests that will allow us to faster detection of incorrectly signed certificates or CRLs.Recommended additional precaution for all CAs (not just Certum): - Ensure that your CRL and revocation processes (including CRL signing, OCSP-response pre-signing, database sizes etc.) can handle the hypothetical extreme of all issued certificates being revoked. For example if an issuing Intermediary CA is configured to not issue more than 1 million certificates, the associated revocation mechanisms should be configured and tested to handle 1 million revocations. - Maintain at least one hierarchy of not-publicly-trusted CAs that run on the same platform (or an exact clone in a staging environment) and routinely run such worst case scenarios through it. Note that the first proposed precaution can be achieved by rolling new Intermediary CAs more often (e.g. every 20000 certs for a 20000 cert CRL signing limit) or by increasing the revocation capacity at the CAs discretion. Of cause once an Intermediary CA has issued a certum number of certificates, the capacity to revoke them all cannot be denied. Also note that the worst case scenario is not the performance optimization point, it is OK if running in this mode will entail horribly slow performance, as long as it stays within the absolute maximums set by standards, BRs, CPS etc. For example, OCSP responders might start taking seconds to return each response and the CRL download webserver might slow to a crawl. Ability to sign new CRLs may slow to one every 23 hours and 59 minutes. Which is why running these tests on non-production hardware (with no physical security) is probably a smart choice.
smime.p7s
Description: S/MIME Cryptographic Signature
_______________________________________________ dev-security-policy mailing list dev-security-policy@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-security-policy