Ryan, I’m not sure where we go from here. We have customers that need certificates and they have demonstrated they can comply with not permitting the creation and use of certificates for domains other than those that the hosting company is hosting for that customer. All certificates will continue to be posted to CT logs.
As far as the wildcard question, when someone asks for a wildcard cert for a domain like *.us.example.com, we validate on that minus the * (so, us.example.com in this case). We’d like to move forward with issuing certificates with controls in place. If there are any other controls you need us to implement to resume issuance, let us know. For example, if we limit validity to 1 year (possibly up to 15 months) and if we put a firm end date for OneClick for July 1, 2018, would that suffice? Doug From: Ryan Sleevi [mailto:r...@sleevi.com] Sent: Monday, January 15, 2018 2:31 PM To: Doug Beattie <doug.beat...@globalsign.com> Cc: r...@sleevi.com; Wayne Thayer <wtha...@mozilla.com>; Gervase Markham <g...@mozilla.org>; mozilla-dev-security-pol...@lists.mozilla.org Subject: Re: Possible Issue with Domain Validation Method 9 in a shared hosting environment On Mon, Jan 15, 2018 at 1:18 PM, Doug Beattie <doug.beat...@globalsign.com<mailto:doug.beat...@globalsign.com>> wrote: From: Ryan Sleevi [mailto:r...@sleevi.com<mailto:r...@sleevi.com>] Sent: Friday, January 12, 2018 5:53 PM To: Doug Beattie <doug.beat...@globalsign.com<mailto:doug.beat...@globalsign.com>> Cc: Wayne Thayer <wtha...@mozilla.com<mailto:wtha...@mozilla.com>>; Gervase Markham <g...@mozilla.org<mailto:g...@mozilla.org>>; r...@sleevi.com<mailto:r...@sleevi.com>; mozilla-dev-security-pol...@lists.mozilla.org<mailto:mozilla-dev-security-pol...@lists.mozilla.org> Subject: Re: Possible Issue with Domain Validation Method 9 in a shared hosting environment (Wearing a Google Hat) Doug, Thanks for sharing additional details. On the basis of what you've shared so far, we do not believe this results in an appropriate level of security for the ecosystem, and request that you do not re-enable issuance at this time. This applies for any CA using methods similar to what you're using. Broadly speaking, https://groups.google.com/d/msg/mozilla.dev.security.policy/RHsIInIjJA0/HACyY9tMAAAJ has shared the some of the principles we've used in this consideration. If there is additional details that GlobalSign can share, related to those principles, this would be invaluable. Ryan, I had a hard time digesting that email because it compared so many different items, many of which aren’t directly applicable to the OneClick vs. method 10 that I want to focus on. The key points I took away from your email are: “weak” manual method comparison with methods 9 and 10 (not applicable to the methods 9-10 comparison since we’re not comparing them to manual methods). Short validity certificates represent more risk to ecosystem (expiration) and less risk (certs issued under the exploit will expire within 90 days – badness lasts for only 90 days). I’ll address this point below, but given LE will allow renewals of possibly bad validations and attackers generally only operate with short periods of attacks before moving on, I don’t see the value of short lived certificates having meaningful reduction in risk within this context. Ease of which an alternate method exists and can be used (discussion of manual vs. automated methods): Not applicable to the methods 9-10 comparison since they are both automated and have the same characteristics. Risk is applicable to shared service providers and an accepted risk mitigation is to block SNI negotiations that contain “.invalid”. We also propose working with our customers on an account by account basis to assure they comply with the guidelines for use of method 9 until such time it’s re-affirmed, improved or deprecated from the BRs. Perhaps I missed some other key points from that email. I think these points may not have been fully appreciated. I don't see evidence from this mail, or from the ecosystem, that the OneClick method poses both the same risk and the same level of review as ACME's TLS-SNI, and I think we may fundamentally disagree about the risk profile of certificates with long validity periods, and both the detrimental effect they have on reasoning about ecosystem security AND the ways in which they mitigate the need to 'quickly re-enable this' This assessment is based on a number of factors, but includes: - The validity period of certificates issued via this method means that there is an unacceptably large window for certificates improperly issued to be used. Risk should not be based so heavily on the validity period, which seems to be one of your consistent points. The number of certificates issued along with the probability of a failure should both be used in the ecosystem risk computation. We must disagree then. Risk is profoundly dependent on the validity period - one of the key mitigations to making an improper evaluation is the knowledge that the risk is bounded, whereas greater than 90 days represents significantly greater risk. Given LE issues orders of magnitude more certificates to unique endpoints, I think the risk to the eco system at large with the GlobalSign issuance is lower of that with LE (when it comes to the topic of validity periods). As I tried to explained in our considerations, we believe quite the opposite. The large number of ACME (since this is, to be clear, not Let's Encrypt specific) endpoints, combined with the shorter-lived validity period, presents significant risk to immediately turning it off, while GlobalSign's longer period, combined with lesser issuance, reduces that risk of impact to the ecosystem from taking the defensibly conservative choice. Risk = impact x probability: With the number of LE endpoints (or anyone using Method 10 in high volumes), the probability of a successful attack is vastly higher due to the sheer number of servers, and the impact for both methods is the same (a certificate issued to a successful attacker) Your assessment of the impact I believe is inverse. An impact of a misissued 3-year certificate is profoundly worse than that of a 90 day certificate. This should be rather self-evident, given the ample discussion about certificate lifetimes in the Forum, but consider a CA that issued 100 3 year certificates versus a CA that issued 1000 30 day certificates. Let's further presume that all turn out to be problematic. The size of the CRLs, the impact to clients, and the cost to mitigate that are substantially greater for those 3 year certs, even though there are an order of magnitude fewer of them than the 30 day certificates. The risk exposure, to the overall ecosystem (e.g. including those outside of the browser space) is equally profoundly greater. - Based on the available information of expiration times and the potential difficulty in renewing certificates using this method, the ecosystem risk of disallowing this method is much less. How did you come to the conclusion that validity periods and renewal challenges substantially increase the risk of method 9? 1) While a GlobalSign certificate would be valid for a longer period than LE (typically 1 year, but up to 3), typical attacks are done, detected, resolved within days or weeks I don’t believe that the validity period of certificates significantly increases the risk when exploited in the way as described (the target site would typically notice they were compromised and it would be reported and the certified revoked within days or weeks). A more important factor is the number of certificates that may be issued, not their validity period. We disagree on this point, for the reasons explained above. This is a long-standing position regarding risk in the ecosystem, as reflected in our past discussions regarding validity periods, and thus is entirely consistent with past policies and actions. 2) While LE’s validity period is shorter, they re-use the validation for subsequent issuance thus the time between validation and expiration is longer than 90 days (I believe the domain validations can be cached for 60 days). This equates to 5 months vs. generally 12 months for GlobalSign. And since LE will permit domain renewal of possibly bad authentications, the 5 months could average out to be substantially higher. 3) While the renewal process is currently not optimal, it’s been working for 5 years without significant pushback from our customers. I fail to see how this factors into risk in a meaningful way. I may have missed your point. To be honest, I'm not quite sure what you mean by "domain renewal" in this respect. However, as explained, when factoring in risks to the ecosystem, we take a holistic view about the impact of making a security-negative decision (such as improperly allowing something that should be rejected) and a security-positive decision (such as rejecting something that might otherwise be acceptable). This holistic calculus informed our position here, and while it's clear that we disagree with respect to the 'improperly allow' risk (and for which I tried to explain why that may be), it's also unclear that there's any new information about the impact from taking a security-positive decision here. - The subtleties regarding Authorization Domain Names means that the risk analysis provided is not sufficient - namely, it is unclear, as described, whether it is possible to obtain a certificate for "www.example.com<http://www.example.com>", on a host that has a customer already configured on that domain (and checking/enforcing certificates), by first applying for a certificate for "example.com<http://example.com>" as an attacker, providing and provisioning a test certificate using that method (which is not configured to serve a certificate by the 'victim'), and then using that subsequent authorization of the Authorization Domain Name to then apply for "www.example.com<http://www.example.com>". Each and every certificate undergoes its own validation – there is no re-use of validation data when issuing subsequent certificates. I should have stated this earlier. I hope this answers this question. It doesn't. A reuse of validation is distinct from the validation of the Authorization Domain Name. I was highlighting the scoping issues of ADN vs FQDN, and your response doesn't seem to address that. - The potential risk in maintaining this whitelist, given both the statements provided by plans to move to deprecate this method post-haste (e.g. no such plans) and the validity period of issued certificates (up to 39 months or, soon, 825 days). Since LE can continue to renew certificates issued under this method prior to this change, doesn’t that effectively allow longer effective validity periods? I recognize there is a difference between renewing and long validity certs, but allowing renewal of certs issued under the flawed method seems to reduce value of your argument here. No, it doesn't, because in the event of misissuance, the attacker's ability is not the full duration (or 5 months, as you suggest), but bounded by the lifetime of the certificate. These are fundamentally different risks - and that's why the validity period of the certificate itself is far more important than the reuse period of the information. A victim can contact an ACME using CA to invalidate the information, thus preventing renewal, and the attacker is still bound to the lifetime of the existing certificate. Compare this with a certificate issued by 1-3 years by GlobalSign, in which even if a victim contacts GlobalSign, the most that GlobalSign can do is to revoke that certificate, which is ineffective at scale. This permits the attacker a far greater 'attack' window, even though GS might have revoked it, and is a key and fundamental difference. - The lack of preexisting review and documentation of the specific protocol being employed The process was discussed as part of the BR update and it’s documented in the BRs, and I hope the supplemental information provided here helps. Also, the OneClick plugins and intended use is not susceptible to this attack because the plugin does not permit upload/use of certificates for domains other than the one the plug-in is managing (yes, clients can change the code or develop their own, so the protocol is susceptible in the end). Right, I'm focusing on the protocol, and I don't believe that the protocol implemented by OneClick is equivalent to what's described in 3.2.2.4.9. I believe it is compatible with/an implementation of, but just as 3.2.2.4.10 does not describe ACME TLS-SNI (nor does 3.2.2.4.6 describe ACME HTTP-01), despite being compatible, we're treating this based on the available information of and public review and discussion of the specific protocol. As this thread shows, we anticipate we will continue to find variants of risk, and thus the whitelist approach, combined with the validity periods caused by the risk (both of issued certificates and "completed validations"), poses a long-term risk, even if we catch issues 'within days'. Validity period discussions above hopefully helps to mitigate the risk of OneClick as compared with method 10 @ LE As I hope you can see, the validity period conversation again highlights that we believe it presents significant risk to the ecosystem to allow such certificates to be valid for greater than 90 days, at the most. This applies to any user of ACME's TLS-SNI, and more than likely, any (existing) validation method under the 3.2.2.4.9 / 3.2.2.4.10 aegis. Doug’s summary: - Implement mitigations to prevent exploitations can be coordinated with our major customers to reduce or eliminate the risk (similar to LE) We don't believe the proposal meaningfully does so, for the reasons outlined. - GlobalSign performs domain validation for every issuance, vs caching validation info. The issue is not with revalidation at issuance, but about the permitted use of the Authorization Domain Name. - Total active OneClick certificates <100K - Total number of active OneClick customers: < 10 This highlights that the risk (to not allowing issuance) is potentially small, and can meaningfully be addressed by engaging with those 10 customers. As you no doubt know, our goal is to ensure a consistent response, and we see significant danger in posing 'whitelisting' as a solution, as it fundamentally devolves into a question about how the CA manages and maintains that whitelist, which exists outside the remit of the BRs. This is where steps, such as those proposed by Let's Encrypt, to move to disable and/or prevent the usage of TLS-SNI (particularly in new deployments), poses a way to both mitigate the immediate risk and to move towards deprecation overall. GlobalSign's in a unique position, given the small number of deployments, to be able to work to devise technical solutions that may mitigate the risk, and effective deployment may be a better solution for the ecosystem both medium- and long-term. Inevitably, however, validity periods must play a part as any short-term mitigation (in addition to exploring technical solutions), so that we can arrive at something that is consistent across CAs and consistent among risk factors. - Risk to Ecosystem with GlobalSign is lower based on active certificates (100K vs. tens of millions). Perhaps the more important factor is not active certificates, but certificates issued which increases the difference between GlobalSign and LE. Agreed that the risk to disallowing issuance is lower. I believe you were trying to highlight something different, but hopefully the responses above highlight why we may disagree that the GlobalSign solution is less risky. - Time between domain validation and certificate expiration is 12 months (typical GlobalSign cert) vs. 5-8 or more months with LE (since domains can be renewed with method 10 without mitigations, even if prior and current validation methods are vulnerable) As noted above, the time period noted here is incorrect in its assessment. _______________________________________________ dev-security-policy mailing list dev-security-policy@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-security-policy