From: Ryan Sleevi [mailto:r...@sleevi.com] Sent: Friday, January 12, 2018 5:53 PM To: Doug Beattie <doug.beat...@globalsign.com> Cc: Wayne Thayer <wtha...@mozilla.com>; Gervase Markham <g...@mozilla.org>; r...@sleevi.com; mozilla-dev-security-pol...@lists.mozilla.org Subject: Re: Possible Issue with Domain Validation Method 9 in a shared hosting environment
(Wearing a Google Hat) Doug, Thanks for sharing additional details. On the basis of what you've shared so far, we do not believe this results in an appropriate level of security for the ecosystem, and request that you do not re-enable issuance at this time. This applies for any CA using methods similar to what you're using. Broadly speaking, https://groups.google.com/d/msg/mozilla.dev.security.policy/RHsIInIjJA0/HACyY9tMAAAJ has shared the some of the principles we've used in this consideration. If there is additional details that GlobalSign can share, related to those principles, this would be invaluable. Ryan, I had a hard time digesting that email because it compared so many different items, many of which aren’t directly applicable to the OneClick vs. method 10 that I want to focus on. The key points I took away from your email are: “weak” manual method comparison with methods 9 and 10 (not applicable to the methods 9-10 comparison since we’re not comparing them to manual methods). Short validity certificates represent more risk to ecosystem (expiration) and less risk (certs issued under the exploit will expire within 90 days – badness lasts for only 90 days). I’ll address this point below, but given LE will allow renewals of possibly bad validations and attackers generally only operate with short periods of attacks before moving on, I don’t see the value of short lived certificates having meaningful reduction in risk within this context. Ease of which an alternate method exists and can be used (discussion of manual vs. automated methods): Not applicable to the methods 9-10 comparison since they are both automated and have the same characteristics. Risk is applicable to shared service providers and an accepted risk mitigation is to block SNI negotiations that contain “.invalid”. We also propose working with our customers on an account by account basis to assure they comply with the guidelines for use of method 9 until such time it’s re-affirmed, improved or deprecated from the BRs. Perhaps I missed some other key points from that email. This assessment is based on a number of factors, but includes: - The validity period of certificates issued via this method means that there is an unacceptably large window for certificates improperly issued to be used. Risk should not be based so heavily on the validity period, which seems to be one of your consistent points. The number of certificates issued along with the probability of a failure should both be used in the ecosystem risk computation. Given LE issues orders of magnitude more certificates to unique endpoints, I think the risk to the eco system at large with the GlobalSign issuance is lower of that with LE (when it comes to the topic of validity periods). Risk = impact x probability: With the number of LE endpoints (or anyone using Method 10 in high volumes), the probability of a successful attack is vastly higher due to the sheer number of servers, and the impact for both methods is the same (a certificate issued to a successful attacker) - Based on the available information of expiration times and the potential difficulty in renewing certificates using this method, the ecosystem risk of disallowing this method is much less. How did you come to the conclusion that validity periods and renewal challenges substantially increase the risk of method 9? 1) While a GlobalSign certificate would be valid for a longer period than LE (typically 1 year, but up to 3), typical attacks are done, detected, resolved within days or weeks I don’t believe that the validity period of certificates significantly increases the risk when exploited in the way as described (the target site would typically notice they were compromised and it would be reported and the certified revoked within days or weeks). A more important factor is the number of certificates that may be issued, not their validity period. 2) While LE’s validity period is shorter, they re-use the validation for subsequent issuance thus the time between validation and expiration is longer than 90 days (I believe the domain validations can be cached for 60 days). This equates to 5 months vs. generally 12 months for GlobalSign. And since LE will permit domain renewal of possibly bad authentications, the 5 months could average out to be substantially higher. 3) While the renewal process is currently not optimal, it’s been working for 5 years without significant pushback from our customers. I fail to see how this factors into risk in a meaningful way. I may have missed your point. - The subtleties regarding Authorization Domain Names means that the risk analysis provided is not sufficient - namely, it is unclear, as described, whether it is possible to obtain a certificate for "www.example.com<http://www.example.com>", on a host that has a customer already configured on that domain (and checking/enforcing certificates), by first applying for a certificate for "example.com<http://example.com>" as an attacker, providing and provisioning a test certificate using that method (which is not configured to serve a certificate by the 'victim'), and then using that subsequent authorization of the Authorization Domain Name to then apply for "www.example.com<http://www.example.com>". Each and every certificate undergoes its own validation – there is no re-use of validation data when issuing subsequent certificates. I should have stated this earlier. I hope this answers this question. Mitigating this as a site operator would necessitate blocking not just on existant domains, but also by the notion of Authorization Domain Name, and thus represents a significant greater complexity in both assessing compliance (for those on the whitelist) and for minimizing risk. Given the statement above, this isn’t a risk. - The potential risk in maintaining this whitelist, given both the statements provided by plans to move to deprecate this method post-haste (e.g. no such plans) and the validity period of issued certificates (up to 39 months or, soon, 825 days). Since LE can continue to renew certificates issued under this method prior to this change, doesn’t that effectively allow longer effective validity periods? I recognize there is a difference between renewing and long validity certs, but allowing renewal of certs issued under the flawed method seems to reduce value of your argument here. - The lack of preexisting review and documentation of the specific protocol being employed The process was discussed as part of the BR update and it’s documented in the BRs, and I hope the supplemental information provided here helps. Also, the OneClick plugins and intended use is not susceptible to this attack because the plugin does not permit upload/use of certificates for domains other than the one the plug-in is managing (yes, clients can change the code or develop their own, so the protocol is susceptible in the end). - The potential risk of both domain name wildcards and of wildcard issuance, which remains I hope I addressed this above. We don’t re-use domain validation data for subsequent requests. While it is possible that you may be correct that the underlying root cause of TLS-SNI presents greater risk, compared to this method, the many mitigating factors that influenced our decision are not applicable here. I attempted (likely unsuccessfully) to reiterate the risk topics you raised with both methods and hope this makes the risks associated with method 9 similar to those of method 10. As this thread shows, we anticipate we will continue to find variants of risk, and thus the whitelist approach, combined with the validity periods caused by the risk (both of issued certificates and "completed validations"), poses a long-term risk, even if we catch issues 'within days'. Validity period discussions above hopefully helps to mitigate the risk of OneClick as compared with method 10 @ LE We'd like to continue discussing with GlobalSign and the community as to the risk posed by immediately and permanently disabling this method, as well as possible mitigations to the risk - both through issuance policies of GlobalSign and technical measures in the usage - that may permit its usage for a short-time to transition this method away. This is a conversation we look forward to having over the next week. In the interim, we'd ask you not re-enable this method. Thanks for keeping the dialog open. The domain validation method remains disabled, but we want to continue the discussion so the community better understands the risks an how we can re-enable this for some specific customer accounts who have demonstrated the proper controls are in place. Doug’s summary: - Implement mitigations to prevent exploitations can be coordinated with our major customers to reduce or eliminate the risk (similar to LE) - GlobalSign performs domain validation for every issuance, vs caching validation info. - Total active OneClick certificates <100K - Total number of active OneClick customers: < 10 - Risk to Ecosystem with GlobalSign is lower based on active certificates (100K vs. tens of millions). Perhaps the more important factor is not active certificates, but certificates issued which increases the difference between GlobalSign and LE. - Time between domain validation and certificate expiration is 12 months (typical GlobalSign cert) vs. 5-8 or more months with LE (since domains can be renewed with method 10 without mitigations, even if prior and current validation methods are vulnerable) I hope the information provided above will help advance the discussion. _______________________________________________ dev-security-policy mailing list dev-security-policy@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-security-policy