Hey all, At Let's Encrypt we've been thinking about designing some kind of renewal information API as an extension to ACME for a while now. Recent events have brought this back to the forefronts of our minds. Below I've attached a proposal I've written up detailing our proposal. I'd really like to get input on this proposal, especially from those working on ACME clients as this work mostly represents thoughts from ACME server developers, and as such may not accurately capture issues faced by clients.
If the working group is interested in this as a work product I'll spend some time developing an ID based on this outline. Thanks! Roland ---- This proposal aims to address two issues that affect both ACME and the wider web PKI. The first of these issues is how a CA should inform subscribers of a CA, or third-party, initiated certificate revocation event. In most cases this is done via email, or other out-of-band notification channels, which may be appropriate for CAs that rely on manual processes but seems clunky for ACME based CAs which heavily rely on automation. For automated ACME clients the probability that a user will act upon a revocation notice (or even receive one if they do not provide an account contact) is lower than manually maintained certificates, leading to the possibility of serving a revoked certificate until their next renewal window. In cases where the CA has a buffer before performing the revocation, being able to inform the client of this impending event would allow for seamless renewal before the revocation took place, and in the case where the revocation has already taken place this would help to significantly reduce the impact to the subscriber. The second issue is how ACME clients should determine when to renew a regular non-revoked certificate. Most clients take one of two routes. They are either manually configured to renew at a specific interval (i.e. via `cron` or similar) or parse the issued certificate to determine the expiration date and choose some date preceding it to attempt renewal. While the latter option is better than the former, each can cause issues for both the client and the issuing CA. The first option causes significant barriers for the issuing CA changing certificate lifetimes, as the static renewal window makes assumptions about that lifetime that must be manually updated. Both options can also cause load clustering for the issuing CA. Being able to indicate to the client a period in which the issuing CA suggests renewal would allow dynamic changes to the certificate lifetime and smearing of load. ## ACME vs. OCSP The two obvious options for transmitting renewal suggestions to the subscriber are via an extension to the ACME protocol, or an extension to the OCSP protocol. Each option has advantages and disadvantages. For OCSP one of the obvious advantages is that the protocol is already designed to carry revocation information, which some clients already poll for. An extension could be added to OCSP responses containing 'recommended renewal' windows and/or indicators of impending revocation. Using OCSP also has the advantage of existing serving and caching infrastructure. The disadvantages of using OCSP mainly revolve around usage of the protocol by relying parties. In order to avoid intolerance for new OCSP extensions we would likely want to require clients to indicate that they want this information, via either an OCSP request extension or an HTTP header, which could increase caching requirements for the issuing CA and possibly require changes to their existing caching infrastructure. For ACME the most obvious advantage is that every ACME client already understands the protocol, and should have a relatively easy path to being extended to understand a new API endpoint. Using ACME also allows a more descriptive, extensible API, rather than requiring us to stuff more information into a strictly defined ASN.1 extension. Using ACME would allow for requesting information on multiple certificates in a single request, which while technically possible via OCSP is in reality rarely supported. The disadvantages of using ACME mainly revolve around increased load on the ACME API for the issuing CA. ACME currently has no endpoints that are designed to be routinely polled, adding one could introduce a significant load vector which infrastructure has not been designed for. Another disadvantage is that if the API was authenticated it wouldn't be possible to viably cache the renewal information at a CDN layer. On balance it seems like ACME is the better choice for this API. ## Push vs. Pull The CA could either push information to ACME clients, for instance via webhook, or it could rely on clients polling for information. The push method is challenging because many ACME clients run behind firewalls or don’t have full access to provide external-facing services. For instance, an ACME client might only have the ability to provision files under /.well-known/acme-challenge/, or it might only have access to modify DNS records. The pull method, on the other hand, is straightforward. ACME clients, by necessity, need to send HTTPS requests to the CA. They can use that same channel to poll periodically. The disadvantage of polling is that it provides less timely results than pushing. The most relevant constraint is the Baseline Requirement that CAs must revoke within 24 hours on key compromise, or when validation information “cannot be relied on.” Polling must be frequent enough that the ACME client will receive notification within this 24-hour window, with enough remaining time for manual escalation if the automated client fails to act. Polling on a 12-hour interval should provide this. ## Cacheability An important question to answer is if the results of this API need to be cacheable, and if so what level of cacheable it should have. One reason for designing the API around cacheability would be high request load for repeated requests. Users for repeated identical requests are likely to have a relatively low cardinality and these requests are not likely to be made rapidly, suggesting that the API doesn't need to be highly cacheable. That said given the information returned by the API isn't likely to be dynamic (for instance in the lifetime of the certificate it is unlikely to change, barring a revocation event) it seems likely that the issuing CA would like some way to cache the results in order to reduce unnecessary resource usage.. ## An OCSP-based design (rejected) Here we’ll sketch out an OCSP-based design for contrast with the design proposed below. OCSP is frequently fetched by Relying Parties (RPs). We do not want to increase the bandwidth usage for normal RP fetches, since that would worsen performance for many normal web browsing requests. Also, when ACME clients poll, they will want different caching semantics than RPs. CAs will want ACME clients to get fresh information about every 12 hours, while OCSP responses are commonly cacheable up to their NextUpdate, which according to the Microsoft Root Program can be up to 7 days after ThisUpdate. While CAs could shorten their NextUpdate interval to accommodate ACME clients, this would be an unnecessary coupling of concerns. Under this proposal, ACME clients that poll OCSP for renewal information MUST add an HTTP Header, “ACME-Renewal: 1” to their requests. CAs that use CDNs to serve OCSP responses MUST treat the ACME-Renewal header as part of their cache key, so that responses to ACME clients can have a different Cache-Control: max-age than those sent to RPs. In the normal case, when no renewal is needed soon, the OCSP response will be unchanged. For the “renewal needed soon” case, we have two choices to convey that information: An OCSP extension, or an HTTP header. The OCSP extension has the advantage that it’s signed, but has the disadvantage that it requires extra signatures from a CA’s HSM, at a time when the HSM may already be burdened by signing bulk revocation responses. In the case where a CA wants an ACME client to renew a certificate, the CA responds to all requests that have “ACME-Renewal: 1” in the header with a response that has the header “ACME-Renewal: renew-by=<datetime>; key-rotate=<true/false>”. The ACME client then attempts renewal by the specified datetime. In both cases the CA MUST include the “Vary” header in its response, and must include “ACME-Renewal” among the header names listed. Advantages of this proposal: It does not require a discovery mechanism for ACME clients to find out where to check the status of a certificate; the OCSP URL is already available in the certificate itself. ACME clients can also check the revocation status of the OCSP response. For CAs that don’t support renewal notifications, these clients could trigger renewal immediately on noticing a certificate was revoked. Disadvantages of this proposal: It combines two different types of requests with different caching at a single URL, inviting subtle mistakes with cache keys. Because OCSP URLs embedded in certificates necessarily use HTTP, the response triggering renewal is unauthenticated. A MITM attacker could use this to trigger early certificate renewal. We reject this design. ## Proposed API Here we propose a roughly sketched out ACME API extension, taking into account the topics discussed above. Conformant ACME servers should include a new key in the JSON objects for finalized orders with the key “renewalInformation”. The value of this field should contain a unique URL from which renewal information can be retrieved. To request renewal information conforming ACME clients should make a GET request to this URL. The ACME server should respond to a request with a JSON object containing renewal hints for the associated certificate. { "suggestedRenewalWindow": { "start": "...", "end": "..." }, "keyRotate": true } The structure of the certificate objects is as follows: suggestedRenewalWindow (object, required): A JSON object containing two strings, "start" and "end", which indicates the window in which the CA recommends renewing the certificate. Conformant ACME clients should pick a random time within this window at which to renew the certificate. If this window is in the past, conforming clients SHOULD immediately attempt to renew the certificate. keyRotate (boolean, optional): A boolean indicating if the ACME server requires that the renewed certificate MUST use a new key pair. The HTTP response should contain a Retry-After heading indicating the polling interval that the ACME server recommends. Conforming ACME clients SHOULD use this value to determine their polling schedule, using the returned date as a lower bound for requesting information again, rather than using a fixed interval. This API is explicitly unauthenticated, and does not use the ACME POST-as-GET scheme, as none of the information used by this API is considered confidential. Conforming ACME servers may construct the renewal URLs included in order objects in any fashion they wish as long as the URL is stable for the lifetime of the certificate. Conforming clients should store this URL locally so that the ACME server does not need to be queried in order to learn the URL as the server may delete, or otherwise make unavailable, the related order object while the certificate is still valid. ### Discoverability & URL Construction Determining how the ACME server offers renewal information, and how the ACME client discovers this information, is a big question. We’ve proposed one design above, but acknowledge that there are trade-offs in our design which may make more sense to ACME server implementers than ACME client implementers. This section details those trade-offs with our proposed design, and another initial design we rejected. The design proposed uses a static URL, the format of which is not specified. These URLs are provided via the order object, and as they have no specified structure, cannot be derived from the certificate itself. This means that clients must store the URL locally in order to access the API or access the order to learn the URL, although as orders may expire, or become inaccessible during the lifetime of the certificate, this is not an ideal approach. This also means that ACME servers must continue to serve this specific URL for the lifetime of the certificate and cannot dynamically change where they serve this information from. Our initial design specified the construction of the URL, using a directory resource to point to the API endpoint and the SHA256 hash of the certificate as the token. The upside of this is that it would allow clients to construct the URL without any required local state other than the certificate itself. It would also allow the ACME server to change where it was serving the API endpoint from dynamically, as the client would need to query the directory to learn the first portion of the URL to append their hash to. The main downside here is that it requires clients to make two requests each time they want to access the API, one to the directory endpoint, and then another to the API endpoint. Depending on the design of the ACME server this could cause a significant increase in load, specifically to the directory endpoint. Another downside is that this requires that we specify the construction of the token beyond simply being unique, which adds complexity to the specification. We could possibly merge these two designs, such that the specification specifies how to construct the URL, and provides a directory entry, but RECOMMENDS that the ACME client store this URL locally in order to reduce load on the ACME server. In the case where the client makes a request and receives a 404, for instance because the server has changed where it serves the API endpoint from, it would then re-query the directory in order to reconstruct the URL. This would provide the benefits of both designs, with the benefit of inducing lower load on the ACME server, but would require a somewhat more complex client design. ## Acknowledgements This document draws heavily from an internal write-up of the issue by Jacob Hoffman-Andrews.
_______________________________________________ Acme mailing list Acme@ietf.org https://www.ietf.org/mailman/listinfo/acme