Re: [Acme] Practical concerns of draft-ietf-acme-ari

Ilari Liusvaara Wed, 19 Jul 2023 23:16:19 -0700

On Wed, Jul 19, 2023 at 03:05:52PM -0700, Aaron Gable wrote:
> Hi Matt,
> 
> On Fri, Jun 23, 2023 at 9:21 AM Matthew Holt <m...@dyanim.com> wrote:
> 
> > But when a renewal window does change, what does that mean? Well,
> > something is wrong. Either the certificate is being revoked, or the CA
> > anticipates downtime or availability issues.
> >
> 
> This is not true. Explicitly, by the spec, the renewal window changing
> means nothing. The situations you list are the motivations for writing the
> spec in the first place, but they are not the only motivations for changing
> the window in any given case. In fact, Let's Encrypt is currently
> considering adding random jitter to the renewal window every time it is
> requested, specifically to prevent interpretations like this, and to
> naturally even-out renewal spikes through Brownian motion.


However, clients might not react nicely to doing that while in window
or very near it.

E.g., the client might be deterministically generating renewal time
from window (the client I wrote does this). This works nicely if the
renewal window does not shift around. However, it becomes heavily
biased toward beginning of the window if the window shifts around.

 
> > 3) ARI does not scale well. Some ACME clients manage 10K+ certificates,
> > and in that case the client would have to check the ARI for at least 24
> > certificates per hour to get through them in a month. Deferring to the
> > Retry-After header may result in insufficient throughput. The current
> > expectation or convention is to check every certificate every 6-12 hours,
> > or tens of thousands of checks per day. One endpoint per certificate
> > multiple times per day is quite saturating. This is a considerable burden
> > for both ACME clients and servers. I would like to explore options that do
> > not involve 2+ HTTP requests per certificate.
> >
> 
> Totally agreed, we don't love the heavy-polling nature of ARI as it stands
> either. It's a lot of requests, and that's a large part of why we've
> striven to keep the response size so small. The original version of this
> was just a single timestamp. It's grown to two timestamps and an optional
> URL thanks to community feedback, but I'd be happy to reduce the response
> size again if we decide that prioritizing efficiency is more important than
> prioritizing third-party certificate monitoring tools.

The reason for having a window is third-party monitoring tools?

What the client I wrote ends up doing is de facto collapsing the window
into a single time. As anything else would be biased.

 
> Unfortunately, I don't currently have a different approach that I love. The
> 24-hour revocation timeline enforced by the BRs for certain kinds of
> revocations means that clients should be checking at least once every 24
> hours, regardless of mechanism. I'll comment more on your specific
> proposals to address this below.

One thing I said earlier was that it might make sense to split the info
endpoint from certificate control endpoint. In case a CA wishes to
stick the info endpoint on CDN.

 
> 4) Crafting the URL is convoluted. As Peter Cooper described it, "The core
> > issue is that the URL you need to construct is based on an OCSP structure
> > identifying the certificate, which requires taking one's existing
> > certificate and parsing out the serial number and issuer, and also taking
> > the intermediate certificate that signed it and getting its public key too.
> > So rather than just, like, using the fingerprint of the existing leaf or
> > something similarly simple that a lot of tooling can already give you, one
> > needs to really dig into both the leaf, and the intermediate, and hash
> > various pieces thereof, and then take all that to build a new ASN.1
> > structure." Why are we striving for near-parity with an OCSP request?? This
> > should be orthogonal to OCSP, right?
> >
> 
> This is great feedback. We picked this request format specifically because
> we thought it would be easy. It's good to know that we were wrong, and
> investigate what other request formats would work better.

Looking at the code of the client I wrote, some of the data processing
seems to be shared with OCSP (e.g., extracting the key from the SPKI),
but the final serialization is different, because OCSP is de facto still
stuck on SHA-1.

The single most annoying part of the process is the hash of the issuer
key. For that, you need the issuer certificate, while everything else
can be pulled from the subject certificate.

And the issuer key hash is not in practice even needed, because issuer
name is de facto key for the issuer key. A lot of stuff would likely
break if one had two different keys for the same issuer name.

 
> Allow me to provide a little bit of context for how we arrived at using the
> OCSP CertID structure:
> 
> We need a way to uniquely identify the certificate in question. ACME has
> one mechanism for doing so already: the URL provided by a finalized Order.
> Personally, my ideal would be to say "the ARI url is the Certificate URL
> concatenated with /ari". Unfortunately we can't do that, because there's
> nothing to prevent the URL provided by an Order from having query
> parameters, in which case appending a new path component would be
> incorrect. So, we could follow ACME's example, and provide a second
> "renewalInfo" URL in finalized Orders as well. Unfortunately, this a) means
> that clients have to persist this URL in order to use it, and b) clients
> which did not persist the URL (either ephemeral clients, or third-party
> certificate monitoring clients) cannot construct the URL at all.

Thinking about client I have written, implementing that would not be
very nasty: I would persist it into comment in the certificate PEM
(the code already does stuff like persisting the keyfile path into
CSR PEM).


> So we need a way to uniquely identify a certificate which can be
> constructed from the certificate itself. The serial seems like an obvious
> candidate. However, serials are only required to be unique on a per-issuer
> basis, and a single ACME server may issue from multiple issuer
> certificates. It turns out that OCSP already has a solution for this:
> combine the serial with a unique identifier of the issuer. And OCSP's
> solution even comes with algorithm agility for how the unique identifier of
> the issuer is computed! That's nice. So we took OCSP's request format,
> stripped away the pieces not pertaining to identifying a single
> certificate, et voila, the CertID.

It turns out I faced similar problem when designing the assume-revoked
mechanism in the ACME client I wrote (to replace force-renew with
something less of a footgun).

What it does is combine hex form of issuer name hash with the serial
(with separator in the middle)

let issuer = tbs_cert.asn1_sequence_out().map_err(Issuer2)?;

... So it includes the SEQUENCE tag and the length field.


The known Let's Encrypt issuer hashes are:

E1: e498ea6f9d0ab27ba56e7cec29600a572a0323c659b23fb08fdc05ce43961a2e
E2: 53e671ad99b92c914dece8377ff2fada1e9273cad39505771ad6623dd93bb6ba
R3: bde18ac64b30e7e33c6407fcc625b80a8be4e59000aefe703506d2bf7645f810
R4: 314f6f029369791f0e310e483dfcecaebdef469e6e168a28f0d426dc5fc8d516



> We believed this would be easy because many ACME clients are written in
> languages or running in environments that already have access to robust
> OCSP libraries. I wrote the first version of this
> <https://github.com/letsencrypt/boulder/blob/73b72e8fa2d852a40753926c34f38313a7db083d/wfe2/wfe_test.go#L3517-L3538>
> (constructing
> an OCSP request, parsing it, extracting the relevant parameters, and
> serializing them into a CertID) in a few minutes. Again, it's useful to
> know that we were wrong.

That would require having OCSP request URL generation code support
SHA-256, instead of hardcoding the de-facto required SHA-1.

 
> This leads to the question of: what should we use to uniquely identify the
> certificate instead? Certainly we could go with the "fingerprint" or
> "thumbprint" (a sha256 hash of DER bytes or PEM encoding, depending on who
> you ask, of the certificate) if people think that is sufficiently simple,
> easy to specify, unique, and future-proof. We could also go with "just the
> Serial", and force existing ACME servers to choose between either keeping
> serials unique across all issuers they represent, or splitting the server
> into multiple servers which each represent just a single issuer. Or we
> could return to the "url in the Order object" approach we started with. I'm
> curious what path forward people think is best.

See above what I ended up with trying to solve that problem.
 

> Now, I *am* a fan of adding a field to newOrder requests which uniquely
> identifies the cert being replaced. If such a field is populated, the CA
> would treat it the same as if the client had made a POST request to mark
> the certificate as replaced (Section 4.2 of the current draft). This has
> many nice effects, like letting the CA track renewals explicitly (instead
> of attempting to identify them with heuristics), letting renewal requests
> bypass rate limits, and more. I just don't think it elegantly replaces the
> renewalInfo endpoint itself.

Presumably the replacing would actually only happen when the order is
finalized. Replacing at order creation time could cause problems if the
order creation response is lost and the client retries it.

And then, even if certificate is obtained, it might not be deployed into
production immediately. For multiple reasons, e.g.:

- The system might batch renewals and only reload the reverse proxy at
  end of operation (because that might be slow operation).

- There is a Chrome bug that causes some users to get invalid
  certificate errors if the certificate is too new. Working around
  that would require holding the certificate uninstalled for some
  time (I think one hour should be enough).

 
> On the one hand, I'm in complete agreement, it would be great to have a
> "batch" endpoint that returns suggested windows for all certificates
> associated with a given account, or matching some other criteria. On the
> other hand, there's a reason that Let's Encrypt diverges from RFC8555 and
> does not implement the "orders" field on account objects: endpoints which
> serve unboundedly-large documents and require paging are difficult to
> implement correctly on both the server and client side, and can quickly
> lead to disruptive database queries.

Yeah, there are probably some "whale" accounts with lots of
certificates. And perversely, those would be exactly the ones for
whom this would be most useful!

However, since HTTPS setup is much more expensive than HTTP query,
querying status of N certificates is much less than N times as
expensive as querying status of one certificate.

 
> > And finally, I want to bring attention to the longer-term prospects for
> > ARI: it's quite possible that ARI will become irrelevant before it is
> > widely adopted by most clients. This itself may discourage adoption. As
> > stated above, ARI has two primary use cases: revocation and traffic
> > smoothing. As we push for shorter certificate lifetimes, revocation should
> > become irrelevant. And traffic smoothing will perhaps become a natural
> > consequence as clients are renewing more frequently anyway. We all know
> > revocation and long-lived certificates are broken, so I'd rather WebPKI
> > developers focus our energy on the ACTUAL goal: short-lived certificates.
> > We should not be focusing our ecosystem resources on infrastructure that
> > acts as a band-aid for a broken leg.
> >
> 
> This is an interesting point. ARI was first conceived
> <https://bugzilla.mozilla.org/show_bug.cgi?id=1619179#c7> as a way to
> improve business continuity across mass revocation events, and grew from
> there. The idea that 10-day certs might be a reality, and that revocation
> would be wholly optional for them, was almost unimaginable at that time.
> But even today, the reality is that CAs such as Let's Encrypt will likely
> have to support revocation for a very long time to come: migrating the
> whole world to 10-day certs will not happen overnight. So I think that this
> work is worthwhile, even if other solutions are also on the horizon.

Yes, dealing with revocations is maybe the most important usecase for
ARI.

Well, I don't think short-lived certs were unimaginable at the time.
Thought that CAs would not do those, yes.




-Ilari

_______________________________________________
Acme mailing list
Acme@ietf.org
https://www.ietf.org/mailman/listinfo/acme

Re: [Acme] Practical concerns of draft-ietf-acme-ari

Reply via email to