Re: CCADB Proposal: Add field called JSON Array of Partitioned CRLs Issued By This CA

2021-02-26 Thread Aaron Gable via dev-security-policy
On Fri, Feb 26, 2021 at 5:18 PM Ryan Sleevi  wrote:

> I do believe it's problematic for the OCSP and CRL versions of the
> repository to be out of sync, but also agree this is an area that is useful
> to clarify. To that end, I filed
> https://github.com/cabforum/servercert/issues/252 to make sure we don't
> lose track of this for the BRs.
>

Thanks! I like that bug, and commented on it to provide a little more
clarity for how the question arose in my mind and what language we might
want to update. It sounds like maybe what we want is language to the effect
that, if a CA is publishing both OCSP and CRLs, then a certificate is not
considered Revoked until it shows up as Revoked in both revocation
mechanisms. (And it must be Revoked within 24 hours.)

We'll make sure our parallel CRL infrastructure re-issues CRLs
close-to-immediately after a certificate in that shard's scope is revoked,
just as we do for OCSP today.

Thanks again,
Aaron
___
dev-security-policy mailing list
dev-security-policy@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-security-policy


Re: CCADB Proposal: Add field called JSON Array of Partitioned CRLs Issued By This CA

2021-02-26 Thread Ryan Sleevi via dev-security-policy
On Fri, Feb 26, 2021 at 6:01 PM Aaron Gable  wrote:

> On Fri, Feb 26, 2021 at 12:05 PM Ryan Sleevi  wrote:
>
>> You can still do parallel signing. I was trying to account for that
>> explicitly with the notion of the “pre-reserved” set of URLs. However, that
>> also makes an assumption I should have been more explicit about: whether
>> the expectation is “you declare, then fill, CRLs”, or whether it’s
>> acceptable to “fill, then declare, CRLs”. I was trying to cover the former,
>> but I don’t think there is any innate prohibition on the latter, and it was
>> what I was trying to call out in the previous mail.
>>
>> I do take your point about deterministically, because the process I’m
>> describing is implicitly assuming you have a work queue (e.g. pub/sub, go
>> channel, etc), in which certs to revoke go in, and one or more CRL signers
>> consume the queue and produce CRLs. The order of that consumption would be
>> non-deterministic, but it very much would be parallelizable, and you’d be
>> in full control over what the work unit chunks were sized at.
>>
>> Right, neither of these are required if you can “produce, then declare”.
>> From the client perspective, a consuming party cannot observe any
>> meaningful difference from the “declare, then produce” or the “produce,
>> then declare”, since in both cases, they have to wait for the CRL to be
>> published on the server before they can consume. The fact that they know
>> the URL, but the content is stale/not yet updated (I.e. the declare then
>> produce scenario) doesn’t provide any advantages. Ostensibly, the “produce,
>> then declare” gives greater advantage to the client/root program, because
>> then they can say “All URLs must be correct at time of declaration” and use
>> that to be able to quantify whether or not the CA met their timeline
>> obligations for the mass revocation event.
>>
>
> I think we managed to talk slightly past each other, but we're well into
> the weeds of implementation details so it probably doesn't matter much :)
> The question in my mind was not "can there be multiple CRL signers
> consuming revocations from the queue?"; but rather "assuming there are
> multiple CRL signers consuming revocations from the queue, what
> synchronization do they have to do to ensure that multiple signers don't
> decide the old CRL is full and allocate new ones at the same time?". In the
> world where every certificate is pre-allocated to a CRL shard, no such
> synchronization is necessary at all.
>

Oh, I meant they could be signing independent CRLs (e.g. each has an IDP
with a prefix indicating which shard-generator is running), and at the end
of the queue-draining ceremony, you see what CRLs each worker created, and
add those to the JSON. So you could have multiple "small" CRLs (one or more
for each worker, depending on how you manage things), allowing them to
process revocations wholly independently. This, of course, relies again on
the assumption that the cRLDP is not baked into the certificate, which
enables you to have maximum flexibility in how CRL URLs are allocated and
sharded, provided the sum union of all of their contents reflects the CA's
state.


> This conversation does raise a different question in my mind. The Baseline
> Requirements do not have a provision that requires that a CRL be re-issued
> within 24 hours of the revocation of any certificate which falls within its
> scope. CRLs and OCSP responses for Intermediate CAs are clearly required to
> receive updates within 24 hours of the revocation of a relevant certificate
> (sections 4.9.7 and 4.9.10 respectively), but no such requirement appears
> to exist for end-entity CRLs. The closest is the requirement that
> subscriber certificates be revoked within 24 hours after certain conditions
> are met, but the same structure exists for the conditions under which
> Intermediate CAs must be revoked, suggesting that the BRs believe there is
> a difference between revoking a certificate and *publishing* that
> revocation via OCSP or CRLs. Is this distinction intended by the root
> programs, and does anyone intend to change this status quo as more emphasis
> is placed on end-entity CRLs?
>
> Or more bluntly: in the presence of OCSP and CRLs being published side by
> side, is it expected that the CA MUST re-issue a sharded end-entity CRL
> within 24 hours of revoking a certificate in its scope, or may the CA wait
> to re-issue the CRL until its next 7-day re-issuance time comes up as
> normal?
>

I recall this came up in the past (with DigiCert, [1]), in which
"revocation" was enacted by setting a flag in a database (or perhaps that
was an *extra* incident, with a different CA), but not through the actual
publication and propagation of that revocation information from DigiCert's
systems through the CDN. The issue at the time was with respect to
4.9.1.1's requirements of whether or "SHALL revoke" is a matter of merely a
server-side bit, or whether it's the actual publication of that 

Re: CCADB Proposal: Add field called JSON Array of Partitioned CRLs Issued By This CA

2021-02-26 Thread Aaron Gable via dev-security-policy
On Fri, Feb 26, 2021 at 12:05 PM Ryan Sleevi  wrote:

> You can still do parallel signing. I was trying to account for that
> explicitly with the notion of the “pre-reserved” set of URLs. However, that
> also makes an assumption I should have been more explicit about: whether
> the expectation is “you declare, then fill, CRLs”, or whether it’s
> acceptable to “fill, then declare, CRLs”. I was trying to cover the former,
> but I don’t think there is any innate prohibition on the latter, and it was
> what I was trying to call out in the previous mail.
>
> I do take your point about deterministically, because the process I’m
> describing is implicitly assuming you have a work queue (e.g. pub/sub, go
> channel, etc), in which certs to revoke go in, and one or more CRL signers
> consume the queue and produce CRLs. The order of that consumption would be
> non-deterministic, but it very much would be parallelizable, and you’d be
> in full control over what the work unit chunks were sized at.
>
> Right, neither of these are required if you can “produce, then declare”.
> From the client perspective, a consuming party cannot observe any
> meaningful difference from the “declare, then produce” or the “produce,
> then declare”, since in both cases, they have to wait for the CRL to be
> published on the server before they can consume. The fact that they know
> the URL, but the content is stale/not yet updated (I.e. the declare then
> produce scenario) doesn’t provide any advantages. Ostensibly, the “produce,
> then declare” gives greater advantage to the client/root program, because
> then they can say “All URLs must be correct at time of declaration” and use
> that to be able to quantify whether or not the CA met their timeline
> obligations for the mass revocation event.
>

I think we managed to talk slightly past each other, but we're well into
the weeds of implementation details so it probably doesn't matter much :)
The question in my mind was not "can there be multiple CRL signers
consuming revocations from the queue?"; but rather "assuming there are
multiple CRL signers consuming revocations from the queue, what
synchronization do they have to do to ensure that multiple signers don't
decide the old CRL is full and allocate new ones at the same time?". In the
world where every certificate is pre-allocated to a CRL shard, no such
synchronization is necessary at all.

This conversation does raise a different question in my mind. The Baseline
Requirements do not have a provision that requires that a CRL be re-issued
within 24 hours of the revocation of any certificate which falls within its
scope. CRLs and OCSP responses for Intermediate CAs are clearly required to
receive updates within 24 hours of the revocation of a relevant certificate
(sections 4.9.7 and 4.9.10 respectively), but no such requirement appears
to exist for end-entity CRLs. The closest is the requirement that
subscriber certificates be revoked within 24 hours after certain conditions
are met, but the same structure exists for the conditions under which
Intermediate CAs must be revoked, suggesting that the BRs believe there is
a difference between revoking a certificate and *publishing* that
revocation via OCSP or CRLs. Is this distinction intended by the root
programs, and does anyone intend to change this status quo as more emphasis
is placed on end-entity CRLs?

Or more bluntly: in the presence of OCSP and CRLs being published side by
side, is it expected that the CA MUST re-issue a sharded end-entity CRL
within 24 hours of revoking a certificate in its scope, or may the CA wait
to re-issue the CRL until its next 7-day re-issuance time comes up as
normal?

Agreed - I do think having a well-tested, reliable path for programmatic
> update is an essential property to mandating the population. My hope and
> belief, however, is that this is fairly light-weight and doable.
>

 Thanks, I look forward to hearing more about what this will look like.

Aaron
___
dev-security-policy mailing list
dev-security-policy@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-security-policy


Re: CCADB Proposal: Add field called JSON Array of Partitioned CRLs Issued By This CA

2021-02-26 Thread Ryan Sleevi via dev-security-policy
On Fri, Feb 26, 2021 at 1:46 PM Aaron Gable  wrote:

> If we leave out the "new url for each re-issuance of a given CRL" portion
> of the design (or offer both url-per-thisUpdate and
> static-url-always-pointing-at-the-latest), then we could in fact include
> CRLDP urls in the certificates using the rolling time-based shards model.
> And frankly we may want to do that in the near future: maintaining both CRL
> *and* OCSP infrastructure when the BRs require only one or the other is an
> unnecessary expense, and turning down our OCSP infrastructure would
> constitute a significant savings, both in tangible bills and in engineering
> effort.
>

This isn’t quite correct. You MUST support OCSP for EE certs. It is only
optional for intermediates. So you can’t really contemplate turning down
the OCSP side, and that’s intentional, because clients use OCSP, rather
than CRLs, as the fallback mechanism for when the aggregated-CRLs fail.

I think it would be several years off before we could practically talk
about removing the OCSP requirement, once much more reliable CRL profiles
are in place, which by necessity would also mean profiling the acceptable
sharding algorithms.

Further, under today’s model, while you COULD place the CRLDP within the
certificate, that seems like it would only introduce additional cost and
limitation without providing you benefit. This is because major clients
won’t fetch the CRLDP for EE certs (especially if OCSP is present, which
the BRs MUST/REQUIRE). You would end up with some clients querying (such as
Java, IIRC), so you’d be paying for bandwidth, especially in your mass
revocation scenario, that would largely be unnecessary compared to the
status quo.

Thus, in my mind, the dynamic sharding idea you outlined has two major
> downsides:
> 1) It requires us to maintain our parallel OCSP infrastructure
> indefinitely, and
>

To the above, I think this should be treated as a foregone conclusion in
today’s requirements. So I think mostly the discussion here focuses on #2,
which is really useful.

2) It is much less resilient in the face of a mass revocation event.
>
> Fundamentally, we need our infrastructure to be able to handle the
> revocation of 200M certificates in 24 hours without any difference from how
> it handles the revocation of one certificate in the same period. Already
> having certificates pre-allocated into CRL shards means that we can
> deterministically sign many CRLs in parallel.
>

You can still do parallel signing. I was trying to account for that
explicitly with the notion of the “pre-reserved” set of URLs. However, that
also makes an assumption I should have been more explicit about: whether
the expectation is “you declare, then fill, CRLs”, or whether it’s
acceptable to “fill, then declare, CRLs”. I was trying to cover the former,
but I don’t think there is any innate prohibition on the latter, and it was
what I was trying to call out in the previous mail.

I do take your point about deterministically, because the process I’m
describing is implicitly assuming you have a work queue (e.g. pub/sub, go
channel, etc), in which certs to revoke go in, and one or more CRL signers
consume the queue and produce CRLs. The order of that consumption would be
non-deterministic, but it very much would be parallelizable, and you’d be
in full control over what the work unit chunks were sized at.

>
> Dynamically assigning certificates to CRLs as they are revoked requires
> taking a lock to determine if a new CRL needs to be created or not, and
> then atomically creating a new one. Or it requires a separate,
> not-operation-as-normal process to allocate a bunch of new CRLs, assign
> certs to them, and then sign those in parallel. Neither of these --
> dramatically changing not just the quantity but the *quality* of the
> database access, nor introducing additional processes -- is acceptable in
> the face of a mass revocation event.
>

Right, neither of these are required if you can “produce, then declare”.
From the client perspective, a consuming party cannot observe any
meaningful difference from the “declare, then produce” or the “produce,
then declare”, since in both cases, they have to wait for the CRL to be
published on the server before they can consume. The fact that they know
the URL, but the content is stale/not yet updated (I.e. the declare then
produce scenario) doesn’t provide any advantages. Ostensibly, the “produce,
then declare” gives greater advantage to the client/root program, because
then they can say “All URLs must be correct at time of declaration” and use
that to be able to quantify whether or not the CA met their timeline
obligations for the mass revocation event.

In any case, I think this conversation has served the majority of its
> purpose. This discussion has led to several ideas that would allow us to
> update our JSON document only when we create new shards (which will still
> likely be every 6 to 24 hours), as opposed to on every re-issuance of a
> 

Re: CCADB Proposal: Add field called JSON Array of Partitioned CRLs Issued By This CA

2021-02-26 Thread Aaron Gable via dev-security-policy
Thanks for the reminder that CCADB automatically dereferences URLs for
archival purposes, and for the info about existing automation! I don't
personally have CCADB credentials, so all of my knowledge of it is based on
what I've learned from others at LE and from this list.

If we leave out the "new url for each re-issuance of a given CRL" portion
of the design (or offer both url-per-thisUpdate and
static-url-always-pointing-at-the-latest), then we could in fact include
CRLDP urls in the certificates using the rolling time-based shards model.
And frankly we may want to do that in the near future: maintaining both CRL
*and* OCSP infrastructure when the BRs require only one or the other is an
unnecessary expense, and turning down our OCSP infrastructure would
constitute a significant savings, both in tangible bills and in engineering
effort.

Thus, in my mind, the dynamic sharding idea you outlined has two major
downsides:
1) It requires us to maintain our parallel OCSP infrastructure
indefinitely, and
2) It is much less resilient in the face of a mass revocation event.

Fundamentally, we need our infrastructure to be able to handle the
revocation of 200M certificates in 24 hours without any difference from how
it handles the revocation of one certificate in the same period. Already
having certificates pre-allocated into CRL shards means that we can
deterministically sign many CRLs in parallel.

Dynamically assigning certificates to CRLs as they are revoked requires
taking a lock to determine if a new CRL needs to be created or not, and
then atomically creating a new one. Or it requires a separate,
not-operation-as-normal process to allocate a bunch of new CRLs, assign
certs to them, and then sign those in parallel. Neither of these --
dramatically changing not just the quantity but the *quality* of the
database access, nor introducing additional processes -- is acceptable in
the face of a mass revocation event.

In any case, I think this conversation has served the majority of its
purpose. This discussion has led to several ideas that would allow us to
update our JSON document only when we create new shards (which will still
likely be every 6 to 24 hours), as opposed to on every re-issuance of a
shard. We'd still greatly prefer that CCADB be willing to
accept-and-dereference a URL to a JSON document, as it would allow our
systems to have fewer dependencies and fewer failure modes, but understand
that our arguments may not be persuasive enough :)

If Mozilla et al. do go forward with this proposal as-is, I'd like to
specifically request that CCADB surfaces an API to update this field before
any root programs require that it be populated, and does so with sufficient
lead time for development against the API to occur.

Thanks again,
Aaron

On Fri, Feb 26, 2021 at 8:47 AM Ryan Sleevi  wrote:

>
>
> On Fri, Feb 26, 2021 at 5:49 AM Rob Stradling  wrote:
>
>> > We already have automation for CCADB. CAs can and do use it for
>> disclosure of intermediates.
>>
>> Any CA representatives that are surprised by this statement might want to
>> go and read the "CCADB Release Notes" (click the hyperlink when you
>> login to the CCADB).  That's the only place I've seen the CCADB API
>> "announced".
>>
>> > Since we're talking Let's Encrypt, the assumption here is that the CRL
>> URLs
>> > will not be present within the crlDistributionPoints of the
>> certificates,
>> > otherwise, this entire discussion is fairly moot, since those
>> > crlDistributionPoints can be obtained directly from Certificate T
>> ransparency.
>>
>> AIUI, Mozilla is moving towards requiring that the CCADB holds all CRL
>> URLs, even the ones that also appear in crlDistributionPoints extensions.
>> Therefore, I think that this entire discussion is not moot at all.
>>
>
> Rob,
>
> I think you misparsed, but that's understandable, because I worded it
> poorly. The discussion is mooted by whether or not the CA includes the
> cRLDP within the certificate itself - i.e. that the CA has to allocate the
> shard at issuance time and that it's fixed for the lifetime of the
> certificate. That's not a requirement - EEs don't need cRLDPs - and so
> there's no inherent need to do static assignment, nor does it sound like LE
> is looking to go that route, since it would be incompatible with the design
> they outlined. Because of this, the dynamic sharding discussed seems
> significantly _less_ complex, both for producers and for consumers of this
> data, than the static sharding-and-immutability scheme proposed.
>
___
dev-security-policy mailing list
dev-security-policy@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-security-policy


Re: Policy 2.7.1: MRSP Issue #206: Limit re-use of domain name verification to 398 days

2021-02-26 Thread Ryan Sleevi via dev-security-policy
On Thu, Feb 25, 2021 at 7:55 PM Clint Wilson via dev-security-policy <
dev-security-policy@lists.mozilla.org> wrote:

> I think it makes sense to separate out the date for domain validation
> expiration from the issuance of server certificates with previously
> validated domain names, but agree with Ben that the timeline doesn’t seem
> to need to be prolonged. What about something like this:
>
> 1. Domain name or IP address verifications performed on or after July 1,
> 2021 may be reused for a maximum of 398 days.
> 2. Server certificates issued on or after September 1, 2021 must have
> completed domain name or IP address verification within the preceding 398
> days.
>
> This effectively stretches the “cliff” out across ~6 months (now through
> the end of August), which seems reasonable.
>

Yeah, that does sound reasonable.
___
dev-security-policy mailing list
dev-security-policy@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-security-policy


Re: CCADB Proposal: Add field called JSON Array of Partitioned CRLs Issued By This CA

2021-02-26 Thread Ryan Sleevi via dev-security-policy
On Fri, Feb 26, 2021 at 5:49 AM Rob Stradling  wrote:

> > We already have automation for CCADB. CAs can and do use it for
> disclosure of intermediates.
>
> Any CA representatives that are surprised by this statement might want to
> go and read the "CCADB Release Notes" (click the hyperlink when you login
> to the CCADB).  That's the only place I've seen the CCADB API "announced".
>
> > Since we're talking Let's Encrypt, the assumption here is that the CRL
> URLs
> > will not be present within the crlDistributionPoints of the certificates,
> > otherwise, this entire discussion is fairly moot, since those
> > crlDistributionPoints can be obtained directly from Certificate T
> ransparency.
>
> AIUI, Mozilla is moving towards requiring that the CCADB holds all CRL
> URLs, even the ones that also appear in crlDistributionPoints extensions.
> Therefore, I think that this entire discussion is not moot at all.
>

Rob,

I think you misparsed, but that's understandable, because I worded it
poorly. The discussion is mooted by whether or not the CA includes the
cRLDP within the certificate itself - i.e. that the CA has to allocate the
shard at issuance time and that it's fixed for the lifetime of the
certificate. That's not a requirement - EEs don't need cRLDPs - and so
there's no inherent need to do static assignment, nor does it sound like LE
is looking to go that route, since it would be incompatible with the design
they outlined. Because of this, the dynamic sharding discussed seems
significantly _less_ complex, both for producers and for consumers of this
data, than the static sharding-and-immutability scheme proposed.
___
dev-security-policy mailing list
dev-security-policy@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-security-policy


Re: CCADB Proposal: Add field called JSON Array of Partitioned CRLs Issued By This CA

2021-02-26 Thread Rob Stradling via dev-security-policy
> We already have automation for CCADB. CAs can and do use it for disclosure of 
> intermediates.

Any CA representatives that are surprised by this statement might want to go 
and read the "CCADB Release Notes" (click the hyperlink when you login to the 
CCADB).  That's the only place I've seen the CCADB API "announced".

> Since we're talking Let's Encrypt, the assumption here is that the CRL URLs
> will not be present within the crlDistributionPoints of the certificates,
> otherwise, this entire discussion is fairly moot, since those
> crlDistributionPoints can be obtained directly from Certificate Transparency.

AIUI, Mozilla is moving towards requiring that the CCADB holds all CRL URLs, 
even the ones that also appear in crlDistributionPoints extensions.  Therefore, 
I think that this entire discussion is not moot at all.

Ben's placeholder text:
https://github.com/BenWilson-Mozilla/pkipolicy/commit/26c1ee4ea8be1a07f86253e38fbf0cc043e12d48


From: dev-security-policy  on 
behalf of Ryan Sleevi via dev-security-policy 

Sent: 26 February 2021 06:02
To: Aaron Gable 
Cc: Ryan Sleevi ; mozilla-dev-security-policy 
; Kathleen Wilson 

Subject: Re: CCADB Proposal: Add field called JSON Array of Partitioned CRLs 
Issued By This CA

CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you recognize the sender and know the content 
is safe.


On Thu, Feb 25, 2021 at 8:21 PM Aaron Gable  wrote:

> If I may, I believe that the problem is less that it is a reference (which
> is true of every URL stored in CCADB), and more that it is a reference to
> an unsigned object.
>

While that's a small part, it really is as I said: the issue of being a
reference. We've already had this issue with the other URL fields, and thus
there exists logic to dereference and archive those URLs within CCADB.
Issues like audit statements, CP, and CPSes are all things that are indeed
critical to understanding the posture of a CA over time, and so actually
having those materials in something stable and maintained (without a
dependence on the CA) is important.  It's the lesson from those
various past failure modes that had Google very supportive of the non-URL
based approach, putting the JSON directly in CCADB, rather than forcing yet
another "update-and-fetch" system.. You're absolutely correct
that the "configured by CA" element has the nice property of being assured
that the change came from the CA themselves, without requiring signing, but
I wouldn't want to reduce the concern to just that.

* I'm not aware of any other automation system with write-access to CCADB
> (I may be very wrong!), and I imagine there would need to be some sort of
> further design discussion with CCADB's maintainers about what it means to
> give write credentials to an automated system, what sorts of protections
> would be necessary around those credentials, how to scope those credentials
> as narrowly as possible, and more.
>

We already have automation for CCADB. CAs can and do use it for disclosure
of intermediates.


> * I'm not sure CCADB's maintainers want updates to it to be in the
> critical path of ongoing issuance, as opposed to just in the critical path
> for beginning issuance with a new issuer.
>

Without wanting to sound dismissive, whether or not it's in a critical path
of updating is the CA's choice on their design. I understand that there are
designs that could put it there, I think the question is whether it's
reasonable for the CA to have done that in the first place, which is why
it's important to drill down into these concerns. I know you merely
qualified it as undesirable, rather than actually being a blocker, and I
appreciate that, but I do think some of these concerns are perhaps less
grounded or persuasive than others :)

Taking a step back here, I think there's been a fundamental design error in
your proposed design, and I think that it, combined with the (existing)
automation, may make much of this not actually be the issue you anticipate.

Since we're talking Let's Encrypt, the assumption here is that the CRL URLs
will not be present within the crlDistributionPoints of the certificates,
otherwise, this entire discussion is fairly moot, since those
crlDistributionPoints can be obtained directly from Certificate
Transparency.

The purpose of this field is to help discover CRLs that are otherwise not
discoverable (e.g. from CT), but this also means that these CRLs do not
suffer from the same design limitations of PKI. Recall that there's nothing
intrinsic to a CRL that expresses its sharding algorithm (ignoring, for a
second, reasonCodes within the IDP extension). The only observability that
an external (not-the-CA) party has, whether the Subscriber or the RP, is
merely that "the CRL DP for this certificate is different from the CRLDP
for that certificate". It is otherwise opaque how the CA used it, even if
through a large enough corpus from CT,