Re: CCADB Proposal: Add field called JSON Array of Partitioned CRLs Issued By This CA

2021-02-26 Thread Aaron Gable via dev-security-policy
On Fri, Feb 26, 2021 at 5:18 PM Ryan Sleevi  wrote:

> I do believe it's problematic for the OCSP and CRL versions of the
> repository to be out of sync, but also agree this is an area that is useful
> to clarify. To that end, I filed
> https://github.com/cabforum/servercert/issues/252 to make sure we don't
> lose track of this for the BRs.
>

Thanks! I like that bug, and commented on it to provide a little more
clarity for how the question arose in my mind and what language we might
want to update. It sounds like maybe what we want is language to the effect
that, if a CA is publishing both OCSP and CRLs, then a certificate is not
considered Revoked until it shows up as Revoked in both revocation
mechanisms. (And it must be Revoked within 24 hours.)

We'll make sure our parallel CRL infrastructure re-issues CRLs
close-to-immediately after a certificate in that shard's scope is revoked,
just as we do for OCSP today.

Thanks again,
Aaron
___
dev-security-policy mailing list
dev-security-policy@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-security-policy


Re: CCADB Proposal: Add field called JSON Array of Partitioned CRLs Issued By This CA

2021-02-26 Thread Ryan Sleevi via dev-security-policy
On Fri, Feb 26, 2021 at 6:01 PM Aaron Gable  wrote:

> On Fri, Feb 26, 2021 at 12:05 PM Ryan Sleevi  wrote:
>
>> You can still do parallel signing. I was trying to account for that
>> explicitly with the notion of the “pre-reserved” set of URLs. However, that
>> also makes an assumption I should have been more explicit about: whether
>> the expectation is “you declare, then fill, CRLs”, or whether it’s
>> acceptable to “fill, then declare, CRLs”. I was trying to cover the former,
>> but I don’t think there is any innate prohibition on the latter, and it was
>> what I was trying to call out in the previous mail.
>>
>> I do take your point about deterministically, because the process I’m
>> describing is implicitly assuming you have a work queue (e.g. pub/sub, go
>> channel, etc), in which certs to revoke go in, and one or more CRL signers
>> consume the queue and produce CRLs. The order of that consumption would be
>> non-deterministic, but it very much would be parallelizable, and you’d be
>> in full control over what the work unit chunks were sized at.
>>
>> Right, neither of these are required if you can “produce, then declare”.
>> From the client perspective, a consuming party cannot observe any
>> meaningful difference from the “declare, then produce” or the “produce,
>> then declare”, since in both cases, they have to wait for the CRL to be
>> published on the server before they can consume. The fact that they know
>> the URL, but the content is stale/not yet updated (I.e. the declare then
>> produce scenario) doesn’t provide any advantages. Ostensibly, the “produce,
>> then declare” gives greater advantage to the client/root program, because
>> then they can say “All URLs must be correct at time of declaration” and use
>> that to be able to quantify whether or not the CA met their timeline
>> obligations for the mass revocation event.
>>
>
> I think we managed to talk slightly past each other, but we're well into
> the weeds of implementation details so it probably doesn't matter much :)
> The question in my mind was not "can there be multiple CRL signers
> consuming revocations from the queue?"; but rather "assuming there are
> multiple CRL signers consuming revocations from the queue, what
> synchronization do they have to do to ensure that multiple signers don't
> decide the old CRL is full and allocate new ones at the same time?". In the
> world where every certificate is pre-allocated to a CRL shard, no such
> synchronization is necessary at all.
>

Oh, I meant they could be signing independent CRLs (e.g. each has an IDP
with a prefix indicating which shard-generator is running), and at the end
of the queue-draining ceremony, you see what CRLs each worker created, and
add those to the JSON. So you could have multiple "small" CRLs (one or more
for each worker, depending on how you manage things), allowing them to
process revocations wholly independently. This, of course, relies again on
the assumption that the cRLDP is not baked into the certificate, which
enables you to have maximum flexibility in how CRL URLs are allocated and
sharded, provided the sum union of all of their contents reflects the CA's
state.


> This conversation does raise a different question in my mind. The Baseline
> Requirements do not have a provision that requires that a CRL be re-issued
> within 24 hours of the revocation of any certificate which falls within its
> scope. CRLs and OCSP responses for Intermediate CAs are clearly required to
> receive updates within 24 hours of the revocation of a relevant certificate
> (sections 4.9.7 and 4.9.10 respectively), but no such requirement appears
> to exist for end-entity CRLs. The closest is the requirement that
> subscriber certificates be revoked within 24 hours after certain conditions
> are met, but the same structure exists for the conditions under which
> Intermediate CAs must be revoked, suggesting that the BRs believe there is
> a difference between revoking a certificate and *publishing* that
> revocation via OCSP or CRLs. Is this distinction intended by the root
> programs, and does anyone intend to change this status quo as more emphasis
> is placed on end-entity CRLs?
>
> Or more bluntly: in the presence of OCSP and CRLs being published side by
> side, is it expected that the CA MUST re-issue a sharded end-entity CRL
> within 24 hours of revoking a certificate in its scope, or may the CA wait
> to re-issue the CRL until its next 7-day re-issuance time comes up as
> normal?
>

I recall this came up in the past (with DigiCert, [1]), in which
"revocation" was enacted by setting a flag in a database (or perhaps that
was an *extra* incident, with a different CA), but not through the actual
publication and propagation of that revocation information from DigiCert's
systems through the CDN. The issue at the time was with respect to
4.9.1.1's requirements of whether or "SHALL revoke" is a matter of merely a
server-side bit, or whether it's the actual publication of that 

Re: CCADB Proposal: Add field called JSON Array of Partitioned CRLs Issued By This CA

2021-02-26 Thread Aaron Gable via dev-security-policy
On Fri, Feb 26, 2021 at 12:05 PM Ryan Sleevi  wrote:

> You can still do parallel signing. I was trying to account for that
> explicitly with the notion of the “pre-reserved” set of URLs. However, that
> also makes an assumption I should have been more explicit about: whether
> the expectation is “you declare, then fill, CRLs”, or whether it’s
> acceptable to “fill, then declare, CRLs”. I was trying to cover the former,
> but I don’t think there is any innate prohibition on the latter, and it was
> what I was trying to call out in the previous mail.
>
> I do take your point about deterministically, because the process I’m
> describing is implicitly assuming you have a work queue (e.g. pub/sub, go
> channel, etc), in which certs to revoke go in, and one or more CRL signers
> consume the queue and produce CRLs. The order of that consumption would be
> non-deterministic, but it very much would be parallelizable, and you’d be
> in full control over what the work unit chunks were sized at.
>
> Right, neither of these are required if you can “produce, then declare”.
> From the client perspective, a consuming party cannot observe any
> meaningful difference from the “declare, then produce” or the “produce,
> then declare”, since in both cases, they have to wait for the CRL to be
> published on the server before they can consume. The fact that they know
> the URL, but the content is stale/not yet updated (I.e. the declare then
> produce scenario) doesn’t provide any advantages. Ostensibly, the “produce,
> then declare” gives greater advantage to the client/root program, because
> then they can say “All URLs must be correct at time of declaration” and use
> that to be able to quantify whether or not the CA met their timeline
> obligations for the mass revocation event.
>

I think we managed to talk slightly past each other, but we're well into
the weeds of implementation details so it probably doesn't matter much :)
The question in my mind was not "can there be multiple CRL signers
consuming revocations from the queue?"; but rather "assuming there are
multiple CRL signers consuming revocations from the queue, what
synchronization do they have to do to ensure that multiple signers don't
decide the old CRL is full and allocate new ones at the same time?". In the
world where every certificate is pre-allocated to a CRL shard, no such
synchronization is necessary at all.

This conversation does raise a different question in my mind. The Baseline
Requirements do not have a provision that requires that a CRL be re-issued
within 24 hours of the revocation of any certificate which falls within its
scope. CRLs and OCSP responses for Intermediate CAs are clearly required to
receive updates within 24 hours of the revocation of a relevant certificate
(sections 4.9.7 and 4.9.10 respectively), but no such requirement appears
to exist for end-entity CRLs. The closest is the requirement that
subscriber certificates be revoked within 24 hours after certain conditions
are met, but the same structure exists for the conditions under which
Intermediate CAs must be revoked, suggesting that the BRs believe there is
a difference between revoking a certificate and *publishing* that
revocation via OCSP or CRLs. Is this distinction intended by the root
programs, and does anyone intend to change this status quo as more emphasis
is placed on end-entity CRLs?

Or more bluntly: in the presence of OCSP and CRLs being published side by
side, is it expected that the CA MUST re-issue a sharded end-entity CRL
within 24 hours of revoking a certificate in its scope, or may the CA wait
to re-issue the CRL until its next 7-day re-issuance time comes up as
normal?

Agreed - I do think having a well-tested, reliable path for programmatic
> update is an essential property to mandating the population. My hope and
> belief, however, is that this is fairly light-weight and doable.
>

 Thanks, I look forward to hearing more about what this will look like.

Aaron
___
dev-security-policy mailing list
dev-security-policy@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-security-policy


Re: CCADB Proposal: Add field called JSON Array of Partitioned CRLs Issued By This CA

2021-02-26 Thread Ryan Sleevi via dev-security-policy
On Fri, Feb 26, 2021 at 1:46 PM Aaron Gable  wrote:

> If we leave out the "new url for each re-issuance of a given CRL" portion
> of the design (or offer both url-per-thisUpdate and
> static-url-always-pointing-at-the-latest), then we could in fact include
> CRLDP urls in the certificates using the rolling time-based shards model.
> And frankly we may want to do that in the near future: maintaining both CRL
> *and* OCSP infrastructure when the BRs require only one or the other is an
> unnecessary expense, and turning down our OCSP infrastructure would
> constitute a significant savings, both in tangible bills and in engineering
> effort.
>

This isn’t quite correct. You MUST support OCSP for EE certs. It is only
optional for intermediates. So you can’t really contemplate turning down
the OCSP side, and that’s intentional, because clients use OCSP, rather
than CRLs, as the fallback mechanism for when the aggregated-CRLs fail.

I think it would be several years off before we could practically talk
about removing the OCSP requirement, once much more reliable CRL profiles
are in place, which by necessity would also mean profiling the acceptable
sharding algorithms.

Further, under today’s model, while you COULD place the CRLDP within the
certificate, that seems like it would only introduce additional cost and
limitation without providing you benefit. This is because major clients
won’t fetch the CRLDP for EE certs (especially if OCSP is present, which
the BRs MUST/REQUIRE). You would end up with some clients querying (such as
Java, IIRC), so you’d be paying for bandwidth, especially in your mass
revocation scenario, that would largely be unnecessary compared to the
status quo.

Thus, in my mind, the dynamic sharding idea you outlined has two major
> downsides:
> 1) It requires us to maintain our parallel OCSP infrastructure
> indefinitely, and
>

To the above, I think this should be treated as a foregone conclusion in
today’s requirements. So I think mostly the discussion here focuses on #2,
which is really useful.

2) It is much less resilient in the face of a mass revocation event.
>
> Fundamentally, we need our infrastructure to be able to handle the
> revocation of 200M certificates in 24 hours without any difference from how
> it handles the revocation of one certificate in the same period. Already
> having certificates pre-allocated into CRL shards means that we can
> deterministically sign many CRLs in parallel.
>

You can still do parallel signing. I was trying to account for that
explicitly with the notion of the “pre-reserved” set of URLs. However, that
also makes an assumption I should have been more explicit about: whether
the expectation is “you declare, then fill, CRLs”, or whether it’s
acceptable to “fill, then declare, CRLs”. I was trying to cover the former,
but I don’t think there is any innate prohibition on the latter, and it was
what I was trying to call out in the previous mail.

I do take your point about deterministically, because the process I’m
describing is implicitly assuming you have a work queue (e.g. pub/sub, go
channel, etc), in which certs to revoke go in, and one or more CRL signers
consume the queue and produce CRLs. The order of that consumption would be
non-deterministic, but it very much would be parallelizable, and you’d be
in full control over what the work unit chunks were sized at.

>
> Dynamically assigning certificates to CRLs as they are revoked requires
> taking a lock to determine if a new CRL needs to be created or not, and
> then atomically creating a new one. Or it requires a separate,
> not-operation-as-normal process to allocate a bunch of new CRLs, assign
> certs to them, and then sign those in parallel. Neither of these --
> dramatically changing not just the quantity but the *quality* of the
> database access, nor introducing additional processes -- is acceptable in
> the face of a mass revocation event.
>

Right, neither of these are required if you can “produce, then declare”.
From the client perspective, a consuming party cannot observe any
meaningful difference from the “declare, then produce” or the “produce,
then declare”, since in both cases, they have to wait for the CRL to be
published on the server before they can consume. The fact that they know
the URL, but the content is stale/not yet updated (I.e. the declare then
produce scenario) doesn’t provide any advantages. Ostensibly, the “produce,
then declare” gives greater advantage to the client/root program, because
then they can say “All URLs must be correct at time of declaration” and use
that to be able to quantify whether or not the CA met their timeline
obligations for the mass revocation event.

In any case, I think this conversation has served the majority of its
> purpose. This discussion has led to several ideas that would allow us to
> update our JSON document only when we create new shards (which will still
> likely be every 6 to 24 hours), as opposed to on every re-issuance of a
> 

Re: CCADB Proposal: Add field called JSON Array of Partitioned CRLs Issued By This CA

2021-02-26 Thread Aaron Gable via dev-security-policy
Thanks for the reminder that CCADB automatically dereferences URLs for
archival purposes, and for the info about existing automation! I don't
personally have CCADB credentials, so all of my knowledge of it is based on
what I've learned from others at LE and from this list.

If we leave out the "new url for each re-issuance of a given CRL" portion
of the design (or offer both url-per-thisUpdate and
static-url-always-pointing-at-the-latest), then we could in fact include
CRLDP urls in the certificates using the rolling time-based shards model.
And frankly we may want to do that in the near future: maintaining both CRL
*and* OCSP infrastructure when the BRs require only one or the other is an
unnecessary expense, and turning down our OCSP infrastructure would
constitute a significant savings, both in tangible bills and in engineering
effort.

Thus, in my mind, the dynamic sharding idea you outlined has two major
downsides:
1) It requires us to maintain our parallel OCSP infrastructure
indefinitely, and
2) It is much less resilient in the face of a mass revocation event.

Fundamentally, we need our infrastructure to be able to handle the
revocation of 200M certificates in 24 hours without any difference from how
it handles the revocation of one certificate in the same period. Already
having certificates pre-allocated into CRL shards means that we can
deterministically sign many CRLs in parallel.

Dynamically assigning certificates to CRLs as they are revoked requires
taking a lock to determine if a new CRL needs to be created or not, and
then atomically creating a new one. Or it requires a separate,
not-operation-as-normal process to allocate a bunch of new CRLs, assign
certs to them, and then sign those in parallel. Neither of these --
dramatically changing not just the quantity but the *quality* of the
database access, nor introducing additional processes -- is acceptable in
the face of a mass revocation event.

In any case, I think this conversation has served the majority of its
purpose. This discussion has led to several ideas that would allow us to
update our JSON document only when we create new shards (which will still
likely be every 6 to 24 hours), as opposed to on every re-issuance of a
shard. We'd still greatly prefer that CCADB be willing to
accept-and-dereference a URL to a JSON document, as it would allow our
systems to have fewer dependencies and fewer failure modes, but understand
that our arguments may not be persuasive enough :)

If Mozilla et al. do go forward with this proposal as-is, I'd like to
specifically request that CCADB surfaces an API to update this field before
any root programs require that it be populated, and does so with sufficient
lead time for development against the API to occur.

Thanks again,
Aaron

On Fri, Feb 26, 2021 at 8:47 AM Ryan Sleevi  wrote:

>
>
> On Fri, Feb 26, 2021 at 5:49 AM Rob Stradling  wrote:
>
>> > We already have automation for CCADB. CAs can and do use it for
>> disclosure of intermediates.
>>
>> Any CA representatives that are surprised by this statement might want to
>> go and read the "CCADB Release Notes" (click the hyperlink when you
>> login to the CCADB).  That's the only place I've seen the CCADB API
>> "announced".
>>
>> > Since we're talking Let's Encrypt, the assumption here is that the CRL
>> URLs
>> > will not be present within the crlDistributionPoints of the
>> certificates,
>> > otherwise, this entire discussion is fairly moot, since those
>> > crlDistributionPoints can be obtained directly from Certificate T
>> ransparency.
>>
>> AIUI, Mozilla is moving towards requiring that the CCADB holds all CRL
>> URLs, even the ones that also appear in crlDistributionPoints extensions.
>> Therefore, I think that this entire discussion is not moot at all.
>>
>
> Rob,
>
> I think you misparsed, but that's understandable, because I worded it
> poorly. The discussion is mooted by whether or not the CA includes the
> cRLDP within the certificate itself - i.e. that the CA has to allocate the
> shard at issuance time and that it's fixed for the lifetime of the
> certificate. That's not a requirement - EEs don't need cRLDPs - and so
> there's no inherent need to do static assignment, nor does it sound like LE
> is looking to go that route, since it would be incompatible with the design
> they outlined. Because of this, the dynamic sharding discussed seems
> significantly _less_ complex, both for producers and for consumers of this
> data, than the static sharding-and-immutability scheme proposed.
>
___
dev-security-policy mailing list
dev-security-policy@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-security-policy


Re: CCADB Proposal: Add field called JSON Array of Partitioned CRLs Issued By This CA

2021-02-26 Thread Ryan Sleevi via dev-security-policy
On Fri, Feb 26, 2021 at 5:49 AM Rob Stradling  wrote:

> > We already have automation for CCADB. CAs can and do use it for
> disclosure of intermediates.
>
> Any CA representatives that are surprised by this statement might want to
> go and read the "CCADB Release Notes" (click the hyperlink when you login
> to the CCADB).  That's the only place I've seen the CCADB API "announced".
>
> > Since we're talking Let's Encrypt, the assumption here is that the CRL
> URLs
> > will not be present within the crlDistributionPoints of the certificates,
> > otherwise, this entire discussion is fairly moot, since those
> > crlDistributionPoints can be obtained directly from Certificate T
> ransparency.
>
> AIUI, Mozilla is moving towards requiring that the CCADB holds all CRL
> URLs, even the ones that also appear in crlDistributionPoints extensions.
> Therefore, I think that this entire discussion is not moot at all.
>

Rob,

I think you misparsed, but that's understandable, because I worded it
poorly. The discussion is mooted by whether or not the CA includes the
cRLDP within the certificate itself - i.e. that the CA has to allocate the
shard at issuance time and that it's fixed for the lifetime of the
certificate. That's not a requirement - EEs don't need cRLDPs - and so
there's no inherent need to do static assignment, nor does it sound like LE
is looking to go that route, since it would be incompatible with the design
they outlined. Because of this, the dynamic sharding discussed seems
significantly _less_ complex, both for producers and for consumers of this
data, than the static sharding-and-immutability scheme proposed.
___
dev-security-policy mailing list
dev-security-policy@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-security-policy


Re: CCADB Proposal: Add field called JSON Array of Partitioned CRLs Issued By This CA

2021-02-26 Thread Rob Stradling via dev-security-policy
> We already have automation for CCADB. CAs can and do use it for disclosure of 
> intermediates.

Any CA representatives that are surprised by this statement might want to go 
and read the "CCADB Release Notes" (click the hyperlink when you login to the 
CCADB).  That's the only place I've seen the CCADB API "announced".

> Since we're talking Let's Encrypt, the assumption here is that the CRL URLs
> will not be present within the crlDistributionPoints of the certificates,
> otherwise, this entire discussion is fairly moot, since those
> crlDistributionPoints can be obtained directly from Certificate Transparency.

AIUI, Mozilla is moving towards requiring that the CCADB holds all CRL URLs, 
even the ones that also appear in crlDistributionPoints extensions.  Therefore, 
I think that this entire discussion is not moot at all.

Ben's placeholder text:
https://github.com/BenWilson-Mozilla/pkipolicy/commit/26c1ee4ea8be1a07f86253e38fbf0cc043e12d48


From: dev-security-policy  on 
behalf of Ryan Sleevi via dev-security-policy 

Sent: 26 February 2021 06:02
To: Aaron Gable 
Cc: Ryan Sleevi ; mozilla-dev-security-policy 
; Kathleen Wilson 

Subject: Re: CCADB Proposal: Add field called JSON Array of Partitioned CRLs 
Issued By This CA

CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you recognize the sender and know the content 
is safe.


On Thu, Feb 25, 2021 at 8:21 PM Aaron Gable  wrote:

> If I may, I believe that the problem is less that it is a reference (which
> is true of every URL stored in CCADB), and more that it is a reference to
> an unsigned object.
>

While that's a small part, it really is as I said: the issue of being a
reference. We've already had this issue with the other URL fields, and thus
there exists logic to dereference and archive those URLs within CCADB.
Issues like audit statements, CP, and CPSes are all things that are indeed
critical to understanding the posture of a CA over time, and so actually
having those materials in something stable and maintained (without a
dependence on the CA) is important.  It's the lesson from those
various past failure modes that had Google very supportive of the non-URL
based approach, putting the JSON directly in CCADB, rather than forcing yet
another "update-and-fetch" system.. You're absolutely correct
that the "configured by CA" element has the nice property of being assured
that the change came from the CA themselves, without requiring signing, but
I wouldn't want to reduce the concern to just that.

* I'm not aware of any other automation system with write-access to CCADB
> (I may be very wrong!), and I imagine there would need to be some sort of
> further design discussion with CCADB's maintainers about what it means to
> give write credentials to an automated system, what sorts of protections
> would be necessary around those credentials, how to scope those credentials
> as narrowly as possible, and more.
>

We already have automation for CCADB. CAs can and do use it for disclosure
of intermediates.


> * I'm not sure CCADB's maintainers want updates to it to be in the
> critical path of ongoing issuance, as opposed to just in the critical path
> for beginning issuance with a new issuer.
>

Without wanting to sound dismissive, whether or not it's in a critical path
of updating is the CA's choice on their design. I understand that there are
designs that could put it there, I think the question is whether it's
reasonable for the CA to have done that in the first place, which is why
it's important to drill down into these concerns. I know you merely
qualified it as undesirable, rather than actually being a blocker, and I
appreciate that, but I do think some of these concerns are perhaps less
grounded or persuasive than others :)

Taking a step back here, I think there's been a fundamental design error in
your proposed design, and I think that it, combined with the (existing)
automation, may make much of this not actually be the issue you anticipate.

Since we're talking Let's Encrypt, the assumption here is that the CRL URLs
will not be present within the crlDistributionPoints of the certificates,
otherwise, this entire discussion is fairly moot, since those
crlDistributionPoints can be obtained directly from Certificate
Transparency.

The purpose of this field is to help discover CRLs that are otherwise not
discoverable (e.g. from CT), but this also means that these CRLs do not
suffer from the same design limitations of PKI. Recall that there's nothing
intrinsic to a CRL that expresses its sharding algorithm (ignoring, for a
second, reasonCodes within the IDP extension). The only observability that
an external (not-the-CA) party has, whether the Subscriber or the RP, is
merely that "the CRL DP for this certificate is different from the CRLDP
for th

Re: CCADB Proposal: Add field called JSON Array of Partitioned CRLs Issued By This CA

2021-02-25 Thread Ryan Sleevi via dev-security-policy
On Thu, Feb 25, 2021 at 8:21 PM Aaron Gable  wrote:

> If I may, I believe that the problem is less that it is a reference (which
> is true of every URL stored in CCADB), and more that it is a reference to
> an unsigned object.
>

While that's a small part, it really is as I said: the issue of being a
reference. We've already had this issue with the other URL fields, and thus
there exists logic to dereference and archive those URLs within CCADB.
Issues like audit statements, CP, and CPSes are all things that are indeed
critical to understanding the posture of a CA over time, and so actually
having those materials in something stable and maintained (without a
dependence on the CA) is important.  It's the lesson from those
various past failure modes that had Google very supportive of the non-URL
based approach, putting the JSON directly in CCADB, rather than forcing yet
another "update-and-fetch" system.. You're absolutely correct
that the "configured by CA" element has the nice property of being assured
that the change came from the CA themselves, without requiring signing, but
I wouldn't want to reduce the concern to just that.

* I'm not aware of any other automation system with write-access to CCADB
> (I may be very wrong!), and I imagine there would need to be some sort of
> further design discussion with CCADB's maintainers about what it means to
> give write credentials to an automated system, what sorts of protections
> would be necessary around those credentials, how to scope those credentials
> as narrowly as possible, and more.
>

We already have automation for CCADB. CAs can and do use it for disclosure
of intermediates.


> * I'm not sure CCADB's maintainers want updates to it to be in the
> critical path of ongoing issuance, as opposed to just in the critical path
> for beginning issuance with a new issuer.
>

Without wanting to sound dismissive, whether or not it's in a critical path
of updating is the CA's choice on their design. I understand that there are
designs that could put it there, I think the question is whether it's
reasonable for the CA to have done that in the first place, which is why
it's important to drill down into these concerns. I know you merely
qualified it as undesirable, rather than actually being a blocker, and I
appreciate that, but I do think some of these concerns are perhaps less
grounded or persuasive than others :)

Taking a step back here, I think there's been a fundamental design error in
your proposed design, and I think that it, combined with the (existing)
automation, may make much of this not actually be the issue you anticipate.

Since we're talking Let's Encrypt, the assumption here is that the CRL URLs
will not be present within the crlDistributionPoints of the certificates,
otherwise, this entire discussion is fairly moot, since those
crlDistributionPoints can be obtained directly from Certificate
Transparency.

The purpose of this field is to help discover CRLs that are otherwise not
discoverable (e.g. from CT), but this also means that these CRLs do not
suffer from the same design limitations of PKI. Recall that there's nothing
intrinsic to a CRL that expresses its sharding algorithm (ignoring, for a
second, reasonCodes within the IDP extension). The only observability that
an external (not-the-CA) party has, whether the Subscriber or the RP, is
merely that "the CRL DP for this certificate is different from the CRLDP
for that certificate". It is otherwise opaque how the CA used it, even if
through a large enough corpus from CT, you can infer the algorithm from the
pattern. Further, when such shards are being used, you can observe that a
given CRL that you have (whose provenance may be unknown) can be known
whether or not it covers a given certificate by matching the CRLDP of the
cert against the IDP of the CRL.  We're talking about a scenario in which
the certificate lacks a CRLDP, and so there's no way to know that, indeed,
a given CRL "covers" the certificate unambiguously. The only thing we have
is the CRL having an IDP, because if it didn't, it'd have to be a full CRL,
and then you'd be back to only having one URL to worry about.

Because of all of this, it means that the consumers of this JSON are
expected to combine all of the CRLs present, union all the revoked serials,
and be done with it. However, it's that unioning that I think you've
overlooked here in working out your math. In the "classic" PKI sense (i.e.
CRLDP present), the CA has to plan for revocation for the lifetime of the
certificate, it's fixed when the certificate is created, and it's immutable
once created. Further, changes in revocation frequency mean you need to
produce new versions of that specific CRL. However, the scenario we're
discussing, in which these CRLs are unioned, you're entirely flexible at
all points in time for how you balance your CRLs. Further, in the 'ideal'
case (no revocations), you need only produce a single empty CRL. There's no
need to produce an 

Re: CCADB Proposal: Add field called JSON Array of Partitioned CRLs Issued By This CA

2021-02-25 Thread Aaron Gable via dev-security-policy
Similarly, snipping and replying to portions of your message below:

On Thu, Feb 25, 2021 at 12:52 PM Ryan Sleevi  wrote:

> Am I understanding your proposal correctly that "any published JSON
> document be valid for a certain period of time" effectively means that each
> update of the JSON document also gets a distinct URL (i.e. same as the
> CRLs)?
>

No, the (poorly expressed) idea is this: suppose you fetch our
rapidly-changing document and get version X. Over the next five minutes,
you fetch every CRL URL in that document. But during that same five
minutes, we've published versions X+1 and X+2 of that JSON document at that
same URL. There should be a guarantee that, as long as you fetch the CRLs
in your document "fast enough" (for some to-be-determined value of "fast"),
all of those URLs will still be valid (i.e. not return a 404 or similar),
*even though* some of them are not referenced by the most recent version of
the JSON document.

This may seem like a problem that arises only in our rapidly-changing JSON
version of things. But I believe it should be a concern even in the system
as proposed by Kathleen: when a CA updates the JSON array contained in
CCADB, how long does a consumer of CCADB have to get a snapshot of the
contents of the previous set of URLs? To posit an extreme hypothetical, can
a CA hide misissuance of a CRL by immediately hosting their fixed CRL at a
new URL and updating their CCADB JSON list to include that new URL instead?
Not to put too fine a point on it, but I believe that this sort of
hypothetical is the underlying worry about having the JSON list live
outside CCADB where it can be changed on a whim, but I'm not sure that
having the list live inside CCADB without any requirements on the validity
of the URLs inside it provides significantly more auditability.

The issue I see with the "URL stored in CCADB" is that it's a reference,
> and the dereferencing operation (retrieving the URL) puts the onus on the
> consumer (e.g. root stores) and can fail, or result in different content
> for different parties, undetectably.
>

If I may, I believe that the problem is less that it is a reference (which
is true of every URL stored in CCADB), and more that it is a reference to
an unsigned object. URLs directly to CRLs don't have this issue, because
the CRL is signed. And storing the JSON array directly doesn't have this
issue, because it is implicitly signed by the credentials of the user who
signed in to CCADB to modify it. One possible solution here would be to
require that the JSON document be signed by the same CA certificate which
issued all of the CRLs contained in it. I don't think I like this solution,
but it is within the possibility space.


> If there is an API that allows you to modify the JSON contents directly
> (e.g. a CCADB API call you could make with an OAuth token), would that
> address your concern?
>

If Mozilla and the other stakeholders in CCADB decide to go with this
thread's proposal as-is, then I suspect that yes, we would develop
automation to talk to CCADB's API in exactly this way. This is undesired
from our perspective for a variety of reasons:
* I'm not aware of a well-maintained Go library for interacting with the
Salesforce API.
* I'm not aware of any other automation system with write-access to CCADB
(I may be very wrong!), and I imagine there would need to be some sort of
further design discussion with CCADB's maintainers about what it means to
give write credentials to an automated system, what sorts of protections
would be necessary around those credentials, how to scope those credentials
as narrowly as possible, and more.
* I'm not sure CCADB's maintainers want updates to it to be in the critical
path of ongoing issuance, as opposed to just in the critical path for
beginning issuance with a new issuer.

I think the question was with respect to the frequency of change of those
> documents.
>

Frankly, I think the least frequent creation of a new time-sharded CRL we
would be willing to do is once every 24 hours (that's still >60MB per CRL
in the worst case). That's going to require automation no matter what.


> There is one thing you mentioned that's also non-obvious to me, because I
> would expect you already have to deal with this exact issue with respect to
> OCSP, which is "overwriting files is a dangerous operation prone to many
> forms of failure". Could you expand more about what some of those
> top-concerns are? I ask, since, say, an OCSP Responder is frequently
> implemented as "Spool /ocsp/:issuerDN/:serialNumber", with the CA
> overwriting :serialNumber whenever they produce new responses. It sounds
> like you're saying that common design pattern may be problematic for y'all,
> and I'm curious to learn more.
>

Sure, happy to expand. For those following along at home, this last bit is
relatively off-topic compared to the other sections above, so skip if you
feel like it :)

OCSP consists of hundreds of millions of small entries. Thus our 

Re: CCADB Proposal: Add field called JSON Array of Partitioned CRLs Issued By This CA

2021-02-25 Thread Ryan Sleevi via dev-security-policy
Hugely useful! Thanks for sharing - this is incredibly helpful.

I've snipped a good bit, just to keep the thread small, and have some
further questions inline.

On Thu, Feb 25, 2021 at 2:15 PM Aaron Gable  wrote:

> I believe that there is an argument to be made here that this plan
> increases the auditability of the CRLs, rather than decreases it. Root
> programs could require that any published JSON document be valid for a
> certain period of time, and that all CRLs within that document remain
> available for that period as well. Or even that historical versions of CRLs
> remain available until every certificate they cover has expired (which is
> what we intend to do anyway). Researchers can crawl our history of CRLs and
> examine revocation events in more detail than previously available.
>

So I think unpacking this a little: Am I understanding your proposal
correctly that "any published JSON document be valid for a certain period
of time" effectively means that each update of the JSON document also gets
a distinct URL (i.e. same as the CRLs)? I'm not sure if that's what you
meant, because it would still mean regularly updating CCADB whenever your
shard-set changes (which seems to be the concern), but at the same time, it
would seem that any validity requirement imposes on you a lower-bound for
how frequently you can change or introduce new shards, right?

The issue I see with the "URL stored in CCADB" is that it's a reference,
and the dereferencing operation (retrieving the URL) puts the onus on the
consumer (e.g. root stores) and can fail, or result in different content
for different parties, undetectably. If it was your proposal to change to
distinct URLs, that issue would still unfortunately exist.

If there is an API that allows you to modify the JSON contents directly
(e.g. a CCADB API call you could make with an OAuth token), would that
address your concern? It would allow CCADB to still canonically record the
change history and contents, facilitating that historic research. It would
also facilitate better compliance tracking - since we know policies like
"could require that any published JSON" don't really mean anything in
practice for a number of CAs, unless the requirements are also monitored
and enforced.


> Regardless, even without statically-pathed, timestamped CRLs, I believe
> that the merits of rolling time-based shards are sufficient to be a strong
> argument in favor of dynamic JSON documents.
>

Right, I don't think there's any fundamental opposition to that. I'm very
much in favor of time-sharded CRLs over hash-sharded CRLs, for exactly the
reasons you highlight. I think the question was with respect to the
frequency of change of those documents (i.e. how often you introduce new
shards, and how those shards are represented).

There is one thing you mentioned that's also non-obvious to me, because I
would expect you already have to deal with this exact issue with respect to
OCSP, which is "overwriting files is a dangerous operation prone to many
forms of failure". Could you expand more about what some of those
top-concerns are? I ask, since, say, an OCSP Responder is frequently
implemented as "Spool /ocsp/:issuerDN/:serialNumber", with the CA
overwriting :serialNumber whenever they produce new responses. It sounds
like you're saying that common design pattern may be problematic for y'all,
and I'm curious to learn more.
___
dev-security-policy mailing list
dev-security-policy@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-security-policy


Re: CCADB Proposal: Add field called JSON Array of Partitioned CRLs Issued By This CA

2021-02-25 Thread Aaron Gable via dev-security-policy
Sure, happy to provide more details! The fundamental issue here is the
scale at which Let's Encrypt issues, and the automated nature by which
clients interact with Let's Encrypt.

LE currently has 150M certificates active, all (as of March 1st) signed by
the same issuer certificate, R3. In the event of a mass revocation, that
means a CRL with 150M entries in it. At an average of 38 bytes per entry in
a CRL, that means nearly 6GB worth of CRL. Passing around a single 6GB file
isn't good for reliability (it's much better to fail-and-retry downloading
one of a hundred 60MB files than fail-and-retry a single 6GB file), so
sharding seems like an operational necessity.

Even without a LE-initiated mass revocation event, one of our large
integrators (such as a hosting provider with millions of domains) could
decide for any reason to revoke every single certificate we have issued to
them. We need to be resilient to these kinds of events.

Once we've decided that sharding is necessary, the next question is "static
or dynamic sharding?". It's easy to imagine a world in which we usually
have only one or two CRL shards, but dynamically scale that number up to
keep individual CRL sizes small if/when revocation rises sharply. There are
a lot of "interesting" (read: difficult) engineering problems here, and
we've decided not to go the dynamic route, but even if we did it would
obviously require being able to change the list of URLs in the JSON array
on the fly.

For static sharding, we would need to constantly maintain a large set of
small CRLs, such that even in the worst case no individual CRL would become
too large. I see two main approaches: maintaining a fully static set of
shards into which our certificates are bucketed, or maintaining rolling
time-based shards (much like CT shards).

Maintaining a static set of shards has the primary advantage of "working
like CRLs usually work". A given CRL has a scope (e.g. "all certs issued by
R3 whose serial number is equal to 1 mod 500"), it has a nextUpdate, and a
new CRL with the same scope will be re-issued at the same path before that
nextUpdate is reached. However, it makes re-sharding difficult. If Let's
Encrypt's issuance rises enough that we want to have 1000 shards instead of
500, we'll have to re-shard every cert, re-issue every CRL, and update the
list of URLs in the JSON. And if we're updating the list, we should have
standards around how that list is updated and how its history is stored,
and then we'd prefer that those standards allow for rapid updates.

The alternative is to have rolling time-based shards. In this case, every X
hours we would create a new CRL, and every certificate we issue over the
next period would belong to that CRL. Similar to the above, these CRLs have
nice scopes: "all certs issued by R3 between AA:BB and XX:YY"). When every
certificate in one of these time-based shards has expired, we can simply
stop re-issuing it. This has the advantage of solving the resharding
problem: if we want to make our CRLs smaller, we just increase the
frequency at which we initialize a new one, and 90 days later we've fully
switched over to the new size. It has the disadvantage from your
perspective of requiring us to add a new URL to the JSON array every period
(and we get to drop an old URL from the array every period as well).

So why would we want to put each CRL re-issuance at a new path, and update
our JSON even more frequently? Because we have reason to believe that
various root programs will soon seek CRL re-issuance on the order of every
6 hours, not every 7 days as currently required; we will have many shards;
and overwriting files is a dangerous operation prone to many forms of
failure. Our current plan is to surface CRLs at paths like
`/crls/:issuerID/:shardID/:thisUpdate.der`, so that we never have to
overwrite a file. Similarly, our JSON document can always be written to a
new file, and the path in CCADB can point to a simple handler which always
serves the most recent file. Additionally, this means that anyone in
possession of one of our JSON documents can fetch all the CRLs listed in it
and get a *consistent* view of our revocation information as of that time.

I believe that there is an argument to be made here that this plan
increases the auditability of the CRLs, rather than decreases it. Root
programs could require that any published JSON document be valid for a
certain period of time, and that all CRLs within that document remain
available for that period as well. Or even that historical versions of CRLs
remain available until every certificate they cover has expired (which is
what we intend to do anyway). Researchers can crawl our history of CRLs and
examine revocation events in more detail than previously available.

Regardless, even without statically-pathed, timestamped CRLs, I believe
that the merits of rolling time-based shards are sufficient to be a strong
argument in favor of dynamic JSON documents.

I hope this helps and that I 

Re: CCADB Proposal: Add field called JSON Array of Partitioned CRLs Issued By This CA

2021-02-25 Thread Ryan Sleevi via dev-security-policy
On Thu, Feb 25, 2021 at 12:33 PM Aaron Gable via dev-security-policy <
dev-security-policy@lists.mozilla.org> wrote:

> Obviously this plan may have changed due to other off-list conversations,
> but I would like to express a strong preference for the original plan. At
> the scale at which Let's Encrypt issues, it is likely that our JSON array
> will contain on the order of 1000 CRL URLs, and will add a new one (and age
> out an entirely-expired one) every 6 hours or so. I am not aware of any
> existing automation which updates CCADB at that frequency.
>
> Further, from a resiliency perspective, we would prefer that the CRLs we
> generate live at fully static paths. Rather than overwriting CRLs with new
> versions when they are re-issued prior to their nextUpdate time, we would
> leave the old (soon-to-be-expired) CRL in place, offer its replacement at
> an adjacent path, and update the JSON to point at the replacement. This
> process would have us updating the JSON array on the order of minutes, not
> hours.


This seems like a very inefficient design choice, and runs contrary to how
CRLs are deployed by, well, literally anyone using CRLs as specified, since
the URL is fixed within the issued certificate.

Could you share more about the design of why? Both for the choice to use
sharded CRLs (since that is the essence of the first concern), and the
motivation to use fixed URLs.

We believe that earlier "URL to a JSON array..." approach makes room for
> significantly simpler automation on the behalf of CAs without significant
> loss of auditability. I believe it may be helpful for the CCADB field
> description (or any upcoming portion of the MRSP which references it) to
> include specific requirements around the cache lifetime of the JSON
> document and the CRLs referenced within it.


Indirectly, you’ve highlighted exactly why the approach you propose loses
auditability. Using the URL-based approach puts the onus on the consumer to
try and detect and record changes, introduces greater operational risks
that evade detection (e.g. stale caches on the CAs side for the content of
that URL), and encourages or enables designs that put greater burden on
consumers.

I don’t think this is suggested because of malice, but I do think it makes
it significantly easier for malice to go undetected, for accurate historic
information to be hidden or made too complex to maintain.

This is already a known and, as of recent, studied problem with CRLs [1].
Unquestionably, you are right for highlighting and emphasizing that this
constrains and limits how CAs perform certain operations. You highlight it
as a potential bug, but I’d personally been thinking about it as a
potential feature. To figure out the disconnect, I’m hoping you could
further expand on the “why” of the design factors for your proposed design.

Additionally, it’d be useful to understand how you would suggest CCADB
consumers maintain an accurate, CA attested log of changes. Understanding
such changes is an essential part of root program maintenance, and it does
seem reasonable to expect CAs to need to adjust to provide that, rather
than give up on the goal.

[1]
https://arxiv.org/abs/2102.04288

>
___
dev-security-policy mailing list
dev-security-policy@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-security-policy


Re: CCADB Proposal: Add field called JSON Array of Partitioned CRLs Issued By This CA

2021-02-25 Thread Aaron Gable via dev-security-policy
Hi Kathleen,

It was my impression from earlier discussions
 that
the plan was for the new CCADB field to contain a URL which points to a
document containing only a JSON array of partitioned CRL URLs, rather than
the new CCADB field containing such an array directly.

Obviously this plan may have changed due to other off-list conversations,
but I would like to express a strong preference for the original plan. At
the scale at which Let's Encrypt issues, it is likely that our JSON array
will contain on the order of 1000 CRL URLs, and will add a new one (and age
out an entirely-expired one) every 6 hours or so. I am not aware of any
existing automation which updates CCADB at that frequency.

Further, from a resiliency perspective, we would prefer that the CRLs we
generate live at fully static paths. Rather than overwriting CRLs with new
versions when they are re-issued prior to their nextUpdate time, we would
leave the old (soon-to-be-expired) CRL in place, offer its replacement at
an adjacent path, and update the JSON to point at the replacement. This
process would have us updating the JSON array on the order of minutes, not
hours.

We believe that earlier "URL to a JSON array..." approach makes room for
significantly simpler automation on the behalf of CAs without significant
loss of auditability. I believe it may be helpful for the CCADB field
description (or any upcoming portion of the MRSP which references it) to
include specific requirements around the cache lifetime of the JSON
document and the CRLs referenced within it.

Thanks,
Aaron

On Wed, Feb 24, 2021 at 12:36 PM Kathleen Wilson via dev-security-policy <
dev-security-policy@lists.mozilla.org> wrote:

> All,
>
> As previously discussed, there is a section on root and intermediate
> certificate pages in the CCADB called ‘Pertaining to Certificates Issued
> by this CA’, and it currently has one field called 'Full CRL Issued By
> This CA'.
>
> Proposal: Add field called 'JSON Array of Partitioned CRLs Issued By
> This CA'
>
> Description of this proposed field:
> When there is no full CRL for certificates issued by this CA, provide a
> JSON array whose elements are URLs of partitioned, DER-encoded CRLs that
> when combined are the equivalent of a full CRL. The JSON array may omit
> obsolete partitioned CRLs whose scopes only include expired certificates.
>
> Example:
>
> [
>"http://cdn.example/crl-1.crl;,
>"http://cdn.example/crl-2.crl;
> ]
>
>
>
> Additionally, I propose adding a new section to
> https://www.ccadb.org/cas/fields called “Revocation Information”.
>
> The proposed draft for this new section is here:
>
> https://docs.google.com/document/d/1uVK0h4q5BSrFv6e86f2SwR5m2o9Kl1km74vG4HnkABw/edit?usp=sharing
>
>
> I will appreciate your input on this proposal.
>
> Thanks,
> Kathleen
>
>
> ___
> dev-security-policy mailing list
> dev-security-policy@lists.mozilla.org
> https://lists.mozilla.org/listinfo/dev-security-policy
>
___
dev-security-policy mailing list
dev-security-policy@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-security-policy