Re: Google Trust Services - CRL handling of expired certificates not fully compliant with RFC 5280 Section 3.3

2019-09-13 Thread Wayne Thayer via dev-security-policy
Thank you for the report and follow-up Andy. I created
https://bugzilla.mozilla.org/show_bug.cgi?id=1581183 to track this issue.

- Wayne

On Fri, Sep 13, 2019 at 10:19 AM Andy Warner via dev-security-policy <
dev-security-policy@lists.mozilla.org> wrote:

> A quick follow-up to close this out.
>
> The push to fully address the issue was completed globally shortly before
> 16:00 UTC on 2019-09-02.
>
> After additional review, we're confident the only certificates affected
> were these two:
> https://crt.sh/?id=760396354
> https://crt.sh/?id=759833603
>
> Google Trust Services considers this matter fully addressed. We will of
> course continue our ongoing internal review program, but no other work or
> information is outstanding at this point.
>
> --
> Andy Warner
> Google Trust Services
>
> On Friday, August 30, 2019 at 2:39:51 PM UTC-4, Andy Warner wrote:
> > This is an initial report and we expect to provide some additional
> details and the completion timeline after a bit more verification and full
> deployment of in-flight mitigations. We are posting the most complete
> information we have currently to comply with Mozilla reporting timelines
> and will follow-up with additional details soon.
> >
> > 1. How your CA first became aware of the problem and the time and date.
> >
> > While performing an internal review and assessment of the CRL generation
> system for Google Trust Services' GTS CA 1O1 on August 16, 2019, it was
> discovered that the CRL generation service did not include CRL entries of
> expired certificates. The periodic job only considered certificates with
> valid lifetimes. This does not conform to RFC 5280 Section 3.3 which states
> that “An entry MUST NOT be removed from the CRL until it appears on one
> regularly scheduled CRL issued beyond the revoked certificate's validity
> period.”  We expect that few, if any, clients have been impacted.  For a
> client to be impacted they would have to: clock skewed to a time before the
> not-after field of the certificate; and have a CRL published after
> expiration dropping the revoked certificate.
> >
> >
> > 2. A timeline of the actions your CA took in response. A timeline is a
> date-and-time-stamped sequence of all relevant events. This may include
> events before the incident was reported, such as when a particular
> requirement became applicable, or a document changed, or a bug was
> introduced, or an audit was done.
> >
> > August 16, 2019 15:00 UTC - Reviewer realizes that CRL will not publish
> for one update past expiration
> > August 16, 2019 16:00 UTC - Reviewer checks for other issues & talks to
> peers to confirm problem
> > August 16, 2019 17:00 UTC - Bug is filed to fix the issue with a
> proposed design fix
> > August 16, 2019 23:30 UTC - Fix is sent for review
> > August 20, 2019 16:00 UTC - Remediation work is discussed & assigned
> > August  20, 2019 18:00 UTC - Query to inspect revoked certificates is
> created and sent to be run by production team for initial analysis.
> > August 21, 2019 10:40 UTC - Production team runs query and returns result
> > August 21, 2019 15:00 UTC - Reviewer analyzes data
> > August 21, 2019 20:30 UTC - Reviewer asks for a follow up query to
> ascertain if any certificates did not make it onto the CRL
> > August 22, 2019 07:00 UTC - Initial attempt at updating test systems
> with fix.
> > August 22, 2019 09:00 UTC - Updating of test systems aborted due to
> (unrelated) issues.
> > August 22, 2019 07:00 UTC - Production team runs query for CRLs that may
> have missed a certificate
> > August 22, 2019 15:00 UTC - Reviewer ascertains that certificates under
> question were on a CRL
> > August 26, 2019 11:00 UTC - Second attempt at updating test systems with
> fix.
> > August 26, 2019 13:00 UTC - Test systems updated, confirmed integrity of
> fixed software.
> > August 27, 2019 09:00 UTC - Confirmed fix is effective on test systems.
> > August 27, 2019 10:00 UTC - present: Ongoing staged deployment to
> production systems. Should complete fully by September 3, 2019 17:00 UTC
> (slightly extended window due to push policies around holiday weekends. The
> rollout was staged in accordance with Google's standard rollout procedures.)
> >
> >
> > 3. Whether your CA has stopped, or has not yet stopped, issuing
> certificates with the problem.
> >
> > The affected CA software has been patched.  It now populates expired
> certificates in the CRL for 7 days after their expiration to ensure they
> appear in at least one regularly issued CRL update.  Automated testing was
> added as part of the same patch to check that revoked certificates are kept
> in the CRL.  The patch was developed, tested, reviewed and landed within
> the codebase by August 19, 2019.  The CRL entry removal bug has been fully
> remediated.
> >
> >
> > 4. A summary of the problematic certificates. For each problem: number
> of certs, and the date the first and last certs with that problem were
> issued.
> >
> > Investigation began on August

Re: Google Trust Services - CRL handling of expired certificates not fully compliant with RFC 5280 Section 3.3

2019-09-13 Thread Andy Warner via dev-security-policy
A quick follow-up to close this out.

The push to fully address the issue was completed globally shortly before 16:00 
UTC on 2019-09-02.

After additional review, we're confident the only certificates affected were 
these two:
https://crt.sh/?id=760396354
https://crt.sh/?id=759833603

Google Trust Services considers this matter fully addressed. We will of course 
continue our ongoing internal review program, but no other work or information 
is outstanding at this point.

--
Andy Warner
Google Trust Services

On Friday, August 30, 2019 at 2:39:51 PM UTC-4, Andy Warner wrote:
> This is an initial report and we expect to provide some additional details 
> and the completion timeline after a bit more verification and full deployment 
> of in-flight mitigations. We are posting the most complete information we 
> have currently to comply with Mozilla reporting timelines and will follow-up 
> with additional details soon.
> 
> 1. How your CA first became aware of the problem and the time and date.
> 
> While performing an internal review and assessment of the CRL generation 
> system for Google Trust Services' GTS CA 1O1 on August 16, 2019, it was 
> discovered that the CRL generation service did not include CRL entries of 
> expired certificates. The periodic job only considered certificates with 
> valid lifetimes. This does not conform to RFC 5280 Section 3.3 which states 
> that “An entry MUST NOT be removed from the CRL until it appears on one 
> regularly scheduled CRL issued beyond the revoked certificate's validity 
> period.”  We expect that few, if any, clients have been impacted.  For a 
> client to be impacted they would have to: clock skewed to a time before the 
> not-after field of the certificate; and have a CRL published after expiration 
> dropping the revoked certificate.
> 
> 
> 2. A timeline of the actions your CA took in response. A timeline is a 
> date-and-time-stamped sequence of all relevant events. This may include 
> events before the incident was reported, such as when a particular 
> requirement became applicable, or a document changed, or a bug was 
> introduced, or an audit was done.
> 
> August 16, 2019 15:00 UTC - Reviewer realizes that CRL will not publish for 
> one update past expiration
> August 16, 2019 16:00 UTC - Reviewer checks for other issues & talks to peers 
> to confirm problem
> August 16, 2019 17:00 UTC - Bug is filed to fix the issue with a proposed 
> design fix
> August 16, 2019 23:30 UTC - Fix is sent for review
> August 20, 2019 16:00 UTC - Remediation work is discussed & assigned
> August  20, 2019 18:00 UTC - Query to inspect revoked certificates is created 
> and sent to be run by production team for initial analysis.
> August 21, 2019 10:40 UTC - Production team runs query and returns result
> August 21, 2019 15:00 UTC - Reviewer analyzes data
> August 21, 2019 20:30 UTC - Reviewer asks for a follow up query to ascertain 
> if any certificates did not make it onto the CRL 
> August 22, 2019 07:00 UTC - Initial attempt at updating test systems with fix.
> August 22, 2019 09:00 UTC - Updating of test systems aborted due to 
> (unrelated) issues.
> August 22, 2019 07:00 UTC - Production team runs query for CRLs that may have 
> missed a certificate
> August 22, 2019 15:00 UTC - Reviewer ascertains that certificates under 
> question were on a CRL
> August 26, 2019 11:00 UTC - Second attempt at updating test systems with fix.
> August 26, 2019 13:00 UTC - Test systems updated, confirmed integrity of 
> fixed software.
> August 27, 2019 09:00 UTC - Confirmed fix is effective on test systems.
> August 27, 2019 10:00 UTC - present: Ongoing staged deployment to production 
> systems. Should complete fully by September 3, 2019 17:00 UTC (slightly 
> extended window due to push policies around holiday weekends. The rollout was 
> staged in accordance with Google's standard rollout procedures.)
> 
> 
> 3. Whether your CA has stopped, or has not yet stopped, issuing certificates 
> with the problem. 
> 
> The affected CA software has been patched.  It now populates expired 
> certificates in the CRL for 7 days after their expiration to ensure they 
> appear in at least one regularly issued CRL update.  Automated testing was 
> added as part of the same patch to check that revoked certificates are kept 
> in the CRL.  The patch was developed, tested, reviewed and landed within the 
> codebase by August 19, 2019.  The CRL entry removal bug has been fully 
> remediated.
> 
> 
> 4. A summary of the problematic certificates. For each problem: number of 
> certs, and the date the first and last certs with that problem were issued.
> 
> Investigation began on August 20, 2019 to discover the potential impact of 
> the logic bug. The CRL generation had contained the bug since its inception, 
> affecting all issuance under GTS 1O1 since March 2018. There were 200,263 
> revoked certificates during that time window. Almost all certificates were 
> for internal monitoring speci