The candidate list goes down to 20k occurrences in order to catch resources
that were updated mid-crawl and may have multiple entries with different
hashes that add up to 100k+ occurrences. In the candidate list, without any
filtering, the 100k cutoff is around 600, I'd estimate that well less than
25% of the candidates make it through the filtering for stable pattern,
correct resource type and reliable pattern. First release will likely be
100-200 and I don't expect it will ever grow above 500.

As far as cadence goes, I expect there will be a lot of activity for the
next few releases as individual patterns are coordinated with the origin
owners but then it will settle down to a much more bursty pattern of
updates every few Chrome releases (likely linked with an origin changing
their application and adding more/different resources). And yes, it is
manual.

As far as the process goes, resource owners need to actively assert that
their resource is appropriate for the single-keyed cache and that they
would like it included (usually in response to active outreach from us but
we have the external-facing list for owner-initiated contact as well).  The
design doc has the documentation for what it means to be appropriate (and
the doc will be moved to a readme page in the repository next to the actual
list so it's not a hard-to-find Google doc):

5. Require resource owner opt-in
For each URL to be included, reach out to the team/company responsible for
the resource to validate the URL pattern and get assurances that the
pattern will always serve the same content to all sites and not be abused
for tracking (by using unique URLs within the pattern mask as a bit-mask
for fingerprinting). They will also need to validate that the URLs covered
by the pattern will not rely on being able to set cookies over HTTP using a
Set-Cookie HTTP response header because they will not be re-applied across
cache boundaries (the set-cookie is not cached with the resource).



On Wed, Oct 22, 2025 at 5:31 PM Mike Taylor <[email protected]> wrote:

> On 10/18/25 8:34 a.m., Patrick Meenan wrote:
>
> Sorry, I missed a step in making the candidate resource list public. I
> have moved it to my chromium account and made it public here
> <https://docs.google.com/spreadsheets/d/1TgWhdeqKbGm6hLM9WqnnXLn-iiO4Y9HTjDXjVO2aBqI/edit?usp=sharing>.
>
>
> Not everything in that list meets all of the criteria - it's just the
> first step in the manual curation (same URL served the same content across
> > 20k sites in the HTTP Archive dataset).
>
> The manual steps frome there for meeting the criteria are basically:
>
> - Cull the list for scripts, stylesheets and compression dictionaries.
> - Remove any URLs that use query parameters.
> - Exclude any responses that set cookies.
> - Identify URLs that are not manually versioned by site embedders (i.e.
> the embedded resource can not get stale). This is either in-place updating
> resources or automatically versioned resources.
> - Only include URLs that can reliably target a single resource by pattern
> (i.e. ..../<hash>-common.js but not ..../<hash>.js)
> - Get confirmation from the resource owner that the given URL Pattern is
> and will continue to be appropriate for the single-keyed cache
>
> A few questions on list curation:
>
> Can you clarify how big the list will be? The privacy review at
> https://chromestatus.com/feature/5202380930678784?gate=5174931459145728 
> mentions
> ~500, while the design doc mentions 1000. I see the candidate resource list
> starts at ~5000, then presumably manual curation begins to get to one of
> those numbers.
>
> What is the expected list curation/update cadence? Is it actually manual?
>
> Is there any recourse process for owners of resources that don't want to
> be included? Do we have documentation on what it mean to be appropriate for
> the single-keyed cache?
>
> thanks,
> Mike
>

-- 
You received this message because you are subscribed to the Google Groups 
"blink-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion visit 
https://groups.google.com/a/chromium.org/d/msgid/blink-dev/CAPq58w7UxDqdO6-pvjEROTaE6aectdnDDLH%2Bb-HTeXA%2BUeBgpg%40mail.gmail.com.

Reply via email to