I'm extremely supportive of this effort, with multiple hats on.

I'd have loved if this wasn't restricted to users with 3P cookies enabled,
but one can imagine abuse where pervasive resource *patterns* are used, but
with unique hashes that are not deployed in the wild, and where each such
URL is used as a cross-origin bit of entropy.

On Sat, Nov 8, 2025 at 7:04 AM Patrick Meenan <[email protected]> wrote:

> The list construction should already be completely objective. I changed
> the manual origin-owner validation to trust and require "cache-control:
> public" instead. The rest of the criteria
> <https://docs.google.com/document/d/1xaoF9iSOojrlPrHZaKIJMK4iRZKA3AD6pQvbSy4ueUQ/edit?tab=t.0>
> should be well-defined and objective. I'm not sure if they can be fully
> automated yet (though that might just be my pre-AI thinking).
>
> The main need for humans in the loop right now is to create the patterns
> so that they each represent a "single" resource that is stable over time
> with URL changes (version/hash) and distinguishing those stable files from
> random hash bundles that aren't stable from release to release. That's
> fairly easy for a human to do (and get right).
>

This is something that origins that use compression dictionaries already do
by themselves - define the "match" pattern that covers the URL's semantics.
Can we somehow use that for automation where it exists?


>
>
>
> On Fri, Nov 7, 2025 at 4:47 PM Rick Byers <[email protected]> wrote:
>
>> Thanks Pat. I am personally a big fan of things which increase publisher
>> ad revenue across the web broadly without hurting (or ideally improving)
>> the user experience, and this seems likely to do exactly that. In
>> particular I recall all the debate around stale-while-revalidate
>> <https://web.dev/articles/stale-while-revalidate> and am proud that we
>> pushed
>> <https://groups.google.com/a/chromium.org/g/blink-dev/c/rspPrQHfFkI/m/c5j3xJQRDAAJ?e=48417069>
>> through it with urgency and confirmed it indeed increased publisher ad
>> revenue across the web
>> <https://web.dev/case-studies/ads-case-study-stale-while-revalidate>.
>>
>> Reading the Mozilla feedback carefully the point that resonates most with
>> me is the risk of "gatekeeping" and the potential to mitigate that by
>> establishing objective rules for inclusion. Is it plausible to imagine a
>> version of this where the list construction would be entirely objective?
>> What would the tradeoffs be?
>>
>> Thanks,
>>    Rick
>>
>>
>>
>>
>> On Thu, Oct 30, 2025 at 3:50 PM Patrick Meenan <[email protected]>
>> wrote:
>>
>>> Reaching out to site owners was mostly for a sanity check that the
>>> resource is not expecting to be partitioned for some reason (even though
>>> the payloads are known to be identical). If it helps, we can replace the
>>> reach-out step with a requirement that the responses be "Cache-Control:
>>> public" (and hard-enforce it in the browser by not writing the resource to
>>> cache if it isn't). That is an explicit indicator that the resources are
>>> cacheable in shared upstream caches.
>>>
>>> I removed the 2 items from the design doc that were specifically
>>> targeted at direct fingerprinting since that's moot with the 3PC link (as
>>> well as the fingerprinting bits from the validation with resource owners).
>>>
>>> On the site-preferencing concern, it doesn't actually preference large
>>> sites but it does preference currently-popular third-party resources (most
>>> of which are provided by large corporations). The benefit is spread across
>>> all of the sites that they are embedded in (funnily enough, most large
>>> sites won't benefit because they don't tend to use third-parties).
>>>
>>> Determining the common resources at a local level exposes the same XS
>>> Leak issues as allowing all resources (i.e. your local map tiles will show
>>> up in multiple cache partitions because they all reference your current
>>> location but they can be used to identify your location since they are not
>>> globally common). Instead of using the HTTP Archive to collect the
>>> candidates, we could presumably build a centralized list based on
>>> aggregated common resources that are seen across cache partitions by each
>>> user but that feels like an awful lot of complexity for a very small number
>>> of resulting resources.
>>>
>>> On the test results, sorry, I thought I had included the experiment
>>> results in the I2S but it looks like I may not have.
>>>
>>> The test was specifically just with the patterns for the Google ads
>>> scripts because we aren't expecting this feature to impact the vitals for
>>> the main page/content since most of the pervasive resources are third-party
>>> content that is usually async already and not critical-path. It's possible
>>> some video or map embeds might trigger LCP in some cases but that's the
>>> exception more than the norm. This is more geared to making those
>>> supporting things work better while maintaining the user experience. Ads
>>> has the kind of instrumentation that we'd need to be able to get visibility
>>> into the success (or failure) of that assumption and to be able to measure
>>> small changes.
>>>
>>> The results were stat-sig positive but relatively small. The ad iframes
>>> displayed their content slightly faster and transmitted fewer bytes for
>>> each frame (very low single digit percentages).
>>>
>>> The guardrail metrics, including vitals) were all neutral which is what
>>> we were hoping for (improvement without a cost of increased contention).
>>>
>>> If you'd feel more comfortable with gathering more data, I wouldn't be
>>> opposed to running the full list at 1% to check the guardrail metrics again
>>> before fully launching. We won't necessarily expect to see positive
>>> movement to justify a launch since the resources are still async but we can
>>> validate that assumption with the full list at least (if that is the only
>>> remaining concern).
>>>
>>>
>>> On Thu, Oct 30, 2025 at 5:28 PM Rick Byers <[email protected]> wrote:
>>>
>>>> Thanks Erik and Patrick, of course that makes sense. Sorry for the
>>>> naive question. My naive reading of the design doc suggested to me that a
>>>> lot of the privacy mitigations were about preventing the cross-site
>>>> tracking risk. Could the design be simplified by removing some of those
>>>> mitigations? For example, the section about reaching out to the resource
>>>> owners, to what extent is that really necessary when all we're trying to
>>>> mitigate is XS leaks? Don't the popularity properties alone mitigate that
>>>> sufficiently?
>>>>
>>>> What can you share about the magnitude of the performance benefit in
>>>> practice in your experiments? In particular for LCP, since we know
>>>> <https://wpostats.com/> that correlates well with user engagement (and
>>>> against abandonment) and so presumably user value.
>>>>
>>>> The concern about not wanting to further advantage more popular sites
>>>> over less popular ones resonates with me. Part of that argument seems to
>>>> apply broadly to the idea of any LRU cache (especially one with a reuse
>>>> bias which I believe ours has
>>>> <https://www.chromium.org/developers/design-documents/network-stack/disk-cache/#eviction>?).
>>>> But perhaps an important distinction here is that the benefits are
>>>> determined globally vs. on a user-by-user basis? But I think any solution
>>>> that worked on a user-by-user basis would have the XS leak problem, right?
>>>> Perhaps it's worth reflecting on our stance on using crowd-sourced data to
>>>> try to improve the experience for all users while still being fair to sites
>>>> broadly. In general I think this is something Chromium is much more open to
>>>> (where it brings significant user benefit) than other engines. For example,
>>>> our Media Engagement Index <https://developer.chrome.com/blog/autoplay>
>>>> system has some similar properties in terms of using aggregate user
>>>> behaviour to help decide which sites have the power to play audio on page
>>>> load and which don't. I was personally uncertain at the time if the
>>>> complexity would prove to be worth the benefit, but now I'm quite convinced
>>>> it is. Playing audio on load is just something users and developers want in
>>>> a few cases, but not most cases. I wonder if perhaps cross-site caching is
>>>> similar?
>>>>
>>>> Rick
>>>>
>>>> On Thu, Oct 30, 2025 at 9:09 AM Matt Menke <[email protected]> wrote:
>>>>
>>>>> Note that even with Vary: Origin, we still have to load the HTTP
>>>>> request headers from the disk cache to apply the vary header, which leaks
>>>>> timing information, so "Vary: Origin" is not a sufficient security
>>>>> mechanism to prevent that sort of cross-site attack.
>>>>>
>>>>> On Wednesday, October 29, 2025 at 5:08:42 PM UTC-4 Erik Anderson wrote:
>>>>>
>>>>>> My understanding was that there was believed to be a meaningful
>>>>>> security benefit with partitioning the cache. That’s because it would 
>>>>>> limit
>>>>>> a party from being able to inferr that you’ve visited some other site by
>>>>>> measuring a side effect tied to how quickly a resource loads. That
>>>>>> observation could potentially be made even if that specific adversary
>>>>>> doesn’t have any of their own content loaded on the other site.
>>>>>>
>>>>>>
>>>>>>
>>>>>> Of course, if there is an entity with a resource loaded across both
>>>>>> sites with a 3p cookie *and* they’re willing to share that
>>>>>> info/collude, there’s not much benefit. And even when partitioned, if 3p
>>>>>> cookies are enabled, there are potentially measurable side effects that
>>>>>> differ based on if the resource request had some specific state in a 3p
>>>>>> cookie.
>>>>>>
>>>>>>
>>>>>>
>>>>>> Does that incremental security benefit of partitioning the cache
>>>>>> justify the performance costs when 3p cookies are still enabled? I’m not
>>>>>> sure.
>>>>>>
>>>>>>
>>>>>>
>>>>>> Even if partitioning was eliminated, a site could protect themselves
>>>>>> a bit by specifying Vary: Origin, but that probably doesn’t
>>>>>> sufficiently cover iframe scenarios (nor would I expect most sites to 
>>>>>> hold
>>>>>> it right).
>>>>>>
>>>>>>
>>>>>>
>>>>>> *From:* Rick Byers <[email protected]>
>>>>>> *Sent:* Wednesday, October 29, 2025 11:56 AM
>>>>>> *To:* Patrick Meenan <[email protected]>
>>>>>> *Cc:* Mike Taylor <[email protected]>; blink-dev <
>>>>>> [email protected]>
>>>>>> *Subject:* [EXTERNAL] Re: [blink-dev] Intent to ship: Cache sharing
>>>>>> for extremely-pervasive resources
>>>>>>
>>>>>>
>>>>>>
>>>>>> If this is enabled only when 3PCs are enabled, then what are the
>>>>>> tradeoffs of going through all this complexity and governance vs. just
>>>>>> broadly coupling HTTP cache keying behavior to 3PC status in some way? 
>>>>>> What
>>>>>> can a tracker credibly do with a single-keyed HTTP cache that they cannot
>>>>>> do with 3PCs? Are there also concerns about accidental cross-site 
>>>>>> resource
>>>>>> sharing which could be mitigated more simply by other means, eg. by 
>>>>>> scoping
>>>>>> to just to ETag-based caching?
>>>>>>
>>>>>>
>>>>>>
>>>>>> I remember the controversy and some real evidence of harm to users
>>>>>> and businesses in 2020 when we partitioned the HTTP cache, but I was
>>>>>> convinced that we had to accept that harm in order to credibly achieve
>>>>>> 3PCD. At the time I was personally a fan of a proposal like this (even 
>>>>>> for
>>>>>> users without 3PCs) in order to mitigate the harm. But now it seems to me
>>>>>> that if we're going to start talking about poking holes in that decision,
>>>>>> perhaps we should be doing a larger review of the options in that space
>>>>>> with the knowledge that most Chrome users are likely to continue to
>>>>>> have 3PCs enabled. WDYT?
>>>>>>
>>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>>    Rick
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Mon, Oct 27, 2025 at 10:27 AM Patrick Meenan <[email protected]>
>>>>>> wrote:
>>>>>>
>>>>>> I don't believe the security/privacy protections actually rely on the
>>>>>> assertions (and it's unlikely those would be public). It's more for
>>>>>> awareness and to make sure they don't accidentally break something with
>>>>>> their app if they were relying on the responses being partitioned by 
>>>>>> site.
>>>>>>
>>>>>>
>>>>>>
>>>>>> As far as query params go, the browser code already only filters for
>>>>>> requests with no query params so any that do rely on query params won't 
>>>>>> get
>>>>>> included anyway.
>>>>>>
>>>>>>
>>>>>>
>>>>>> The same goes for cookies. Since the feature is only enabled when
>>>>>> third-party cookies are enabled, adding cookies to these responses or
>>>>>> putting unique content in them won't actually pierce any new boundaries 
>>>>>> but
>>>>>> it goes against the intent of only using it for public/static resources 
>>>>>> and
>>>>>> they'd lose the benefit of the shared cache when it gets updated. Same 
>>>>>> goes
>>>>>> for the fingerprinting risks if the pattern was abused.
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Mon, Oct 27, 2025 at 9:39 AM Mike Taylor <[email protected]>
>>>>>> wrote:
>>>>>>
>>>>>> On 10/22/25 5:48 p.m., Patrick Meenan wrote:
>>>>>>
>>>>>> The candidate list goes down to 20k occurrences in order to catch
>>>>>> resources that were updated mid-crawl and may have multiple entries with
>>>>>> different hashes that add up to 100k+ occurrences. In the candidate list,
>>>>>> without any filtering, the 100k cutoff is around 600, I'd estimate that
>>>>>> well less than 25% of the candidates make it through the filtering for
>>>>>> stable pattern, correct resource type and reliable pattern. First release
>>>>>> will likely be 100-200 and I don't expect it will ever grow above 500.
>>>>>>
>>>>>> Thanks - I see the living document has been updated to mention 500 as
>>>>>> a ceiling.
>>>>>>
>>>>>>
>>>>>>
>>>>>> As far as cadence goes, I expect there will be a lot of activity for
>>>>>> the next few releases as individual patterns are coordinated with the
>>>>>> origin owners but then it will settle down to a much more bursty pattern 
>>>>>> of
>>>>>> updates every few Chrome releases (likely linked with an origin changing
>>>>>> their application and adding more/different resources). And yes, it is
>>>>>> manual.
>>>>>>
>>>>>> As far as the process goes, resource owners need to actively assert
>>>>>> that their resource is appropriate for the single-keyed cache and that 
>>>>>> they
>>>>>> would like it included (usually in response to active outreach from us 
>>>>>> but
>>>>>> we have the external-facing list for owner-initiated contact as well).  
>>>>>> The
>>>>>> design doc has the documentation for what it means to be appropriate (and
>>>>>> the doc will be moved to a readme page in the repository next to the 
>>>>>> actual
>>>>>> list so it's not a hard-to-find Google doc):
>>>>>>
>>>>>> Will there be any kind of public record of this assertion? What
>>>>>> happens if a site starts using query params or sending cookies? Does the
>>>>>> person in charge of manual list curation discover that in the next 
>>>>>> release?
>>>>>> Does that require a new release (I don't know if this lives in component
>>>>>> updater, or in the binary itself)?
>>>>>>
>>>>>>
>>>>>>
>>>>>> *5. Require resource owner opt-in*
>>>>>> For each URL to be included, reach out to the team/company
>>>>>> responsible for the resource to validate the URL pattern and get 
>>>>>> assurances
>>>>>> that the pattern will always serve the same content to all sites and not 
>>>>>> be
>>>>>> abused for tracking (by using unique URLs within the pattern mask as a
>>>>>> bit-mask for fingerprinting). They will also need to validate that the 
>>>>>> URLs
>>>>>> covered by the pattern will not rely on being able to set cookies over 
>>>>>> HTTP
>>>>>> using a Set-Cookie HTTP response header because they will not be
>>>>>> re-applied across cache boundaries (the set-cookie is not cached with the
>>>>>> resource).
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Wed, Oct 22, 2025 at 5:31 PM Mike Taylor <[email protected]>
>>>>>> wrote:
>>>>>>
>>>>>> On 10/18/25 8:34 a.m., Patrick Meenan wrote:
>>>>>>
>>>>>> Sorry, I missed a step in making the candidate resource list public.
>>>>>> I have moved it to my chromium account and made it public here
>>>>>> <https://docs.google.com/spreadsheets/d/1TgWhdeqKbGm6hLM9WqnnXLn-iiO4Y9HTjDXjVO2aBqI/edit?usp=sharing>.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> Not everything in that list meets all of the criteria - it's just the
>>>>>> first step in the manual curation (same URL served the same content 
>>>>>> across
>>>>>> > 20k sites in the HTTP Archive dataset).
>>>>>>
>>>>>>
>>>>>>
>>>>>> The manual steps frome there for meeting the criteria are basically:
>>>>>>
>>>>>>
>>>>>>
>>>>>> - Cull the list for scripts, stylesheets and compression dictionaries.
>>>>>>
>>>>>> - Remove any URLs that use query parameters.
>>>>>>
>>>>>> - Exclude any responses that set cookies.
>>>>>>
>>>>>> - Identify URLs that are not manually versioned by site embedders
>>>>>> (i.e. the embedded resource can not get stale). This is either in-place
>>>>>> updating resources or automatically versioned resources.
>>>>>>
>>>>>> - Only include URLs that can reliably target a single resource by
>>>>>> pattern (i.e. ..../<hash>-common.js but not ..../<hash>.js)
>>>>>>
>>>>>> - Get confirmation from the resource owner that the given URL Pattern
>>>>>> is and will continue to be appropriate for the single-keyed cache
>>>>>>
>>>>>> A few questions on list curation:
>>>>>>
>>>>>> Can you clarify how big the list will be? The privacy review at
>>>>>> https://chromestatus.com/feature/5202380930678784?gate=5174931459145728 
>>>>>> mentions
>>>>>> ~500, while the design doc mentions 1000. I see the candidate resource 
>>>>>> list
>>>>>> starts at ~5000, then presumably manual curation begins to get to one of
>>>>>> those numbers.
>>>>>>
>>>>>> What is the expected list curation/update cadence? Is it actually
>>>>>> manual?
>>>>>>
>>>>>> Is there any recourse process for owners of resources that don't want
>>>>>> to be included? Do we have documentation on what it mean to be 
>>>>>> appropriate
>>>>>> for the single-keyed cache?
>>>>>>
>>>>>> thanks,
>>>>>> Mike
>>>>>>
>>>>>> --
>>>>>> You received this message because you are subscribed to the Google
>>>>>> Groups "blink-dev" group.
>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>> send an email to [email protected].
>>>>>> To view this discussion visit
>>>>>> https://groups.google.com/a/chromium.org/d/msgid/blink-dev/CAPq58w6UFSnxxzhGKBnY1BJKiZZeH7BUm7PmcjQm_%2BLjGyrtYg%40mail.gmail.com
>>>>>> <https://groups.google.com/a/chromium.org/d/msgid/blink-dev/CAPq58w6UFSnxxzhGKBnY1BJKiZZeH7BUm7PmcjQm_%2BLjGyrtYg%40mail.gmail.com?utm_medium=email&utm_source=footer>
>>>>>> .
>>>>>>
>>>>>> --
>>>>>> You received this message because you are subscribed to the Google
>>>>>> Groups "blink-dev" group.
>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>> send an email to [email protected].
>>>>>>
>>>>>> To view this discussion visit
>>>>>> https://groups.google.com/a/chromium.org/d/msgid/blink-dev/CAFUtAY9Nffq00r-xbiu2BO00y%2B_2knAi-zheMs9hrE-dB%2BTZ3w%40mail.gmail.com
>>>>>> <https://groups.google.com/a/chromium.org/d/msgid/blink-dev/CAFUtAY9Nffq00r-xbiu2BO00y%2B_2knAi-zheMs9hrE-dB%2BTZ3w%40mail.gmail.com?utm_medium=email&utm_source=footer>
>>>>>> .
>>>>>>
>>>>> --
> You received this message because you are subscribed to the Google Groups
> "blink-dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion visit
> https://groups.google.com/a/chromium.org/d/msgid/blink-dev/CAPq58w4ceQ4Df%2BzFCYwFM5MSAh4APVXtCHj9Q7o5CP_B%3DKs1kA%40mail.gmail.com
> <https://groups.google.com/a/chromium.org/d/msgid/blink-dev/CAPq58w4ceQ4Df%2BzFCYwFM5MSAh4APVXtCHj9Q7o5CP_B%3DKs1kA%40mail.gmail.com?utm_medium=email&utm_source=footer>
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"blink-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion visit 
https://groups.google.com/a/chromium.org/d/msgid/blink-dev/CAOmohSJD09qeAmWFw5ySuMMVovMfXy_5rWHa%3DqN9y%2B%2BhyEqx5g%40mail.gmail.com.

Reply via email to