Ah yes, I recall we have actually observed this in the past, but it fell off the task queue. It was problematic in particular for asm.js -- there was a patch to fix it just for that case, but it never ended up landing ( https://chromium-review.googlesource.com/c/v8/v8/+/2508404).
I think something like what you're describing is doable; we could in fact make the isolate script cache entries always be Scripts, they already contain a WeakFixedArray of all the SFIs in the Script so it'd "just" be a matter of clearing out Scripts from the cache whose SFI arrays are empty. The tricky part will be, as you say, plumbing this into streaming and deserialization -- I think plumbing into the existing streaming implementation is off the table (since that is only executed with a new source anyway), and the deserialization would need to become a bit more aware of what it contains, since right now the serialized cache is just a "dumb" dump of the object graph with no real way of hooking in existing objects. I don't think we have the resources to prioritise a project like this at the moment, particularly since our isolate cache hit rate is still pretty high, but I'd be happy to discuss potential implementation a bit more if you're interested. - Leszek On Mon, Feb 28, 2022 at 5:38 PM '[email protected]' via v8-dev < [email protected]> wrote: > Sure thing, I've opened 12668 - isolate script cache stores streamed > scripts incorrectly - v8 (chromium.org) > <https://bugs.chromium.org/p/v8/issues/detail?id=12668>. However, I think > I was incorrect earlier: I didn't figure out *the* problem, just *a* problem. > The isolate script cache does a very nice job of deduplicating > SharedFunctionInfos when a bunch of tabs are opened quickly, but that's not > a very realistic customer behavior. Customers are more likely to interact > with a tab for a while, causing plenty of allocations and GC cycles, before > opening another tab, so ageing is a problem in this scenario. I completely > understand that we should clear out the root SFI for a script after a while > because it's unlikely to be used again. However, other SFIs for functions > defined within the script may stay alive much longer. I think our main > problem arises from the fact that when generating a new root SFI, V8 > attaches it to a new Script instance, meaning SFIs for nested functions end > up getting duplicated. I have a proposal that I think would help; what do > you think of the following? > > When an entry in the isolate script cache becomes old, instead of deleting > it, we could replace the value with a weak pointer to the Script and update > the source pointer in the key to be weak too. That way, the entry would not > retain any large objects. When the cache needs resizing, we could iterate > the table and delete any entries that point to cleared weak pointers. > > If the cache lookup returns a Script, that means the calling code must > create a new root SFI but attach it to the existing Script. I imagine this > would require some additional plumbing through the various code paths > (synchronous parse, finalizing a streaming parse, and finalizing a code > cache deserialization), but I think that the additional complexity would be > worthwhile for the result of fully reliable deduplication. In places where > the parsing or deserialization code currently creates a new SFI other than > the root, that code would first need to check whether the Script already > has one. If the SFI already exists but has been flushed, the parsing or > deserialization code should attach new bytecode to it. > > This proposal is only concerning the script cache, not the eval or regexp > caches. > > Looking forward to hearing your thoughts. Thanks! > > On Monday, February 28, 2022 at 12:51:12 AM UTC-8 [email protected] > wrote: > >> That's a great catch! Indeed the key should be based on the outer >> language mode, can you file a bug? >> >> On Sat, Feb 26, 2022 at 1:07 AM '[email protected]' via v8-dev < >> [email protected]> wrote: >> >>> I think I've figured out the problem and it should be simple to fix. >>> When streaming, V8 checks the isolate script cache table based on the >>> task->language_mode(), which may be the outer language mode of the >>> compilation task if it hasn't finished yet. However, V8 then inserts the >>> result into the table based on task->language_mode() again, which at that >>> point is the language mode deduced from the script content. If the task was >>> started in sloppy mode but the script says 'use strict', then the script >>> gets stored in the table with a wrong key. I assume the key in both cases >>> should be based on the outer language mode, not the text content of the >>> script (which is already checked to match). >>> >>> On Wednesday, February 23, 2022 at 2:22:20 PM UTC-8 >>> [email protected] wrote: >>> >>>> Excellent, thanks for the useful information! I tried the scenario >>>> again with --js-flags=--no-isolate-script-cache-ageing and still see a >>>> bunch of duplicated BytecodeArrays in a heap snapshot, so something else >>>> must be preventing this script from using the Isolate script cache. I'll >>>> continue investigating; thanks again for pointing me in the right >>>> direction. >>>> >>>> On Wednesday, February 23, 2022 at 12:03:39 AM UTC-8 >>>> [email protected] wrote: >>>> >>>>> On Wed, Feb 23, 2022 at 1:26 AM '[email protected]' via v8-dev < >>>>> [email protected]> wrote: >>>>> >>>>>> Hello all, >>>>>> >>>>>> An idea came up recently which I imagine some of you have probably >>>>>> already considered at some point, so I'd love to hear any thoughts you >>>>>> have, or summaries of past discussions you'd be willing to share. Or if >>>>>> this idea is fundamentally infeasible, I'd love to hear that too. >>>>>> >>>>>> The scenario: we've been investigating a case where many open tabs >>>>>> all embed the same cross-site iframe, and all of those iframes get put >>>>>> into >>>>>> the same process due to the heuristics described in 899838 - Improve >>>>>> process reuse policies - chromium >>>>>> <https://bugs.chromium.org/p/chromium/issues/detail?id=899838>. >>>>>> Ignoring for the moment whether those heuristics are optimal, the result >>>>>> is >>>>>> a single Isolate with many NativeContexts, where each NativeContext loads >>>>>> mostly the same scripts. >>>>>> >>>>>> The idea: could V8 share bytecode and/or Sparkplug code for functions >>>>>> in those scripts? I know that TurboFan code is native context dependent, >>>>>> but as far as I know, both bytecode and Sparkplug code are native context >>>>>> independent. If V8 could avoid generating duplicates, then this scenario >>>>>> would use substantially less memory, plus tabs after the first wouldn't >>>>>> have to wait on tiering up to Sparkplug. >>>>>> >>>>> >>>>> I think this already happens, at least to some extent. >>>>> SharedFunctionInfos are shared between native contexts through the >>>>> compilation cache >>>>> <https://source.chromium.org/chromium/chromium/src/+/main:v8/src/objects/compilation-cache-table.h;l=90;drc=a2d4701bea545269ad3f5fe6e111adb65c46b8da>, >>>>> and bytecode/baseline code hangs off the SFI >>>>> <https://source.chromium.org/chromium/chromium/src/+/main:v8/src/objects/shared-function-info.h;l=340;drc=a2d4701bea545269ad3f5fe6e111adb65c46b8da>. >>>>> It's not 100% reliable since the cache is aged. Is the cache not hit in >>>>> your scenario? >>>>> >>>>> >>>>>> >>>>>> Thanks, >>>>>> Seth >>>>>> >>>>>> -- >>>>>> -- >>>>>> v8-dev mailing list >>>>>> [email protected] >>>>>> http://groups.google.com/group/v8-dev >>>>>> --- >>>>>> You received this message because you are subscribed to the Google >>>>>> Groups "v8-dev" group. >>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>> send an email to [email protected]. >>>>>> To view this discussion on the web visit >>>>>> https://groups.google.com/d/msgid/v8-dev/bbeb2a47-a26c-4623-81a5-7cad2c0dfec1n%40googlegroups.com >>>>>> <https://groups.google.com/d/msgid/v8-dev/bbeb2a47-a26c-4623-81a5-7cad2c0dfec1n%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>>> . >>>>>> >>>>> -- >>> -- >>> v8-dev mailing list >>> [email protected] >>> http://groups.google.com/group/v8-dev >>> --- >>> You received this message because you are subscribed to the Google >>> Groups "v8-dev" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected]. >>> >> To view this discussion on the web visit >>> https://groups.google.com/d/msgid/v8-dev/21626811-c09d-43bf-a5aa-733aa7a6bfcan%40googlegroups.com >>> <https://groups.google.com/d/msgid/v8-dev/21626811-c09d-43bf-a5aa-733aa7a6bfcan%40googlegroups.com?utm_medium=email&utm_source=footer> >>> . >>> >> -- > -- > v8-dev mailing list > [email protected] > http://groups.google.com/group/v8-dev > --- > You received this message because you are subscribed to the Google Groups > "v8-dev" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/v8-dev/eca14a0a-1dba-4f2c-968d-bf9cb1dc66cbn%40googlegroups.com > <https://groups.google.com/d/msgid/v8-dev/eca14a0a-1dba-4f2c-968d-bf9cb1dc66cbn%40googlegroups.com?utm_medium=email&utm_source=footer> > . > -- -- v8-dev mailing list [email protected] http://groups.google.com/group/v8-dev --- You received this message because you are subscribed to the Google Groups "v8-dev" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/v8-dev/CAGRskv8Txhfkc4%3DdWhftdt86HdUHRLg1R0ZN5cr1vpPOrCErEA%40mail.gmail.com.
