Thanks, Leszek! This issue is high priority for Microsoft because it affects Office Online, so we'd be happy to provide the implementation with your guidance and code reviews.
> we could in fact make the isolate script cache entries always be Scripts Would that option cause a lower hit rate? The current implementation keeps the root SFI alive until its bytecode gets flushed, meaning it survives five GC cycles after nobody is using it. If we only held a pointer to the Script, then I imagine the root SFI could get collected earlier since the Script points to its SFIs only via a WeakFixedArray. On Tuesday, March 1, 2022 at 6:30:46 AM UTC-8 [email protected] wrote: > Ah yes, I recall we have actually observed this in the past, but it fell > off the task queue. It was problematic in particular for asm.js -- there > was a patch to fix it just for that case, but it never ended up landing ( > https://chromium-review.googlesource.com/c/v8/v8/+/2508404). > > I think something like what you're describing is doable; we could in fact > make the isolate script cache entries always be Scripts, they already > contain a WeakFixedArray of all the SFIs in the Script so it'd "just" be a > matter of clearing out Scripts from the cache whose SFI arrays are empty. > The tricky part will be, as you say, plumbing this into streaming and > deserialization -- I think plumbing into the existing streaming > implementation is off the table (since that is only executed with a new > source anyway), and the deserialization would need to become a bit more > aware of what it contains, since right now the serialized cache is just a > "dumb" dump of the object graph with no real way of hooking in existing > objects. > > I don't think we have the resources to prioritise a project like this at > the moment, particularly since our isolate cache hit rate is still pretty > high, but I'd be happy to discuss potential implementation a bit more if > you're interested. > > - Leszek > > On Mon, Feb 28, 2022 at 5:38 PM '[email protected]' via v8-dev < > [email protected]> wrote: > >> Sure thing, I've opened 12668 - isolate script cache stores streamed >> scripts incorrectly - v8 (chromium.org) >> <https://bugs.chromium.org/p/v8/issues/detail?id=12668>. However, I >> think I was incorrect earlier: I didn't figure out *the* problem, just >> *a* problem. The isolate script cache does a very nice job of >> deduplicating SharedFunctionInfos when a bunch of tabs are opened quickly, >> but that's not a very realistic customer behavior. Customers are more >> likely to interact with a tab for a while, causing plenty of allocations >> and GC cycles, before opening another tab, so ageing is a problem in this >> scenario. I completely understand that we should clear out the root SFI for >> a script after a while because it's unlikely to be used again. However, >> other SFIs for functions defined within the script may stay alive much >> longer. I think our main problem arises from the fact that when generating >> a new root SFI, V8 attaches it to a new Script instance, meaning SFIs for >> nested functions end up getting duplicated. I have a proposal that I think >> would help; what do you think of the following? >> >> When an entry in the isolate script cache becomes old, instead of >> deleting it, we could replace the value with a weak pointer to the Script >> and update the source pointer in the key to be weak too. That way, the >> entry would not retain any large objects. When the cache needs resizing, we >> could iterate the table and delete any entries that point to cleared weak >> pointers. >> >> If the cache lookup returns a Script, that means the calling code must >> create a new root SFI but attach it to the existing Script. I imagine this >> would require some additional plumbing through the various code paths >> (synchronous parse, finalizing a streaming parse, and finalizing a code >> cache deserialization), but I think that the additional complexity would be >> worthwhile for the result of fully reliable deduplication. In places where >> the parsing or deserialization code currently creates a new SFI other than >> the root, that code would first need to check whether the Script already >> has one. If the SFI already exists but has been flushed, the parsing or >> deserialization code should attach new bytecode to it. >> >> This proposal is only concerning the script cache, not the eval or regexp >> caches. >> >> Looking forward to hearing your thoughts. Thanks! >> >> On Monday, February 28, 2022 at 12:51:12 AM UTC-8 [email protected] >> wrote: >> >>> That's a great catch! Indeed the key should be based on the outer >>> language mode, can you file a bug? >>> >>> On Sat, Feb 26, 2022 at 1:07 AM '[email protected]' via v8-dev < >>> [email protected]> wrote: >>> >>>> I think I've figured out the problem and it should be simple to fix. >>>> When streaming, V8 checks the isolate script cache table based on the >>>> task->language_mode(), which may be the outer language mode of the >>>> compilation task if it hasn't finished yet. However, V8 then inserts the >>>> result into the table based on task->language_mode() again, which at that >>>> point is the language mode deduced from the script content. If the task >>>> was >>>> started in sloppy mode but the script says 'use strict', then the script >>>> gets stored in the table with a wrong key. I assume the key in both cases >>>> should be based on the outer language mode, not the text content of the >>>> script (which is already checked to match). >>>> >>>> On Wednesday, February 23, 2022 at 2:22:20 PM UTC-8 >>>> [email protected] wrote: >>>> >>>>> Excellent, thanks for the useful information! I tried the scenario >>>>> again with --js-flags=--no-isolate-script-cache-ageing and still see a >>>>> bunch of duplicated BytecodeArrays in a heap snapshot, so something else >>>>> must be preventing this script from using the Isolate script cache. I'll >>>>> continue investigating; thanks again for pointing me in the right >>>>> direction. >>>>> >>>>> On Wednesday, February 23, 2022 at 12:03:39 AM UTC-8 >>>>> [email protected] wrote: >>>>> >>>>>> On Wed, Feb 23, 2022 at 1:26 AM '[email protected]' via v8-dev >>>>>> <[email protected]> wrote: >>>>>> >>>>>>> Hello all, >>>>>>> >>>>>>> An idea came up recently which I imagine some of you have probably >>>>>>> already considered at some point, so I'd love to hear any thoughts you >>>>>>> have, or summaries of past discussions you'd be willing to share. Or if >>>>>>> this idea is fundamentally infeasible, I'd love to hear that too. >>>>>>> >>>>>>> The scenario: we've been investigating a case where many open tabs >>>>>>> all embed the same cross-site iframe, and all of those iframes get put >>>>>>> into >>>>>>> the same process due to the heuristics described in 899838 - >>>>>>> Improve process reuse policies - chromium >>>>>>> <https://bugs.chromium.org/p/chromium/issues/detail?id=899838>. >>>>>>> Ignoring for the moment whether those heuristics are optimal, the >>>>>>> result is >>>>>>> a single Isolate with many NativeContexts, where each NativeContext >>>>>>> loads >>>>>>> mostly the same scripts. >>>>>>> >>>>>>> The idea: could V8 share bytecode and/or Sparkplug code for >>>>>>> functions in those scripts? I know that TurboFan code is native context >>>>>>> dependent, but as far as I know, both bytecode and Sparkplug code are >>>>>>> native context independent. If V8 could avoid generating duplicates, >>>>>>> then >>>>>>> this scenario would use substantially less memory, plus tabs after the >>>>>>> first wouldn't have to wait on tiering up to Sparkplug. >>>>>>> >>>>>> >>>>>> I think this already happens, at least to some extent. >>>>>> SharedFunctionInfos are shared between native contexts through the >>>>>> compilation cache >>>>>> <https://source.chromium.org/chromium/chromium/src/+/main:v8/src/objects/compilation-cache-table.h;l=90;drc=a2d4701bea545269ad3f5fe6e111adb65c46b8da>, >>>>>> >>>>>> and bytecode/baseline code hangs off the SFI >>>>>> <https://source.chromium.org/chromium/chromium/src/+/main:v8/src/objects/shared-function-info.h;l=340;drc=a2d4701bea545269ad3f5fe6e111adb65c46b8da>. >>>>>> >>>>>> It's not 100% reliable since the cache is aged. Is the cache not hit in >>>>>> your scenario? >>>>>> >>>>>> >>>>>>> >>>>>>> Thanks, >>>>>>> Seth >>>>>>> >>>>>>> -- >>>>>>> -- >>>>>>> v8-dev mailing list >>>>>>> [email protected] >>>>>>> http://groups.google.com/group/v8-dev >>>>>>> --- >>>>>>> You received this message because you are subscribed to the Google >>>>>>> Groups "v8-dev" group. >>>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>>> send an email to [email protected]. >>>>>>> To view this discussion on the web visit >>>>>>> https://groups.google.com/d/msgid/v8-dev/bbeb2a47-a26c-4623-81a5-7cad2c0dfec1n%40googlegroups.com >>>>>>> >>>>>>> <https://groups.google.com/d/msgid/v8-dev/bbeb2a47-a26c-4623-81a5-7cad2c0dfec1n%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>>>> . >>>>>>> >>>>>> -- >>>> -- >>>> v8-dev mailing list >>>> [email protected] >>>> http://groups.google.com/group/v8-dev >>>> --- >>>> You received this message because you are subscribed to the Google >>>> Groups "v8-dev" group. >>>> To unsubscribe from this group and stop receiving emails from it, send >>>> an email to [email protected]. >>>> >>> To view this discussion on the web visit >>>> https://groups.google.com/d/msgid/v8-dev/21626811-c09d-43bf-a5aa-733aa7a6bfcan%40googlegroups.com >>>> >>>> <https://groups.google.com/d/msgid/v8-dev/21626811-c09d-43bf-a5aa-733aa7a6bfcan%40googlegroups.com?utm_medium=email&utm_source=footer> >>>> . >>>> >>> -- >> -- >> v8-dev mailing list >> [email protected] >> http://groups.google.com/group/v8-dev >> --- >> You received this message because you are subscribed to the Google Groups >> "v8-dev" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected]. >> > To view this discussion on the web visit >> https://groups.google.com/d/msgid/v8-dev/eca14a0a-1dba-4f2c-968d-bf9cb1dc66cbn%40googlegroups.com >> >> <https://groups.google.com/d/msgid/v8-dev/eca14a0a-1dba-4f2c-968d-bf9cb1dc66cbn%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> > -- -- v8-dev mailing list [email protected] http://groups.google.com/group/v8-dev --- You received this message because you are subscribed to the Google Groups "v8-dev" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/v8-dev/5de23b2b-1586-4e42-a3a2-7b07ce1f4859n%40googlegroups.com.
