Ah yes, I recall we have actually observed this in the past, but it fell
off the task queue. It was problematic in particular for asm.js -- there
was a patch to fix it just for that case, but it never ended up landing (
https://chromium-review.googlesource.com/c/v8/v8/+/2508404).

I think something like what you're describing is doable; we could in fact
make the isolate script cache entries always be Scripts, they already
contain a WeakFixedArray of all the SFIs in the Script so it'd "just" be a
matter of clearing out Scripts from the cache whose SFI arrays are empty.
The tricky part will be, as you say, plumbing this into streaming and
deserialization -- I think plumbing into the existing streaming
implementation is off the table (since that is only executed with a new
source anyway), and the deserialization would need to become a bit more
aware of what it contains, since right now the serialized cache is just a
"dumb" dump of the object graph with no real way of hooking in existing
objects.

I don't think we have the resources to prioritise a project like this at
the moment, particularly since our isolate cache hit rate is still pretty
high, but I'd be happy to discuss potential implementation a bit more if
you're interested.

- Leszek

On Mon, Feb 28, 2022 at 5:38 PM '[email protected]' via v8-dev <
[email protected]> wrote:

> Sure thing, I've opened  12668 - isolate script cache stores streamed
> scripts incorrectly - v8 (chromium.org)
> <https://bugs.chromium.org/p/v8/issues/detail?id=12668>. However, I think
> I was incorrect earlier: I didn't figure out *the* problem, just *a* problem.
> The isolate script cache does a very nice job of deduplicating
> SharedFunctionInfos when a bunch of tabs are opened quickly, but that's not
> a very realistic customer behavior. Customers are more likely to interact
> with a tab for a while, causing plenty of allocations and GC cycles, before
> opening another tab, so ageing is a problem in this scenario. I completely
> understand that we should clear out the root SFI for a script after a while
> because it's unlikely to be used again. However, other SFIs for functions
> defined within the script may stay alive much longer. I think our main
> problem arises from the fact that when generating a new root SFI, V8
> attaches it to a new Script instance, meaning SFIs for nested functions end
> up getting duplicated. I have a proposal that I think would help; what do
> you think of the following?
>
> When an entry in the isolate script cache becomes old, instead of deleting
> it, we could replace the value with a weak pointer to the Script and update
> the source pointer in the key to be weak too. That way, the entry would not
> retain any large objects. When the cache needs resizing, we could iterate
> the table and delete any entries that point to cleared weak pointers.
>
> If the cache lookup returns a Script, that means the calling code must
> create a new root SFI but attach it to the existing Script. I imagine this
> would require some additional plumbing through the various code paths
> (synchronous parse, finalizing a streaming parse, and finalizing a code
> cache deserialization), but I think that the additional complexity would be
> worthwhile for the result of fully reliable deduplication. In places where
> the parsing or deserialization code currently creates a new SFI other than
> the root, that code would first need to check whether the Script already
> has one. If the SFI already exists but has been flushed, the parsing or
> deserialization code should attach new bytecode to it.
>
> This proposal is only concerning the script cache, not the eval or regexp
> caches.
>
> Looking forward to hearing your thoughts. Thanks!
>
> On Monday, February 28, 2022 at 12:51:12 AM UTC-8 [email protected]
> wrote:
>
>> That's a great catch! Indeed the key should be based on the outer
>> language mode, can you file a bug?
>>
>> On Sat, Feb 26, 2022 at 1:07 AM '[email protected]' via v8-dev <
>> [email protected]> wrote:
>>
>>> I think I've figured out the problem and it should be simple to fix.
>>> When streaming, V8 checks the isolate script cache table based on the
>>> task->language_mode(), which may be the outer language mode of the
>>> compilation task if it hasn't finished yet. However, V8 then inserts the
>>> result into the table based on task->language_mode() again, which at that
>>> point is the language mode deduced from the script content. If the task was
>>> started in sloppy mode but the script says 'use strict', then the script
>>> gets stored in the table with a wrong key. I assume the key in both cases
>>> should be based on the outer language mode, not the text content of the
>>> script (which is already checked to match).
>>>
>>> On Wednesday, February 23, 2022 at 2:22:20 PM UTC-8
>>> [email protected] wrote:
>>>
>>>> Excellent, thanks for the useful information! I tried the scenario
>>>> again with --js-flags=--no-isolate-script-cache-ageing and still see a
>>>> bunch of duplicated BytecodeArrays in a heap snapshot, so something else
>>>> must be preventing this script from using the Isolate script cache. I'll
>>>> continue investigating; thanks again for pointing me in the right 
>>>> direction.
>>>>
>>>> On Wednesday, February 23, 2022 at 12:03:39 AM UTC-8
>>>> [email protected] wrote:
>>>>
>>>>> On Wed, Feb 23, 2022 at 1:26 AM '[email protected]' via v8-dev <
>>>>> [email protected]> wrote:
>>>>>
>>>>>> Hello all,
>>>>>>
>>>>>> An idea came up recently which I imagine some of you have probably
>>>>>> already considered at some point, so I'd love to hear any thoughts you
>>>>>> have, or summaries of past discussions you'd be willing to share. Or if
>>>>>> this idea is fundamentally infeasible, I'd love to hear that too.
>>>>>>
>>>>>> The scenario: we've been investigating a case where many open tabs
>>>>>> all embed the same cross-site iframe, and all of those iframes get put 
>>>>>> into
>>>>>> the same process due to the heuristics described in 899838 - Improve
>>>>>> process reuse policies - chromium
>>>>>> <https://bugs.chromium.org/p/chromium/issues/detail?id=899838>.
>>>>>> Ignoring for the moment whether those heuristics are optimal, the result 
>>>>>> is
>>>>>> a single Isolate with many NativeContexts, where each NativeContext loads
>>>>>> mostly the same scripts.
>>>>>>
>>>>>> The idea: could V8 share bytecode and/or Sparkplug code for functions
>>>>>> in those scripts? I know that TurboFan code is native context dependent,
>>>>>> but as far as I know, both bytecode and Sparkplug code are native context
>>>>>> independent. If V8 could avoid generating duplicates, then this scenario
>>>>>> would use substantially less memory, plus tabs after the first wouldn't
>>>>>> have to wait on tiering up to Sparkplug.
>>>>>>
>>>>>
>>>>> I think this already happens, at least to some extent.
>>>>> SharedFunctionInfos are shared between native contexts through the
>>>>> compilation cache
>>>>> <https://source.chromium.org/chromium/chromium/src/+/main:v8/src/objects/compilation-cache-table.h;l=90;drc=a2d4701bea545269ad3f5fe6e111adb65c46b8da>,
>>>>> and bytecode/baseline code hangs off the SFI
>>>>> <https://source.chromium.org/chromium/chromium/src/+/main:v8/src/objects/shared-function-info.h;l=340;drc=a2d4701bea545269ad3f5fe6e111adb65c46b8da>.
>>>>> It's not 100% reliable since the cache is aged. Is the cache not hit in
>>>>> your scenario?
>>>>>
>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>> Seth
>>>>>>
>>>>>> --
>>>>>> --
>>>>>> v8-dev mailing list
>>>>>> [email protected]
>>>>>> http://groups.google.com/group/v8-dev
>>>>>> ---
>>>>>> You received this message because you are subscribed to the Google
>>>>>> Groups "v8-dev" group.
>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>> send an email to [email protected].
>>>>>> To view this discussion on the web visit
>>>>>> https://groups.google.com/d/msgid/v8-dev/bbeb2a47-a26c-4623-81a5-7cad2c0dfec1n%40googlegroups.com
>>>>>> <https://groups.google.com/d/msgid/v8-dev/bbeb2a47-a26c-4623-81a5-7cad2c0dfec1n%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>>> .
>>>>>>
>>>>> --
>>> --
>>> v8-dev mailing list
>>> [email protected]
>>> http://groups.google.com/group/v8-dev
>>> ---
>>> You received this message because you are subscribed to the Google
>>> Groups "v8-dev" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to [email protected].
>>>
>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/v8-dev/21626811-c09d-43bf-a5aa-733aa7a6bfcan%40googlegroups.com
>>> <https://groups.google.com/d/msgid/v8-dev/21626811-c09d-43bf-a5aa-733aa7a6bfcan%40googlegroups.com?utm_medium=email&utm_source=footer>
>>> .
>>>
>> --
> --
> v8-dev mailing list
> [email protected]
> http://groups.google.com/group/v8-dev
> ---
> You received this message because you are subscribed to the Google Groups
> "v8-dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/v8-dev/eca14a0a-1dba-4f2c-968d-bf9cb1dc66cbn%40googlegroups.com
> <https://groups.google.com/d/msgid/v8-dev/eca14a0a-1dba-4f2c-968d-bf9cb1dc66cbn%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>

-- 
-- 
v8-dev mailing list
[email protected]
http://groups.google.com/group/v8-dev
--- 
You received this message because you are subscribed to the Google Groups 
"v8-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/v8-dev/CAGRskv8Txhfkc4%3DdWhftdt86HdUHRLg1R0ZN5cr1vpPOrCErEA%40mail.gmail.com.

Reply via email to