On Tue, Jan 14, 2025 at 5:59 AM Jakob Kummerow <[email protected]>
wrote:

> - from what you describe, perhaps it would be feasible to craft a
> reproducer. It'd probably have to be a custom V8 embedder that, in a loop,
> creates many fresh isolates and instantiates/runs the same (or several?)
> demo Wasm module in them.
>

I tried exactly that yesterday, and was able to see that "external memory"
was indeed correlated across isolates, but after creating/destroying
thousands of isolates it seemed to converge on a reasonable number rather
than keep growing forever.

But in prod we see something in external memory growing and growing.


> - it could make sense to verify (with printfs in their destructors) that
> both Isolates and NativeModules get destroyed as expected. It's
> conceivable that the memory growth you're observing is intentional caching
> (of generated code, or something?) because the WasmEngine thinks that the
> cached data is still needed/useful.
>
> How/where exactly are you seeing this increased "external memory"?
> I.e. what reporting system are you using to get memory consumption numbers?
>
>
> On Tue, Jan 14, 2025 at 1:09 AM Kenton Varda <[email protected]>
> wrote:
>
>> To add context here:
>>
>> The problem appears to show up only after running in production for an
>> hour or two. During that time we will have created thousands of isolates to
>> handle millions of requests.
>>
>> But the problem seems to affect *new* isolates, even when those isolates
>> are loaded with applications that had been loaded into previous isolates
>> without problems. Startup of an application should be 100% deterministic
>> since we disallow any I/O during startup, but we're seeing that after the
>> host has been running a while, new isolates are showing much higher
>> "external memory" on startup. (E.g. 400MB external memory, but we enforce a
>> 128MB limit on the whole isolate.)
>>
>> We observed that the wasm native module cache causes identical wasm
>> modules to be shared across isolates, and that wasm lazy compilation causes
>> memory usage of a wasm module -- as accounted by all isolates that have
>> loaded it -- to change.
>>
>> Could it be that there is a memory leak in lazy compilation, such that
>> these shared cached modules are gradually growing over time, to the point
>> where new isolates that try to load these modules are being hit with
>> extremely high "external memory" numbers right off the bat?
>>
>> -Kenton
>>
>> On Mon, Jan 13, 2025 at 5:31 PM Erik Corry <[email protected]>
>> wrote:
>>
>>> It looks like it's related to shared objects between isolates. Is there
>>> a newer document than
>>> https://docs.google.com/document/d/18lYuaEsDSudzl2TDu-nc-0sVXW7WTGAs14k64GEhnFg/edit?usp=drivesdk
>>> that describes how this works today? In particular cross-isolate GCs?
>>>
>>> On Mon, 13 Jan 2025, 15:25 Jakob Kummerow, <[email protected]>
>>> wrote:
>>>
>>>> Sounds like a bug, but without more details (or a repro) I don't have a
>>>> more specific guess than that.
>>>>
>>>> If you're desperate, you could try to bisect it (even with a flaky
>>>> repro). Or review the ~500 changes between those branches:
>>>> https://chromium.googlesource.com/v8/v8/+log/branch-heads/13.1..branch-heads/13.2?n=10000
>>>>
>>>>
>>>> On Mon, Jan 13, 2025 at 2:48 PM 'Dan Lapid' via v8-dev <
>>>> [email protected]> wrote:
>>>>
>>>>> Hi,
>>>>> In V8 13.2 and 13.3 we see wasm isolates external memory usage blowing
>>>>> up sometimes (up to gigabytes).
>>>>> Under V8 13.1 the same code would never ever use more than 80-100MB
>>>>> The issue doesn't happen every time for the same wasm bytecode. It
>>>>> doesn't even reproduce locally.
>>>>> But some significant percentage of the time it does happen.
>>>>> This has only started happening in 13.2, what are we missing? Should
>>>>> we be enabling/disabling some flags?
>>>>> It also seems that 13.3 is significantly worse in terms of error rate.
>>>>> The problem happens under "--liftoff-only".
>>>>> We use pointer compression but not sandbox.
>>>>> We've tried enabling --turboshaft-wasm in 13.1 and the problem did not
>>>>> reproduce.
>>>>> Has anything changed that we need to adapt to?
>>>>> Would really appreciate your help!
>>>>>
>>>>> --
>>>>>
>>>>

-- 
-- 
v8-dev mailing list
[email protected]
http://groups.google.com/group/v8-dev
--- 
You received this message because you are subscribed to the Google Groups 
"v8-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion visit 
https://groups.google.com/d/msgid/v8-dev/CAJouXQkFAjAARc75cX5ywh69CNHm8LF%3DsexZRXrGeszOZKubNw%40mail.gmail.com.

Reply via email to