On Tue, Jan 14, 2025 at 5:59 AM Jakob Kummerow <[email protected]> wrote:
> - from what you describe, perhaps it would be feasible to craft a > reproducer. It'd probably have to be a custom V8 embedder that, in a loop, > creates many fresh isolates and instantiates/runs the same (or several?) > demo Wasm module in them. > I tried exactly that yesterday, and was able to see that "external memory" was indeed correlated across isolates, but after creating/destroying thousands of isolates it seemed to converge on a reasonable number rather than keep growing forever. But in prod we see something in external memory growing and growing. > - it could make sense to verify (with printfs in their destructors) that > both Isolates and NativeModules get destroyed as expected. It's > conceivable that the memory growth you're observing is intentional caching > (of generated code, or something?) because the WasmEngine thinks that the > cached data is still needed/useful. > > How/where exactly are you seeing this increased "external memory"? > I.e. what reporting system are you using to get memory consumption numbers? > > > On Tue, Jan 14, 2025 at 1:09 AM Kenton Varda <[email protected]> > wrote: > >> To add context here: >> >> The problem appears to show up only after running in production for an >> hour or two. During that time we will have created thousands of isolates to >> handle millions of requests. >> >> But the problem seems to affect *new* isolates, even when those isolates >> are loaded with applications that had been loaded into previous isolates >> without problems. Startup of an application should be 100% deterministic >> since we disallow any I/O during startup, but we're seeing that after the >> host has been running a while, new isolates are showing much higher >> "external memory" on startup. (E.g. 400MB external memory, but we enforce a >> 128MB limit on the whole isolate.) >> >> We observed that the wasm native module cache causes identical wasm >> modules to be shared across isolates, and that wasm lazy compilation causes >> memory usage of a wasm module -- as accounted by all isolates that have >> loaded it -- to change. >> >> Could it be that there is a memory leak in lazy compilation, such that >> these shared cached modules are gradually growing over time, to the point >> where new isolates that try to load these modules are being hit with >> extremely high "external memory" numbers right off the bat? >> >> -Kenton >> >> On Mon, Jan 13, 2025 at 5:31 PM Erik Corry <[email protected]> >> wrote: >> >>> It looks like it's related to shared objects between isolates. Is there >>> a newer document than >>> https://docs.google.com/document/d/18lYuaEsDSudzl2TDu-nc-0sVXW7WTGAs14k64GEhnFg/edit?usp=drivesdk >>> that describes how this works today? In particular cross-isolate GCs? >>> >>> On Mon, 13 Jan 2025, 15:25 Jakob Kummerow, <[email protected]> >>> wrote: >>> >>>> Sounds like a bug, but without more details (or a repro) I don't have a >>>> more specific guess than that. >>>> >>>> If you're desperate, you could try to bisect it (even with a flaky >>>> repro). Or review the ~500 changes between those branches: >>>> https://chromium.googlesource.com/v8/v8/+log/branch-heads/13.1..branch-heads/13.2?n=10000 >>>> >>>> >>>> On Mon, Jan 13, 2025 at 2:48 PM 'Dan Lapid' via v8-dev < >>>> [email protected]> wrote: >>>> >>>>> Hi, >>>>> In V8 13.2 and 13.3 we see wasm isolates external memory usage blowing >>>>> up sometimes (up to gigabytes). >>>>> Under V8 13.1 the same code would never ever use more than 80-100MB >>>>> The issue doesn't happen every time for the same wasm bytecode. It >>>>> doesn't even reproduce locally. >>>>> But some significant percentage of the time it does happen. >>>>> This has only started happening in 13.2, what are we missing? Should >>>>> we be enabling/disabling some flags? >>>>> It also seems that 13.3 is significantly worse in terms of error rate. >>>>> The problem happens under "--liftoff-only". >>>>> We use pointer compression but not sandbox. >>>>> We've tried enabling --turboshaft-wasm in 13.1 and the problem did not >>>>> reproduce. >>>>> Has anything changed that we need to adapt to? >>>>> Would really appreciate your help! >>>>> >>>>> -- >>>>> >>>> -- -- v8-dev mailing list [email protected] http://groups.google.com/group/v8-dev --- You received this message because you are subscribed to the Google Groups "v8-dev" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion visit https://groups.google.com/d/msgid/v8-dev/CAJouXQkFAjAARc75cX5ywh69CNHm8LF%3DsexZRXrGeszOZKubNw%40mail.gmail.com.
