Re: Fission MemShrink Newsletter #1: What (it is) and Why (it matters to you)
On 7/13/18 5:22 PM, gsquel...@mozilla.com wrote: E.g., could I instrument one class, so that every allocation would be tracked automatically, and I'd get nice stats at the end? You mean apart from just having a memory reporter for it? Including wasted space because of larger allocation blocks? Memory reporters using mallocSizeOf include that space, yes. Could I even run what-if scenarios, where I could instrument a class and extract its current size but also provide an alternate size (based on what I think I could make it shrink), and in the end I'll know how much I could save overall? You could hack the relevant memory reporter, sure. Do we have Try tests that simulate real-world usage, so we could collect memory-usage data that's relevant to our users, but also reproducible? See the "awsy-10s" test suite, which sort of aims to do that. Should there be some kind of Talos-like CI tests that focus on memory usage, so we'd get some warning if a particular patch suddenly eats too much memory? This is what awsy-e10s aims to do, yes. -Boris ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: Fission MemShrink Newsletter #1: What (it is) and Why (it matters to you)
On Wednesday, July 11, 2018 at 4:19:15 AM UTC+10, Kris Maglione wrote: > [...] > Essentially what this means, though, is that if we identify an area of > overhead that's 50KB[3] or larger that can be eliminated, it *has* to be > eliminated. There just aren't that many large chunks to remove. They all need > to go. And if an area of code has a dozen 5KB chunks that can be eliminated, > maybe they don't all have to go, but at least half of them do. The more the > better. Some questions: -- Sorry if some of this is already common knowledge or has been discussed. Are there tools available, that could easily track memory usage of specific things? E.g., could I instrument one class, so that every allocation would be tracked automatically, and I'd get nice stats at the end? Including wasted space because of larger allocation blocks? Could I even run what-if scenarios, where I could instrument a class and extract its current size but also provide an alternate size (based on what I think I could make it shrink), and in the end I'll know how much I could save overall? Do we have Try tests that simulate real-world usage, so we could collect memory-usage data that's relevant to our users, but also reproducible? Should there be some kind of Talos-like CI tests that focus on memory usage, so we'd get some warning if a particular patch suddenly eats too much memory? ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: Fission MemShrink Newsletter #1: What (it is) and Why (it matters to you)
On Fri, Jul 13, 2018 at 11:14:24AM -0400, Randell Jesup wrote: Hash tables are a big issue. There are a lot of 64K/128K/256K allocations at the moment for hashtables. When we started looking at this in bug 1436250, we had a 256K, ~4 128K, and a whole bunch of 64K hashtable allocs (on linux). Some may be smaller or gone now, but it's still big. I wonder if it's worth the perf hit to realloc to exact size hash tables that are build-once - probably. hashtable->Finalize()? (I wonder if that would let us make any other memory/speed optimizations if we know the table is now static.) I think, as much as possible, we really want static or mostly-static hash tables to be shared between processes. I've already been working on this in a few areas, e.g., bug 1470365 for string bundles, which are completely static, and bug 1471025 for preferences, which are mostly static. And those patches add helpers which should make it pretty easy to do the same for more things in the future, so that should probably be our go-to strategy for reducing per-process overhead, when possible. ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: Fission MemShrink Newsletter #1: What (it is) and Why (it matters to you)
> > > > >Also note that dealing with the "importance" of a page is not just a > >matter of visibility and focus. There are other factors to take into > >account such as if the page is playing audio or video (like listening to > >music on YouTube), if it's self-updating and so on. > > Absolutely > We should think about how we can make different performance and memory trade-offs for processes that are hosting top-level frames and processes hosting 3rd-party subframes > ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: Fission MemShrink Newsletter #1: What (it is) and Why (it matters to you)
>On 13/07/2018 04:55, Randell Jesup wrote: >> Correct - we need to have observers/what-have-you for >> background/foreground state (and we may want an intermediate state or >> two - foreground-but-not-focused (for example a visible window that >> isn't the focused window); recently-in-foreground (switching back and >> forth); background-for-longer-than-delta, etc. >> >> Modules can use these to drop caches, shut down unnecessary threads, >> change strategies, force GCs/CCs, etc. >Also note that dealing with the "importance" of a page is not just a >matter of visibility and focus. There are other factors to take into >account such as if the page is playing audio or video (like listening to >music on YouTube), if it's self-updating and so on. Absolutely >The only mechanism to reduce memory consumption we have now is >memory-pressure events which while functional are still under-used. We >might also need more fine grained mechanisms than "drop as much memory >as you can". This is also very important for GeckoView -- Randell Jesup, Mozilla Corp remove "news" for personal email ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: Fission MemShrink Newsletter #1: What (it is) and Why (it matters to you)
>On Thu, Jul 12, 2018 at 08:56:28AM -0700, Andrew McCreight wrote: >>On Thu, Jul 12, 2018 at 3:57 AM, Emilio Cobos Álvarez >>wrote: >> >>> Just curious, is there a bug on file to measure excess capacity on >>> nsTArrays and hash tables? [snip] >I kind of suspect that improving the storage efficiency of hashtables (and >probably nsTArrays too) will have an out-sized effect on per-process >memory. Just at startup, for a mostly empty process, we have a huge amount >of memory devoted to hashtables that would otherwise be shared across a >bunch of origins—enough that removing just 4 bytes of padding per entry >would save 87K per process. And that number tends to grow as we populate >caches that we need for things like layout and atoms. Hash tables are a big issue. There are a lot of 64K/128K/256K allocations at the moment for hashtables. When we started looking at this in bug 1436250, we had a 256K, ~4 128K, and a whole bunch of 64K hashtable allocs (on linux). Some may be smaller or gone now, but it's still big. I wonder if it's worth the perf hit to realloc to exact size hash tables that are build-once - probably. hashtable->Finalize()? (I wonder if that would let us make any other memory/speed optimizations if we know the table is now static.) -- Randell Jesup, Mozilla Corp remove "news" for personal email ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: Fission MemShrink Newsletter #1: What (it is) and Why (it matters to you)
This touches on a really important point: we're not the only ones allocating memory. Just a few that come to mind: GPU drivers, system media codecs, a11y tools, and especially on Windows we have to deal with "utility" applications, corporate-mandated gunk, and downright crapware. When we're measuring progress toward our goals, look at not only your own pristine dev box but also that one neighbor whose adware you're always cleaning out. On Fri, Jul 13, 2018 at 7:57 AM Gabriele Svelto wrote: > > Just another bit of info to raise awareness on a thorny issue we have to > face if we want to significantly raise the number of content processes. > On 64-bit Windows we often consume significantly more commit space than > physical memory. This consumption is currently unaccounted for in > about:memory though I've seen hints of it being cause by the GPU driver > (or other parts of the graphics pipeline). I've filed bug 1475518 [1] so > that I don't forget and I encourage anybody with Windows experience to > have a look because it's something we _need_ to solve to reduce content > process memory usage. > > Gabriele > > [1] Commit-space usage investigation > https://bugzilla.mozilla.org/show_bug.cgi?id=1475518 > > ___ > dev-platform mailing list > dev-platform@lists.mozilla.org > https://lists.mozilla.org/listinfo/dev-platform ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: Fission MemShrink Newsletter #1: What (it is) and Why (it matters to you)
Just another bit of info to raise awareness on a thorny issue we have to face if we want to significantly raise the number of content processes. On 64-bit Windows we often consume significantly more commit space than physical memory. This consumption is currently unaccounted for in about:memory though I've seen hints of it being cause by the GPU driver (or other parts of the graphics pipeline). I've filed bug 1475518 [1] so that I don't forget and I encourage anybody with Windows experience to have a look because it's something we _need_ to solve to reduce content process memory usage. Gabriele [1] Commit-space usage investigation https://bugzilla.mozilla.org/show_bug.cgi?id=1475518 signature.asc Description: OpenPGP digital signature ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: Fission MemShrink Newsletter #1: What (it is) and Why (it matters to you)
On 13/07/2018 04:55, Randell Jesup wrote: > Correct - we need to have observers/what-have-you for > background/foreground state (and we may want an intermediate state or > two - foreground-but-not-focused (for example a visible window that > isn't the focused window); recently-in-foreground (switching back and > forth); background-for-longer-than-delta, etc. > > Modules can use these to drop caches, shut down unnecessary threads, > change strategies, force GCs/CCs, etc. > > Some of this certainly already exists, but may need to be extended (and > used a lot more). We already had most of this stuff in the ProcessPriorityManager [1] which has be only ever used in Firefox OS. Since we had one-process-per-tab there it was designed that way so it might need some reworking to deal with one tab consisting of multiple content processes. Also note that dealing with the "importance" of a page is not just a matter of visibility and focus. There are other factors to take into account such as if the page is playing audio or video (like listening to music on YouTube), if it's self-updating and so on. The only mechanism to reduce memory consumption we have now is memory-pressure events which while functional are still under-used. We might also need more fine grained mechanisms than "drop as much memory as you can". Gabriele [1] https://searchfox.org/mozilla-central/rev/46292b1212d2d61d7b5a7df184406774727085b8/dom/ipc/ProcessPriorityManager.cpp signature.asc Description: OpenPGP digital signature ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: Fission MemShrink Newsletter #1: What (it is) and Why (it matters to you)
>On 07/12/2018 11:08 PM, Randell Jesup wrote: >> We may need to trade first-load time against memory use by lazy-initing >> more things than now, though we did quite a bit on that already for >> reducing startup time. > >One thing to remember that some of the child processes will be more >important than others. For example all the processes used for browsing >contexts in the foreground tab should probably prefer performance over >memory (in cases that is something we can choose from), but if a >process is only used for browsing contexts in background tabs and isn't >playing any audio or such, it can probably use less memory hungry >approaches. Correct - we need to have observers/what-have-you for background/foreground state (and we may want an intermediate state or two - foreground-but-not-focused (for example a visible window that isn't the focused window); recently-in-foreground (switching back and forth); background-for-longer-than-delta, etc. Modules can use these to drop caches, shut down unnecessary threads, change strategies, force GCs/CCs, etc. Some of this certainly already exists, but may need to be extended (and used a lot more). -- Randell Jesup, Mozilla Corp remove "news" for personal email ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: Fission MemShrink Newsletter #1: What (it is) and Why (it matters to you)
On Fri, Jul 13, 2018 at 1:56 AM, Andrew McCreight wrote: > > > > Just curious, is there a bug on file to measure excess capacity on > > nsTArrays and hash tables? > > njn looked at that kind of issue at some point (he changed how arrays grow, > for instance, to reduce overhead), but it has probably been around 5 years, > so there may be room for improvement for things added in the meanwhile. > For a trip down memory lane, check out https://blog.mozilla.org/nnethercote/2011/08/05/clownshoes-available-in-sizes-2101-and-up/. The size classes described in that post are still in use today. More usefully: if anyone wants to investigate slop -- which is only one kind of wasted space, but an important one -- it's now really easy with DMD: - Invoke DMD in "Live" mode (i.e. generic heap profiling mode, rather than dark matter detection mode). - Use the `--sort-by slop` flag with dmd.py. Full instructions are at https://developer.mozilla.org/en-US/docs/Mozilla/Performance/DMD. Nick ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: Fission MemShrink Newsletter #1: What (it is) and Why (it matters to you)
On Fri, Jul 13, 2018, at 6:51 AM, Kris Maglione wrote: > I actually have a patch sitting around with helpers to make it super easy to > use smart pointers as tagged pointers :) I never wound up putting it up for > review, since my original use case went away, but it you can think of any > specific cases where it would be useful, I'd be happy to try and get it > landed. Speaking of tagged pointers, I've used lower one or two bits for tagging a number of times, but I've never tried packing things into the high bits of a 64 bit pointer. Is that inadvisable for any reason? How many bits can I use, given the 64 bit platforms we need to support? ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: Fission MemShrink Newsletter #1: What (it is) and Why (it matters to you)
On Fri, Jul 13, 2018, at 7:08 AM, smaug wrote: > One thing to remember that some of the child processes will be more > important than others. For example all the processes used for browsing > contexts in > the foreground tab should probably prefer performance over memory (in > cases that is something we can choose from), but if a process > is only used for browsing contexts in background tabs and isn't playing > any audio or such, it can probably use less memory hungry approaches. > Like, could stylo use fewer threads when used in background-tabs-only- > processes, and once the process becomes foreground, more threads are > created. I've filed a bug for this after I saw this email thread: https://bugzilla.mozilla.org/show_bug.cgi?id=1475091 - Xidorn ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: Fission MemShrink Newsletter #1: What (it is) and Why (it matters to you)
On 07/12/2018 11:08 PM, Randell Jesup wrote: I do hope that the 100 process figures scenario that was given is a worse case scenario though... It's not. Worst case is a LOT worse. Shutting down threads/threadpools when not needed or off an idle timer is a Good thing. There may be some perf hit since it may mean starting a thread instead of just sending a message at times; this may require some tuning in specific cases, or leaving 1 thread or more running anyways. Stylo will be an interesting case here. We may need to trade first-load time against memory use by lazy-initing more things than now, though we did quite a bit on that already for reducing startup time. One thing to remember that some of the child processes will be more important than others. For example all the processes used for browsing contexts in the foreground tab should probably prefer performance over memory (in cases that is something we can choose from), but if a process is only used for browsing contexts in background tabs and isn't playing any audio or such, it can probably use less memory hungry approaches. Like, could stylo use fewer threads when used in background-tabs-only-processes, and once the process becomes foreground, more threads are created. We have similar approach in many cases for performance and responsiveness reasons, but less often for memory usage reasons. ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: Fission MemShrink Newsletter #1: What (it is) and Why (it matters to you)
On Thu, Jul 12, 2018 at 10:27:13PM +0200, Gabriele Svelto wrote: On 12/07/2018 22:19, Kris Maglione wrote: I've actually been thinking on filing a bug to do something similar, to measure cumulative effects of excess padding in certain types since I began looking into bug 1460674, and Sylvestre mentioned that clang-analyzer can generate reports on excess padding. I've encountered at least one structure where a boolean flag is 64-bits in size on 64-bit builds. If we really want to go to the last mile we might want to also evaluate things like tagged pointers; there's probably some KiB's to be saved there too. I actually have a patch sitting around with helpers to make it super easy to use smart pointers as tagged pointers :) I never wound up putting it up for review, since my original use case went away, but it you can think of any specific cases where it would be useful, I'd be happy to try and get it landed. ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: Fission MemShrink Newsletter #1: What (it is) and Why (it matters to you)
On Thu, Jul 12, 2018 at 04:08:49PM -0400, Randell Jesup wrote: I do hope that the 100 process figures scenario that was given is a worse case scenario though... It's not. Worst case is a LOT worse. Shutting down threads/threadpools when not needed or off an idle timer is a Good thing. There may be some perf hit since it may mean starting a thread instead of just sending a message at times; this may require some tuning in specific cases, or leaving 1 thread or more running anyways. Stylo will be an interesting case here. We may need to trade first-load time against memory use by lazy-initing more things than now, though we did quite a bit on that already for reducing startup time. This is a really important point: Memory usage and performance deeply intertwined. There are hard limits on the amount of memory we can use, and the more of it we waste needlessly, the less we have available for performance optimizations that need it. In the worst (performance) case, we wind up swapping, at which point performance may as well not exist. We're going to have to make hard decisions about when/how often/how aggressively we flush caches, spin down threads, unload tabs, ... The more unnecessary overhead we save, the less extreme we're going to have to be about this. And the better we get at spinning down unused threads and evicting low impact cache entries, the less aggressive we're going to have to be about the high impact ones. Throwing those things away will have a performance impact, but not throwing them away will, in the end, have a bigger one. ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: Fission MemShrink Newsletter #1: What (it is) and Why (it matters to you)
On Thu, Jul 12, 2018 at 08:56:28AM -0700, Andrew McCreight wrote: On Thu, Jul 12, 2018 at 3:57 AM, Emilio Cobos Álvarez wrote: Thanks for doing this! Just curious, is there a bug on file to measure excess capacity on nsTArrays and hash tables? njn looked at that kind of issue at some point (he changed how arrays grow, for instance, to reduce overhead), but it has probably been around 5 years, so there may be room for improvement for things added in the meanwhile. However, our focus here is really on reducing per-process memory overhead, rather than generic memory improvements, because we've had a lot of focus on the latter as part of MemShrink, but not the former, so there's likely easier improvements to be had. I kind of suspect that improving the storage efficiency of hashtables (and probably nsTArrays too) will have an out-sized effect on per-process memory. Just at startup, for a mostly empty process, we have a huge amount of memory devoted to hashtables that would otherwise be shared across a bunch of origins—enough that removing just 4 bytes of padding per entry would save 87K per process. And that number tends to grow as we populate caches that we need for things like layout and atoms. As much as I'd like to be able to share many of those caches between processes, there are always going to need process-specific hashtables on top of the shared ones for things that can't be/shouldn't be/aren't yet shared. And that extra overhead tends to grow proportionally to the number of processes we have. On 07/10/2018 08:19 PM, Kris Maglione wrote: Welcome to the first edition of the Fission MemShrink newsletter.[1] In this edition, I'll sum up what the project is, and why it matters to you. In subsequent editions, I'll give updates on progress that we've made, and areas that we'll need to focus on next.[2] The Fission MemShrink project is one of the most easily overlooked aspects of Project Fission (also known as Site Isolation), but is absolutely critical to its success. And will require a company- and community-wide effort effort to meet its goals. The problem is thus: In order for site isolation to work, we need to be able to run *at least* 100 content processes in an average Firefox session. Each of those processes has its own base memory overhead—memory we use just for creating the process, regardless of what's running in it. In the post-Fission world, that overhead needs to be less than 10MB per process in order to keep the extra overhead from Fission below 1GB. Right now, on our best-cast platform, Windows 10, is somewhere between 17 and 21MB. Linux and OS-X hover between 25 and 35MB. In other words, between 2 and 3.5GB for an ordinary session. That means that, in the best case, we need to reduce the memory we use in content processes by *at least* 7MB. The problem, of course, is that there are only so many places we can cut memory without losing functionality, and even fewer places where we can make big wins. But, there are lots of places we can make small and medium-sized wins. So, to put the task into perspective, of all of the places we can cut a certain amount of overhead, here are the number of each that we need to fix in order to reach 1MB: 250KB: 4 100KB: 10 75KB: 13 50KB: 20 20KB: 50 10KB: 100 5KB: 200 Now remember: we need to do *all* of these in order to reach our goal. It's not a matter of one 250KB improvement or 50 5KB improvements. It's 4 250KB *and* 200 5KB improvements. There just aren't enough places we can cut 250KB. If we fall short in any of those areas, Project Fission will fail, and Firefox will be the only major browser without site isolation. But it won't fail, because all of you are awesome, and this is a totally achievable goal if we all throw our effort behind it. Essentially what this means, though, is that if we identify an area of overhead that's 50KB[3] or larger that can be eliminated, it *has* to be eliminated. There just aren't that many large chunks to remove. They all need to go. And if an area of code has a dozen 5KB chunks that can be eliminated, maybe they don't all have to go, but at least half of them do. The more the better. To help us triage these issues, we have a tracking bug ( https://bugzil.la/memshrink-content), and a per-bug whiteboard tag ([overhead:...]) which gives an estimate of how much per-process overhead we believe fixing that bug would eliminate. Please feel free to add blockers to the tracking bug if you think they're relevant, and to add or update [overhead] tags if you have reasonable estimates. With all of that said, here's a brief update of the progress we've made so far: In the past month, unique memory per process[4] has dropped 3-4MB[5], and JS memory usage in particular has dropped 1.1-1.9MB. Particular credit goes to: * Eric Rahm added an AWSY test suite to track base content process memory (https://bugzil.la/1442361). Results: Resident unique: https://treeherder.mozilla.org
Re: Fission MemShrink Newsletter #1: What (it is) and Why (it matters to you)
On 12/07/2018 22:19, Kris Maglione wrote: > I've actually been thinking on filing a bug to do something similar, to > measure cumulative effects of excess padding in certain types since I > began looking into bug 1460674, and Sylvestre mentioned that > clang-analyzer can generate reports on excess padding. I've encountered at least one structure where a boolean flag is 64-bits in size on 64-bit builds. If we really want to go to the last mile we might want to also evaluate things like tagged pointers; there's probably some KiB's to be saved there too. There's also more than one place where we're using strings to identify stuff where we could use enums/integers instead. And yeah, my much delayed refactoring of the observer service got a lot higher on my priority list after reading this thread. Gabriele signature.asc Description: OpenPGP digital signature ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: Fission MemShrink Newsletter #1: What (it is) and Why (it matters to you)
On Thu, Jul 12, 2018 at 12:57:35PM +0200, Emilio Cobos Álvarez wrote: Thanks for doing this! Just curious, is there a bug on file to measure excess capacity on nsTArrays and hash tables? I don't think so, but it's a good idea. I've actually been thinking on filing a bug to do something similar, to measure cumulative effects of excess padding in certain types since I began looking into bug 1460674, and Sylvestre mentioned that clang-analyzer can generate reports on excess padding. It would probably be a good idea to try to roll this into the same project. One nice change coming up on this front is that bug 1402910 will probably allow us to increase the load factors of most of our hashtables without losing performance. Having up-to-date numbers for these things would probably help decide how to prioritize those sorts of bugs. On 07/10/2018 08:19 PM, Kris Maglione wrote: Welcome to the first edition of the Fission MemShrink newsletter.[1] In this edition, I'll sum up what the project is, and why it matters to you. In subsequent editions, I'll give updates on progress that we've made, and areas that we'll need to focus on next.[2] The Fission MemShrink project is one of the most easily overlooked aspects of Project Fission (also known as Site Isolation), but is absolutely critical to its success. And will require a company- and community-wide effort effort to meet its goals. The problem is thus: In order for site isolation to work, we need to be able to run *at least* 100 content processes in an average Firefox session. Each of those processes has its own base memory overhead—memory we use just for creating the process, regardless of what's running in it. In the post-Fission world, that overhead needs to be less than 10MB per process in order to keep the extra overhead from Fission below 1GB. Right now, on our best-cast platform, Windows 10, is somewhere between 17 and 21MB. Linux and OS-X hover between 25 and 35MB. In other words, between 2 and 3.5GB for an ordinary session. That means that, in the best case, we need to reduce the memory we use in content processes by *at least* 7MB. The problem, of course, is that there are only so many places we can cut memory without losing functionality, and even fewer places where we can make big wins. But, there are lots of places we can make small and medium-sized wins. So, to put the task into perspective, of all of the places we can cut a certain amount of overhead, here are the number of each that we need to fix in order to reach 1MB: 250KB: 4 100KB: 10 75KB: 13 50KB: 20 20KB: 50 10KB: 100 5KB: 200 Now remember: we need to do *all* of these in order to reach our goal. It's not a matter of one 250KB improvement or 50 5KB improvements. It's 4 250KB *and* 200 5KB improvements. There just aren't enough places we can cut 250KB. If we fall short in any of those areas, Project Fission will fail, and Firefox will be the only major browser without site isolation. But it won't fail, because all of you are awesome, and this is a totally achievable goal if we all throw our effort behind it. Essentially what this means, though, is that if we identify an area of overhead that's 50KB[3] or larger that can be eliminated, it *has* to be eliminated. There just aren't that many large chunks to remove. They all need to go. And if an area of code has a dozen 5KB chunks that can be eliminated, maybe they don't all have to go, but at least half of them do. The more the better. To help us triage these issues, we have a tracking bug (https://bugzil.la/memshrink-content), and a per-bug whiteboard tag ([overhead:...]) which gives an estimate of how much per-process overhead we believe fixing that bug would eliminate. Please feel free to add blockers to the tracking bug if you think they're relevant, and to add or update [overhead] tags if you have reasonable estimates. With all of that said, here's a brief update of the progress we've made so far: In the past month, unique memory per process[4] has dropped 3-4MB[5], and JS memory usage in particular has dropped 1.1-1.9MB. Particular credit goes to: * Eric Rahm added an AWSY test suite to track base content process memory (https://bugzil.la/1442361). Results: Resident unique: https://treeherder.mozilla.org/perf.html#/graphs?series=mozilla-central,1684862,1,4&series=mozilla-central,1684846,1,4&series=mozilla-central,1685133,1,4&series=mozilla-central,1685127,1,4 Explicit allocations: https://treeherder.mozilla.org/perf.html#/graphs?series=mozilla-inbound,1706218,1,4&series=mozilla-inbound,1706220,1,4&series=mozilla-inbound,1706216,1,4 JS: https://treeherder.mozilla.org/perf.html#/graphs?series=mozilla-central,1684866,1,4&series=mozilla-central,1685137,1,4&series=mozilla-central,1685131,1,4 * Andrew McCreight created a tool for tracking JS memory usage, and figuring out which scripts and objects are responsible for how muc
Re: Fission MemShrink Newsletter #1: What (it is) and Why (it matters to you)
>I do hope that the 100 process figures scenario that was given is a worse case >scenario though... It's not. Worst case is a LOT worse. Shutting down threads/threadpools when not needed or off an idle timer is a Good thing. There may be some perf hit since it may mean starting a thread instead of just sending a message at times; this may require some tuning in specific cases, or leaving 1 thread or more running anyways. Stylo will be an interesting case here. We may need to trade first-load time against memory use by lazy-initing more things than now, though we did quite a bit on that already for reducing startup time. -- Randell Jesup, Mozilla Corp remove "news" for personal email ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: Fission MemShrink Newsletter #1: What (it is) and Why (it matters to you)
On Thu, Jul 12, 2018 at 3:57 AM, Emilio Cobos Álvarez wrote: > Thanks for doing this! > > Just curious, is there a bug on file to measure excess capacity on > nsTArrays and hash tables? > > WebKit has a bunch of bugs like: > > https://bugs.webkit.org/show_bug.cgi?id=186709 > > Which seem relevant. > njn looked at that kind of issue at some point (he changed how arrays grow, for instance, to reduce overhead), but it has probably been around 5 years, so there may be room for improvement for things added in the meanwhile. However, our focus here is really on reducing per-process memory overhead, rather than generic memory improvements, because we've had a lot of focus on the latter as part of MemShrink, but not the former, so there's likely easier improvements to be had. Andrew > -- Emilio > > On 07/10/2018 08:19 PM, Kris Maglione wrote: > >> Welcome to the first edition of the Fission MemShrink newsletter.[1] >> >> In this edition, I'll sum up what the project is, and why it matters to >> you. In subsequent editions, I'll give updates on progress that we've made, >> and areas that we'll need to focus on next.[2] >> >> >> The Fission MemShrink project is one of the most easily overlooked >> aspects of Project Fission (also known as Site Isolation), but is >> absolutely critical to its success. And will require a company- and >> community-wide effort effort to meet its goals. >> >> The problem is thus: In order for site isolation to work, we need to be >> able to run *at least* 100 content processes in an average Firefox session. >> Each of those processes has its own base memory overhead—memory we use just >> for creating the process, regardless of what's running in it. In the >> post-Fission world, that overhead needs to be less than 10MB per process in >> order to keep the extra overhead from Fission below 1GB. Right now, on our >> best-cast platform, Windows 10, is somewhere between 17 and 21MB. Linux and >> OS-X hover between 25 and 35MB. In other words, between 2 and 3.5GB for an >> ordinary session. >> >> That means that, in the best case, we need to reduce the memory we use in >> content processes by *at least* 7MB. The problem, of course, is that there >> are only so many places we can cut memory without losing functionality, and >> even fewer places where we can make big wins. But, there are lots of places >> we can make small and medium-sized wins. >> >> So, to put the task into perspective, of all of the places we can cut a >> certain amount of overhead, here are the number of each that we need to fix >> in order to reach 1MB: >> >> 250KB: 4 >> 100KB: 10 >> 75KB: 13 >> 50KB: 20 >> 20KB: 50 >> 10KB: 100 >> 5KB: 200 >> >> Now remember: we need to do *all* of these in order to reach our goal. >> It's not a matter of one 250KB improvement or 50 5KB improvements. It's 4 >> 250KB *and* 200 5KB improvements. There just aren't enough places we can >> cut 250KB. If we fall short in any of those areas, Project Fission will >> fail, and Firefox will be the only major browser without site isolation. >> >> But it won't fail, because all of you are awesome, and this is a totally >> achievable goal if we all throw our effort behind it. >> >> Essentially what this means, though, is that if we identify an area of >> overhead that's 50KB[3] or larger that can be eliminated, it *has* to be >> eliminated. There just aren't that many large chunks to remove. They all >> need to go. And if an area of code has a dozen 5KB chunks that can be >> eliminated, maybe they don't all have to go, but at least half of them do. >> The more the better. >> >> >> To help us triage these issues, we have a tracking bug ( >> https://bugzil.la/memshrink-content), and a per-bug whiteboard tag >> ([overhead:...]) which gives an estimate of how much per-process overhead >> we believe fixing that bug would eliminate. Please feel free to add >> blockers to the tracking bug if you think they're relevant, and to add or >> update [overhead] tags if you have reasonable estimates. >> >> >> With all of that said, here's a brief update of the progress we've made >> so far: >> >> In the past month, unique memory per process[4] has dropped 3-4MB[5], and >> JS memory usage in particular has dropped 1.1-1.9MB. >> >> Particular credit goes to: >> >> * Eric Rahm added an AWSY test suite to track base content process memory >>(https://bugzil.la/1442361). Results: >> >> Resident unique: https://treeherder.mozilla.org >> /perf.html#/graphs?series=mozilla-central,1684862,1,4&series >> =mozilla-central,1684846,1,4&series=mozilla-central, >> 1685133,1,4&series=mozilla-central,1685127,1,4 >> Explicit allocations: https://treeherder.mozilla.org >> /perf.html#/graphs?series=mozilla-inbound,1706218,1,4&series >> =mozilla-inbound,1706220,1,4&series=mozilla-inbound,1706216,1,4 >> JS: https://treeherder.mozilla.org/perf.html#/graphs?series=mozi >> lla-central,1684866,1,4&series=mozilla-central,1685137,1,4& >> series=mozilla-central,1
Re: Fission MemShrink Newsletter #1: What (it is) and Why (it matters to you)
On Wed, Jul 11, 2018 at 6:25 PM, Karl Tomlinson wrote: > Is there a guideline that should be used to evaluate what can > acceptably run in the same process for different sites? > This is on me to write. I have been slow at doing so mainly because there's a lot of "What does X look like and where do its pats run" investigation I feel I need to do to write it. (For X in at least { WebExtensions, WebRTC, Compositing, Filters, ... }) > I assume the primary goal is to prevent one site from reading > information that should only be available to another site? > Yep. On Wed, Jul 11, 2018 at 6:56 PM, Robert O'Callahan wrote: > On Thu, Jul 12, 2018 at 11:25 AM, Karl Tomlinson > wrote: > > > Would it be easier to answer the opposite question? What should > > not run in a shared process? JS is a given. Others? > > > > Currently when an exploitable bug is found in content process code, > attackers use JS to weaponize it with an arsenal of known techniques (e.g. > heap spraying and shaping). An important question is whether, assuming a > similar bug were found in a shared non-content process, how difficult would > it be for content JS to apply those techniques remotely across the process > boundary? You're completely correct. > That would be a pretty interesting problem for security > researchers to work on. > It's always illustrative to have exploits that demonstrate this goal in the target of interest - they may have created generic techniques that we can address fundamentally (like with Memory Partitioning or Allocator Hardening). But people have been writing exploits for targets that don't have a scripting environment for two decades or more, so all of those are prior art for this sort of exploitation. This isn't a reason not to pursue this work, and it's not saying this work isn't a net security win though! I have been pondering (and brainstormed with a few people) about creating something Google native-client-like to enforce process-like state separation between threads in a single process. That might make it safer to share utility processes between content processes. But it's considerably less straightforward than I was hoping. Big open research question. Use of system font, graphics, or audio servers is in a similar bucket I > > guess. > > > > Taking control of an audio server would let you listen into phone calls, > which seems interesting. > > Another question is whether you can exfiltrate cross-origin data by > performing side-channel attacks against those shared processes. You > probably need to assume that Spectre-ish attacks will be blocked at process > boundaries by hardware/OS mitigations, but there could be > browser-implementation-specific timing attacks etc. E.g. do IPDL IDs > exposed to content processes leak useful information about the activities > of other processes? Of course there are cross-origin timing-based > information leaks that are already known and somewhat unfixable :-(. Yup! -tom ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: Fission MemShrink Newsletter #1: What (it is) and Why (it matters to you)
Thanks for doing this! Just curious, is there a bug on file to measure excess capacity on nsTArrays and hash tables? WebKit has a bunch of bugs like: https://bugs.webkit.org/show_bug.cgi?id=186709 Which seem relevant. -- Emilio On 07/10/2018 08:19 PM, Kris Maglione wrote: Welcome to the first edition of the Fission MemShrink newsletter.[1] In this edition, I'll sum up what the project is, and why it matters to you. In subsequent editions, I'll give updates on progress that we've made, and areas that we'll need to focus on next.[2] The Fission MemShrink project is one of the most easily overlooked aspects of Project Fission (also known as Site Isolation), but is absolutely critical to its success. And will require a company- and community-wide effort effort to meet its goals. The problem is thus: In order for site isolation to work, we need to be able to run *at least* 100 content processes in an average Firefox session. Each of those processes has its own base memory overhead—memory we use just for creating the process, regardless of what's running in it. In the post-Fission world, that overhead needs to be less than 10MB per process in order to keep the extra overhead from Fission below 1GB. Right now, on our best-cast platform, Windows 10, is somewhere between 17 and 21MB. Linux and OS-X hover between 25 and 35MB. In other words, between 2 and 3.5GB for an ordinary session. That means that, in the best case, we need to reduce the memory we use in content processes by *at least* 7MB. The problem, of course, is that there are only so many places we can cut memory without losing functionality, and even fewer places where we can make big wins. But, there are lots of places we can make small and medium-sized wins. So, to put the task into perspective, of all of the places we can cut a certain amount of overhead, here are the number of each that we need to fix in order to reach 1MB: 250KB: 4 100KB: 10 75KB: 13 50KB: 20 20KB: 50 10KB: 100 5KB: 200 Now remember: we need to do *all* of these in order to reach our goal. It's not a matter of one 250KB improvement or 50 5KB improvements. It's 4 250KB *and* 200 5KB improvements. There just aren't enough places we can cut 250KB. If we fall short in any of those areas, Project Fission will fail, and Firefox will be the only major browser without site isolation. But it won't fail, because all of you are awesome, and this is a totally achievable goal if we all throw our effort behind it. Essentially what this means, though, is that if we identify an area of overhead that's 50KB[3] or larger that can be eliminated, it *has* to be eliminated. There just aren't that many large chunks to remove. They all need to go. And if an area of code has a dozen 5KB chunks that can be eliminated, maybe they don't all have to go, but at least half of them do. The more the better. To help us triage these issues, we have a tracking bug (https://bugzil.la/memshrink-content), and a per-bug whiteboard tag ([overhead:...]) which gives an estimate of how much per-process overhead we believe fixing that bug would eliminate. Please feel free to add blockers to the tracking bug if you think they're relevant, and to add or update [overhead] tags if you have reasonable estimates. With all of that said, here's a brief update of the progress we've made so far: In the past month, unique memory per process[4] has dropped 3-4MB[5], and JS memory usage in particular has dropped 1.1-1.9MB. Particular credit goes to: * Eric Rahm added an AWSY test suite to track base content process memory (https://bugzil.la/1442361). Results: Resident unique: https://treeherder.mozilla.org/perf.html#/graphs?series=mozilla-central,1684862,1,4&series=mozilla-central,1684846,1,4&series=mozilla-central,1685133,1,4&series=mozilla-central,1685127,1,4 Explicit allocations: https://treeherder.mozilla.org/perf.html#/graphs?series=mozilla-inbound,1706218,1,4&series=mozilla-inbound,1706220,1,4&series=mozilla-inbound,1706216,1,4 JS: https://treeherder.mozilla.org/perf.html#/graphs?series=mozilla-central,1684866,1,4&series=mozilla-central,1685137,1,4&series=mozilla-central,1685131,1,4 * Andrew McCreight created a tool for tracking JS memory usage, and figuring out which scripts and objects are responsible for how much of it (https://bugzil.la/1463569). * Andrew and Nika Layzell also completely rewrote the way we handle XPIDL type info so that it's statically compiled into the executable and shared between all processes (https://bugzil.la/1438688, https://bugzil.la/1444745). * Felipe Gomes split a bunch of code out of frame scripts so that it could be lazily loaded only when needed (https://bugzil.la/1467278, ...) and added a whitelist of JSMs that are allowed to be loaded at content process startup (https://bugzil.la/1471066) * I did a bit of this too, and also prevented us from loading s
Re: Fission MemShrink Newsletter #1: What (it is) and Why (it matters to you)
On Thu, Jul 12, 2018 at 11:25 AM, Karl Tomlinson wrote: > Would it be easier to answer the opposite question? What should > not run in a shared process? JS is a given. Others? > Currently when an exploitable bug is found in content process code, attackers use JS to weaponize it with an arsenal of known techniques (e.g. heap spraying and shaping). An important question is whether, assuming a similar bug were found in a shared non-content process, how difficult would it be for content JS to apply those techniques remotely across the process boundary? That would be a pretty interesting problem for security researchers to work on. Use of system font, graphics, or audio servers is in a similar bucket I > guess. > Taking control of an audio server would let you listen into phone calls, which seems interesting. Another question is whether you can exfiltrate cross-origin data by performing side-channel attacks against those shared processes. You probably need to assume that Spectre-ish attacks will be blocked at process boundaries by hardware/OS mitigations, but there could be browser-implementation-specific timing attacks etc. E.g. do IPDL IDs exposed to content processes leak useful information about the activities of other processes? Of course there are cross-origin timing-based information leaks that are already known and somewhat unfixable :-(. Rob -- Su ot deraeppa sah dna Rehtaf eht htiw saw hcihw, efil lanrete eht uoy ot mialcorp ew dna, ti ot yfitset dna ti nees evah ew; deraeppa efil eht. Efil fo Drow eht gninrecnoc mialcorp ew siht - dehcuot evah sdnah ruo dna ta dekool evah ew hcihw, seye ruo htiw nees evah ew hcihw, draeh evah ew hcihw, gninnigeb eht morf saw hcihw taht. ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: Fission MemShrink Newsletter #1: What (it is) and Why (it matters to you)
Is there a guideline that should be used to evaluate what can acceptably run in the same process for different sites? I assume the primary goal is to prevent one site from reading information that should only be available to another site? There would also be defense-in-depth value from having each site sandboxed separately because a security breach from one site could not compromise another. I guess a single compositor process is acceptable because there is essentially no information returning from the compositor? A font server may be acceptable, because information returned is of limited power? Use of system font, graphics, or audio servers is in a similar bucket I guess. Would using a single process for network be acceptable, not because information returned is limited, but because we're willing to have some compromise because there is a small API surface? Or would that be acceptable because content JS does not run in that process? Would it be acceptable to perform layout in a single process for multiple sites (if that were practical)? Would it be easier to answer the opposite question? What should not run in a shared process? JS is a given. Others? ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: Fission MemShrink Newsletter #1: What (it is) and Why (it matters to you)
On Wed, Jul 11, 2018 at 11:42:01PM +0200, Jean-Yves Avenard wrote: On 11 Jul 2018, at 10:10 pm, Kris Maglione wrote: It looks like it will be helpful, but unfortunately won't give us the 2MB simple arithmetic would suggest. On Windows, at least, (and probably elsewhere, but need to confirm) thread stacks are lazily committed, so as long as the decoders aren't used in a process, the overhead is probably closer to 25KB per thread. I haven’t looked much in details, not being an expert on this and having just finished watching the world cup… A quick glance at the code gives me: On mac/linux using pthread: when a thread is created, the stack size is set using pthread_attr_setstacksize https://searchfox.org/mozilla-central/source/nsprpub/pr/src/pthreads/ptthread.c#355 On Linux, the man page is clear: "The stack size attribute determines the minimum size (in bytes) that will be allocated for threads created using the thread attributes object attr.” Right, but allocation size doesn't imply that the memory is committed, just that it's mapped. In general, anonymous mapped memory isn't actually committed (and therefore doesn't become part of the process's USS) until it's touched. On Windows: https://searchfox.org/mozilla-central/source/nsprpub/pr/src/md/windows/w95thred.c#151 the thread is created with STACK_SIZE_PARAM_IS_A_RESERVATION flag set. This will allocate the memory immediately. Allocate, yes, but not commit. That flag is actually what ensures that our Windows thread stacks don't consume system memory until they're actually touched. The saving I was mentioning earlier isn’t just due to media decoder threadpool thread stack no longer needing to be that big, but that all other threadpools can be reduced too. Threadpools aren’t used only when playing a video/audio file. Reducing thread pool sizes would certainly be helpful. One unfortunate side-effect of large thread pools is that, even with lazy commit thread stacks, the more threads you run code on, the more stacks wind up with committed pages. ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: Fission MemShrink Newsletter #1: What (it is) and Why (it matters to you)
On Wed, Jul 11, 2018 at 11:42:01PM +0200, Jean-Yves Avenard wrote: > Hi > > > On 11 Jul 2018, at 10:10 pm, Kris Maglione wrote: > > Thanks. Boris added this as a blocker. > > > > It looks like it will be helpful, but unfortunately won't give us the 2MB > > simple arithmetic would suggest. On Windows, at least, (and probably > > elsewhere, but need to confirm) thread stacks are lazily committed, so as > > long as the decoders aren't used in a process, the overhead is probably > > closer to 25KB per thread. > > > > Shrinking the size of the thread pool and lazily spinning up threads when > > they're first needed would probably save us 200KB per process, though... > > I haven’t looked much in details, not being an expert on this and having just > finished watching the world cup… > > A quick glance at the code gives me: > > On mac/linux using pthread: > when a thread is created, the stack size is set using > pthread_attr_setstacksize > https://searchfox.org/mozilla-central/source/nsprpub/pr/src/pthreads/ptthread.c#355 > > On Linux, the man page is clear: > "The stack size attribute determines the minimum size (in bytes) that will be > allocated for threads created using the thread attributes object attr.” > > On mac, less so, I’m not sure what’s the behaviour there is, if it’s > allocated or not… > > On Windows: > https://searchfox.org/mozilla-central/source/nsprpub/pr/src/md/windows/w95thred.c#151 > > the thread is created with STACK_SIZE_PARAM_IS_A_RESERVATION flag set. This > will allocate the memory immediately. Allocate in this context means address space being consumed. It doesn't mean memory being actually committed. Memory is only committed once used, so only as much as what the code running in the thread actually uses is committed (rounded to page size). This means at least 4k per thread, so the more threads we have at initialization, the more memory is committed. That being said, we're talking about something akin to NUWA here, and presumably, we're talking about processes that don't initialize everything. Mike ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: Fission MemShrink Newsletter #1: What (it is) and Why (it matters to you)
Hi > On 11 Jul 2018, at 10:10 pm, Kris Maglione wrote: > Thanks. Boris added this as a blocker. > > It looks like it will be helpful, but unfortunately won't give us the 2MB > simple arithmetic would suggest. On Windows, at least, (and probably > elsewhere, but need to confirm) thread stacks are lazily committed, so as > long as the decoders aren't used in a process, the overhead is probably > closer to 25KB per thread. > > Shrinking the size of the thread pool and lazily spinning up threads when > they're first needed would probably save us 200KB per process, though... I haven’t looked much in details, not being an expert on this and having just finished watching the world cup… A quick glance at the code gives me: On mac/linux using pthread: when a thread is created, the stack size is set using pthread_attr_setstacksize https://searchfox.org/mozilla-central/source/nsprpub/pr/src/pthreads/ptthread.c#355 On Linux, the man page is clear: "The stack size attribute determines the minimum size (in bytes) that will be allocated for threads created using the thread attributes object attr.” On mac, less so, I’m not sure what’s the behaviour there is, if it’s allocated or not… On Windows: https://searchfox.org/mozilla-central/source/nsprpub/pr/src/md/windows/w95thred.c#151 the thread is created with STACK_SIZE_PARAM_IS_A_RESERVATION flag set. This will allocate the memory immediately. The saving I was mentioning earlier isn’t just due to media decoder threadpool thread stack no longer needing to be that big, but that all other threadpools can be reduced too. Threadpools aren’t used only when playing a video/audio file. Anyway, this needs further inspection… we’ll know soon :) I do hope that the 100 process figures scenario that was given is a worse case scenario though... JY smime.p7s Description: S/MIME cryptographic signature ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: Fission MemShrink Newsletter #1: What (it is) and Why (it matters to you)
On Wed, Jul 11, 2018 at 01:49:04PM +0200, Jean-Yves Avenard wrote: There’s one place where we could gain heaps is in the media stack. Currently, each content process allocate a thread-pool with at least 8 threads for use with the media decoders, each threads a default stack size of 256kB. (https://searchfox.org/mozilla-central/source/xpcom/threads/nsIThreadManager.idl#53) That stack size has been increased over the years due to the growing use of either system frameworks (in particular the mac CoreVideo framework that use over 200kB alone), and right now 256kB itself isn’t enough for the new AV1 decoder from libaom. One of the work the media team has started, is to have all those decoders run in a dedicated process: the reason for this work was mostly done for security reasons, but there will be side gains memory-wise. This work is tracked in bug 1471535 (https://bugzilla.mozilla.org/show_bug.cgi?id=1471535) Once this is done, and we no longer calls decoders in the content process, the decoder process could use an increase stack size, while reducing the content process default stack size to 128kB (and maybe even 64kB) That alone may be sufficient to achieve your mentioned goals. Thanks. Boris added this as a blocker. It looks like it will be helpful, but unfortunately won't give us the 2MB simple arithmetic would suggest. On Windows, at least, (and probably elsewhere, but need to confirm) thread stacks are lazily committed, so as long as the decoders aren't used in a process, the overhead is probably closer to 25KB per thread. Shrinking the size of the thread pool and lazily spinning up threads when they're first needed would probably save us 200KB per process, though... An immediate intermediary step could be to use two different stack sizes as we pretty much know which one needs more over others. JY On 10 Jul 2018, at 8:19 pm, Kris Maglione wrote: Welcome to the first edition of the Fission MemShrink newsletter.[1] In this edition, I'll sum up what the project is, and why it matters to you. In subsequent editions, I'll give updates on progress that we've made, and areas that we'll need to focus on next.[2] The Fission MemShrink project is one of the most easily overlooked aspects of Project Fission (also known as Site Isolation), but is absolutely critical to its success. And will require a company- and community-wide effort effort to meet its goals. The problem is thus: In order for site isolation to work, we need to be able to run *at least* 100 content processes in an average Firefox session. Each of those processes has its own base memory overhead—memory we use just for creating the process, regardless of what's running in it. In the post-Fission world, that overhead needs to be less than 10MB per process in order to keep the extra overhead from Fission below 1GB. Right now, on our best-cast platform, Windows 10, is somewhere between 17 and 21MB. Linux and OS-X hover between 25 and 35MB. In other words, between 2 and 3.5GB for an ordinary session. That means that, in the best case, we need to reduce the memory we use in content processes by *at least* 7MB. The problem, of course, is that there are only so many places we can cut memory without losing functionality, and even fewer places where we can make big wins. But, there are lots of places we can make small and medium-sized wins. So, to put the task into perspective, of all of the places we can cut a certain amount of overhead, here are the number of each that we need to fix in order to reach 1MB: 250KB: 4 100KB: 10 75KB: 13 50KB: 20 20KB: 50 10KB: 100 5KB: 200 Now remember: we need to do *all* of these in order to reach our goal. It's not a matter of one 250KB improvement or 50 5KB improvements. It's 4 250KB *and* 200 5KB improvements. There just aren't enough places we can cut 250KB. If we fall short in any of those areas, Project Fission will fail, and Firefox will be the only major browser without site isolation. But it won't fail, because all of you are awesome, and this is a totally achievable goal if we all throw our effort behind it. Essentially what this means, though, is that if we identify an area of overhead that's 50KB[3] or larger that can be eliminated, it *has* to be eliminated. There just aren't that many large chunks to remove. They all need to go. And if an area of code has a dozen 5KB chunks that can be eliminated, maybe they don't all have to go, but at least half of them do. The more the better. To help us triage these issues, we have a tracking bug (https://bugzil.la/memshrink-content), and a per-bug whiteboard tag ([overhead:...]) which gives an estimate of how much per-process overhead we believe fixing that bug would eliminate. Please feel free to add blockers to the tracking bug if you think they're relevant, and to add or update [overhead] tags if you have reasonable estimates. With all of that said, here's a brief updat
Re: Fission MemShrink Newsletter #1: What (it is) and Why (it matters to you)
>On 7/11/18 5:42 AM, David Bruant wrote: >> I've seen this information of 100 content processes in a couple places but >> i haven't been able to find the rationale for it. How was the 100 number >> picked? > >I believe this is based on telemetry for number of distinct sites involved >in browsing sessions. As an example, 10 randomly chosen tabs in Chrome site isolation (a few months ago) yielded ~80 renderers (Content processes). Some sites generate a lot; that list of 10 included some which likely don't generate more than 1 or 2: google.com, mozilla.org, facebook login page, wikipedia (might spawn a few?). >> Would 90 prevent a release of project fission? > >It would make it harder to ship to users, yes... Whether it "prevents" >would depend on other considerations. It's a continuum - the more memory we use, the more OOMs, the worse we'll look (relative to Chrome), the larger impact on system perf, etc. There's likely no hard line, but there may be a defined "we need to get at least here" line, and for now that's 100 apparently (I wasn't directly involved in picking it, so I don't know how "hard" it is). We'll have to do more than just limit process sizes, but limiting process sizes is basically table stakes, IMO. -- Randell Jesup, Mozilla Corp remove "news" for personal email ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: Fission MemShrink Newsletter #1: What (it is) and Why (it matters to you)
On Wed, Jul 11, 2018 at 02:42:11PM +0200, David Bruant wrote: 2018-07-10 20:19 GMT+02:00 Kris Maglione : The problem is thus: In order for site isolation to work, we need to be able to run *at least* 100 content processes in an average Firefox session I've seen this information of 100 content processes in a couple places but i haven't been able to find the rationale for it. How was the 100 number picked? So, the basic problem here is that we don't get to choose the number of content processes we'll have. It will depend entirely on the number of origins that we load documents from at any given time. In practice, the biggest contributing factor to that number tends to be iframes (mostly for things like ads and social widgets). The "100 processes" number was initially chosen based on experimentation (basically, counting the number of origins loaded by typical pages on certain popular sites) and our knowledge of typical usage patterns. It's meant to be a conservative estimate of the number of processes typical users are likely to hit on a regular basis, though hopefully not all the time. For heavy users, we expect the number to be much higher[1]. And while those users typically have more RAM to spare, they also tend not to be happy when we waste it. We also need to add to that number the Activity Stream process that hosts things like about:newtab and about:home, the system extension process, processes for any other extensions the user has installed (which will each likely need their own processes for the same reasons each content origin will), and the pre-loaded web content process[4]. We've been working on improving our estimates by collecting telemetry on the number of document groups[2] per tab group[3]: https://telemetry.mozilla.org/new-pipeline/dist.html#!cumulative=1&end_date=2018-06-30&keys=__none__!__none__!__none__&max_channel_version=nightly%252F63&measure=TOTAL_HTTP_DOCGROUPS_PER_TABGROUP&min_channel_version=null&processType=*&product=Firefox&sanitize=0&sort_keys=submissions&start_date=2018-06-25&table=0&trim=1&use_submission_date=0 But we don't have enough data to draw conclusions yet. Would 90 prevent a release of project fission? This isn't really something we get to choose. The closest I can come is something like "would an overhead of 1.1GB prevent a release of project Fission". And, while the answer may turn out to be "no", I'd prefer not to speculate, because that's a decision we'd wind up paying for with user dissatisfaction. There are some other hacks that we can use to decrease the overall overhead, like aggressively unloading background tabs, and flushing their resources. We're almost certainly going to wind up having to do some of that regardless, but it comes at a performance cost. The more aggressive we have to be about it, the less responsive the browser is going to wind up being. So, again, the shorter we fall on our memory reduction efforts, the more we're going to pay in terms of user satisfaction. How will the rollout happen? Will the rollout happen progressively (like 2 content processes soon, 4 soon after, 10 some time after, etc.) or does it have to be 1 (current situation IIUC) then 100? * Andrew McCreight created a tool for tracking JS memory usage, and figuring out which scripts and objects are responsible for how much of it (https://bugzil.la/1463569). How often is this code run? Is there a place to find the daily output of this tool applied to a nightly build for instance? For the moment, it requires a patched build of Firefox, so we've been running it locally as we try to track down and fix memory issues, and Andrew has been periodically updating the numbers in the bug. I believe Andrew has been working on updating the patch to a land-able state (which is non-trivial), after which we'll hopefully be able to get up-to-date numbers from automation. [1]: Particularly readers of TechCrunch, which regularly loads 30 origins on a single page. [2]: Essentially documents of different origin. [3]: Essentially sets of tabs that are tied together because they were opened by things like window.open() calls or link clicks from other tabs. [4]: Which currently have only one of, but may need more of in the future in order to support loading several iframes in a given page without noticeable lag or jank. ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: Fission MemShrink Newsletter #1: What (it is) and Why (it matters to you)
On Wed, Jul 11, 2018 at 5:42 AM, David Bruant wrote: > > > * Andrew McCreight created a tool for tracking JS memory usage, and >> figuring >> out which scripts and objects are responsible for how much of it >> (https://bugzil.la/1463569). >> > How often is this code run? Is there a place to find the daily output of > this tool applied to a nightly build for instance? > You have to manually run this using a special build (hopefully I'll be able to at least land code so that a special build is not needed). It isn't clear from that description, but the focus here is on the chrome JS that is part of the browser, rather than on websites. Reducing content process chrome JS memory usage is going to have to be a big focus for this effort, because I believe other browsers don't write their UI in JS, and the way JIT stuff works it is harder to share code memory between processes than with AOT compiled code. If you look at about:memory, there's already a decent breakdown of how much memory is used in JS for different things, but that doesn't help you figure out which individual scripts are taking up memory. JSMs and content scripts are run in only a few globals (to save memory), but that means that looking up how much memory a global uses doesn't tell you much. Andrew > Thanks again, > > David > > ___ > firefox-dev mailing list > firefox-...@mozilla.org > https://mail.mozilla.org/listinfo/firefox-dev > > ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: Fission MemShrink Newsletter #1: What (it is) and Why (it matters to you)
On 7/11/18 5:42 AM, David Bruant wrote: I've seen this information of 100 content processes in a couple places but i haven't been able to find the rationale for it. How was the 100 number picked? I believe this is based on telemetry for number of distinct sites involved in browsing sessions. Would 90 prevent a release of project fission? It would make it harder to ship to users, yes... Whether it "prevents" would depend on other considerations. Will the rollout happen progressively (like 2 content processes soon, 4 soon after, 10 some time after, etc.) or does it have to be 1 (current situation IIUC) Current situation is 4 processes. How we scale up from there is TBD. -Boris ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: Fission MemShrink Newsletter #1: What (it is) and Why (it matters to you)
Thanks Kris for all this information and the beginning of the first issue of this newsletter! 2018-07-10 20:19 GMT+02:00 Kris Maglione : > The problem is thus: In order for site isolation to work, we need to be > able to run *at least* 100 content processes in an average Firefox session I've seen this information of 100 content processes in a couple places but i haven't been able to find the rationale for it. How was the 100 number picked? Would 90 prevent a release of project fission? How will the rollout happen? Will the rollout happen progressively (like 2 content processes soon, 4 soon after, 10 some time after, etc.) or does it have to be 1 (current situation IIUC) then 100? * Andrew McCreight created a tool for tracking JS memory usage, and figuring > out which scripts and objects are responsible for how much of it > (https://bugzil.la/1463569). > How often is this code run? Is there a place to find the daily output of this tool applied to a nightly build for instance? Thanks again, David ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: Fission MemShrink Newsletter #1: What (it is) and Why (it matters to you)
Hi That’s great info, thank you. There’s one place where we could gain heaps is in the media stack. Currently, each content process allocate a thread-pool with at least 8 threads for use with the media decoders, each threads a default stack size of 256kB. (https://searchfox.org/mozilla-central/source/xpcom/threads/nsIThreadManager.idl#53) That stack size has been increased over the years due to the growing use of either system frameworks (in particular the mac CoreVideo framework that use over 200kB alone), and right now 256kB itself isn’t enough for the new AV1 decoder from libaom. One of the work the media team has started, is to have all those decoders run in a dedicated process: the reason for this work was mostly done for security reasons, but there will be side gains memory-wise. This work is tracked in bug 1471535 (https://bugzilla.mozilla.org/show_bug.cgi?id=1471535) Once this is done, and we no longer calls decoders in the content process, the decoder process could use an increase stack size, while reducing the content process default stack size to 128kB (and maybe even 64kB) That alone may be sufficient to achieve your mentioned goals. An immediate intermediary step could be to use two different stack sizes as we pretty much know which one needs more over others. JY > On 10 Jul 2018, at 8:19 pm, Kris Maglione wrote: > > Welcome to the first edition of the Fission MemShrink newsletter.[1] > > In this edition, I'll sum up what the project is, and why it matters to you. > In subsequent editions, I'll give updates on progress that we've made, and > areas that we'll need to focus on next.[2] > > > The Fission MemShrink project is one of the most easily overlooked aspects of > Project Fission (also known as Site Isolation), but is absolutely critical to > its success. And will require a company- and community-wide effort effort to > meet its goals. > > The problem is thus: In order for site isolation to work, we need to be able > to run *at least* 100 content processes in an average Firefox session. Each > of those processes has its own base memory overhead—memory we use just for > creating the process, regardless of what's running in it. In the post-Fission > world, that overhead needs to be less than 10MB per process in order to keep > the extra overhead from Fission below 1GB. Right now, on our best-cast > platform, Windows 10, is somewhere between 17 and 21MB. Linux and OS-X hover > between 25 and 35MB. In other words, between 2 and 3.5GB for an ordinary > session. > > That means that, in the best case, we need to reduce the memory we use in > content processes by *at least* 7MB. The problem, of course, is that there > are only so many places we can cut memory without losing functionality, and > even fewer places where we can make big wins. But, there are lots of places > we can make small and medium-sized wins. > > So, to put the task into perspective, of all of the places we can cut a > certain amount of overhead, here are the number of each that we need to fix > in order to reach 1MB: > > 250KB: 4 > 100KB: 10 > 75KB: 13 > 50KB: 20 > 20KB: 50 > 10KB: 100 > 5KB: 200 > > Now remember: we need to do *all* of these in order to reach our goal. It's > not a matter of one 250KB improvement or 50 5KB improvements. It's 4 250KB > *and* 200 5KB improvements. There just aren't enough places we can cut 250KB. > If we fall short in any of those areas, Project Fission will fail, and > Firefox will be the only major browser without site isolation. > > But it won't fail, because all of you are awesome, and this is a totally > achievable goal if we all throw our effort behind it. > > Essentially what this means, though, is that if we identify an area of > overhead that's 50KB[3] or larger that can be eliminated, it *has* to be > eliminated. There just aren't that many large chunks to remove. They all need > to go. And if an area of code has a dozen 5KB chunks that can be eliminated, > maybe they don't all have to go, but at least half of them do. The more the > better. > > > To help us triage these issues, we have a tracking bug > (https://bugzil.la/memshrink-content), and a per-bug whiteboard tag > ([overhead:...]) which gives an estimate of how much per-process overhead we > believe fixing that bug would eliminate. Please feel free to add blockers to > the tracking bug if you think they're relevant, and to add or update > [overhead] tags if you have reasonable estimates. > > > With all of that said, here's a brief update of the progress we've made so > far: > > In the past month, unique memory per process[4] has dropped 3-4MB[5], and JS > memory usage in particular has dropped 1.1-1.9MB. > > Particular credit goes to: > > * Eric Rahm added an AWSY test suite to track base content process memory > (https://bugzil.la/1442361). Results: > > Resident unique: > https://treeherder.mozilla.org/perf.html#/graphs?series=mozilla-
Re: Fission MemShrink Newsletter #1: What (it is) and Why (it matters to you)
>Welcome to the first edition of the Fission MemShrink newsletter.[1] This is awesome and critical. I'll note (and many of you know this well) that in addition to getting rid of allocations (or making them lazy), another primary solution is to move data out of the Content processes, and into the master process (or some other shared process, if that's advisable for security or other reasons), and access the data over IPC. Or you can move it to a shared memory block (with appropriate locking if not static). For example, on linux one of our worst offenders is fontconfig; Chrome for example remotes much of that to the master process. -- Randell Jesup, Mozilla Corp remove "news" for personal email ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Fission MemShrink Newsletter #1: What (it is) and Why (it matters to you)
Welcome to the first edition of the Fission MemShrink newsletter.[1] In this edition, I'll sum up what the project is, and why it matters to you. In subsequent editions, I'll give updates on progress that we've made, and areas that we'll need to focus on next.[2] The Fission MemShrink project is one of the most easily overlooked aspects of Project Fission (also known as Site Isolation), but is absolutely critical to its success. And will require a company- and community-wide effort effort to meet its goals. The problem is thus: In order for site isolation to work, we need to be able to run *at least* 100 content processes in an average Firefox session. Each of those processes has its own base memory overhead—memory we use just for creating the process, regardless of what's running in it. In the post-Fission world, that overhead needs to be less than 10MB per process in order to keep the extra overhead from Fission below 1GB. Right now, on our best-cast platform, Windows 10, is somewhere between 17 and 21MB. Linux and OS-X hover between 25 and 35MB. In other words, between 2 and 3.5GB for an ordinary session. That means that, in the best case, we need to reduce the memory we use in content processes by *at least* 7MB. The problem, of course, is that there are only so many places we can cut memory without losing functionality, and even fewer places where we can make big wins. But, there are lots of places we can make small and medium-sized wins. So, to put the task into perspective, of all of the places we can cut a certain amount of overhead, here are the number of each that we need to fix in order to reach 1MB: 250KB: 4 100KB: 10 75KB: 13 50KB: 20 20KB: 50 10KB: 100 5KB: 200 Now remember: we need to do *all* of these in order to reach our goal. It's not a matter of one 250KB improvement or 50 5KB improvements. It's 4 250KB *and* 200 5KB improvements. There just aren't enough places we can cut 250KB. If we fall short in any of those areas, Project Fission will fail, and Firefox will be the only major browser without site isolation. But it won't fail, because all of you are awesome, and this is a totally achievable goal if we all throw our effort behind it. Essentially what this means, though, is that if we identify an area of overhead that's 50KB[3] or larger that can be eliminated, it *has* to be eliminated. There just aren't that many large chunks to remove. They all need to go. And if an area of code has a dozen 5KB chunks that can be eliminated, maybe they don't all have to go, but at least half of them do. The more the better. To help us triage these issues, we have a tracking bug (https://bugzil.la/memshrink-content), and a per-bug whiteboard tag ([overhead:...]) which gives an estimate of how much per-process overhead we believe fixing that bug would eliminate. Please feel free to add blockers to the tracking bug if you think they're relevant, and to add or update [overhead] tags if you have reasonable estimates. With all of that said, here's a brief update of the progress we've made so far: In the past month, unique memory per process[4] has dropped 3-4MB[5], and JS memory usage in particular has dropped 1.1-1.9MB. Particular credit goes to: * Eric Rahm added an AWSY test suite to track base content process memory (https://bugzil.la/1442361). Results: Resident unique: https://treeherder.mozilla.org/perf.html#/graphs?series=mozilla-central,1684862,1,4&series=mozilla-central,1684846,1,4&series=mozilla-central,1685133,1,4&series=mozilla-central,1685127,1,4 Explicit allocations: https://treeherder.mozilla.org/perf.html#/graphs?series=mozilla-inbound,1706218,1,4&series=mozilla-inbound,1706220,1,4&series=mozilla-inbound,1706216,1,4 JS: https://treeherder.mozilla.org/perf.html#/graphs?series=mozilla-central,1684866,1,4&series=mozilla-central,1685137,1,4&series=mozilla-central,1685131,1,4 * Andrew McCreight created a tool for tracking JS memory usage, and figuring out which scripts and objects are responsible for how much of it (https://bugzil.la/1463569). * Andrew and Nika Layzell also completely rewrote the way we handle XPIDL type info so that it's statically compiled into the executable and shared between all processes (https://bugzil.la/1438688, https://bugzil.la/1444745). * Felipe Gomes split a bunch of code out of frame scripts so that it could be lazily loaded only when needed (https://bugzil.la/1467278, ...) and added a whitelist of JSMs that are allowed to be loaded at content process startup (https://bugzil.la/1471066) * I did a bit of this too, and also prevented us from loading some other JSMs before we need them (https://bugzil.la/1470333, https://bugzil.la/1469719, ...) * Nick Nethercote made dynamic nsAtoms allocate their string storage inline rather than use a refcounted StringBuffer (https://bugzil.la/1447951) * Emilio Álvarez reduced the amount of memory the Gecko Profiler use