Re: [MOCKUP] x86/mm: Lightweight lazy mm refcounting
> On Dec 3, 2020, at 2:13 PM, Nicholas Piggin wrote: > > Excerpts from Peter Zijlstra's message of December 3, 2020 6:44 pm: >>> On Wed, Dec 02, 2020 at 09:25:51PM -0800, Andy Lutomirski wrote: >>> >>> power: same as ARM, except that the loop may be rather larger since >>> the systems are bigger. But I imagine it's still faster than Nick's >>> approach -- a cmpxchg to a remote cacheline should still be faster than >>> an IPI shootdown. >> >> While a single atomic might be cheaper than an IPI, the comparison >> doesn't work out nicely. You do the xchg() on every unlazy, while the >> IPI would be once per process exit. >> >> So over the life of the process, it might do very many unlazies, adding >> up to a total cost far in excess of what the single IPI would've been. > > Yeah this is the concern, I looked at things that add cost to the > idle switch code and it gets hard to justify the scalability improvement > when you slow these fundmaental things down even a bit. v2 fixes this and is generally much nicer. I’ll send it out in a couple hours. > > I still think working on the assumption that IPIs = scary expensive > might not be correct. An IPI itself is, but you only issue them when > you've left a lazy mm on another CPU which just isn't that often. > > Thanks, > Nick
Re: [MOCKUP] x86/mm: Lightweight lazy mm refcounting
Excerpts from Peter Zijlstra's message of December 3, 2020 6:44 pm: > On Wed, Dec 02, 2020 at 09:25:51PM -0800, Andy Lutomirski wrote: > >> power: same as ARM, except that the loop may be rather larger since >> the systems are bigger. But I imagine it's still faster than Nick's >> approach -- a cmpxchg to a remote cacheline should still be faster than >> an IPI shootdown. > > While a single atomic might be cheaper than an IPI, the comparison > doesn't work out nicely. You do the xchg() on every unlazy, while the > IPI would be once per process exit. > > So over the life of the process, it might do very many unlazies, adding > up to a total cost far in excess of what the single IPI would've been. Yeah this is the concern, I looked at things that add cost to the idle switch code and it gets hard to justify the scalability improvement when you slow these fundmaental things down even a bit. I still think working on the assumption that IPIs = scary expensive might not be correct. An IPI itself is, but you only issue them when you've left a lazy mm on another CPU which just isn't that often. Thanks, Nick
Re: [MOCKUP] x86/mm: Lightweight lazy mm refcounting
On Thu, 2020-12-03 at 12:31 +, Matthew Wilcox wrote: > And this just makes me think RCU freeing of mm_struct. I'm sure it's > more complicated than that (then, or now), but if an anonymous > process > is borrowing a freed mm, and the mm is freed by RCU then it will not > go > away until the task context switches. When we context switch back to > the anon task, it'll borrow some other task's MM and won't even > notice > that the MM it was using has gone away. One major complication here is that most of the active_mm borrowing is done by the idle task, but RCU does not wait for idle tasks to context switch. That means RCU, as it is today, is not a mechanism that mm_struct freeing could just piggyback off. -- All Rights Reversed. signature.asc Description: This is a digitally signed message part
Re: [MOCKUP] x86/mm: Lightweight lazy mm refcounting
On Wed, Dec 02, 2020 at 09:25:51PM -0800, Andy Lutomirski wrote: > This code compiles, but I haven't even tried to boot it. The earlier > part of the series isn't terribly interesting -- it's a handful of > cleanups that remove all reads of ->active_mm from arch/x86. I've > been meaning to do that for a while, and now I did it. But, with > that done, I think we can move to a totally different lazy mm refcounting > model. I went back and read Documentation/vm/active_mm.rst recently. I think it's useful to think about how this would have been handled if we'd had RCU at the time. Particularly: Linus wrote: > To support all that, the "struct mm_struct" now has two counters: a > "mm_users" counter that is how many "real address space users" there are, > and a "mm_count" counter that is the number of "lazy" users (ie anonymous > users) plus one if there are any real users. And this just makes me think RCU freeing of mm_struct. I'm sure it's more complicated than that (then, or now), but if an anonymous process is borrowing a freed mm, and the mm is freed by RCU then it will not go away until the task context switches. When we context switch back to the anon task, it'll borrow some other task's MM and won't even notice that the MM it was using has gone away.
Re: [MOCKUP] x86/mm: Lightweight lazy mm refcounting
On Wed, Dec 02, 2020 at 09:25:51PM -0800, Andy Lutomirski wrote: > power: same as ARM, except that the loop may be rather larger since > the systems are bigger. But I imagine it's still faster than Nick's > approach -- a cmpxchg to a remote cacheline should still be faster than > an IPI shootdown. While a single atomic might be cheaper than an IPI, the comparison doesn't work out nicely. You do the xchg() on every unlazy, while the IPI would be once per process exit. So over the life of the process, it might do very many unlazies, adding up to a total cost far in excess of what the single IPI would've been. And while I appreciate all the work to get rid of the active_mm accounting; the worry I have with pushing this all into arch code is that it will be so very easy to get this subtly wrong.
Re: [MOCKUP] x86/mm: Lightweight lazy mm refcounting
Hi Andy, I love your patch! Yet something to improve: [auto build test ERROR on tip/x86/core] [also build test ERROR on tip/x86/mm soc/for-next linus/master v5.10-rc6 next-20201201] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch] url: https://github.com/0day-ci/linux/commits/Andy-Lutomirski/x86-mm-Lightweight-lazy-mm-refcounting/20201203-132706 base: https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git 238c91115cd05c71447ea071624a4c9fe661f970 config: m68k-randconfig-s032-20201203 (attached as .config) compiler: m68k-linux-gcc (GCC) 9.3.0 reproduce: wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # apt-get install sparse # sparse version: v0.6.3-179-ga00755aa-dirty # https://github.com/0day-ci/linux/commit/0d9b23b22e621d8e588095b4d0f9f39110a57901 git remote add linux-review https://github.com/0day-ci/linux git fetch --no-tags linux-review Andy-Lutomirski/x86-mm-Lightweight-lazy-mm-refcounting/20201203-132706 git checkout 0d9b23b22e621d8e588095b4d0f9f39110a57901 # save the attached .config to linux build tree COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' ARCH=m68k If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot All errors (new ones prefixed by >>): m68k-linux-ld: kernel/fork.o: in function `__mmput': >> fork.c:(.text+0x214): undefined reference to `arch_fixup_lazy_mm_refs' m68k-linux-ld: kernel/fork.o: in function `mmput_async_fn': fork.c:(.text+0x990): undefined reference to `arch_fixup_lazy_mm_refs' m68k-linux-ld: kernel/fork.o: in function `mmput': fork.c:(.text+0xa68): undefined reference to `arch_fixup_lazy_mm_refs' m68k-linux-ld: kernel/fork.o: in function `dup_mm.isra.0': fork.c:(.text+0x1192): undefined reference to `arch_fixup_lazy_mm_refs' m68k-linux-ld: kernel/fork.o: in function `mm_access': fork.c:(.text+0x13c6): undefined reference to `arch_fixup_lazy_mm_refs' --- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org .config.gz Description: application/gzip