Re: [PATCH 3/4] mm: Add free()
On Tue, Apr 03, 2018 at 10:50:59AM +0200, Pavel Machek wrote: > > gcc already does some nice optimisations around free(). For example, it > > can eliminate dead stores: > > Are we comfortable with that optimalization for kernel? > > us: "Hey, let's remove those encryption keys before freeing memory." > gcc: :-). > > us: "Hey, we want to erase lock magic values not to cause confusion > later." > gcc: "I like confusion!" > > Yes, these probably can be fixed by strategic "volatile" and/or > barriers, but... Exactly, we should mark those sites explicitly with volatile so that they aren't dead stores.
Re: [PATCH 3/4] mm: Add free()
Hi! > > And sure, your free() implementation obviously also has that property, > > but I'm worried that they might one day decide to warn about the > > prototype mismatch (actually, I'm surprised it doesn't warn now, given > > that it obviously pretends to know what free() function I'm calling...), > > or make some crazy optimization that will break stuff in very subtle ways. > > > > Also, we probably don't want people starting to use free() (or whatever > > name is chosen) if they do know the kind of memory they're freeing? > > Maybe it should not be advertised that widely (i.e., in kernel.h). > > All that you've said I see as an advantage, not a disadvantage. > Maybe I should change the prototype to match the userspace > free(), although gcc is deliberately lax about the constness of > function arguments when determining compatibility with builtins. > See match_builtin_function_types() if you're really curious. > > gcc already does some nice optimisations around free(). For example, it > can eliminate dead stores: Are we comfortable with that optimalization for kernel? us: "Hey, let's remove those encryption keys before freeing memory." gcc: :-). us: "Hey, we want to erase lock magic values not to cause confusion later." gcc: "I like confusion!" Yes, these probably can be fixed by strategic "volatile" and/or barriers, but... Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html signature.asc Description: Digital signature
Re: [PATCH 3/4] mm: Add free()
On Fri, Mar 23, 2018 at 08:14:21AM -0700, Matthew Wilcox wrote: > On Fri, Mar 23, 2018 at 04:33:24PM +0300, Kirill Tkhai wrote: > > > + page = virt_to_head_page(ptr); > > > + if (likely(PageSlab(page))) > > > + return kmem_cache_free(page->slab_cache, (void *)ptr); > > > > It seems slab_cache is not generic for all types of slabs. SLOB does not > > care about it: > > Oof. I was sure I checked that. You're quite right that it doesn't ... > this should fix that problem: This patch was complete rubbish. The point of SLOB is that it mixes sizes within the same page, and doesn't store the size when allocating from a slab. So there is no way to tell. I'm going to think about this some more.
Re: [PATCH 3/4] mm: Add free()
Hi Matthew, I love your patch! Yet something to improve: [auto build test ERROR on rcu/rcu/next] [also build test ERROR on v4.16-rc6 next-20180323] [if your patch is applied to the wrong git tree, please drop us a note to help improve the system] url: https://github.com/0day-ci/linux/commits/Matthew-Wilcox/Add-free-function/20180324-140756 base: https://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git rcu/next config: s390-default_defconfig (attached as .config) compiler: s390x-linux-gnu-gcc (Debian 7.2.0-11) 7.2.0 reproduce: wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # save the attached .config to linux build tree make.cross ARCH=s390 All errors (new ones prefixed by >>): >> drivers/s390/cio/blacklist.c:43:20: error: 'free' redeclared as different >> kind of symbol typedef enum {add, free} range_action; ^~~~ In file included from include/linux/list.h:9:0, from include/linux/preempt.h:11, from include/linux/spinlock.h:51, from include/linux/vmalloc.h:5, from drivers/s390/cio/blacklist.c:15: include/linux/kernel.h:935:6: note: previous declaration of 'free' was here void free(const void *); ^~~~ vim +/free +43 drivers/s390/cio/blacklist.c ^1da177e Linus Torvalds 2005-04-16 30 ^1da177e Linus Torvalds 2005-04-16 31 /* ^1da177e Linus Torvalds 2005-04-16 32 * "Blacklisting" of certain devices: ^1da177e Linus Torvalds 2005-04-16 33 * Device numbers given in the commandline as cio_ignore=... won't be known ^1da177e Linus Torvalds 2005-04-16 34 * to Linux. ^1da177e Linus Torvalds 2005-04-16 35 * ^1da177e Linus Torvalds 2005-04-16 36 * These can be single devices or ranges of devices ^1da177e Linus Torvalds 2005-04-16 37 */ ^1da177e Linus Torvalds 2005-04-16 38 fb6958a5 Cornelia Huck 2006-01-06 39 /* 65536 bits for each set to indicate if a devno is blacklisted or not */ a8237fc4 Cornelia Huck 2006-01-06 40 #define __BL_DEV_WORDS ((__MAX_SUBCHANNEL + (8*sizeof(long) - 1)) / \ ^1da177e Linus Torvalds 2005-04-16 41 (8*sizeof(long))) fb6958a5 Cornelia Huck 2006-01-06 42 static unsigned long bl_dev[__MAX_SSID + 1][__BL_DEV_WORDS]; ^1da177e Linus Torvalds 2005-04-16 @43 typedef enum {add, free} range_action; ^1da177e Linus Torvalds 2005-04-16 44 :: The code at line 43 was first introduced by commit :: 1da177e4c3f41524e886b7f1b8a0c1fc7321cac2 Linux-2.6.12-rc2 :: TO: Linus Torvalds :: CC: Linus Torvalds --- 0-DAY kernel test infrastructureOpen Source Technology Center https://lists.01.org/pipermail/kbuild-all Intel Corporation .config.gz Description: application/gzip
Re: [PATCH 3/4] mm: Add free()
On Fri, Mar 23, 2018 at 08:14:21AM -0700, Matthew Wilcox wrote: > > One more thing, there is > > some kasan checks on the main way of kfree(), and there is no guarantee they > > reflected in kmem_cache_free() identical. > > Which function are you talking about here? > > slub calls slab_free() for both kfree() and kmem_cache_free(). > slab calls __cache_free() for both kfree() and kmem_cache_free(). > Each of them do their kasan handling in the called function. ... except for where slub can free large objects without calling slab_free(): if (unlikely(!PageSlab(page))) { BUG_ON(!PageCompound(page)); kfree_hook(object); __free_pages(page, compound_order(page)); return; } slab_free(page->slab_cache, page, object, NULL, 1, _RET_IP_); If you call kmalloc(16384, GFP_KERNEL), slub will hand back an order-2 page without setting PageSlab on it. So if that gets passed to free(), it'll call __put_page() which calls free_compound_page() which calls __free_pages_ok(). Looks like we want another compound_dtor to be sure we call the kfree hook.
Re: [PATCH 3/4] mm: Add free()
On 23.03.2018 18:14, Matthew Wilcox wrote: > On Fri, Mar 23, 2018 at 04:33:24PM +0300, Kirill Tkhai wrote: >>> + page = virt_to_head_page(ptr); >>> + if (likely(PageSlab(page))) >>> + return kmem_cache_free(page->slab_cache, (void *)ptr); >> >> It seems slab_cache is not generic for all types of slabs. SLOB does not >> care about it: > > Oof. I was sure I checked that. You're quite right that it doesn't ... > this should fix that problem: > > diff --git a/mm/slob.c b/mm/slob.c > index 623e8a5c46ce..96339420c6fc 100644 > --- a/mm/slob.c > +++ b/mm/slob.c > @@ -266,7 +266,7 @@ static void *slob_page_alloc(struct page *sp, size_t > size, int align) > /* > * slob_alloc: entry point into the slob allocator. > */ > -static void *slob_alloc(size_t size, gfp_t gfp, int align, int node) > +static void *slob_alloc(size_t size, gfp_t gfp, int align, int node, void *c) > { > struct page *sp; > struct list_head *prev; > @@ -324,6 +324,7 @@ static void *slob_alloc(size_t size, gfp_t gfp, int > align, int node) > sp->units = SLOB_UNITS(PAGE_SIZE); > sp->freelist = b; > INIT_LIST_HEAD(&sp->lru); > + sp->slab_cache = c; > set_slob(b, SLOB_UNITS(PAGE_SIZE), b + SLOB_UNITS(PAGE_SIZE)); > set_slob_page_free(sp, slob_list); > b = slob_page_alloc(sp, size, align); > @@ -440,7 +441,7 @@ __do_kmalloc_node(size_t size, gfp_t gfp, int node, > unsigned long caller) > if (!size) > return ZERO_SIZE_PTR; > > - m = slob_alloc(size + align, gfp, align, node); > + m = slob_alloc(size + align, gfp, align, node, NULL); > > if (!m) > return NULL; > @@ -544,7 +545,7 @@ static void *slob_alloc_node(struct kmem_cache *c, gfp_t > flags, int node) > fs_reclaim_release(flags); > > if (c->size < PAGE_SIZE) { > - b = slob_alloc(c->size, flags, c->align, node); > + b = slob_alloc(c->size, flags, c->align, node, c); > trace_kmem_cache_alloc_node(_RET_IP_, b, c->object_size, > SLOB_UNITS(c->size) * SLOB_UNIT, > flags, node); > @@ -600,6 +601,8 @@ static void kmem_rcu_free(struct rcu_head *head) > > void kmem_cache_free(struct kmem_cache *c, void *b) > { > + if (!c) > + return kfree(b); > kmemleak_free_recursive(b, c->flags); > if (unlikely(c->flags & SLAB_TYPESAFE_BY_RCU)) { > struct slob_rcu *slob_rcu; > >> Also, using kmem_cache_free() for kmalloc()'ed memory will connect them >> hardly, >> and this may be difficult to maintain in the future. > > I think the win from being able to delete all the little RCU callbacks > that just do a kmem_cache_free() is big enough to outweigh the > disadvantage of forcing slab allocators to support kmem_cache_free() > working on kmalloced memory. > >> One more thing, there is >> some kasan checks on the main way of kfree(), and there is no guarantee they >> reflected in kmem_cache_free() identical. > > Which function are you talking about here? > > slub calls slab_free() for both kfree() and kmem_cache_free(). > slab calls __cache_free() for both kfree() and kmem_cache_free(). > Each of them do their kasan handling in the called function. Maybe not KASAN, I never dived deeply into sl[*]b. But they look like just three different functions, doing different actions... Kirill
Re: [PATCH 3/4] mm: Add free()
On Fri, Mar 23, 2018 at 04:33:24PM +0300, Kirill Tkhai wrote: > > + page = virt_to_head_page(ptr); > > + if (likely(PageSlab(page))) > > + return kmem_cache_free(page->slab_cache, (void *)ptr); > > It seems slab_cache is not generic for all types of slabs. SLOB does not care > about it: Oof. I was sure I checked that. You're quite right that it doesn't ... this should fix that problem: diff --git a/mm/slob.c b/mm/slob.c index 623e8a5c46ce..96339420c6fc 100644 --- a/mm/slob.c +++ b/mm/slob.c @@ -266,7 +266,7 @@ static void *slob_page_alloc(struct page *sp, size_t size, int align) /* * slob_alloc: entry point into the slob allocator. */ -static void *slob_alloc(size_t size, gfp_t gfp, int align, int node) +static void *slob_alloc(size_t size, gfp_t gfp, int align, int node, void *c) { struct page *sp; struct list_head *prev; @@ -324,6 +324,7 @@ static void *slob_alloc(size_t size, gfp_t gfp, int align, int node) sp->units = SLOB_UNITS(PAGE_SIZE); sp->freelist = b; INIT_LIST_HEAD(&sp->lru); + sp->slab_cache = c; set_slob(b, SLOB_UNITS(PAGE_SIZE), b + SLOB_UNITS(PAGE_SIZE)); set_slob_page_free(sp, slob_list); b = slob_page_alloc(sp, size, align); @@ -440,7 +441,7 @@ __do_kmalloc_node(size_t size, gfp_t gfp, int node, unsigned long caller) if (!size) return ZERO_SIZE_PTR; - m = slob_alloc(size + align, gfp, align, node); + m = slob_alloc(size + align, gfp, align, node, NULL); if (!m) return NULL; @@ -544,7 +545,7 @@ static void *slob_alloc_node(struct kmem_cache *c, gfp_t flags, int node) fs_reclaim_release(flags); if (c->size < PAGE_SIZE) { - b = slob_alloc(c->size, flags, c->align, node); + b = slob_alloc(c->size, flags, c->align, node, c); trace_kmem_cache_alloc_node(_RET_IP_, b, c->object_size, SLOB_UNITS(c->size) * SLOB_UNIT, flags, node); @@ -600,6 +601,8 @@ static void kmem_rcu_free(struct rcu_head *head) void kmem_cache_free(struct kmem_cache *c, void *b) { + if (!c) + return kfree(b); kmemleak_free_recursive(b, c->flags); if (unlikely(c->flags & SLAB_TYPESAFE_BY_RCU)) { struct slob_rcu *slob_rcu; > Also, using kmem_cache_free() for kmalloc()'ed memory will connect them > hardly, > and this may be difficult to maintain in the future. I think the win from being able to delete all the little RCU callbacks that just do a kmem_cache_free() is big enough to outweigh the disadvantage of forcing slab allocators to support kmem_cache_free() working on kmalloced memory. > One more thing, there is > some kasan checks on the main way of kfree(), and there is no guarantee they > reflected in kmem_cache_free() identical. Which function are you talking about here? slub calls slab_free() for both kfree() and kmem_cache_free(). slab calls __cache_free() for both kfree() and kmem_cache_free(). Each of them do their kasan handling in the called function.
Re: [PATCH 3/4] mm: Add free()
On Fri, Mar 23, 2018 at 09:04:10AM +0100, Rasmus Villemoes wrote: > On 2018-03-22 20:58, Matthew Wilcox wrote: > > From: Matthew Wilcox > > > > free() can free many different kinds of memory. > > I'd be a bit worried about using that name. gcc very much knows about > the C standard's definition of that function, as can be seen on > godbolt.org by compiling > > void free(const void *); > void f(void) > { > free((void*)0); > } > > with -O2 -Wall -Wextra -c. Anything from 4.6 onwards simply compiles this to > > f: > repz retq > > And sure, your free() implementation obviously also has that property, > but I'm worried that they might one day decide to warn about the > prototype mismatch (actually, I'm surprised it doesn't warn now, given > that it obviously pretends to know what free() function I'm calling...), > or make some crazy optimization that will break stuff in very subtle ways. > > Also, we probably don't want people starting to use free() (or whatever > name is chosen) if they do know the kind of memory they're freeing? > Maybe it should not be advertised that widely (i.e., in kernel.h). All that you've said I see as an advantage, not a disadvantage. Maybe I should change the prototype to match the userspace free(), although gcc is deliberately lax about the constness of function arguments when determining compatibility with builtins. See match_builtin_function_types() if you're really curious. gcc already does some nice optimisations around free(). For example, it can eliminate dead stores: #include void f(char *foo) { foo[1] = 3; free(foo); } becomes: : 0: e9 00 00 00 00 jmpq 5 1: R_X86_64_PLT32 free-0x4 You can see more things it knows about free() by grepping for BUILT_IN_FREE. I absolutely do want to see people using free() instead of kfree() if it's not important that the memory was kmalloced. I wouldn't go through and change existing code, but I do want to see #define malloc(x) kvmalloc((x), GFP_KERNEL)
Re: [PATCH 3/4] mm: Add free()
Hi, Matthew, On 22.03.2018 22:58, Matthew Wilcox wrote: > From: Matthew Wilcox > > free() can free many different kinds of memory. > > Signed-off-by: Matthew Wilcox > --- > include/linux/kernel.h | 2 ++ > mm/util.c | 39 +++ > 2 files changed, 41 insertions(+) > > diff --git a/include/linux/kernel.h b/include/linux/kernel.h > index 3fd291503576..8bb578938e65 100644 > --- a/include/linux/kernel.h > +++ b/include/linux/kernel.h > @@ -933,6 +933,8 @@ static inline void ftrace_dump(enum ftrace_dump_mode > oops_dump_mode) { } >"pointer type mismatch in container_of()");\ > ((type *)(__mptr - offsetof(type, member))); }) > > +void free(const void *); > + > /* Rebuild everything on CONFIG_FTRACE_MCOUNT_RECORD */ > #ifdef CONFIG_FTRACE_MCOUNT_RECORD > # define REBUILD_DUE_TO_FTRACE_MCOUNT_RECORD > diff --git a/mm/util.c b/mm/util.c > index dc4c7b551aaf..8aa2071059b0 100644 > --- a/mm/util.c > +++ b/mm/util.c > @@ -26,6 +26,45 @@ static inline int is_kernel_rodata(unsigned long addr) > addr < (unsigned long)__end_rodata; > } > > +/** > + * free() - Free memory > + * @ptr: Pointer to memory > + * > + * This function can free almost any type of memory. It can safely be > + * called on: > + * * NULL pointers. > + * * Pointers to read-only data (will do nothing). > + * * Pointers to memory allocated from kmalloc(). > + * * Pointers to memory allocated from kmem_cache_alloc(). > + * * Pointers to memory allocated from vmalloc(). > + * * Pointers to memory allocated from alloc_percpu(). > + * * Pointers to memory allocated from __get_free_pages(). > + * * Pointers to memory allocated from page_frag_alloc(). > + * > + * It cannot free memory allocated by dma_pool_alloc() or > dma_alloc_coherent(). > + */ > +void free(const void *ptr) > +{ > + struct page *page; > + > + if (unlikely(ZERO_OR_NULL_PTR(ptr))) > + return; > + if (is_kernel_rodata((unsigned long)ptr)) > + return; > + > + page = virt_to_head_page(ptr); > + if (likely(PageSlab(page))) > + return kmem_cache_free(page->slab_cache, (void *)ptr); It seems slab_cache is not generic for all types of slabs. SLOB does not care about it: ~/linux-next$ git grep -w slab_cache mm include | grep -v kasan include/linux/mm.h: * slab code uses page->slab_cache, which share storage with page->ptl. include/linux/mm_types.h: struct kmem_cache *slab_cache; /* SL[AU]B: Pointer to slab */ mm/slab.c: return page->slab_cache; mm/slab.c: cachep = page->slab_cache; mm/slab.c: page->slab_cache = cache; mm/slab.c: cachep = page->slab_cache; mm/slab.h: cachep = page->slab_cache; mm/slub.c: if (unlikely(s != page->slab_cache)) { mm/slub.c: } else if (!page->slab_cache) { mm/slub.c: page->slab_cache = s; mm/slub.c: __free_slab(page->slab_cache, page); mm/slub.c: df->s = page->slab_cache; mm/slub.c: s = page->slab_cache; mm/slub.c: return slab_ksize(page->slab_cache); mm/slub.c: slab_free(page->slab_cache, page, object, NULL, 1, _RET_IP_); mm/slub.c: p->slab_cache = s; mm/slub.c: p->slab_cache = s; Also, using kmem_cache_free() for kmalloc()'ed memory will connect them hardly, and this may be difficult to maintain in the future. One more thing, there is some kasan checks on the main way of kfree(), and there is no guarantee they reflected in kmem_cache_free() identical. Maybe, we will use kfree() for now, and skip kmemcache free() support? If there is no different way to differ kmemcache memory from kmalloc()'ed memory, of course. Kirill
Re: [PATCH 3/4] mm: Add free()
On 2018-03-22 20:58, Matthew Wilcox wrote: > From: Matthew Wilcox > > free() can free many different kinds of memory. I'd be a bit worried about using that name. gcc very much knows about the C standard's definition of that function, as can be seen on godbolt.org by compiling void free(const void *); void f(void) { free((void*)0); } with -O2 -Wall -Wextra -c. Anything from 4.6 onwards simply compiles this to f: repz retq And sure, your free() implementation obviously also has that property, but I'm worried that they might one day decide to warn about the prototype mismatch (actually, I'm surprised it doesn't warn now, given that it obviously pretends to know what free() function I'm calling...), or make some crazy optimization that will break stuff in very subtle ways. Also, we probably don't want people starting to use free() (or whatever name is chosen) if they do know the kind of memory they're freeing? Maybe it should not be advertised that widely (i.e., in kernel.h). Rasmus
[PATCH 3/4] mm: Add free()
From: Matthew Wilcox free() can free many different kinds of memory. Signed-off-by: Matthew Wilcox --- include/linux/kernel.h | 2 ++ mm/util.c | 39 +++ 2 files changed, 41 insertions(+) diff --git a/include/linux/kernel.h b/include/linux/kernel.h index 3fd291503576..8bb578938e65 100644 --- a/include/linux/kernel.h +++ b/include/linux/kernel.h @@ -933,6 +933,8 @@ static inline void ftrace_dump(enum ftrace_dump_mode oops_dump_mode) { } "pointer type mismatch in container_of()");\ ((type *)(__mptr - offsetof(type, member))); }) +void free(const void *); + /* Rebuild everything on CONFIG_FTRACE_MCOUNT_RECORD */ #ifdef CONFIG_FTRACE_MCOUNT_RECORD # define REBUILD_DUE_TO_FTRACE_MCOUNT_RECORD diff --git a/mm/util.c b/mm/util.c index dc4c7b551aaf..8aa2071059b0 100644 --- a/mm/util.c +++ b/mm/util.c @@ -26,6 +26,45 @@ static inline int is_kernel_rodata(unsigned long addr) addr < (unsigned long)__end_rodata; } +/** + * free() - Free memory + * @ptr: Pointer to memory + * + * This function can free almost any type of memory. It can safely be + * called on: + * * NULL pointers. + * * Pointers to read-only data (will do nothing). + * * Pointers to memory allocated from kmalloc(). + * * Pointers to memory allocated from kmem_cache_alloc(). + * * Pointers to memory allocated from vmalloc(). + * * Pointers to memory allocated from alloc_percpu(). + * * Pointers to memory allocated from __get_free_pages(). + * * Pointers to memory allocated from page_frag_alloc(). + * + * It cannot free memory allocated by dma_pool_alloc() or dma_alloc_coherent(). + */ +void free(const void *ptr) +{ + struct page *page; + + if (unlikely(ZERO_OR_NULL_PTR(ptr))) + return; + if (is_kernel_rodata((unsigned long)ptr)) + return; + + page = virt_to_head_page(ptr); + if (likely(PageSlab(page))) + return kmem_cache_free(page->slab_cache, (void *)ptr); + + if (is_vmalloc_addr(ptr)) + return vfree(ptr); + if (is_kernel_percpu_address((unsigned long)ptr)) + free_percpu((void __percpu *)ptr); + if (put_page_testzero(page)) + __put_page(page); +} +EXPORT_SYMBOL(free); + /** * kfree_const - conditionally free memory * @x: pointer to the memory -- 2.16.2