Re: [RFC PATCH 7/9] housekeeping: Use own boot option, independant from nohz

2017-08-16 Thread Christopher Lameter
On Tue, 15 Aug 2017, Mike Galbraith wrote: > On Tue, 2017-08-15 at 10:52 -0500, Christopher Lameter wrote: > > On Tue, 15 Aug 2017, Paul E. McKenney wrote: > > > > > Don't the HPC guys just disable idle_balance(), or am I out of date again? > > > > Ummm..

Re: [RFC PATCH 3/3] mm/map_contig: Add mmap(MAP_CONTIG) support

2017-10-16 Thread Christopher Lameter
On Mon, 16 Oct 2017, Michal Hocko wrote: > But putting that aside. Pinning a lot of memory might cause many > performance issues and misbehavior. There are still kernel users > who need high order memory to work properly. On top of that you are > basically allowing an untrusted user to deplete hig

Re: [RFC PATCH 3/3] mm/map_contig: Add mmap(MAP_CONTIG) support

2017-10-16 Thread Christopher Lameter
On Mon, 16 Oct 2017, Michal Hocko wrote: > > So I mmap(MAP_CONTIG) 1GB working of working memory, prefer some data > > structures there, maybe recieve from network, then decide to write > > some and not write some other. > > Why would you want this? Because we are receiving a 1GB block of data an

Re: [RFC PATCH 3/3] mm/map_contig: Add mmap(MAP_CONTIG) support

2017-10-16 Thread Christopher Lameter
On Mon, 16 Oct 2017, Michal Hocko wrote: > On Mon 16-10-17 11:02:24, Cristopher Lameter wrote: > > On Mon, 16 Oct 2017, Michal Hocko wrote: > > > > > > So I mmap(MAP_CONTIG) 1GB working of working memory, prefer some data > > > > structures there, maybe recieve from network, then decide to write >

Re: [RFC PATCH 3/3] mm/map_contig: Add mmap(MAP_CONTIG) support

2017-10-16 Thread Christopher Lameter
On Mon, 16 Oct 2017, Michal Hocko wrote: > > We already have that issue and have ways to control that by tracking > > pinned and mlocked pages as well as limits on their allocations. > > Ohh, it is very different because mlock limit is really small (64kB) > which is not even close to what this is

Re: [PATCH RFC v2 4/4] mm/mempolicy: add nodes_empty check in SYSC_migrate_pages

2017-11-06 Thread Christopher Lameter
On Mon, 6 Nov 2017, Vlastimil Babka wrote: > I'm not sure what exactly is the EPERM intention. Should really the > capability of THIS process override the cpuset restriction of the TARGET > process? Maybe yes. Then, does "insufficient privilege (CAP_SYS_NICE) to CAP_SYS_NICE never overrides cpuse

Re: [PATCH v16 00/13] support "task_isolation" mode

2017-11-06 Thread Christopher Lameter
On Fri, 3 Nov 2017, Chris Metcalf wrote: > However, it doesn't seem possible to do the synchronous cancellation of > the vmstat deferred work with irqs disabled, though if there's a way, > it would be a little cleaner to do that; Christoph? We can certainly > update the statistics with interrupts

Re: [PATCH 1/2] mm: extract common code for calculating total memory size

2017-10-27 Thread Christopher Lameter
On Thu, 26 Oct 2017, Yang Shi wrote: > diff --git a/include/linux/mm.h b/include/linux/mm.h > index 935c4d4..e21b81e 100644 > --- a/include/linux/mm.h > +++ b/include/linux/mm.h > @@ -2050,6 +2050,31 @@ extern int __meminit __early_pfn_to_nid(unsigned long > pfn, > static inline void zero_resv_u

Re: [tip:sched/core] sched/isolation: Document the isolcpus= flags

2017-10-30 Thread Christopher Lameter
On Fri, 27 Oct 2017, Peter Zijlstra wrote: > I _strongly_ object to this statement, isolcpus is _not_ the preferred > way, cpusets are. > > And yes, while cpusets suffers some problems, we _should_ really fix > those and not promote this piece of shit isolcpus crap. Well low level control at the

Re: [tip:sched/core] sched/isolation: Document the isolcpus= flags

2017-10-30 Thread Christopher Lameter
On Mon, 30 Oct 2017, Peter Zijlstra wrote: > > isolcpus is the *right* approach here because you are micromanaging the OS > > and are putting dedicated pieces of software on each core. > > That is what you want, and cpusets should allow for that just fine. Well yes a cpuset of one processor I gue

Re: [RFC 1/2] slub: Avoid trying to allocate memory on offline nodes

2018-08-02 Thread Christopher Lameter
On Wed, 1 Aug 2018, Jeremy Linton wrote: > diff --git a/mm/slub.c b/mm/slub.c > index 51258eff4178..e03719bac1e2 100644 > --- a/mm/slub.c > +++ b/mm/slub.c > @@ -2519,6 +2519,8 @@ static void *___slab_alloc(struct kmem_cache *s, gfp_t > gfpflags, int node, > if (unlikely(!node_match

Re: [PATCH] proc: add percpu populated pages count to meminfo

2018-08-07 Thread Christopher Lameter
On Mon, 6 Aug 2018, Dennis Zhou wrote: > diff --git a/fs/proc/meminfo.c b/fs/proc/meminfo.c > index 2fb04846ed11..ddd5249692e9 100644 > --- a/fs/proc/meminfo.c > +++ b/fs/proc/meminfo.c > @@ -7,6 +7,7 @@ > #include > #include > #include > +#include > #include > #include > #include > @

Re: [PATCH] slub: track number of slabs irrespective of CONFIG_SLUB_DEBUG

2018-06-20 Thread Christopher Lameter
On Wed, 20 Jun 2018, Shakeel Butt wrote: > For !CONFIG_SLUB_DEBUG, SLUB does not maintain the number of slabs > allocated per node for a kmem_cache. Thus, slabs_node() in > __kmem_cache_empty(), __kmem_cache_shrink() and __kmem_cache_destroy() > will always return 0 for such config. This is wrong

Re: [GIT PULL] XArray for 4.19

2018-08-22 Thread Christopher Lameter
On Mon, 13 Aug 2018, Matthew Wilcox wrote: > Please consider pulling the XArray patch set. The XArray provides an > improved interface to the radix tree data structure, providing locking > as part of the API, specifying GFP flags at allocation time, eliminating > preloading, less re-walking the t

Re: [GIT PULL] XArray for 4.19

2018-08-24 Thread Christopher Lameter
On Fri, 24 Aug 2018, Vlastimil Babka wrote: > > I think you can just post those for review and say that they apply on > top of xarray git? Maybe also with your own git URL with those applied > for easier access? I'm curious but also sceptical that something so > major would get picked up to mmotm

Re: [PATCH v3 1/7] mm, slab: combine kmalloc_caches and kmalloc_dma_caches

2018-07-30 Thread Christopher Lameter
On Wed, 18 Jul 2018, Vlastimil Babka wrote: > --- a/include/linux/slab.h > +++ b/include/linux/slab.h > @@ -295,12 +295,28 @@ static inline void __check_heap_object(const void *ptr, > unsigned long n, > #define SLAB_OBJ_MIN_SIZE (KMALLOC_MIN_SIZE < 16 ? \ > (

Re: [PATCH v3 2/7] mm, slab/slub: introduce kmalloc-reclaimable caches

2018-07-30 Thread Christopher Lameter
On Wed, 18 Jul 2018, Vlastimil Babka wrote: > index 4299c59353a1..d89e934e0d8b 100644 > --- a/include/linux/slab.h > +++ b/include/linux/slab.h > @@ -296,11 +296,12 @@ static inline void __check_heap_object(const void *ptr, > unsigned long n, > (KMALLOC_MIN_SIZE) :

Re: [PATCH v3 3/7] mm, slab: allocate off-slab freelists as reclaimable when appropriate

2018-07-30 Thread Christopher Lameter
On Wed, 18 Jul 2018, Vlastimil Babka wrote: > In SLAB, OFF_SLAB caches allocate management structures (currently just the > freelist) from kmalloc caches when placement in a slab page together with > objects would lead to suboptimal memory usage. For SLAB_RECLAIM_ACCOUNT > caches, > we can alloca

Re: [PATCH v3 5/7] mm: rename and change semantics of nr_indirectly_reclaimable_bytes

2018-07-30 Thread Christopher Lameter
Acked-by: Christoph Lameter

Re: [PATCH v3 7/7] mm, slab: shorten kmalloc cache names for large sizes

2018-07-30 Thread Christopher Lameter
On Wed, 18 Jul 2018, Vlastimil Babka wrote: > +static const char * > +kmalloc_cache_name(const char *prefix, unsigned int size) > +{ > + > + static const char units[3] = "\0kM"; > + int idx = 0; > + > + while (size >= 1024 && (size % 1024 == 0)) { > + size /= 1024; > +

Re: [PATCH 04/36] usercopy: Prepare for usercopy whitelisting

2018-01-10 Thread Christopher Lameter
On Tue, 9 Jan 2018, Kees Cook wrote: > +struct kmem_cache *kmem_cache_create_usercopy(const char *name, > + size_t size, size_t align, slab_flags_t flags, > + size_t useroffset, size_t usersize, > + void (*ctor)(void *)); Hmmm... At some

Re: [PATCH 05/36] usercopy: WARN() on slab cache usercopy region violations

2018-01-10 Thread Christopher Lameter
On Tue, 9 Jan 2018, Kees Cook wrote: > @@ -3823,11 +3825,9 @@ int __check_heap_object(const void *ptr, unsigned long > n, struct page *page, Could we do the check in mm_slab_common.c for all allocators and just have a small function in each allocators that give you the metadata needed for the ob

Re: [PATCH 02/38] usercopy: Enhance and rename report_usercopy()

2018-01-11 Thread Christopher Lameter
On Wed, 10 Jan 2018, Kees Cook wrote: > diff --git a/mm/slab.h b/mm/slab.h > index ad657ffa44e5..7d29e69ac310 100644 > --- a/mm/slab.h > +++ b/mm/slab.h > @@ -526,4 +526,10 @@ static inline int cache_random_seq_create(struct > kmem_cache *cachep, > static inline void cache_random_seq_destroy(str

RE: [PATCH 04/36] usercopy: Prepare for usercopy whitelisting

2018-01-12 Thread Christopher Lameter
On Fri, 12 Jan 2018, David Laight wrote: > > Hmmm... At some point we should switch kmem_cache_create to pass a struct > > containing all the parameters. Otherwise the API will blow up with > > additional functions. > > Or add an extra function to 'configure' the kmem_cache with the > extra parame

Re: [PATCH 3/6] struct page: add field for vm_struct

2018-01-31 Thread Christopher Lameter
On Tue, 30 Jan 2018, Igor Stoppa wrote: > @@ -1769,6 +1774,9 @@ void *__vmalloc_node_range(unsigned long size, unsigned > long align, > > kmemleak_vmalloc(area, size, gfp_mask); > > + for (page_counter = 0; page_counter < area->nr_pages; page_counter++) > + area->pages[page_

Re: [PATCH] mm: numa: Do not trap faults on shared data section pages.

2018-01-19 Thread Christopher Lameter
On Thu, 18 Jan 2018, Henry Willard wrote: > If MPOL_MF_LAZY were allowed and specified things would not work > correctly. change_pte_range() is unaware of and can’t honor the > difference between MPOL_MF_MOVE_ALL and MPOL_MF_MOVE. Not sure how that relates to what I said earlier... Sorry. > > Fo

Re: [PATCH 1/3] mm, numa: rework do_pages_move

2018-01-05 Thread Christopher Lameter
On Fri, 5 Jan 2018, Michal Hocko wrote: > Yes. I am really wondering because there souldn't anything specific to > improve the situation with patch 2 and 3. Likewise the only overhead > from the patch 1 I can see is the reduced batching of the mmap_sem. But > then I am wondering what would compens

Re: [PATCH 1/3] mm, numa: rework do_pages_move

2018-01-05 Thread Christopher Lameter
On Fri, 5 Jan 2018, Michal Hocko wrote: > I believe there should be some cap on the number of pages. We shouldn't > keep it held for million of pages if all of them are moved to the same > node. I would really like to postpone that to later unless it causes > some noticeable regressions because th

Re: [PATCH 1/3] mm, numa: rework do_pages_move

2018-01-05 Thread Christopher Lameter
On Fri, 5 Jan 2018, Michal Hocko wrote: > > Also why are you migrating the pages on pagelist if a > > add_page_for_migration() fails? One could simply update > > the status in user space and continue. > > I am open to further cleanups. Care to send a full patch with the > changelog? I would rather

Re: kmem_cache_attr (was Re: [PATCH 04/36] usercopy: Prepare for usercopy whitelisting)

2018-01-16 Thread Christopher Lameter
On Tue, 16 Jan 2018, Matthew Wilcox wrote: > I think that's a good thing! /proc/slabinfo really starts to get grotty > above 16 bytes. I'd like to chop off "_cache" from the name of every > single slab! If ext4_allocation_context has to become ext4_alloc_ctx, > I don't think we're going to lose

Re: kmem_cache_attr (was Re: [PATCH 04/36] usercopy: Prepare for usercopy whitelisting)

2018-01-16 Thread Christopher Lameter
On Tue, 16 Jan 2018, Matthew Wilcox wrote: > > Sure this data is never changed. It can be const. > > It's changed at initialisation. Look: > > kmem_cache_create(const char *name, size_t size, size_t align, > slab_flags_t flags, void (*ctor)(void *)) > s = create_cache(ca

Re: kmem_cache_attr (was Re: [PATCH 04/36] usercopy: Prepare for usercopy whitelisting)

2018-01-16 Thread Christopher Lameter
Draft patch of how the data structs could change. kmem_cache_attr is read only. Index: linux/include/linux/slab.h === --- linux.orig/include/linux/slab.h +++ linux/include/linux/slab.h @@ -135,9 +135,17 @@ struct mem_cgroup; void __i

Re: kmem_cache_attr (was Re: [PATCH 04/36] usercopy: Prepare for usercopy whitelisting)

2018-01-17 Thread Christopher Lameter
On Tue, 16 Jan 2018, Matthew Wilcox wrote: > On Tue, Jan 16, 2018 at 12:17:01PM -0600, Christopher Lameter wrote: > > Draft patch of how the data structs could change. kmem_cache_attr is read > > only. > > Looks good. Although I would add Kees' user feature: Sure I t

Re: [GIT PULL] isolation: 1Hz residual tick offloading v3

2018-01-17 Thread Christopher Lameter
On Tue, 16 Jan 2018, Mike Galbraith wrote: > > I tried to remove isolcpus or at least change the way it works so that its > > effects are reversible (ie: affine the init task instead of isolating > > domains) > > but that got nacked due to the behaviour's expectations for userspace. > > So we pai

Re: [GIT PULL] isolation: 1Hz residual tick offloading v3

2018-01-17 Thread Christopher Lameter
On Wed, 17 Jan 2018, Mike Galbraith wrote: > Domain connectivity very much is a property of a set of CPUs, a rather > important one, and one managed by cpusets.  NOHZ_FULL is a property of > a set of cpus, thus a most excellent fit.  Other things are as well. Not sure to what domain refers to in

Re: [PATCH] mm: numa: Do not trap faults on shared data section pages.

2018-01-17 Thread Christopher Lameter
On Tue, 16 Jan 2018, Mel Gorman wrote: > My main source of discomfort is the fact that this is permanent as two > processes perfectly isolated but with a suitably shared COW mapping > will never migrate the data. A potential improvement to get the reported > bandwidth up in the test program would

Re: [PATCH 02/36] usercopy: Include offset in overflow report

2018-01-10 Thread Christopher Lameter
On Tue, 9 Jan 2018, Kees Cook wrote: > -static void report_usercopy(unsigned long len, bool to_user, const char > *type) > +int report_usercopy(const char *name, const char *detail, bool to_user, > + unsigned long offset, unsigned long len) > { > - pr_emerg("kernel memory %s

kmem_cache_attr (was Re: [PATCH 04/36] usercopy: Prepare for usercopy whitelisting)

2018-01-16 Thread Christopher Lameter
On Sun, 14 Jan 2018, Matthew Wilcox wrote: > > Hmmm... At some point we should switch kmem_cache_create to pass a struct > > containing all the parameters. Otherwise the API will blow up with > > additional functions. > > Obviously I agree with you. I'm inclined to not let that delay Kees' > patc

Re: [PATCH] mm/slab.c: remove duplicated check of colour_next

2018-03-12 Thread Christopher Lameter
Acked-by: Christoph Lameter

Re: [PATCH] slub: Fix misleading 'age' in verbose slub prints

2018-03-07 Thread Christopher Lameter
On Wed, 7 Mar 2018, Chintan Pandya wrote: > In this case, object got freed later but 'age' shows > otherwise. This could be because, while printing > this info, we print allocation traces first and > free traces thereafter. In between, if we get schedule > out, (jiffies - t->when) could become mea

Re: [PATCH] mm, slab: reschedule cache_reap() on the same CPU

2018-04-11 Thread Christopher Lameter
On Wed, 11 Apr 2018, Pekka Enberg wrote: > Acked-by: Pekka Enberg Good to hear from you again. Acked-by: Christoph Lameter

Re: [PATCH v2 2/2] slab: __GFP_ZERO is incompatible with a constructor

2018-04-11 Thread Christopher Lameter
On Tue, 10 Apr 2018, Matthew Wilcox wrote: > diff --git a/mm/slab.h b/mm/slab.h > index 3cd4677953c6..896818c7b30a 100644 > --- a/mm/slab.h > +++ b/mm/slab.h > @@ -515,6 +515,13 @@ static inline void dump_unreclaimable_slab(void) > > void ___cache_free(struct kmem_cache *cache, void *x, unsigned

Re: [PATCH v2 2/2] slab: __GFP_ZERO is incompatible with a constructor

2018-04-11 Thread Christopher Lameter
On Wed, 11 Apr 2018, Matthew Wilcox wrote: > > > slab_post_alloc_hook(s, gfpflags, 1, &object); > > > > Please put this in a code path that is enabled by specifying > > > > slub_debug > > > > on the kernel command line. > > I don't understand. First, I had: > > if (unlikely(gfpflags & __G

Re: [PATCH v2 2/2] slab: __GFP_ZERO is incompatible with a constructor

2018-04-12 Thread Christopher Lameter
On Wed, 11 Apr 2018, Matthew Wilcox wrote: > > I don't see how that works ... can you explain a little more? > > I see ___slab_alloc() is called from __slab_alloc(). And I see > slab_alloc_node does this: > > object = c->freelist; > page = c->page; > if (unlikely(!object |

Re: [PATCH v2 2/2] slab: __GFP_ZERO is incompatible with a constructor

2018-04-12 Thread Christopher Lameter
On Thu, 12 Apr 2018, Matthew Wilcox wrote: > > Thus the next invocation of the fastpath will find that c->freelist is > > NULL and go to the slowpath. ... > > _ah_. I hadn't figured out that c->page was always NULL in the debugging > case too, so ___slab_alloc() always hits the 'new_slab' case.

Re: [PATCH] slab_common: remove test if cache name is accessible

2018-03-23 Thread Christopher Lameter
On Fri, 23 Mar 2018, Mikulas Patocka wrote: > Since the commit db265eca7700 ("mm/sl[aou]b: Move duping of slab name to > slab_common.c"), the kernel always duplicates the slab cache name when > creating a slab cache, so the test if the slab name is accessible is > useless. Acked-by: Christoph Lam

Re: [RFC] mm, slab: reschedule cache_reap() on the same CPU

2018-04-10 Thread Christopher Lameter
On Tue, 10 Apr 2018, Vlastimil Babka wrote: > cache_reap() is initially scheduled in start_cpu_timer() via > schedule_delayed_work_on(). But then the next iterations are scheduled via > schedule_delayed_work(), thus using WORK_CPU_UNBOUND. That is a bug.. cache_reap must run on the same cpu since

Re: [PATCH 1/2] slab: __GFP_ZERO is incompatible with a constructor

2018-04-10 Thread Christopher Lameter
On Tue, 10 Apr 2018, Matthew Wilcox wrote: > __GFP_ZERO requests that the object be initialised to all-zeroes, > while the purpose of a constructor is to initialise an object to a > particular pattern. We cannot do both. Add a warning to catch any > users who mistakenly pass a __GFP_ZERO flag wh

Re: [PATCH 1/2] slab: __GFP_ZERO is incompatible with a constructor

2018-04-10 Thread Christopher Lameter
On Tue, 10 Apr 2018, Christopher Lameter wrote: > On Tue, 10 Apr 2018, Matthew Wilcox wrote: > > > __GFP_ZERO requests that the object be initialised to all-zeroes, > > while the purpose of a constructor is to initialise an object to a > > particular pattern. We cannot do

Re: [PATCH 1/2] slab: __GFP_ZERO is incompatible with a constructor

2018-04-10 Thread Christopher Lameter
On Tue, 10 Apr 2018, Matthew Wilcox wrote: > Are you willing to have this kind of bug go uncaught for a while? There will be frequent allocations and this will show up at some point. Also you could put this into the debug only portions somehwere so we always catch it when debugging is on, '

Re: [PATCH 1/2] slab: __GFP_ZERO is incompatible with a constructor

2018-04-10 Thread Christopher Lameter
On Tue, 10 Apr 2018, Matthew Wilcox wrote: > If we want to get rid of the concept of constructors, it's doable, > but somebody needs to do the work to show what the effects will be. How do you envision dealing with the SLAB_TYPESAFE_BY_RCU slab caches? Those must have a defined state of the objec

Re: [PATCH 1/2] slab: __GFP_ZERO is incompatible with a constructor

2018-04-10 Thread Christopher Lameter
On Tue, 10 Apr 2018, Matthew Wilcox wrote: > > How do you envision dealing with the SLAB_TYPESAFE_BY_RCU slab caches? > > Those must have a defined state of the objects at all times and a > > constructor is > > required for that. And their use of RCU is required for numerous lockless > > lookup a

Re: [PATCH 1/2] slab: __GFP_ZERO is incompatible with a constructor

2018-04-10 Thread Christopher Lameter
On Tue, 10 Apr 2018, Matthew Wilcox wrote: > > Objects can be freed and reused and still be accessed from code that > > thinks the object is the old and not the new object > > Yes, I know, that's the point of RCU typesafety. My point is that an > object *which has never been used* can't be ac

Re: [PATCH] slub: Fix misleading 'age' in verbose slub prints

2018-03-08 Thread Christopher Lameter
On Thu, 8 Mar 2018, Chintan Pandya wrote: > > If you print the raw value, then you can do the subtraction yourself; > > if you've subtracted it from jiffies each time, you've at least introduced > > jitter, and possibly enough jitter to confuse and mislead. > > > This is exactly what I was thinkin

Re: [PATCH v2] slub: use jitter-free reference while printing age

2018-03-08 Thread Christopher Lameter
On Thu, 8 Mar 2018, Chintan Pandya wrote: > In this case, object got freed later but 'age' > shows otherwise. This could be because, while > printing this info, we print allocation traces > first and free traces thereafter. In between, > if we get schedule out or jiffies increment, > (jiffies - t-

Re: [PATCH] slab, slub: remove size disparity on debug kernel

2018-03-13 Thread Christopher Lameter
On Tue, 13 Mar 2018, Shakeel Butt wrote: > However for SLUB in debug kernel, the sizes were same. On further > inspection it is found that SLUB always use kmem_cache.object_size to > measure the kmem_cache.size while SLAB use the given kmem_cache.size. In > the debug kernel the slab's size can be

Re: WARNING: kmalloc bug in input_mt_init_slots

2018-09-27 Thread Christopher Lameter
On Thu, 27 Sep 2018, Dmitry Vyukov wrote: > On Tue, Sep 25, 2018 at 4:04 PM, Christopher Lameter wrote: > > On Tue, 25 Sep 2018, Dmitry Vyukov wrote: > > > >> Assuming that the size is large enough to fail in all allocators, is > >> this warning still

Re: WARNING: kmalloc bug in input_mt_init_slots

2018-09-27 Thread Christopher Lameter
On Thu, 27 Sep 2018, Dmitry Vyukov wrote: > On Thu, Sep 27, 2018 at 4:16 PM, Christopher Lameter wrote: > > On Thu, 27 Sep 2018, Dmitry Vyukov wrote: > > > >> On Tue, Sep 25, 2018 at 4:04 PM, Christopher Lameter > >> wrote: > >>

Re: [STABLE PATCH] slub: make ->cpu_partial unsigned int

2018-09-27 Thread Christopher Lameter
On Thu, 27 Sep 2018, zhong jiang wrote: > From: Alexey Dobriyan > > /* > * cpu_partial determined the maximum number of objects > * kept in the per cpu partial lists of a processor. > */ > > Can't be negative. True. > I hit a real issue that it will result in

Re: WARNING: kmalloc bug in input_mt_init_slots

2018-09-27 Thread Christopher Lameter
On Thu, 27 Sep 2018, Dmitry Vyukov wrote: > > Please post on the mailing list > > It is on the mailing lists: > https://lkml.org/lkml/2018/9/27/802 Ok then lets continue the discussion there.

Re: [PATCH] mm: don't warn about large allocations for slab

2018-09-27 Thread Christopher Lameter
On Thu, 27 Sep 2018, Dmitry Vyukov wrote: > From: Dmitry Vyukov > > This warning does not seem to be useful. Most of the time it fires when > allocation size depends on syscall arguments. We could add __GFP_NOWARN > to these allocation sites, but having a warning only to suppress it > does not ma

Re: [PATCH v3] slub: extend slub debug to handle multiple slabs

2018-09-28 Thread Christopher Lameter
On Fri, 28 Sep 2018, Aaron Tomlin wrote: > Extend the slub_debug syntax to "slub_debug=[,]*", where > may contain an asterisk at the end. For example, the following would poison > all kmalloc slabs: Acked-by: Christoph Lameter

Re: [PATCHi v2] mm: put_and_wait_on_page_locked() while page is migrated

2018-11-27 Thread Christopher Lameter
On Tue, 27 Nov 2018, Mike Rapoport wrote: > > * @page: The page to wait for. > > * > > * The caller should hold a reference on @page. They expect the page to > > * become unlocked relatively soon, but do not wish to hold up migration > > * (for example) by holding the reference while waiting

Re: [RFCv3 PATCH 1/6] uacce: Add documents for WarpDrive/uacce

2018-11-19 Thread Christopher Lameter
On Mon, 19 Nov 2018, Jerome Glisse wrote: > > IIRC this is solved in IB by automatically calling > > madvise(MADV_DONTFORK) before creating the MR. > > > > MADV_DONTFORK > > .. This is useful to prevent copy-on-write semantics from changing the > > physical location of a page if the parent wri

Re: [PATCH] slab: fix 'dubious: x & !y' warning from Sparse

2018-11-16 Thread Christopher Lameter
On Fri, 16 Nov 2018, Masahiro Yamada wrote: > diff --git a/include/linux/slab.h b/include/linux/slab.h > index 918f374..d395c73 100644 > --- a/include/linux/slab.h > +++ b/include/linux/slab.h > @@ -329,7 +329,7 @@ static __always_inline enum kmalloc_cache_type > kmalloc_type(gfp_t flags) >

Re: [PATCH v2 5/6] mm: track gup pages with page->dma_pinned_* fields

2018-07-02 Thread Christopher Lameter
On Mon, 2 Jul 2018, John Hubbard wrote: > > > > These two are just wrong. You cannot make any page reference for > > PageDmaPinned() account against a pin count. First, it is just conceptually > > wrong as these references need not be long term pins, second, you can > > easily race like: > > > > P

Re: [RFC PATCH for 4.18] rseq: use __u64 for rseq_cs fields, validate user inputs

2018-07-02 Thread Christopher Lameter
On Mon, 2 Jul 2018, Mathieu Desnoyers wrote: > Are there any kind of guarantees that a __u64 update on a 32-bit architecture > won't be torn into something daft like byte-per-byte stores when performed > from C code ? > > I don't worry whether the upper bits get updated or how, but I really care >

Re: [RFC PATCH for 4.18] rseq: use __u64 for rseq_cs fields, validate user inputs

2018-07-02 Thread Christopher Lameter
On Mon, 2 Jul 2018, Mathieu Desnoyers wrote: > > > > Platforms with 32 bit word size only guarantee atomicity of a 32 bit > > write or RMV instruction. > > > > Special instructions may exist on a platform to perform 64 bit atomic > > updates. We use cmpxchg64 f.e. on Intel 32 bit platforms to guar

Re: kernel BUG at mm/slab.c:LINE! (2)

2018-07-09 Thread Christopher Lameter
On Sun, 8 Jul 2018, syzbot wrote: > kernel BUG at mm/slab.c:4421! Classic location that indicates memory corruption. Can we rerun this with CONFIG_SLAB_DEBUG? Alternatively use SLUB debugging for better debugging without rebuilding.

Re: [PATCH -V2 -mm 0/4] mm, huge page: Copy target sub-page last when copy huge page

2018-05-25 Thread Christopher Lameter
On Thu, 24 May 2018, Huang, Ying wrote: > If the cache contention is heavy when copying the huge page, and we > copy the huge page from the begin to the end, it is possible that the > begin of huge page is evicted from the cache after we finishing > copying the end of the huge page. And it is pos

Re: [RFC PATCH 1/5] mm, slab/slub: introduce kmalloc-reclaimable caches

2018-05-25 Thread Christopher Lameter
On Thu, 24 May 2018, Vlastimil Babka wrote: > diff --git a/include/linux/slab.h b/include/linux/slab.h > index 9ebe659bd4a5..5bff0571b360 100644 > --- a/include/linux/slab.h > +++ b/include/linux/slab.h > @@ -296,11 +296,16 @@ static inline void __check_heap_object(const void *ptr, > unsigned lon

Re: [RFC PATCH 4/5] mm: rename and change semantics of nr_indirectly_reclaimable_bytes

2018-05-25 Thread Christopher Lameter
On Thu, 24 May 2018, Vlastimil Babka wrote: > diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h > index 32699b2dc52a..4343948f33e5 100644 > --- a/include/linux/mmzone.h > +++ b/include/linux/mmzone.h > @@ -180,7 +180,7 @@ enum node_stat_item { > NR_VMSCAN_IMMEDIATE,/* Prioriti

Re: [RFC] mm, THP: Map read-only text segments using large THP pages

2018-05-14 Thread Christopher Lameter
On Mon, 14 May 2018, William Kucharski wrote: > The idea is that the kernel will attempt to allocate and map the range using a > PMD sized THP page upon first fault; if the allocation is successful the page > will be populated (at present using a call to kernel_read()) and the page will > be mappe

Re: [PATCH 0/7] psi: pressure stall information for CPU, memory, and IO

2018-05-14 Thread Christopher Lameter
On Mon, 7 May 2018, Johannes Weiner wrote: > What to make of this number? If CPU utilization is at 100% and CPU > pressure is 0, it means the system is perfectly utilized, with one > runnable thread per CPU and nobody waiting. At two or more runnable > tasks per CPU, the system is 100% overcommitt

Re: [PATCH 0/7] psi: pressure stall information for CPU, memory, and IO

2018-05-14 Thread Christopher Lameter
On Mon, 14 May 2018, Johannes Weiner wrote: > Since I'm using the same model and infrastructure for memory and IO > load as well, IMO it makes more sense to present them in a coherent > interface instead of trying to retrofit and change the loadavg file, > which might not even be possible. Well I

Re: [PATCH 0/2] mm: gup: don't unmap or drop filesystem buffers

2018-06-17 Thread Christopher Lameter
On Sat, 16 Jun 2018, john.hubb...@gmail.com wrote: > I've come up with what I claim is a simple, robust fix, but...I'm > presuming to burn a struct page flag, and limit it to 64-bit arches, in > order to get there. Given that the problem is old (Jason Gunthorpe noted > that RDMA has been living wi

Re: Can kfree() sleep at runtime?

2018-05-31 Thread Christopher Lameter
On Thu, 31 May 2018, Jia-Ju Bai wrote: > I write a static analysis tool (DSAC), and it finds that kfree() can sleep. That should not happen. > Here is the call path for kfree(). > Please look at it *from the bottom up*. > > [FUNC] alloc_pages(GFP_KERNEL) > arch/x86/mm/pageattr.c, 756: alloc_page

Re: Can kfree() sleep at runtime?

2018-05-31 Thread Christopher Lameter
On Thu, 31 May 2018, Matthew Wilcox wrote: > On Thu, May 31, 2018 at 09:10:07PM +0800, Jia-Ju Bai wrote: > > I write a static analysis tool (DSAC), and it finds that kfree() can sleep. > > > > Here is the call path for kfree(). > > Please look at it *from the bottom up*. > > > > [FUNC] alloc_pages

Re: Can kfree() sleep at runtime?

2018-05-31 Thread Christopher Lameter
On Thu, 31 May 2018, Matthew Wilcox wrote: > > Freeing a page in the page allocator also was traditionally not sleeping. > > That has changed? > > No. "Your bug" being "The bug in your static analysis tool". It probably > isn't following the data flow correctly (or deeply enough). Well ok this

Re: [PATCH v2 5/6] mm: track gup pages with page->dma_pinned_* fields

2018-07-03 Thread Christopher Lameter
On Mon, 2 Jul 2018, John Hubbard wrote: > > If you establish a reference to a page then increase the page count. If > > the reference is a dma pin action also then increase the pinned count. > > > > That way you know how many of the references to the page are dma > > pins and you can correctly man

Re: [PATCH v2 5/6] mm: track gup pages with page->dma_pinned_* fields

2018-07-03 Thread Christopher Lameter
On Tue, 3 Jul 2018, John Hubbard wrote: > The page->_refcount field is used normally, in addition to the > dma_pinned_count. > But the problem is that, unless the caller knows what kind of page it is, > the page->dma_pinned_count cannot be looked at, because it is unioned with > page->lru.prev.

Re: [PATCH v2 5/6] mm: track gup pages with page->dma_pinned_* fields

2018-07-05 Thread Christopher Lameter
On Wed, 4 Jul 2018, Jan Kara wrote: > > So this seems unsolvable without having the caller specify that it knows the > > page type, and that it is therefore safe to decrement > > page->dma_pinned_count. > > I was hoping I'd found a way, but clearly I haven't. :) > > Well, I think the misconceptio

Re: [PATCH] mm: Add new vma flag VM_LOCAL_CPU

2018-05-18 Thread Christopher Lameter
On Tue, 15 May 2018, Boaz Harrosh wrote: > > I don't think page tables work the way you think they work. > > > > + err = vm_insert_pfn_prot(zt->vma, zt_addr, pfn, prot); > > > > That doesn't just insert it into the local CPU's page table. Any CPU > > which directly accesses or even

Re: [PATCH 1/2] mm: introduce put_user_page*(), placeholder versions

2018-12-13 Thread Christopher Lameter
On Wed, 12 Dec 2018, Jerome Glisse wrote: > On Thu, Dec 13, 2018 at 11:51:19AM +1100, Dave Chinner wrote: > > > > > [O1] Avoid write back from a page still being written by either a > > > > > device or some direct I/O or any other existing user of GUP. > > > > IOWs, you need to mark p

[RFC 5/7] Slab defrag core

2018-12-20 Thread Christopher Lameter
Slab defragmentation may occur: 1. Unconditionally when kmem_cache_shrink is called on a slab cache by the kernel calling kmem_cache_shrink. 2. Through the use of the slabinfo command. 3. Per node defrag conditionally when kmem_cache_defrag() is called (can be called from reclaim code with

Re: BUG: unable to handle kernel NULL pointer dereference in setup_kmem_cache_node

2019-01-02 Thread Christopher Lameter
On Wed, 2 Jan 2019, Dmitry Vyukov wrote: > Am I missing something or __alloc_alien_cache misses check for > kmalloc_node result? > > static struct alien_cache *__alloc_alien_cache(int node, int entries, > int batch, gfp_t gfp) > { > size_t me

Re: [RFC][PATCH v2 08/21] mm: introduce and export pgdat peer_node

2018-12-27 Thread Christopher Lameter
On Wed, 26 Dec 2018, Fengguang Wu wrote: > Each CPU socket can have 1 DRAM and 1 PMEM node, we call them "peer nodes". > Migration between DRAM and PMEM will by default happen between peer nodes. Which one does numa_node_id() point to? I guess that is the DRAM node and then we fall back to the PM

Re: [patch] mm, slab: avoid high-order slab pages when it does not reduce waste

2018-10-15 Thread Christopher Lameter
On Fri, 12 Oct 2018, Andrew Morton wrote: > > If the amount of waste is the same at higher cachep->gfporder values, > > there is no significant benefit to allocating higher order memory. There > > will be fewer calls to the page allocator, but each call will require > > zone->lock and finding the

Re: [patch] mm, slab: avoid high-order slab pages when it does not reduce waste

2018-10-15 Thread Christopher Lameter
On Fri, 12 Oct 2018, David Rientjes wrote: > @@ -1803,6 +1804,20 @@ static size_t calculate_slab_order(struct kmem_cache > *cachep, >*/ > if (left_over * 8 <= (PAGE_SIZE << gfporder)) > break; > + > + /* > + * If a highe

Re: [patch] mm, slab: avoid high-order slab pages when it does not reduce waste

2018-10-16 Thread Christopher Lameter
On Mon, 15 Oct 2018, David Rientjes wrote: > On Mon, 15 Oct 2018, Christopher Lameter wrote: > > > > > If the amount of waste is the same at higher cachep->gfporder values, > > > > there is no significant benefit to allocating higher order memory. > > &

Re: WARNING: kmalloc bug in input_mt_init_slots

2018-10-17 Thread Christopher Lameter
On Tue, 16 Oct 2018, Dmitry Torokhov wrote: > On Thu, Sep 27, 2018 at 07:35:37AM -0700, Matthew Wilcox wrote: > > On Mon, Sep 24, 2018 at 11:41:58AM -0700, Dmitry Torokhov wrote: > > > > How large is the allocation? AFACIT nRequests larger than > > > > KMALLOC_MAX_SIZE > > > > are larger than the

Re: [patch] mm, slab: avoid high-order slab pages when it does not reduce waste

2018-10-17 Thread Christopher Lameter
On Wed, 17 Oct 2018, Vlastimil Babka wrote: > I.e. the benefits vs drawbacks of higher order allocations for SLAB are > out of scope here. It would be nice if somebody evaluated them, but the > potential resulting change would be much larger than what concerns this > patch. But it would arguably a

Re: WARNING: kmalloc bug in input_mt_init_slots

2018-10-17 Thread Christopher Lameter
On Wed, 17 Oct 2018, Dmitry Torokhov wrote: > >What is a "contact" here? Are we talking about SG segments? > > No, we are talking about maximum number of fingers a person can have. Devices > don't usually track more than 10 distinct contacts on the touch surface at a > time. Ohh... Way off my u

Re: [PATCH] mm: fix race between kmem_cache destroy, create and deactivate

2018-05-22 Thread Christopher Lameter
On Mon, 21 May 2018, Andrew Morton wrote: > The patch seems depressingly complex. > > And a bit underdocumented... Maybe separate out the bits that rename refcount to alias_count? > > + refcount_t refcount; > > + int alias_count; > > The semantic meaning of these two? What locking protects

Re: [PATCH] mm: Add new vma flag VM_LOCAL_CPU

2018-05-22 Thread Christopher Lameter
On Tue, 22 May 2018, Dave Hansen wrote: > On 05/22/2018 09:05 AM, Boaz Harrosh wrote: > > How can we implement "Private memory"? > > Per-cpu page tables would do it. We already have that for percpu subsystem. See alloc_percpu()

Re: [PATCH] mm: Add new vma flag VM_LOCAL_CPU

2018-05-22 Thread Christopher Lameter
On Tue, 22 May 2018, Dave Hansen wrote: > On 05/22/2018 09:46 AM, Christopher Lameter wrote: > > On Tue, 22 May 2018, Dave Hansen wrote: > > > >> On 05/22/2018 09:05 AM, Boaz Harrosh wrote: > >>> How can we implement "Private memory"? > >> Pe

Re: [PATCH] slab, slub: skip unnecessary kasan_cache_shutdown()

2018-03-28 Thread Christopher Lameter
On Tue, 27 Mar 2018, Shakeel Butt wrote: > The kasan quarantine is designed to delay freeing slab objects to catch > use-after-free. The quarantine can be large (several percent of machine > memory size). When kmem_caches are deleted related objects are flushed > from the quarantine but this requi

Re: [PATCH RESEND] slab: introduce the flag SLAB_MINIMIZE_WASTE

2018-04-26 Thread Christopher Lameter
On Wed, 25 Apr 2018, Mikulas Patocka wrote: > > > > Could yo move that logic into slab_order()? It does something awfully > > similar. > > But slab_order (and its caller) limits the order to "max_order" and we > want more. > > Perhaps slab_order should be dropped and calculate_order totally > rewr

Re: [PATCH RESEND] slab: introduce the flag SLAB_MINIMIZE_WASTE

2018-04-26 Thread Christopher Lameter
On Wed, 25 Apr 2018, Mikulas Patocka wrote: > Do you want this? It deletes slab_order and replaces it with the > "minimize_waste" logic directly. Well yes that looks better. Now we need to make it easy to read and less complicated. Maybe try to keep as much as possible of the old code and also th

Re: [LSF/MM TOPIC NOTES] x86 ZONE_DMA love

2018-04-27 Thread Christopher Lameter
On Fri, 27 Apr 2018, Michal Hocko wrote: > On Thu 26-04-18 22:35:56, Christoph Hellwig wrote: > > On Thu, Apr 26, 2018 at 09:54:06PM +, Luis R. Rodriguez wrote: > > > In practice if you don't have a floppy device on x86, you don't need > > > ZONE_DMA, > > > > I call BS on that, and you actual

  1   2   3   4   >