On Tue, 15 Aug 2017, Mike Galbraith wrote:
> On Tue, 2017-08-15 at 10:52 -0500, Christopher Lameter wrote:
> > On Tue, 15 Aug 2017, Paul E. McKenney wrote:
> >
> > > Don't the HPC guys just disable idle_balance(), or am I out of date again?
> >
> > Ummm..
On Mon, 16 Oct 2017, Michal Hocko wrote:
> But putting that aside. Pinning a lot of memory might cause many
> performance issues and misbehavior. There are still kernel users
> who need high order memory to work properly. On top of that you are
> basically allowing an untrusted user to deplete hig
On Mon, 16 Oct 2017, Michal Hocko wrote:
> > So I mmap(MAP_CONTIG) 1GB working of working memory, prefer some data
> > structures there, maybe recieve from network, then decide to write
> > some and not write some other.
>
> Why would you want this?
Because we are receiving a 1GB block of data an
On Mon, 16 Oct 2017, Michal Hocko wrote:
> On Mon 16-10-17 11:02:24, Cristopher Lameter wrote:
> > On Mon, 16 Oct 2017, Michal Hocko wrote:
> >
> > > > So I mmap(MAP_CONTIG) 1GB working of working memory, prefer some data
> > > > structures there, maybe recieve from network, then decide to write
>
On Mon, 16 Oct 2017, Michal Hocko wrote:
> > We already have that issue and have ways to control that by tracking
> > pinned and mlocked pages as well as limits on their allocations.
>
> Ohh, it is very different because mlock limit is really small (64kB)
> which is not even close to what this is
On Mon, 6 Nov 2017, Vlastimil Babka wrote:
> I'm not sure what exactly is the EPERM intention. Should really the
> capability of THIS process override the cpuset restriction of the TARGET
> process? Maybe yes. Then, does "insufficient privilege (CAP_SYS_NICE) to
CAP_SYS_NICE never overrides cpuse
On Fri, 3 Nov 2017, Chris Metcalf wrote:
> However, it doesn't seem possible to do the synchronous cancellation of
> the vmstat deferred work with irqs disabled, though if there's a way,
> it would be a little cleaner to do that; Christoph? We can certainly
> update the statistics with interrupts
On Thu, 26 Oct 2017, Yang Shi wrote:
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index 935c4d4..e21b81e 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -2050,6 +2050,31 @@ extern int __meminit __early_pfn_to_nid(unsigned long
> pfn,
> static inline void zero_resv_u
On Fri, 27 Oct 2017, Peter Zijlstra wrote:
> I _strongly_ object to this statement, isolcpus is _not_ the preferred
> way, cpusets are.
>
> And yes, while cpusets suffers some problems, we _should_ really fix
> those and not promote this piece of shit isolcpus crap.
Well low level control at the
On Mon, 30 Oct 2017, Peter Zijlstra wrote:
> > isolcpus is the *right* approach here because you are micromanaging the OS
> > and are putting dedicated pieces of software on each core.
>
> That is what you want, and cpusets should allow for that just fine.
Well yes a cpuset of one processor I gue
On Wed, 1 Aug 2018, Jeremy Linton wrote:
> diff --git a/mm/slub.c b/mm/slub.c
> index 51258eff4178..e03719bac1e2 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -2519,6 +2519,8 @@ static void *___slab_alloc(struct kmem_cache *s, gfp_t
> gfpflags, int node,
> if (unlikely(!node_match
On Mon, 6 Aug 2018, Dennis Zhou wrote:
> diff --git a/fs/proc/meminfo.c b/fs/proc/meminfo.c
> index 2fb04846ed11..ddd5249692e9 100644
> --- a/fs/proc/meminfo.c
> +++ b/fs/proc/meminfo.c
> @@ -7,6 +7,7 @@
> #include
> #include
> #include
> +#include
> #include
> #include
> #include
> @
On Wed, 20 Jun 2018, Shakeel Butt wrote:
> For !CONFIG_SLUB_DEBUG, SLUB does not maintain the number of slabs
> allocated per node for a kmem_cache. Thus, slabs_node() in
> __kmem_cache_empty(), __kmem_cache_shrink() and __kmem_cache_destroy()
> will always return 0 for such config. This is wrong
On Mon, 13 Aug 2018, Matthew Wilcox wrote:
> Please consider pulling the XArray patch set. The XArray provides an
> improved interface to the radix tree data structure, providing locking
> as part of the API, specifying GFP flags at allocation time, eliminating
> preloading, less re-walking the t
On Fri, 24 Aug 2018, Vlastimil Babka wrote:
>
> I think you can just post those for review and say that they apply on
> top of xarray git? Maybe also with your own git URL with those applied
> for easier access? I'm curious but also sceptical that something so
> major would get picked up to mmotm
On Wed, 18 Jul 2018, Vlastimil Babka wrote:
> --- a/include/linux/slab.h
> +++ b/include/linux/slab.h
> @@ -295,12 +295,28 @@ static inline void __check_heap_object(const void *ptr,
> unsigned long n,
> #define SLAB_OBJ_MIN_SIZE (KMALLOC_MIN_SIZE < 16 ? \
> (
On Wed, 18 Jul 2018, Vlastimil Babka wrote:
> index 4299c59353a1..d89e934e0d8b 100644
> --- a/include/linux/slab.h
> +++ b/include/linux/slab.h
> @@ -296,11 +296,12 @@ static inline void __check_heap_object(const void *ptr,
> unsigned long n,
> (KMALLOC_MIN_SIZE) :
On Wed, 18 Jul 2018, Vlastimil Babka wrote:
> In SLAB, OFF_SLAB caches allocate management structures (currently just the
> freelist) from kmalloc caches when placement in a slab page together with
> objects would lead to suboptimal memory usage. For SLAB_RECLAIM_ACCOUNT
> caches,
> we can alloca
Acked-by: Christoph Lameter
On Wed, 18 Jul 2018, Vlastimil Babka wrote:
> +static const char *
> +kmalloc_cache_name(const char *prefix, unsigned int size)
> +{
> +
> + static const char units[3] = "\0kM";
> + int idx = 0;
> +
> + while (size >= 1024 && (size % 1024 == 0)) {
> + size /= 1024;
> +
On Tue, 9 Jan 2018, Kees Cook wrote:
> +struct kmem_cache *kmem_cache_create_usercopy(const char *name,
> + size_t size, size_t align, slab_flags_t flags,
> + size_t useroffset, size_t usersize,
> + void (*ctor)(void *));
Hmmm... At some
On Tue, 9 Jan 2018, Kees Cook wrote:
> @@ -3823,11 +3825,9 @@ int __check_heap_object(const void *ptr, unsigned long
> n, struct page *page,
Could we do the check in mm_slab_common.c for all allocators and just have
a small function in each allocators that give you the metadata needed for
the ob
On Wed, 10 Jan 2018, Kees Cook wrote:
> diff --git a/mm/slab.h b/mm/slab.h
> index ad657ffa44e5..7d29e69ac310 100644
> --- a/mm/slab.h
> +++ b/mm/slab.h
> @@ -526,4 +526,10 @@ static inline int cache_random_seq_create(struct
> kmem_cache *cachep,
> static inline void cache_random_seq_destroy(str
On Fri, 12 Jan 2018, David Laight wrote:
> > Hmmm... At some point we should switch kmem_cache_create to pass a struct
> > containing all the parameters. Otherwise the API will blow up with
> > additional functions.
>
> Or add an extra function to 'configure' the kmem_cache with the
> extra parame
On Tue, 30 Jan 2018, Igor Stoppa wrote:
> @@ -1769,6 +1774,9 @@ void *__vmalloc_node_range(unsigned long size, unsigned
> long align,
>
> kmemleak_vmalloc(area, size, gfp_mask);
>
> + for (page_counter = 0; page_counter < area->nr_pages; page_counter++)
> + area->pages[page_
On Thu, 18 Jan 2018, Henry Willard wrote:
> If MPOL_MF_LAZY were allowed and specified things would not work
> correctly. change_pte_range() is unaware of and can’t honor the
> difference between MPOL_MF_MOVE_ALL and MPOL_MF_MOVE.
Not sure how that relates to what I said earlier... Sorry.
>
> Fo
On Fri, 5 Jan 2018, Michal Hocko wrote:
> Yes. I am really wondering because there souldn't anything specific to
> improve the situation with patch 2 and 3. Likewise the only overhead
> from the patch 1 I can see is the reduced batching of the mmap_sem. But
> then I am wondering what would compens
On Fri, 5 Jan 2018, Michal Hocko wrote:
> I believe there should be some cap on the number of pages. We shouldn't
> keep it held for million of pages if all of them are moved to the same
> node. I would really like to postpone that to later unless it causes
> some noticeable regressions because th
On Fri, 5 Jan 2018, Michal Hocko wrote:
> > Also why are you migrating the pages on pagelist if a
> > add_page_for_migration() fails? One could simply update
> > the status in user space and continue.
>
> I am open to further cleanups. Care to send a full patch with the
> changelog? I would rather
On Tue, 16 Jan 2018, Matthew Wilcox wrote:
> I think that's a good thing! /proc/slabinfo really starts to get grotty
> above 16 bytes. I'd like to chop off "_cache" from the name of every
> single slab! If ext4_allocation_context has to become ext4_alloc_ctx,
> I don't think we're going to lose
On Tue, 16 Jan 2018, Matthew Wilcox wrote:
> > Sure this data is never changed. It can be const.
>
> It's changed at initialisation. Look:
>
> kmem_cache_create(const char *name, size_t size, size_t align,
> slab_flags_t flags, void (*ctor)(void *))
> s = create_cache(ca
Draft patch of how the data structs could change. kmem_cache_attr is read
only.
Index: linux/include/linux/slab.h
===
--- linux.orig/include/linux/slab.h
+++ linux/include/linux/slab.h
@@ -135,9 +135,17 @@ struct mem_cgroup;
void __i
On Tue, 16 Jan 2018, Matthew Wilcox wrote:
> On Tue, Jan 16, 2018 at 12:17:01PM -0600, Christopher Lameter wrote:
> > Draft patch of how the data structs could change. kmem_cache_attr is read
> > only.
>
> Looks good. Although I would add Kees' user feature:
Sure I t
On Tue, 16 Jan 2018, Mike Galbraith wrote:
> > I tried to remove isolcpus or at least change the way it works so that its
> > effects are reversible (ie: affine the init task instead of isolating
> > domains)
> > but that got nacked due to the behaviour's expectations for userspace.
>
> So we pai
On Wed, 17 Jan 2018, Mike Galbraith wrote:
> Domain connectivity very much is a property of a set of CPUs, a rather
> important one, and one managed by cpusets. NOHZ_FULL is a property of
> a set of cpus, thus a most excellent fit. Other things are as well.
Not sure to what domain refers to in
On Tue, 16 Jan 2018, Mel Gorman wrote:
> My main source of discomfort is the fact that this is permanent as two
> processes perfectly isolated but with a suitably shared COW mapping
> will never migrate the data. A potential improvement to get the reported
> bandwidth up in the test program would
On Tue, 9 Jan 2018, Kees Cook wrote:
> -static void report_usercopy(unsigned long len, bool to_user, const char
> *type)
> +int report_usercopy(const char *name, const char *detail, bool to_user,
> + unsigned long offset, unsigned long len)
> {
> - pr_emerg("kernel memory %s
On Sun, 14 Jan 2018, Matthew Wilcox wrote:
> > Hmmm... At some point we should switch kmem_cache_create to pass a struct
> > containing all the parameters. Otherwise the API will blow up with
> > additional functions.
>
> Obviously I agree with you. I'm inclined to not let that delay Kees'
> patc
Acked-by: Christoph Lameter
On Wed, 7 Mar 2018, Chintan Pandya wrote:
> In this case, object got freed later but 'age' shows
> otherwise. This could be because, while printing
> this info, we print allocation traces first and
> free traces thereafter. In between, if we get schedule
> out, (jiffies - t->when) could become mea
On Wed, 11 Apr 2018, Pekka Enberg wrote:
> Acked-by: Pekka Enberg
Good to hear from you again.
Acked-by: Christoph Lameter
On Tue, 10 Apr 2018, Matthew Wilcox wrote:
> diff --git a/mm/slab.h b/mm/slab.h
> index 3cd4677953c6..896818c7b30a 100644
> --- a/mm/slab.h
> +++ b/mm/slab.h
> @@ -515,6 +515,13 @@ static inline void dump_unreclaimable_slab(void)
>
> void ___cache_free(struct kmem_cache *cache, void *x, unsigned
On Wed, 11 Apr 2018, Matthew Wilcox wrote:
> > > slab_post_alloc_hook(s, gfpflags, 1, &object);
> >
> > Please put this in a code path that is enabled by specifying
> >
> > slub_debug
> >
> > on the kernel command line.
>
> I don't understand. First, I had:
>
> if (unlikely(gfpflags & __G
On Wed, 11 Apr 2018, Matthew Wilcox wrote:
>
> I don't see how that works ... can you explain a little more?
>
> I see ___slab_alloc() is called from __slab_alloc(). And I see
> slab_alloc_node does this:
>
> object = c->freelist;
> page = c->page;
> if (unlikely(!object |
On Thu, 12 Apr 2018, Matthew Wilcox wrote:
> > Thus the next invocation of the fastpath will find that c->freelist is
> > NULL and go to the slowpath. ...
>
> _ah_. I hadn't figured out that c->page was always NULL in the debugging
> case too, so ___slab_alloc() always hits the 'new_slab' case.
On Fri, 23 Mar 2018, Mikulas Patocka wrote:
> Since the commit db265eca7700 ("mm/sl[aou]b: Move duping of slab name to
> slab_common.c"), the kernel always duplicates the slab cache name when
> creating a slab cache, so the test if the slab name is accessible is
> useless.
Acked-by: Christoph Lam
On Tue, 10 Apr 2018, Vlastimil Babka wrote:
> cache_reap() is initially scheduled in start_cpu_timer() via
> schedule_delayed_work_on(). But then the next iterations are scheduled via
> schedule_delayed_work(), thus using WORK_CPU_UNBOUND.
That is a bug.. cache_reap must run on the same cpu since
On Tue, 10 Apr 2018, Matthew Wilcox wrote:
> __GFP_ZERO requests that the object be initialised to all-zeroes,
> while the purpose of a constructor is to initialise an object to a
> particular pattern. We cannot do both. Add a warning to catch any
> users who mistakenly pass a __GFP_ZERO flag wh
On Tue, 10 Apr 2018, Christopher Lameter wrote:
> On Tue, 10 Apr 2018, Matthew Wilcox wrote:
>
> > __GFP_ZERO requests that the object be initialised to all-zeroes,
> > while the purpose of a constructor is to initialise an object to a
> > particular pattern. We cannot do
On Tue, 10 Apr 2018, Matthew Wilcox wrote:
> Are you willing to have this kind of bug go uncaught for a while?
There will be frequent allocations and this will show up at some point.
Also you could put this into the debug only portions somehwere so we
always catch it when debugging is on,
'
On Tue, 10 Apr 2018, Matthew Wilcox wrote:
> If we want to get rid of the concept of constructors, it's doable,
> but somebody needs to do the work to show what the effects will be.
How do you envision dealing with the SLAB_TYPESAFE_BY_RCU slab caches?
Those must have a defined state of the objec
On Tue, 10 Apr 2018, Matthew Wilcox wrote:
> > How do you envision dealing with the SLAB_TYPESAFE_BY_RCU slab caches?
> > Those must have a defined state of the objects at all times and a
> > constructor is
> > required for that. And their use of RCU is required for numerous lockless
> > lookup a
On Tue, 10 Apr 2018, Matthew Wilcox wrote:
> > Objects can be freed and reused and still be accessed from code that
> > thinks the object is the old and not the new object
>
> Yes, I know, that's the point of RCU typesafety. My point is that an
> object *which has never been used* can't be ac
On Thu, 8 Mar 2018, Chintan Pandya wrote:
> > If you print the raw value, then you can do the subtraction yourself;
> > if you've subtracted it from jiffies each time, you've at least introduced
> > jitter, and possibly enough jitter to confuse and mislead.
> >
> This is exactly what I was thinkin
On Thu, 8 Mar 2018, Chintan Pandya wrote:
> In this case, object got freed later but 'age'
> shows otherwise. This could be because, while
> printing this info, we print allocation traces
> first and free traces thereafter. In between,
> if we get schedule out or jiffies increment,
> (jiffies - t-
On Tue, 13 Mar 2018, Shakeel Butt wrote:
> However for SLUB in debug kernel, the sizes were same. On further
> inspection it is found that SLUB always use kmem_cache.object_size to
> measure the kmem_cache.size while SLAB use the given kmem_cache.size. In
> the debug kernel the slab's size can be
On Thu, 27 Sep 2018, Dmitry Vyukov wrote:
> On Tue, Sep 25, 2018 at 4:04 PM, Christopher Lameter wrote:
> > On Tue, 25 Sep 2018, Dmitry Vyukov wrote:
> >
> >> Assuming that the size is large enough to fail in all allocators, is
> >> this warning still
On Thu, 27 Sep 2018, Dmitry Vyukov wrote:
> On Thu, Sep 27, 2018 at 4:16 PM, Christopher Lameter wrote:
> > On Thu, 27 Sep 2018, Dmitry Vyukov wrote:
> >
> >> On Tue, Sep 25, 2018 at 4:04 PM, Christopher Lameter
> >> wrote:
> >>
On Thu, 27 Sep 2018, zhong jiang wrote:
> From: Alexey Dobriyan
>
> /*
> * cpu_partial determined the maximum number of objects
> * kept in the per cpu partial lists of a processor.
> */
>
> Can't be negative.
True.
> I hit a real issue that it will result in
On Thu, 27 Sep 2018, Dmitry Vyukov wrote:
> > Please post on the mailing list
>
> It is on the mailing lists:
> https://lkml.org/lkml/2018/9/27/802
Ok then lets continue the discussion there.
On Thu, 27 Sep 2018, Dmitry Vyukov wrote:
> From: Dmitry Vyukov
>
> This warning does not seem to be useful. Most of the time it fires when
> allocation size depends on syscall arguments. We could add __GFP_NOWARN
> to these allocation sites, but having a warning only to suppress it
> does not ma
On Fri, 28 Sep 2018, Aaron Tomlin wrote:
> Extend the slub_debug syntax to "slub_debug=[,]*", where
> may contain an asterisk at the end. For example, the following would poison
> all kmalloc slabs:
Acked-by: Christoph Lameter
On Tue, 27 Nov 2018, Mike Rapoport wrote:
> > * @page: The page to wait for.
> > *
> > * The caller should hold a reference on @page. They expect the page to
> > * become unlocked relatively soon, but do not wish to hold up migration
> > * (for example) by holding the reference while waiting
On Mon, 19 Nov 2018, Jerome Glisse wrote:
> > IIRC this is solved in IB by automatically calling
> > madvise(MADV_DONTFORK) before creating the MR.
> >
> > MADV_DONTFORK
> > .. This is useful to prevent copy-on-write semantics from changing the
> > physical location of a page if the parent wri
On Fri, 16 Nov 2018, Masahiro Yamada wrote:
> diff --git a/include/linux/slab.h b/include/linux/slab.h
> index 918f374..d395c73 100644
> --- a/include/linux/slab.h
> +++ b/include/linux/slab.h
> @@ -329,7 +329,7 @@ static __always_inline enum kmalloc_cache_type
> kmalloc_type(gfp_t flags)
>
On Mon, 2 Jul 2018, John Hubbard wrote:
> >
> > These two are just wrong. You cannot make any page reference for
> > PageDmaPinned() account against a pin count. First, it is just conceptually
> > wrong as these references need not be long term pins, second, you can
> > easily race like:
> >
> > P
On Mon, 2 Jul 2018, Mathieu Desnoyers wrote:
> Are there any kind of guarantees that a __u64 update on a 32-bit architecture
> won't be torn into something daft like byte-per-byte stores when performed
> from C code ?
>
> I don't worry whether the upper bits get updated or how, but I really care
>
On Mon, 2 Jul 2018, Mathieu Desnoyers wrote:
> >
> > Platforms with 32 bit word size only guarantee atomicity of a 32 bit
> > write or RMV instruction.
> >
> > Special instructions may exist on a platform to perform 64 bit atomic
> > updates. We use cmpxchg64 f.e. on Intel 32 bit platforms to guar
On Sun, 8 Jul 2018, syzbot wrote:
> kernel BUG at mm/slab.c:4421!
Classic location that indicates memory corruption. Can we rerun this with
CONFIG_SLAB_DEBUG? Alternatively use SLUB debugging for better debugging
without rebuilding.
On Thu, 24 May 2018, Huang, Ying wrote:
> If the cache contention is heavy when copying the huge page, and we
> copy the huge page from the begin to the end, it is possible that the
> begin of huge page is evicted from the cache after we finishing
> copying the end of the huge page. And it is pos
On Thu, 24 May 2018, Vlastimil Babka wrote:
> diff --git a/include/linux/slab.h b/include/linux/slab.h
> index 9ebe659bd4a5..5bff0571b360 100644
> --- a/include/linux/slab.h
> +++ b/include/linux/slab.h
> @@ -296,11 +296,16 @@ static inline void __check_heap_object(const void *ptr,
> unsigned lon
On Thu, 24 May 2018, Vlastimil Babka wrote:
> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> index 32699b2dc52a..4343948f33e5 100644
> --- a/include/linux/mmzone.h
> +++ b/include/linux/mmzone.h
> @@ -180,7 +180,7 @@ enum node_stat_item {
> NR_VMSCAN_IMMEDIATE,/* Prioriti
On Mon, 14 May 2018, William Kucharski wrote:
> The idea is that the kernel will attempt to allocate and map the range using a
> PMD sized THP page upon first fault; if the allocation is successful the page
> will be populated (at present using a call to kernel_read()) and the page will
> be mappe
On Mon, 7 May 2018, Johannes Weiner wrote:
> What to make of this number? If CPU utilization is at 100% and CPU
> pressure is 0, it means the system is perfectly utilized, with one
> runnable thread per CPU and nobody waiting. At two or more runnable
> tasks per CPU, the system is 100% overcommitt
On Mon, 14 May 2018, Johannes Weiner wrote:
> Since I'm using the same model and infrastructure for memory and IO
> load as well, IMO it makes more sense to present them in a coherent
> interface instead of trying to retrofit and change the loadavg file,
> which might not even be possible.
Well I
On Sat, 16 Jun 2018, john.hubb...@gmail.com wrote:
> I've come up with what I claim is a simple, robust fix, but...I'm
> presuming to burn a struct page flag, and limit it to 64-bit arches, in
> order to get there. Given that the problem is old (Jason Gunthorpe noted
> that RDMA has been living wi
On Thu, 31 May 2018, Jia-Ju Bai wrote:
> I write a static analysis tool (DSAC), and it finds that kfree() can sleep.
That should not happen.
> Here is the call path for kfree().
> Please look at it *from the bottom up*.
>
> [FUNC] alloc_pages(GFP_KERNEL)
> arch/x86/mm/pageattr.c, 756: alloc_page
On Thu, 31 May 2018, Matthew Wilcox wrote:
> On Thu, May 31, 2018 at 09:10:07PM +0800, Jia-Ju Bai wrote:
> > I write a static analysis tool (DSAC), and it finds that kfree() can sleep.
> >
> > Here is the call path for kfree().
> > Please look at it *from the bottom up*.
> >
> > [FUNC] alloc_pages
On Thu, 31 May 2018, Matthew Wilcox wrote:
> > Freeing a page in the page allocator also was traditionally not sleeping.
> > That has changed?
>
> No. "Your bug" being "The bug in your static analysis tool". It probably
> isn't following the data flow correctly (or deeply enough).
Well ok this
On Mon, 2 Jul 2018, John Hubbard wrote:
> > If you establish a reference to a page then increase the page count. If
> > the reference is a dma pin action also then increase the pinned count.
> >
> > That way you know how many of the references to the page are dma
> > pins and you can correctly man
On Tue, 3 Jul 2018, John Hubbard wrote:
> The page->_refcount field is used normally, in addition to the
> dma_pinned_count.
> But the problem is that, unless the caller knows what kind of page it is,
> the page->dma_pinned_count cannot be looked at, because it is unioned with
> page->lru.prev.
On Wed, 4 Jul 2018, Jan Kara wrote:
> > So this seems unsolvable without having the caller specify that it knows the
> > page type, and that it is therefore safe to decrement
> > page->dma_pinned_count.
> > I was hoping I'd found a way, but clearly I haven't. :)
>
> Well, I think the misconceptio
On Tue, 15 May 2018, Boaz Harrosh wrote:
> > I don't think page tables work the way you think they work.
> >
> > + err = vm_insert_pfn_prot(zt->vma, zt_addr, pfn, prot);
> >
> > That doesn't just insert it into the local CPU's page table. Any CPU
> > which directly accesses or even
On Wed, 12 Dec 2018, Jerome Glisse wrote:
> On Thu, Dec 13, 2018 at 11:51:19AM +1100, Dave Chinner wrote:
> > > > > [O1] Avoid write back from a page still being written by either a
> > > > > device or some direct I/O or any other existing user of GUP.
> >
> > IOWs, you need to mark p
Slab defragmentation may occur:
1. Unconditionally when kmem_cache_shrink is called on a slab cache by the
kernel calling kmem_cache_shrink.
2. Through the use of the slabinfo command.
3. Per node defrag conditionally when kmem_cache_defrag() is called
(can be called from reclaim code with
On Wed, 2 Jan 2019, Dmitry Vyukov wrote:
> Am I missing something or __alloc_alien_cache misses check for
> kmalloc_node result?
>
> static struct alien_cache *__alloc_alien_cache(int node, int entries,
> int batch, gfp_t gfp)
> {
> size_t me
On Wed, 26 Dec 2018, Fengguang Wu wrote:
> Each CPU socket can have 1 DRAM and 1 PMEM node, we call them "peer nodes".
> Migration between DRAM and PMEM will by default happen between peer nodes.
Which one does numa_node_id() point to? I guess that is the DRAM node and
then we fall back to the PM
On Fri, 12 Oct 2018, Andrew Morton wrote:
> > If the amount of waste is the same at higher cachep->gfporder values,
> > there is no significant benefit to allocating higher order memory. There
> > will be fewer calls to the page allocator, but each call will require
> > zone->lock and finding the
On Fri, 12 Oct 2018, David Rientjes wrote:
> @@ -1803,6 +1804,20 @@ static size_t calculate_slab_order(struct kmem_cache
> *cachep,
>*/
> if (left_over * 8 <= (PAGE_SIZE << gfporder))
> break;
> +
> + /*
> + * If a highe
On Mon, 15 Oct 2018, David Rientjes wrote:
> On Mon, 15 Oct 2018, Christopher Lameter wrote:
>
> > > > If the amount of waste is the same at higher cachep->gfporder values,
> > > > there is no significant benefit to allocating higher order memory.
> > &
On Tue, 16 Oct 2018, Dmitry Torokhov wrote:
> On Thu, Sep 27, 2018 at 07:35:37AM -0700, Matthew Wilcox wrote:
> > On Mon, Sep 24, 2018 at 11:41:58AM -0700, Dmitry Torokhov wrote:
> > > > How large is the allocation? AFACIT nRequests larger than
> > > > KMALLOC_MAX_SIZE
> > > > are larger than the
On Wed, 17 Oct 2018, Vlastimil Babka wrote:
> I.e. the benefits vs drawbacks of higher order allocations for SLAB are
> out of scope here. It would be nice if somebody evaluated them, but the
> potential resulting change would be much larger than what concerns this
> patch. But it would arguably a
On Wed, 17 Oct 2018, Dmitry Torokhov wrote:
> >What is a "contact" here? Are we talking about SG segments?
>
> No, we are talking about maximum number of fingers a person can have. Devices
> don't usually track more than 10 distinct contacts on the touch surface at a
> time.
Ohh... Way off my u
On Mon, 21 May 2018, Andrew Morton wrote:
> The patch seems depressingly complex.
>
> And a bit underdocumented...
Maybe separate out the bits that rename refcount to alias_count?
> > + refcount_t refcount;
> > + int alias_count;
>
> The semantic meaning of these two? What locking protects
On Tue, 22 May 2018, Dave Hansen wrote:
> On 05/22/2018 09:05 AM, Boaz Harrosh wrote:
> > How can we implement "Private memory"?
>
> Per-cpu page tables would do it.
We already have that for percpu subsystem. See alloc_percpu()
On Tue, 22 May 2018, Dave Hansen wrote:
> On 05/22/2018 09:46 AM, Christopher Lameter wrote:
> > On Tue, 22 May 2018, Dave Hansen wrote:
> >
> >> On 05/22/2018 09:05 AM, Boaz Harrosh wrote:
> >>> How can we implement "Private memory"?
> >> Pe
On Tue, 27 Mar 2018, Shakeel Butt wrote:
> The kasan quarantine is designed to delay freeing slab objects to catch
> use-after-free. The quarantine can be large (several percent of machine
> memory size). When kmem_caches are deleted related objects are flushed
> from the quarantine but this requi
On Wed, 25 Apr 2018, Mikulas Patocka wrote:
> >
> > Could yo move that logic into slab_order()? It does something awfully
> > similar.
>
> But slab_order (and its caller) limits the order to "max_order" and we
> want more.
>
> Perhaps slab_order should be dropped and calculate_order totally
> rewr
On Wed, 25 Apr 2018, Mikulas Patocka wrote:
> Do you want this? It deletes slab_order and replaces it with the
> "minimize_waste" logic directly.
Well yes that looks better. Now we need to make it easy to read and less
complicated. Maybe try to keep as much as possible of the old code
and also th
On Fri, 27 Apr 2018, Michal Hocko wrote:
> On Thu 26-04-18 22:35:56, Christoph Hellwig wrote:
> > On Thu, Apr 26, 2018 at 09:54:06PM +, Luis R. Rodriguez wrote:
> > > In practice if you don't have a floppy device on x86, you don't need
> > > ZONE_DMA,
> >
> > I call BS on that, and you actual
1 - 100 of 384 matches
Mail list logo