Re: [PATCH 2/2] mm, util: account_locked_vm() does not hold mmap_lock

2020-08-02 Thread Pengfei Li
On Thu, 30 Jul 2020 16:57:05 -0400
Daniel Jordan  wrote:

> On Wed, Jul 29, 2020 at 12:21:11PM -0700, Hugh Dickins wrote:
> > On Sun, 26 Jul 2020, Pengfei Li wrote:
> > 
> > > Since mm->locked_vm is already an atomic counter,
> > > account_locked_vm() does not need to hold mmap_lock.
> > 
> > I am worried that this patch, already added to mmotm, along with its
> > 1/2 making locked_vm an atomic64, might be rushed into v5.9 with
> > just that two-line commit description, and no discussion at all.
> > 
> > locked_vm belongs fundamentally to mm/mlock.c, and the lock to guard
> > it is mmap_lock; and mlock() has some complicated stuff to do under
> > that lock while it decides how to adjust locked_vm.
> >
> > It is very easy to convert an unsigned long to an atomic64_t, but
> > "atomic read, check limit and do stuff, atomic add" does not give
> > the same guarantee as holding the right lock around it all.
> 
> Yes, this is why I withdrew my attempt to do something similar last
> year, I didn't want to make the accounting racy. Stack and heap
> growing and mremap would be affected in addition to mlock.
>
> It'd help to hear more about the motivation for this.
> 

Thanks for your comments.

My motivation is to allow mm related counters to be safely read and
written without holding mmap_lock. But sorry i didn't do well.

-- 
Pengfei


Re: [PATCH 2/2] mm, util: account_locked_vm() does not hold mmap_lock

2020-08-02 Thread Pengfei Li
On Wed, 29 Jul 2020 12:21:11 -0700 (PDT)
Hugh Dickins  wrote:

Sorry for the late reply.

> On Sun, 26 Jul 2020, Pengfei Li wrote:
> 
> > Since mm->locked_vm is already an atomic counter,
> > account_locked_vm() does not need to hold mmap_lock.  
> 
> I am worried that this patch, already added to mmotm, along with its
> 1/2 making locked_vm an atomic64, might be rushed into v5.9 with just
> that two-line commit description, and no discussion at all.
> 
> locked_vm belongs fundamentally to mm/mlock.c, and the lock to guard
> it is mmap_lock; and mlock() has some complicated stuff to do under
> that lock while it decides how to adjust locked_vm.
> 
> It is very easy to convert an unsigned long to an atomic64_t, but
> "atomic read, check limit and do stuff, atomic add" does not give
> the same guarantee as holding the right lock around it all.
> 
> (At the very least, __account_locked_vm() in 1/2 should be changed to
> replace its atomic64_add by an atomic64_cmpxchg, to enforce the limit
> that it just checked.  But that will be no more than lipstick on a
> pig, when the right lock that everyone else agrees upon is not being
> held.)
> 

Thank you for your detailed comment.

You are right, I should use atomic64_cmpxchg to guarantee the limit of
RLIMIT_MEMLOCK.

> Now, it can be argued that our locked_vm and pinned_vm maintenance
> is so random and deficient, and too difficult to keep right across
> a sprawl of drivers, that we should just be grateful for those that
> do volunteer to subject themselves to RLIMIT_MEMLOCK limitation,
> and never mind if it's a little racy.
> 
> And it may well be that all those who have made considerable efforts
> in the past to improve the situation, have more interesting things to
> devote their time to, and would prefer not to get dragged back here.
> 
> But let's at least give this a little more visibility, and hope
> to hear opinions one way or the other from those who care.


Thank you. My patch should be more thoughtful.

I will send an email to Stephen soon asking to remove these two patches
from -mm tree.

-- 
Pengfei


[PATCH 2/2] mm, util: account_locked_vm() does not hold mmap_lock

2020-07-26 Thread Pengfei Li
Since mm->locked_vm is already an atomic counter, account_locked_vm()
does not need to hold mmap_lock.

Signed-off-by: Pengfei Li 
---
 drivers/vfio/vfio_iommu_type1.c |  8 ++--
 mm/util.c   | 15 +++
 2 files changed, 5 insertions(+), 18 deletions(-)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index 78013be07fe7..53818fce78a6 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -376,12 +376,8 @@ static int vfio_lock_acct(struct vfio_dma *dma, long 
npage, bool async)
if (!mm)
return -ESRCH; /* process exited */
 
-   ret = mmap_write_lock_killable(mm);
-   if (!ret) {
-   ret = __account_locked_vm(mm, abs(npage), npage > 0, dma->task,
- dma->lock_cap);
-   mmap_write_unlock(mm);
-   }
+   ret = __account_locked_vm(mm, abs(npage), npage > 0,
+   dma->task, dma->lock_cap);
 
if (async)
mmput(mm);
diff --git a/mm/util.c b/mm/util.c
index 473add0dc275..320fdd537aea 100644
--- a/mm/util.c
+++ b/mm/util.c
@@ -424,8 +424,7 @@ void arch_pick_mmap_layout(struct mm_struct *mm, struct 
rlimit *rlim_stack)
  * @task:task used to check RLIMIT_MEMLOCK
  * @bypass_rlim: %true if checking RLIMIT_MEMLOCK should be skipped
  *
- * Assumes @task and @mm are valid (i.e. at least one reference on each), and
- * that mmap_lock is held as writer.
+ * Assumes @task and @mm are valid (i.e. at least one reference on each).
  *
  * Return:
  * * 0   on success
@@ -437,8 +436,6 @@ int __account_locked_vm(struct mm_struct *mm, unsigned long 
pages, bool inc,
unsigned long locked_vm, limit;
int ret = 0;
 
-   mmap_assert_write_locked(mm);
-
locked_vm = atomic64_read(>locked_vm);
if (inc) {
if (!bypass_rlim) {
@@ -476,17 +473,11 @@ EXPORT_SYMBOL_GPL(__account_locked_vm);
  */
 int account_locked_vm(struct mm_struct *mm, unsigned long pages, bool inc)
 {
-   int ret;
-
if (pages == 0 || !mm)
return 0;
 
-   mmap_write_lock(mm);
-   ret = __account_locked_vm(mm, pages, inc, current,
- capable(CAP_IPC_LOCK));
-   mmap_write_unlock(mm);
-
-   return ret;
+   return __account_locked_vm(mm, pages, inc,
+   current, capable(CAP_IPC_LOCK));
 }
 EXPORT_SYMBOL_GPL(account_locked_vm);
 
-- 
2.26.2



[PATCH 1/2] mm: make mm->locked_vm an atomic64 counter

2020-07-26 Thread Pengfei Li
Like commit 70f8a3ca68d3 ("mm: make mm->pinned_vm an atomic64 counter").

By making mm->locked_vm an atomic64 counter, we can safely modify it
without holding mmap_lock.

The reason for using atomic64 instead of atomic_long is to keep the same
as mm->pinned_vm, and there is no need to worry about overflow.

Signed-off-by: Pengfei Li 
---
 drivers/infiniband/sw/siw/siw_verbs.c | 12 +++-
 drivers/vfio/vfio_iommu_type1.c   |  6 --
 fs/io_uring.c |  4 ++--
 fs/proc/task_mmu.c|  2 +-
 include/linux/mm_types.h  |  4 ++--
 kernel/fork.c |  2 +-
 mm/debug.c|  5 +++--
 mm/mlock.c|  4 ++--
 mm/mmap.c | 18 +-
 mm/mremap.c   |  6 +++---
 mm/util.c |  6 +++---
 11 files changed, 37 insertions(+), 32 deletions(-)

diff --git a/drivers/infiniband/sw/siw/siw_verbs.c 
b/drivers/infiniband/sw/siw/siw_verbs.c
index adafa1b8bebe..bf78d7988442 100644
--- a/drivers/infiniband/sw/siw/siw_verbs.c
+++ b/drivers/infiniband/sw/siw/siw_verbs.c
@@ -1293,14 +1293,16 @@ struct ib_mr *siw_reg_user_mr(struct ib_pd *pd, u64 
start, u64 len,
goto err_out;
}
if (mem_limit != RLIM_INFINITY) {
-   unsigned long num_pages =
-   (PAGE_ALIGN(len + (start & ~PAGE_MASK))) >> PAGE_SHIFT;
+   unsigned long num_pages, locked_pages;
+
+   num_pages = (PAGE_ALIGN(len + (start & ~PAGE_MASK)))
+   >> PAGE_SHIFT;
+   locked_pages = atomic64_read(>mm->locked_vm);
mem_limit >>= PAGE_SHIFT;
 
-   if (num_pages > mem_limit - current->mm->locked_vm) {
+   if (num_pages > mem_limit - locked_pages) {
siw_dbg_pd(pd, "pages req %lu, max %lu, lock %lu\n",
-  num_pages, mem_limit,
-  current->mm->locked_vm);
+  num_pages, mem_limit, locked_pages);
rv = -ENOMEM;
goto err_out;
}
diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index 9d41105bfd01..78013be07fe7 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -509,7 +509,8 @@ static long vfio_pin_pages_remote(struct vfio_dma *dma, 
unsigned long vaddr,
 * pages are already counted against the user.
 */
if (!rsvd && !vfio_find_vpfn(dma, iova)) {
-   if (!dma->lock_cap && current->mm->locked_vm + 1 > limit) {
+   if (!dma->lock_cap &&
+   atomic64_read(>mm->locked_vm) + 1 > limit) {
put_pfn(*pfn_base, dma->prot);
pr_warn("%s: RLIMIT_MEMLOCK (%ld) exceeded\n", __func__,
limit << PAGE_SHIFT);
@@ -536,7 +537,8 @@ static long vfio_pin_pages_remote(struct vfio_dma *dma, 
unsigned long vaddr,
 
if (!rsvd && !vfio_find_vpfn(dma, iova)) {
if (!dma->lock_cap &&
-   current->mm->locked_vm + lock_acct + 1 > limit) {
+   atomic64_read(>mm->locked_vm) +
+   lock_acct + 1 > limit) {
put_pfn(pfn, dma->prot);
pr_warn("%s: RLIMIT_MEMLOCK (%ld) exceeded\n",
__func__, limit << PAGE_SHIFT);
diff --git a/fs/io_uring.c b/fs/io_uring.c
index 7cf2f295fba7..f1241c6314e6 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -7371,7 +7371,7 @@ static void io_unaccount_mem(struct io_ring_ctx *ctx, 
unsigned long nr_pages,
 
if (ctx->sqo_mm) {
if (acct == ACCT_LOCKED)
-   ctx->sqo_mm->locked_vm -= nr_pages;
+   atomic64_sub(nr_pages, >sqo_mm->locked_vm);
else if (acct == ACCT_PINNED)
atomic64_sub(nr_pages, >sqo_mm->pinned_vm);
}
@@ -7390,7 +7390,7 @@ static int io_account_mem(struct io_ring_ctx *ctx, 
unsigned long nr_pages,
 
if (ctx->sqo_mm) {
if (acct == ACCT_LOCKED)
-   ctx->sqo_mm->locked_vm += nr_pages;
+   atomic64_add(nr_pages, >sqo_mm->locked_vm);
else if (acct == ACCT_PINNED)
atomic64_add(nr_pages, >sqo_mm->pinned_vm);
}
diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index df2f0f05f5ba..2af56e68766e 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -58,

Re: [PATCH v6 0/3] mm, slab: Make kmalloc_info[] contain all types of names

2019-10-08 Thread Pengfei Li
On Thu, Oct 3, 2019 at 7:06 AM Andrew Morton  wrote:
>
> On Mon, 23 Sep 2019 20:27:25 +0800 Pengfei Li  wrote:
>
> > Changes in v6
> > --
> > 1. abandon patch 4-7 (Because there is not enough reason to explain
> > that they are beneficial)
>
> So http://lkml.kernel.org/r/20190923004022.GC15734@shao2-debian can no
> longer occur?
>

Sorry for such a late reply.

Yes, it‘s caused by [patch v5 5/7]. So do not occur in v6.

> > Changes in v5
> > --
> > 1. patch 1/7:
> > - rename SET_KMALLOC_SIZE to INIT_KMALLOC_INFO
> > 2. patch 5/7:
> > - fix build errors (Reported-by: kbuild test robot)
> > - make all_kmalloc_info[] static (Reported-by: kbuild test robot)
> > 3. patch 6/7:
> > - for robustness, determine kmalloc_cache is !NULL in
> >   new_kmalloc_cache()
> > 4. add ack tag from David Rientjes
> >
>
>
>


[PATCH v6 2/3] mm, slab: Remove unused kmalloc_size()

2019-09-23 Thread Pengfei Li
The size of kmalloc can be obtained from kmalloc_info[],
so remove kmalloc_size() that will not be used anymore.

Signed-off-by: Pengfei Li 
Acked-by: Vlastimil Babka 
Acked-by: Roman Gushchin 
Acked-by: David Rientjes 
---
 include/linux/slab.h | 20 
 mm/slab.c|  5 +++--
 mm/slab_common.c |  5 ++---
 3 files changed, 5 insertions(+), 25 deletions(-)

diff --git a/include/linux/slab.h b/include/linux/slab.h
index 56c9c7e..e773e57 100644
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -557,26 +557,6 @@ static __always_inline void *kmalloc(size_t size, gfp_t 
flags)
return __kmalloc(size, flags);
 }
 
-/*
- * Determine size used for the nth kmalloc cache.
- * return size or 0 if a kmalloc cache for that
- * size does not exist
- */
-static __always_inline unsigned int kmalloc_size(unsigned int n)
-{
-#ifndef CONFIG_SLOB
-   if (n > 2)
-   return 1U << n;
-
-   if (n == 1 && KMALLOC_MIN_SIZE <= 32)
-   return 96;
-
-   if (n == 2 && KMALLOC_MIN_SIZE <= 64)
-   return 192;
-#endif
-   return 0;
-}
-
 static __always_inline void *kmalloc_node(size_t size, gfp_t flags, int node)
 {
 #ifndef CONFIG_SLOB
diff --git a/mm/slab.c b/mm/slab.c
index c42b621..7bc4e90 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -1248,8 +1248,9 @@ void __init kmem_cache_init(void)
 */
kmalloc_caches[KMALLOC_NORMAL][INDEX_NODE] = create_kmalloc_cache(
kmalloc_info[INDEX_NODE].name[KMALLOC_NORMAL],
-   kmalloc_size(INDEX_NODE), ARCH_KMALLOC_FLAGS,
-   0, kmalloc_size(INDEX_NODE));
+   kmalloc_info[INDEX_NODE].size,
+   ARCH_KMALLOC_FLAGS, 0,
+   kmalloc_info[INDEX_NODE].size);
slab_state = PARTIAL_NODE;
setup_kmalloc_cache_index_table();
 
diff --git a/mm/slab_common.c b/mm/slab_common.c
index 4e78490..df030cf 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -1239,11 +1239,10 @@ void __init create_kmalloc_caches(slab_flags_t flags)
struct kmem_cache *s = kmalloc_caches[KMALLOC_NORMAL][i];
 
if (s) {
-   unsigned int size = kmalloc_size(i);
-
kmalloc_caches[KMALLOC_DMA][i] = create_kmalloc_cache(
kmalloc_info[i].name[KMALLOC_DMA],
-   size, SLAB_CACHE_DMA | flags, 0, 0);
+   kmalloc_info[i].size,
+   SLAB_CACHE_DMA | flags, 0, 0);
}
}
 #endif
-- 
2.7.4



[PATCH v6 1/3] mm, slab: Make kmalloc_info[] contain all types of names

2019-09-23 Thread Pengfei Li
There are three types of kmalloc, KMALLOC_NORMAL, KMALLOC_RECLAIM
and KMALLOC_DMA.

The name of KMALLOC_NORMAL is contained in kmalloc_info[].name,
but the names of KMALLOC_RECLAIM and KMALLOC_DMA are dynamically
generated by kmalloc_cache_name().

This patch predefines the names of all types of kmalloc to save
the time spent dynamically generating names.

Besides, remove the kmalloc_cache_name() that is no longer used.

Signed-off-by: Pengfei Li 
Acked-by: Vlastimil Babka 
Acked-by: Roman Gushchin 
Acked-by: David Rientjes 
---
 mm/slab.c|  2 +-
 mm/slab.h|  2 +-
 mm/slab_common.c | 91 ++--
 3 files changed, 51 insertions(+), 44 deletions(-)

diff --git a/mm/slab.c b/mm/slab.c
index 9df3705..c42b621 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -1247,7 +1247,7 @@ void __init kmem_cache_init(void)
 * structures first.  Without this, further allocations will bug.
 */
kmalloc_caches[KMALLOC_NORMAL][INDEX_NODE] = create_kmalloc_cache(
-   kmalloc_info[INDEX_NODE].name,
+   kmalloc_info[INDEX_NODE].name[KMALLOC_NORMAL],
kmalloc_size(INDEX_NODE), ARCH_KMALLOC_FLAGS,
0, kmalloc_size(INDEX_NODE));
slab_state = PARTIAL_NODE;
diff --git a/mm/slab.h b/mm/slab.h
index 9057b80..2fc8f95 100644
--- a/mm/slab.h
+++ b/mm/slab.h
@@ -76,7 +76,7 @@ extern struct kmem_cache *kmem_cache;
 
 /* A table of kmalloc cache names and sizes */
 extern const struct kmalloc_info_struct {
-   const char *name;
+   const char *name[NR_KMALLOC_TYPES];
unsigned int size;
 } kmalloc_info[];
 
diff --git a/mm/slab_common.c b/mm/slab_common.c
index 807490f..4e78490 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -1092,26 +1092,56 @@ struct kmem_cache *kmalloc_slab(size_t size, gfp_t 
flags)
return kmalloc_caches[kmalloc_type(flags)][index];
 }
 
+#ifdef CONFIG_ZONE_DMA
+#define INIT_KMALLOC_INFO(__size, __short_size)\
+{  \
+   .name[KMALLOC_NORMAL]  = "kmalloc-" #__short_size,  \
+   .name[KMALLOC_RECLAIM] = "kmalloc-rcl-" #__short_size,  \
+   .name[KMALLOC_DMA] = "dma-kmalloc-" #__short_size,  \
+   .size = __size, \
+}
+#else
+#define INIT_KMALLOC_INFO(__size, __short_size)\
+{  \
+   .name[KMALLOC_NORMAL]  = "kmalloc-" #__short_size,  \
+   .name[KMALLOC_RECLAIM] = "kmalloc-rcl-" #__short_size,  \
+   .size = __size, \
+}
+#endif
+
 /*
  * kmalloc_info[] is to make slub_debug=,kmalloc-xx option work at boot time.
  * kmalloc_index() supports up to 2^26=64MB, so the final entry of the table is
  * kmalloc-67108864.
  */
 const struct kmalloc_info_struct kmalloc_info[] __initconst = {
-   {NULL,  0}, {"kmalloc-96", 96},
-   {"kmalloc-192",   192}, {"kmalloc-8",   8},
-   {"kmalloc-16", 16}, {"kmalloc-32", 32},
-   {"kmalloc-64", 64}, {"kmalloc-128",   128},
-   {"kmalloc-256",   256}, {"kmalloc-512",   512},
-   {"kmalloc-1k",   1024}, {"kmalloc-2k",   2048},
-   {"kmalloc-4k",   4096}, {"kmalloc-8k",   8192},
-   {"kmalloc-16k", 16384}, {"kmalloc-32k", 32768},
-   {"kmalloc-64k", 65536}, {"kmalloc-128k",   131072},
-   {"kmalloc-256k",   262144}, {"kmalloc-512k",   524288},
-   {"kmalloc-1M",1048576}, {"kmalloc-2M",2097152},
-   {"kmalloc-4M",4194304}, {"kmalloc-8M",8388608},
-   {"kmalloc-16M",  16777216}, {"kmalloc-32M",  33554432},
-   {"kmalloc-64M",  67108864}
+   INIT_KMALLOC_INFO(0, 0),
+   INIT_KMALLOC_INFO(96, 96),
+   INIT_KMALLOC_INFO(192, 192),
+   INIT_KMALLOC_INFO(8, 8),
+   INIT_KMALLOC_INFO(16, 16),
+   INIT_KMALLOC_INFO(32, 32),
+   INIT_KMALLOC_INFO(64, 64),
+   INIT_KMALLOC_INFO(128, 128),
+   INIT_KMALLOC_INFO(256, 256),
+   INIT_KMALLOC_INFO(512, 512),
+   INIT_KMALLOC_INFO(1024, 1k),
+   INIT_KMALLOC_INFO(2048, 2k),
+   INIT_KMALLOC_INFO(4096, 4k),
+   INIT_KMALLOC_INFO(8192, 8k),
+   INIT_KMALLOC_INFO(16384, 16k),
+   INIT_KMALLOC_INFO(32768, 32k

[PATCH v6 3/3] mm, slab_common: Use enum kmalloc_cache_type to iterate over kmalloc caches

2019-09-23 Thread Pengfei Li
The type of local variable *type* of new_kmalloc_cache() should
be enum kmalloc_cache_type instead of int, so correct it.

Signed-off-by: Pengfei Li 
Acked-by: Vlastimil Babka 
Acked-by: Roman Gushchin 
Acked-by: David Rientjes 
---
 mm/slab_common.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/mm/slab_common.c b/mm/slab_common.c
index df030cf..23054b8 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -1192,7 +1192,7 @@ void __init setup_kmalloc_cache_index_table(void)
 }
 
 static void __init
-new_kmalloc_cache(int idx, int type, slab_flags_t flags)
+new_kmalloc_cache(int idx, enum kmalloc_cache_type type, slab_flags_t flags)
 {
if (type == KMALLOC_RECLAIM)
flags |= SLAB_RECLAIM_ACCOUNT;
@@ -1210,7 +1210,8 @@ new_kmalloc_cache(int idx, int type, slab_flags_t flags)
  */
 void __init create_kmalloc_caches(slab_flags_t flags)
 {
-   int i, type;
+   int i;
+   enum kmalloc_cache_type type;
 
for (type = KMALLOC_NORMAL; type <= KMALLOC_RECLAIM; type++) {
for (i = KMALLOC_SHIFT_LOW; i <= KMALLOC_SHIFT_HIGH; i++) {
-- 
2.7.4



[PATCH v6 0/3] mm, slab: Make kmalloc_info[] contain all types of names

2019-09-23 Thread Pengfei Li
Changes in v6
--
1. abandon patch 4-7 (Because there is not enough reason to explain
that they are beneficial)

Changes in v5
--
1. patch 1/7:
- rename SET_KMALLOC_SIZE to INIT_KMALLOC_INFO
2. patch 5/7:
- fix build errors (Reported-by: kbuild test robot)
- make all_kmalloc_info[] static (Reported-by: kbuild test robot)
3. patch 6/7:
- for robustness, determine kmalloc_cache is !NULL in
  new_kmalloc_cache()
4. add ack tag from David Rientjes

Changes in v4
--
1. [old] abandon patch 4/4
2. [new] patch 4/7:
- return ZERO_SIZE_ALLOC instead 0 for zero sized requests
3. [new] patch 5/7:
- reorder kmalloc_info[], kmalloc_caches[] (in order of size)
- hard to split, so slightly larger
4. [new] patch 6/7:
- initialize kmalloc_cache[] with the same size but different
  types
5. [new] patch 7/7:
- modify kmalloc_caches[type][idx] to kmalloc_caches[idx][type]

Patch 4-7 are newly added, more information can be obtained from
commit messages.

Changes in v3
--
1. restore __initconst (patch 1/4)
2. rename patch 3/4
3. add more clarification for patch 4/4
4. add ack tag from Roman Gushchin

Changes in v2
--
1. remove __initconst (patch 1/5)
2. squash patch 2/5
3. add ack tag from Vlastimil Babka


There are three types of kmalloc, KMALLOC_NORMAL, KMALLOC_RECLAIM
and KMALLOC_DMA.

The name of KMALLOC_NORMAL is contained in kmalloc_info[].name,
but the names of KMALLOC_RECLAIM and KMALLOC_DMA are dynamically
generated by kmalloc_cache_name().

Patch1 predefines the names of all types of kmalloc to save
the time spent dynamically generating names.

These changes make sense, and the time spent by new_kmalloc_cache()
has been reduced by approximately 36.3%.

 Time spent by new_kmalloc_cache()
  (CPU cycles)
5.3-rc7  66264
5.3-rc7+patch42188

Pengfei Li (3):
  mm, slab: Make kmalloc_info[] contain all types of names
  mm, slab: Remove unused kmalloc_size()
  mm, slab_common: Use enum kmalloc_cache_type to iterate over kmalloc
caches

 include/linux/slab.h | 20 ---
 mm/slab.c|  7 ++--
 mm/slab.h|  2 +-
 mm/slab_common.c | 99 
 4 files changed, 58 insertions(+), 70 deletions(-)

-- 
2.7.4



Re: [RESEND v4 5/7] mm, slab_common: Make kmalloc_caches[] start at size KMALLOC_MIN_SIZE

2019-09-17 Thread Pengfei Li
On Mon, Sep 16, 2019 at 5:38 AM David Rientjes  wrote:
>
> On Mon, 16 Sep 2019, Pengfei Li wrote:
>
> > Currently, kmalloc_cache[] is not sorted by size, kmalloc_cache[0]
> > is kmalloc-96, kmalloc_cache[1] is kmalloc-192 (when ARCH_DMA_MINALIGN
> > is not defined).
> >
> > As suggested by Vlastimil Babka,
> >
> > "Since you're doing these cleanups, have you considered reordering
> > kmalloc_info, size_index, kmalloc_index() etc so that sizes 96 and 192
> > are ordered naturally between 64, 128 and 256? That should remove
> > various special casing such as in create_kmalloc_caches(). I can't
> > guarantee it will be possible without breaking e.g. constant folding
> > optimizations etc., but seems to me it should be feasible. (There are
> > definitely more places to change than those I listed.)"
> >
> > So this patch reordered kmalloc_info[], kmalloc_caches[], and modified
> > kmalloc_index() and kmalloc_slab() accordingly.
> >
> > As a result, there is no subtle judgment about size in
> > create_kmalloc_caches(). And initialize kmalloc_cache[] from 0 instead
> > of KMALLOC_SHIFT_LOW.
> >
> > I used ./scripts/bloat-o-meter to measure the impact of this patch on
> > performance. The results show that it brings some benefits.
> >
> > Considering the size change of kmalloc_info[], the size of the code is
> > actually about 641 bytes less.
> >
>
> bloat-o-meter is reporting a net benefit of -241 bytes for this, so not
> sure about relevancy of the difference for only kmalloc_info.
>

Thanks for your comments.

The size of kmalloc_info has been increased from 432 to 832 (it was
renamed to all_kmalloc_info ). So when the change in kmalloc_info size
is not included, it actually reduces 641 bytes.

> This, to me, looks like increased complexity for the statically allocated
> arrays vs the runtime complexity when initializing the caches themselves.

For runtime kmalloc requests, the implementation of kmalloc_slab() is
no different than before.
For constant kmalloc requests, the smaller size of .text means better
(the compiler does constant optimization).
Therefore, I don't think this patch adds complexity.

> Not sure that this is an improvement given that you still need to do
> things like
>
> +#if KMALLOC_SIZE_96_EXIST == 1
> +   if (size > 64 && size <= 96) return (7 - KMALLOC_IDX_ADJ_0);
> +#endif
> +
> +#if KMALLOC_SIZE_192_EXIST == 1
> +   if (size > 128 && size <= 192) return (8 - KMALLOC_IDX_ADJ_1);
> +#endif

kmalloc_index() is difficult to handle for me.

At first, I made the judgment in the order of size in kmalloc_index(),


/* Order 96, 192 */
static __always_inline unsigned int kmalloc_index(size_t size)
{
...
if (size <=8) return ( 3 - KMALLOC_IDX_ADJ_0);
if (size <=   16) return ( 4 - KMALLOC_IDX_ADJ_0);
if (size <=   32) return ( 5 - KMALLOC_IDX_ADJ_0);
if (size <=   64) return ( 6 - KMALLOC_IDX_ADJ_0);
#if KMALLOC_SIZE_96_EXIST == 1
if (size <=   96) return ( 7 - KMALLOC_IDX_ADJ_0);
#endif
if (size <=  128) return ( 7 - KMALLOC_IDX_ADJ_1);
#if KMALLOC_SIZE_192_EXIST == 1
if (size <=  192) return ( 8 - KMALLOC_IDX_ADJ_1);
#endif
if (size <=  256) return ( 8 - KMALLOC_IDX_ADJ_2);
...
}

but bloat-o-meter shows that I did a bad job.

$ ./scripts/bloat-o-meter vmlinux-base vmlinux-patch_1-5-order_96_192
add/remove: 3/7 grow/shrink: 129/167 up/down: 3691/-2530 (1161)
Function old new   delta
all_kmalloc_info   - 832+832
jhash7441119+375
__regmap_init   32523411+159
drm_mode_atomic_ioctl   23732479+106
apply_wqattrs_prepare449 531 +82
process_preds   17721851 +79
amd_uncore_cpu_up_prepare251 327 +76
property_entries_dup.part789 861 +72
pnp_register_port_resource98 167 +69
pnp_register_mem_resource 98 167 +69
pnp_register_irq_resource146 206 +60
pnp_register_dma_resource 61 121 +60
pcpu_get_vm_areas   30863139 +53
sr_probe13601409 +49
fl_create675 724 +49
ext4_expand_extra_isize_ea  22182265 +47
fib6_info_alloc   60 105 +45
init_worker_pool 247 291 +44
ctnetlink_alloc_filter.part

Re: [RFC PATCH] mm/slub: remove left-over debugging code

2019-09-17 Thread Pengfei Li
On Tue, Sep 17, 2019 at 2:32 AM David Rientjes  wrote:
>
> On Mon, 16 Sep 2019, Qian Cai wrote:
>
> > SLUB_RESILIENCY_TEST and SLUB_DEBUG_CMPXCHG look like some left-over
> > debugging code during the internal development that probably nobody uses
> > it anymore. Remove them to make the world greener.
>
> Adding Pengfei Li who has been working on a patchset for modified handling
> of kmalloc cache initialization and touches the resiliency test.
>

Thanks for looping me in.

My opinion is the same as David Rientjes.
The resiliency test should not be removed but should be improved.

> I still find the resiliency test to be helpful/instructional for handling
> unexpected conditions in these caches, so I'd suggest against removing it:
> the only downside is that it's additional source code.  But it's helpful
> source code for reference.
>
> The cmpxchg failures could likely be more generalized beyond SLUB since
> there will be other dependencies in the kernel than just this allocator.
>
> (I assume you didn't send a Signed-off-by line because this is an RFC.)
>
> > ---
> >  mm/slub.c | 110 
> > --
> >  1 file changed, 110 deletions(-)
> >
> > diff --git a/mm/slub.c b/mm/slub.c
> > index 8834563cdb4b..f97155ba097d 100644
> > --- a/mm/slub.c
> > +++ b/mm/slub.c
> > @@ -150,12 +150,6 @@ static inline bool kmem_cache_has_cpu_partial(struct 
> > kmem_cache *s)
> >   * - Variable sizing of the per node arrays
> >   */
> >
> > -/* Enable to test recovery from slab corruption on boot */
> > -#undef SLUB_RESILIENCY_TEST
> > -
> > -/* Enable to log cmpxchg failures */
> > -#undef SLUB_DEBUG_CMPXCHG
> > -
> >  /*
> >   * Mininum number of partial slabs. These will be left on the partial
> >   * lists even if they are empty. kmem_cache_shrink may reclaim them.
> > @@ -392,10 +386,6 @@ static inline bool __cmpxchg_double_slab(struct 
> > kmem_cache *s, struct page *page
> >   cpu_relax();
> >   stat(s, CMPXCHG_DOUBLE_FAIL);
> >
> > -#ifdef SLUB_DEBUG_CMPXCHG
> > - pr_info("%s %s: cmpxchg double redo ", n, s->name);
> > -#endif
> > -
> >   return false;
> >  }
> >
> > @@ -433,10 +423,6 @@ static inline bool cmpxchg_double_slab(struct 
> > kmem_cache *s, struct page *page,
> >   cpu_relax();
> >   stat(s, CMPXCHG_DOUBLE_FAIL);
> >
> > -#ifdef SLUB_DEBUG_CMPXCHG
> > - pr_info("%s %s: cmpxchg double redo ", n, s->name);
> > -#endif
> > -
> >   return false;
> >  }
> >
> > @@ -2004,45 +1990,11 @@ static inline unsigned long next_tid(unsigned long 
> > tid)
> >   return tid + TID_STEP;
> >  }
> >
> > -static inline unsigned int tid_to_cpu(unsigned long tid)
> > -{
> > - return tid % TID_STEP;
> > -}
> > -
> > -static inline unsigned long tid_to_event(unsigned long tid)
> > -{
> > - return tid / TID_STEP;
> > -}
> > -
> >  static inline unsigned int init_tid(int cpu)
> >  {
> >   return cpu;
> >  }
> >
> > -static inline void note_cmpxchg_failure(const char *n,
> > - const struct kmem_cache *s, unsigned long tid)
> > -{
> > -#ifdef SLUB_DEBUG_CMPXCHG
> > - unsigned long actual_tid = __this_cpu_read(s->cpu_slab->tid);
> > -
> > - pr_info("%s %s: cmpxchg redo ", n, s->name);
> > -
> > -#ifdef CONFIG_PREEMPT
> > - if (tid_to_cpu(tid) != tid_to_cpu(actual_tid))
> > - pr_warn("due to cpu change %d -> %d\n",
> > - tid_to_cpu(tid), tid_to_cpu(actual_tid));
> > - else
> > -#endif
> > - if (tid_to_event(tid) != tid_to_event(actual_tid))
> > - pr_warn("due to cpu running other code. Event %ld->%ld\n",
> > - tid_to_event(tid), tid_to_event(actual_tid));
> > - else
> > - pr_warn("for unknown reason: actual=%lx was=%lx target=%lx\n",
> > - actual_tid, tid, next_tid(tid));
> > -#endif
> > - stat(s, CMPXCHG_DOUBLE_CPU_FAIL);
> > -}
> > -
> >  static void init_kmem_cache_cpus(struct kmem_cache *s)
> >  {
> >   int cpu;
> > @@ -2751,7 +2703,6 @@ static __always_inline void *slab_alloc_node(struct 
> > kmem_cache *s,
> >   object, tid,
> >   next_object, next_tid(tid {
> >
> > - note_cmpxc

Re: [PATCH v5 0/7] mm, slab: Make kmalloc_info[] contain all types of names

2019-09-17 Thread Pengfei Li
On Tue, Sep 17, 2019 at 12:04 AM Christopher Lameter  wrote:
>
> On Mon, 16 Sep 2019, Pengfei Li wrote:
>
> > The name of KMALLOC_NORMAL is contained in kmalloc_info[].name,
> > but the names of KMALLOC_RECLAIM and KMALLOC_DMA are dynamically
> > generated by kmalloc_cache_name().
> >
> > Patch1 predefines the names of all types of kmalloc to save
> > the time spent dynamically generating names.
> >
> > These changes make sense, and the time spent by new_kmalloc_cache()
> > has been reduced by approximately 36.3%.
>
> This is time spend during boot and does not affect later system
> performance.
>

Yes.

> >  Time spent by new_kmalloc_cache()
> >   (CPU cycles)
> > 5.3-rc7  66264
> > 5.3-rc7+patch_1-342188
>
> Ok. 15k cycles during boot saved. So we save 5 microseconds during bootup?
>

Yes, this is a very small benefit.

> The current approach was created with the view on future setups allowing a
> dynamic configuration of kmalloc caches based on need. I.e. ZONE_DMA may
> not be needed once the floppy driver no longer makes use of it.

Yes, With this in mind, I also used #ifdef for INIT_KMALLOC_INFO.


Re: [RESEND v4 5/7] mm, slab_common: Make kmalloc_caches[] start at size KMALLOC_MIN_SIZE

2019-09-16 Thread Pengfei Li
On Mon, Sep 16, 2019 at 12:54 PM kbuild test robot  wrote:
>
> Hi Pengfei,
>
> Thank you for the patch! Perhaps something to improve:
>
> [auto build test WARNING on linus/master]
> [cannot apply to v5.3 next-20190915]
> [if your patch is applied to the wrong git tree, please drop us a note to 
> help improve the system]
>
> url:
> https://github.com/0day-ci/linux/commits/Pengfei-Li/mm-slab-Make-kmalloc_info-contain-all-types-of-names/20190916-065820
> reproduce:
> # apt-get install sparse
> # sparse version: v0.6.1-rc1-7-g2b96cd8-dirty
> make ARCH=x86_64 allmodconfig
> make C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__'
>
> If you fix the issue, kindly add following tag
> Reported-by: kbuild test robot 
>
>
> sparse warnings: (new ones prefixed by >>)
>
> >> mm/slab_common.c:1121:34: sparse: sparse: symbol 'all_kmalloc_info' was 
> >> not declared. Should it be static?

Thanks. I will fix it in v5.

>
> Please review and possibly fold the followup patch.
>
> ---
> 0-DAY kernel test infrastructureOpen Source Technology Center
> https://lists.01.org/pipermail/kbuild-all   Intel Corporation


Re: [RESEND v4 5/7] mm, slab_common: Make kmalloc_caches[] start at size KMALLOC_MIN_SIZE

2019-09-16 Thread Pengfei Li
On Mon, Sep 16, 2019 at 9:46 AM kbuild test robot  wrote:
>
> Hi Pengfei,
>
> Thank you for the patch! Yet something to improve:
>
> [auto build test ERROR on linus/master]
> [cannot apply to v5.3 next-20190904]
> [if your patch is applied to the wrong git tree, please drop us a note to 
> help improve the system]
>
> url:
> https://github.com/0day-ci/linux/commits/Pengfei-Li/mm-slab-Make-kmalloc_info-contain-all-types-of-names/20190916-065820
> config: parisc-allmodconfig (attached as .config)
> compiler: hppa-linux-gcc (GCC) 7.4.0
> reproduce:
> wget 
> https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
> ~/bin/make.cross
> chmod +x ~/bin/make.cross
> # save the attached .config to linux build tree
> GCC_VERSION=7.4.0 make.cross ARCH=parisc
>
> If you fix the issue, kindly add following tag
> Reported-by: kbuild test robot 
>
> All errors (new ones prefixed by >>):
>
> >> mm/slab_common.c:1144:34: error: 'KMALLOC_INFO_START_IDX' undeclared here 
> >> (not in a function); did you mean 'VMALLOC_START'?
> kmalloc_info = _kmalloc_info[KMALLOC_INFO_START_IDX];
>  ^~
>  VMALLOC_START
>
> vim +1144 mm/slab_common.c
>
>   1142
>   1143  const struct kmalloc_info_struct * const __initconst
> > 1144  kmalloc_info = _kmalloc_info[KMALLOC_INFO_START_IDX];
>   1145
>

Thanks.

This error is caused by I was mistakenly placed KMALLOC_INFO_SHIFT_LOW
and KMALLOC_INFO_START_IDX in the wrong place. (ARCH=sh is the same)

I will fix it in v5.

> ---
> 0-DAY kernel test infrastructureOpen Source Technology Center
> https://lists.01.org/pipermail/kbuild-all   Intel Corporation


Re: [RESEND v4 6/7] mm, slab_common: Initialize the same size of kmalloc_caches[]

2019-09-16 Thread Pengfei Li
On Mon, Sep 16, 2019 at 5:38 AM David Rientjes  wrote:

Thanks for your review comments!

>
> On Mon, 16 Sep 2019, Pengfei Li wrote:
>
> > diff --git a/mm/slab_common.c b/mm/slab_common.c
> > index 2aed30deb071..e7903bd28b1f 100644
> > --- a/mm/slab_common.c
> > +++ b/mm/slab_common.c
> > @@ -1165,12 +1165,9 @@ void __init setup_kmalloc_cache_index_table(void)
> >   size_index[size_index_elem(i)] = 0;
> >  }
> >
> > -static void __init
> > +static __always_inline void __init
> >  new_kmalloc_cache(int idx, enum kmalloc_cache_type type, slab_flags_t 
> > flags)
> >  {
> > - if (type == KMALLOC_RECLAIM)
> > - flags |= SLAB_RECLAIM_ACCOUNT;
> > -
> >   kmalloc_caches[type][idx] = create_kmalloc_cache(
> >   kmalloc_info[idx].name[type],
> >   kmalloc_info[idx].size, flags, 0,
> > @@ -1185,30 +1182,22 @@ new_kmalloc_cache(int idx, enum kmalloc_cache_type 
> > type, slab_flags_t flags)
> >  void __init create_kmalloc_caches(slab_flags_t flags)
> >  {
> >   int i;
> > - enum kmalloc_cache_type type;
> >
> > - for (type = KMALLOC_NORMAL; type <= KMALLOC_RECLAIM; type++) {
> > - for (i = 0; i < KMALLOC_CACHE_NUM; i++) {
> > - if (!kmalloc_caches[type][i])
> > - new_kmalloc_cache(i, type, flags);
> > - }
> > - }
> > + for (i = 0; i < KMALLOC_CACHE_NUM; i++) {
> > + if (!kmalloc_caches[KMALLOC_NORMAL][i])
> > + new_kmalloc_cache(i, KMALLOC_NORMAL, flags);
> >
> > - /* Kmalloc array is now usable */
> > - slab_state = UP;
> > + new_kmalloc_cache(i, KMALLOC_RECLAIM,
> > + flags | SLAB_RECLAIM_ACCOUNT);
>
> This seems less robust, no?  Previously we verified that the cache doesn't
> exist before creating a new cache over top of it (for NORMAL and RECLAIM).
> Now we presume that the RECLAIM cache never exists.
>

Agree, this is really less robust.

I have checked the code and found that there is no place to initialize
kmalloc-rcl-xxx before create_kmalloc_caches(). So I assume that
kmalloc-rcl-xxx is NULL.

> Can we just move a check to new_kmalloc_cache() to see if
> kmalloc_caches[type][idx] already exists and, if so, just return?  This
> should be more robust and simplify create_kmalloc_caches() slightly more.

For better robustness, I will do it as you suggested in v5.


Re: [RESEND v4 1/7] mm, slab: Make kmalloc_info[] contain all types of names

2019-09-16 Thread Pengfei Li
On Mon, Sep 16, 2019 at 5:38 AM David Rientjes  wrote:
>
> On Mon, 16 Sep 2019, Pengfei Li wrote:
>
> > There are three types of kmalloc, KMALLOC_NORMAL, KMALLOC_RECLAIM
> > and KMALLOC_DMA.
> >
> > The name of KMALLOC_NORMAL is contained in kmalloc_info[].name,
> > but the names of KMALLOC_RECLAIM and KMALLOC_DMA are dynamically
> > generated by kmalloc_cache_name().
> >
> > This patch predefines the names of all types of kmalloc to save
> > the time spent dynamically generating names.
> >
> > Besides, remove the kmalloc_cache_name() that is no longer used.
> >
> > Signed-off-by: Pengfei Li 
> > Acked-by: Vlastimil Babka 
> > Acked-by: Roman Gushchin 
>
> Acked-by: David Rientjes 
>

Thanks.

> It's unfortunate the existing names are kmalloc-, dma-kmalloc-, and
> kmalloc-rcl- since they aren't following any standard naming convention.
>
> Also not sure I understand the SET_KMALLOC_SIZE naming since this isn't
> just setting a size.  Maybe better off as INIT_KMALLOC_INFO?

Yes, this name is really better. I will rename SET_KMALLOC_SIZE to
INIT_KMALLOC_INFO in v5.

>
> Nothing major though, so:
>
> Acked-by: David Rientjes 


[PATCH v5 7/7] mm, slab_common: Modify kmalloc_caches[type][idx] to kmalloc_caches[idx][type]

2019-09-16 Thread Pengfei Li
KMALLOC_NORMAL is the most frequently accessed, and kmalloc_caches[]
is initialized by different types of the same size.

So modifying kmalloc_caches[type][idx] to kmalloc_caches[idx][type]
will benefit performance.

$ ./scripts/bloat-o-meter vmlinux.patch_1-6 vmlinux.patch_1-7
add/remove: 0/0 grow/shrink: 2/57 up/down: 8/-458 (-450)
Function old new   delta
tg3_self_test   42554259  +4
nf_queue 666 670  +4
kmalloc_slab  97  93  -4
i915_sw_fence_await_dma_fence441 437  -4
__igmp_group_dropped 619 615  -4
gss_import_sec_context   176 170  -6
xhci_alloc_command   212 205  -7
xprt_switch_alloc136 128  -8
xhci_segment_alloc   297 289  -8
xhci_ring_alloc  369 361  -8
xhci_mem_init   36643656  -8
xhci_alloc_virt_device   496 488  -8
xhci_alloc_tt_info   346 338  -8
xhci_alloc_stream_info   718 710  -8
xhci_alloc_container_ctx 215 207  -8
xfrm_policy_alloc271 263  -8
tcp_sendmsg_locked  31203112  -8
tcp_md5_do_add   774 766  -8
tcp_fastopen_defer_connect   270 262  -8
sr_read_tochdr.isra  251 243  -8
sr_read_tocentry.isra328 320  -8
sr_is_xa 376 368  -8
sr_get_mcn   260 252  -8
selinux_sk_alloc_security113 105  -8
sdev_evt_send_simple 118 110  -8
sdev_evt_alloc79  71  -8
scsi_probe_and_add_lun  29382930  -8
sbitmap_queue_init_node  418 410  -8
ring_buffer_read_prepare  94  86  -8
request_firmware_nowait  396 388  -8
regulatory_hint_found_beacon 394 386  -8
ohci_urb_enqueue31763168  -8
nla_strdup   142 134  -8
nfs_alloc_seqid   87  79  -8
nfs4_get_state_owner10401032  -8
nfs4_do_close578 570  -8
nf_ct_tmpl_alloc  85  77  -8
mempool_create_node  164 156  -8
ip_setup_cork362 354  -8
ip6_setup_cork  10211013  -8
gss_create_cred  140 132  -8
drm_flip_work_allocate_task   70  62  -8
dma_pool_alloc   410 402  -8
devres_open_group214 206  -8
create_kmalloc_caches203 195  -8
cfg80211_stop_iface  260 252  -8
cfg80211_sinfo_alloc_tid_stats77  69  -8
cfg80211_port_authorized 212 204  -8
cfg80211_parse_mbssid_data  23972389  -8
cfg80211_ibss_joined 335 327  -8
call_usermodehelper_setup149 141  -8
bpf_prog_alloc_no_stats  182 174  -8
blk_alloc_flush_queue191 183  -8
bdi_alloc_node   195 187  -8
audit_log_d_path 196 188  -8
_netlbl_catmap_getnode   247 239  -8
ip_mc_inc_group  475 467  -8
__i915_sw_fence_await_sw_fence   417 405 -12
ida_alloc_range  955 934 -21
Total: Before=14788957, After=14788507, chg -0.00%

Signed-off-by: Pengfei Li 
Acked-by: David Rientjes 
---
 include/linux/slab.h |  6 +++---
 mm/slab.c|  4 ++--
 mm/slab_common.c |  8 
 mm/slub.c| 12 ++--
 4 files changed, 15 insertions(+), 15 deletions(-)

diff --git a/include/linux/slab.h b/include/linux/slab.h
index f53bb6980110..0842db5f7053 100644
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -340,7 +340,7 @@ enum kmalloc_cache_type {
 
 #ifndef CONFIG_SLOB
 extern struct kmem_cache *
-kmalloc_caches[NR_KMALLOC_TYPES][KMALLOC_CACHE_NUM];
+kmalloc_caches[KMALLOC_CACHE_NUM][NR_KMALLOC_TYPES];
 
 static __always_inline enum kmalloc_cache_type kmalloc_type(gfp_t flags)
 {
@@ -582,7 +582,7

[PATCH v5 6/7] mm, slab_common: Initialize the same size of kmalloc_caches[]

2019-09-16 Thread Pengfei Li
In the current implementation, KMALLOC_RECLAIM is not initialized
until all the KMALLOC_NORMAL sizes have been initialized.

But for a particular size, create_kmalloc_caches() can be executed
faster by initializing different types of kmalloc in order.

$ ./scripts/bloat-o-meter vmlinux.patch_1-5 vmlinux.patch_1-6
add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-11 (-11)
Function old new   delta
create_kmalloc_caches214 203 -11
Total: Before=14788968, After=14788957, chg -0.00%

Although the benefits are small (more judgment is made for
robustness), create_kmalloc_caches() is much simpler.

Besides, KMALLOC_DMA will be initialized after "slab_state = UP",
this does not seem to be necessary.

Commit f97d5f634d3b ("slab: Common function to create the kmalloc
array") introduces create_kmalloc_caches().

And I found that for SLAB, KMALLOC_DMA is initialized before
"slab_state = UP". But for SLUB, KMALLOC_DMA is initialized after
"slab_state = UP".

Based on this fact, I think it is okay to initialize KMALLOC_DMA
before "slab_state = UP".

Signed-off-by: Pengfei Li 
---
 mm/slab_common.c | 35 +--
 1 file changed, 13 insertions(+), 22 deletions(-)

diff --git a/mm/slab_common.c b/mm/slab_common.c
index eeef5ac8d04d..00f2cfc66dbd 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -1168,11 +1168,11 @@ void __init setup_kmalloc_cache_index_table(void)
size_index[size_index_elem(i)] = 0;
 }
 
-static void __init
+static __always_inline void __init
 new_kmalloc_cache(int idx, enum kmalloc_cache_type type, slab_flags_t flags)
 {
-   if (type == KMALLOC_RECLAIM)
-   flags |= SLAB_RECLAIM_ACCOUNT;
+   if (kmalloc_caches[type][idx])
+   return;
 
kmalloc_caches[type][idx] = create_kmalloc_cache(
kmalloc_info[idx].name[type],
@@ -1188,30 +1188,21 @@ new_kmalloc_cache(int idx, enum kmalloc_cache_type 
type, slab_flags_t flags)
 void __init create_kmalloc_caches(slab_flags_t flags)
 {
int i;
-   enum kmalloc_cache_type type;
 
-   for (type = KMALLOC_NORMAL; type <= KMALLOC_RECLAIM; type++) {
-   for (i = 0; i < KMALLOC_CACHE_NUM; i++) {
-   if (!kmalloc_caches[type][i])
-   new_kmalloc_cache(i, type, flags);
-   }
-   }
+   for (i = 0; i < KMALLOC_CACHE_NUM; i++) {
+   new_kmalloc_cache(i, KMALLOC_NORMAL, flags);
 
-   /* Kmalloc array is now usable */
-   slab_state = UP;
+   new_kmalloc_cache(i, KMALLOC_RECLAIM,
+   flags | SLAB_RECLAIM_ACCOUNT);
 
 #ifdef CONFIG_ZONE_DMA
-   for (i = 0; i < KMALLOC_CACHE_NUM; i++) {
-   struct kmem_cache *s = kmalloc_caches[KMALLOC_NORMAL][i];
-
-   if (s) {
-   kmalloc_caches[KMALLOC_DMA][i] = create_kmalloc_cache(
-   kmalloc_info[i].name[KMALLOC_DMA],
-   kmalloc_info[i].size,
-   SLAB_CACHE_DMA | flags, 0, 0);
-   }
-   }
+   new_kmalloc_cache(i, KMALLOC_DMA,
+   flags | SLAB_CACHE_DMA);
 #endif
+   }
+
+   /* Kmalloc array is now usable */
+   slab_state = UP;
 }
 #endif /* !CONFIG_SLOB */
 
-- 
2.21.0



[PATCH v5 5/7] mm, slab_common: Make kmalloc_caches[] start at size KMALLOC_MIN_SIZE

2019-09-16 Thread Pengfei Li
96  -9
ohci_urb_enqueue31853176  -9
nfs_alloc_seqid   96  87  -9
nfs4_get_state_owner10491040  -9
nfs4_do_close587 578  -9
mempool_create_node  173 164  -9
ip6_setup_cork  10301021  -9
dma_pool_alloc   419 410  -9
devres_open_group223 214  -9
cfg80211_parse_mbssid_data  24062397  -9
__igmp_group_dropped 629 619 -10
gss_import_sec_context   187 176 -11
ip_setup_cork374 362 -12
__i915_sw_fence_await_sw_fence   429 417 -12
kmalloc_caches   336 312 -24
create_kmalloc_caches270 214 -56
kmalloc_cache_name57   - -57
new_kmalloc_cache112   --112
kmalloc_info 432   8-424
Total: Before=14789209, After=14788968, chg -0.00%

Signed-off-by: Pengfei Li 
---
 include/linux/slab.h |  96 --
 mm/slab.h|  10 +--
 mm/slab_common.c | 161 +--
 mm/slub.c|  12 ++--
 4 files changed, 136 insertions(+), 143 deletions(-)

diff --git a/include/linux/slab.h b/include/linux/slab.h
index 1f05f68f2c3e..f53bb6980110 100644
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -297,6 +297,23 @@ static inline void __check_heap_object(const void *ptr, 
unsigned long n,
 #define KMALLOC_MIN_SIZE (1 << KMALLOC_SHIFT_LOW)
 #endif
 
+#define KMALLOC_CACHE_MIN_NUM  (KMALLOC_SHIFT_HIGH - KMALLOC_SHIFT_LOW + 1)
+
+#if KMALLOC_MIN_SIZE > 64
+   #define KMALLOC_SIZE_96_EXIST   (0)
+   #define KMALLOC_SIZE_192_EXIST  (0)
+#elif KMALLOC_MIN_SIZE > 32
+   #define KMALLOC_SIZE_96_EXIST   (0)
+   #define KMALLOC_SIZE_192_EXIST  (1)
+#else
+   #define KMALLOC_SIZE_96_EXIST   (1)
+   #define KMALLOC_SIZE_192_EXIST  (1)
+#endif
+
+#define KMALLOC_CACHE_NUM  (KMALLOC_CACHE_MIN_NUM  \
+   + KMALLOC_SIZE_96_EXIST \
+   + KMALLOC_SIZE_192_EXIST)
+
 /*
  * This restriction comes from byte sized index implementation.
  * Page size is normally 2^12 bytes and, in this case, if we want to use
@@ -323,7 +340,7 @@ enum kmalloc_cache_type {
 
 #ifndef CONFIG_SLOB
 extern struct kmem_cache *
-kmalloc_caches[NR_KMALLOC_TYPES][KMALLOC_SHIFT_HIGH + 1];
+kmalloc_caches[NR_KMALLOC_TYPES][KMALLOC_CACHE_NUM];
 
 static __always_inline enum kmalloc_cache_type kmalloc_type(gfp_t flags)
 {
@@ -345,13 +362,18 @@ static __always_inline enum kmalloc_cache_type 
kmalloc_type(gfp_t flags)
 #endif
 }
 
+/* kmalloc_index adjust size: (0, 96] */
+#define KMALLOC_IDX_ADJ_0  (KMALLOC_SHIFT_LOW)
+
+/* kmalloc_index adjust size: (96, 192] */
+#define KMALLOC_IDX_ADJ_1  (KMALLOC_IDX_ADJ_0 - KMALLOC_SIZE_96_EXIST)
+
+/* kmalloc_index adjust size: (192, N] */
+#define KMALLOC_IDX_ADJ_2  (KMALLOC_IDX_ADJ_1 - KMALLOC_SIZE_192_EXIST)
+
 /*
  * Figure out which kmalloc slab an allocation of a certain size
  * belongs to.
- * 0 = zero alloc
- * 1 =  65 .. 96 bytes
- * 2 = 129 .. 192 bytes
- * n = 2^(n-1)+1 .. 2^n
  */
 static __always_inline unsigned int kmalloc_index(size_t size)
 {
@@ -359,36 +381,40 @@ static __always_inline unsigned int kmalloc_index(size_t 
size)
return ZERO_SIZE_ALLOC;
 
if (size <= KMALLOC_MIN_SIZE)
-   return KMALLOC_SHIFT_LOW;
-
-   if (KMALLOC_MIN_SIZE <= 32 && size > 64 && size <= 96)
-   return 1;
-   if (KMALLOC_MIN_SIZE <= 64 && size > 128 && size <= 192)
-   return 2;
-   if (size <=  8) return 3;
-   if (size <= 16) return 4;
-   if (size <= 32) return 5;
-   if (size <= 64) return 6;
-   if (size <=128) return 7;
-   if (size <=256) return 8;
-   if (size <=512) return 9;
-   if (size <=   1024) return 10;
-   if (size <=   2 * 1024) return 11;
-   if (size <=   4 * 1024) return 12;
-   if (size <=   8 * 1024) return 13;
-   if (size <=  16 * 1024) return 14;
-   if (size <=  32 * 1024) return 15;
-   if (size <=  64 * 1024) return 16;
-   if (size <= 128 * 1024) return 17;
-   if (size <= 256 * 1024) return 18;
-   if (size <= 512 * 1024) return 19;
-   if (size <= 1024 * 1024) return 20;
-   if (size <=  2 * 1024 * 1024) return 21;
-   if (size <=  4 * 1024 * 1024) return 22;
-   if (size <=  8 * 1024 * 1024) ret

[PATCH v5 4/7] mm, slab: Return ZERO_SIZE_ALLOC for zero sized kmalloc requests

2019-09-16 Thread Pengfei Li
This is a preparation patch, just replace 0 with ZERO_SIZE_ALLOC
as the return value of zero sized requests.

Signed-off-by: Pengfei Li 
Acked-by: David Rientjes 
---
 include/linux/slab.h | 16 +++-
 1 file changed, 11 insertions(+), 5 deletions(-)

diff --git a/include/linux/slab.h b/include/linux/slab.h
index e773e5764b7b..1f05f68f2c3e 100644
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -121,14 +121,20 @@
 #define SLAB_DEACTIVATED   ((slab_flags_t __force)0x1000U)
 
 /*
- * ZERO_SIZE_PTR will be returned for zero sized kmalloc requests.
+ * ZERO_SIZE_ALLOC will be returned by kmalloc_index() if it was zero sized
+ * requests.
  *
+ * After that, ZERO_SIZE_PTR will be returned by the function that called
+ * kmalloc_index().
+
  * Dereferencing ZERO_SIZE_PTR will lead to a distinct access fault.
  *
  * ZERO_SIZE_PTR can be passed to kfree though in the same way that NULL can.
  * Both make kfree a no-op.
  */
-#define ZERO_SIZE_PTR ((void *)16)
+#define ZERO_SIZE_ALLOC(UINT_MAX)
+
+#define ZERO_SIZE_PTR  ((void *)16)
 
 #define ZERO_OR_NULL_PTR(x) ((unsigned long)(x) <= \
(unsigned long)ZERO_SIZE_PTR)
@@ -350,7 +356,7 @@ static __always_inline enum kmalloc_cache_type 
kmalloc_type(gfp_t flags)
 static __always_inline unsigned int kmalloc_index(size_t size)
 {
if (!size)
-   return 0;
+   return ZERO_SIZE_ALLOC;
 
if (size <= KMALLOC_MIN_SIZE)
return KMALLOC_SHIFT_LOW;
@@ -546,7 +552,7 @@ static __always_inline void *kmalloc(size_t size, gfp_t 
flags)
 #ifndef CONFIG_SLOB
index = kmalloc_index(size);
 
-   if (!index)
+   if (index == ZERO_SIZE_ALLOC)
return ZERO_SIZE_PTR;
 
return kmem_cache_alloc_trace(
@@ -564,7 +570,7 @@ static __always_inline void *kmalloc_node(size_t size, 
gfp_t flags, int node)
size <= KMALLOC_MAX_CACHE_SIZE) {
unsigned int i = kmalloc_index(size);
 
-   if (!i)
+   if (i == ZERO_SIZE_ALLOC)
return ZERO_SIZE_PTR;
 
return kmem_cache_alloc_node_trace(
-- 
2.21.0



[PATCH v5 3/7] mm, slab_common: Use enum kmalloc_cache_type to iterate over kmalloc caches

2019-09-16 Thread Pengfei Li
The type of local variable *type* of new_kmalloc_cache() should
be enum kmalloc_cache_type instead of int, so correct it.

Signed-off-by: Pengfei Li 
Acked-by: Vlastimil Babka 
Acked-by: Roman Gushchin 
Acked-by: David Rientjes 
---
 mm/slab_common.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/mm/slab_common.c b/mm/slab_common.c
index df030cf9f44f..23054b8b75b6 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -1192,7 +1192,7 @@ void __init setup_kmalloc_cache_index_table(void)
 }
 
 static void __init
-new_kmalloc_cache(int idx, int type, slab_flags_t flags)
+new_kmalloc_cache(int idx, enum kmalloc_cache_type type, slab_flags_t flags)
 {
if (type == KMALLOC_RECLAIM)
flags |= SLAB_RECLAIM_ACCOUNT;
@@ -1210,7 +1210,8 @@ new_kmalloc_cache(int idx, int type, slab_flags_t flags)
  */
 void __init create_kmalloc_caches(slab_flags_t flags)
 {
-   int i, type;
+   int i;
+   enum kmalloc_cache_type type;
 
for (type = KMALLOC_NORMAL; type <= KMALLOC_RECLAIM; type++) {
for (i = KMALLOC_SHIFT_LOW; i <= KMALLOC_SHIFT_HIGH; i++) {
-- 
2.21.0



[PATCH v5 2/7] mm, slab: Remove unused kmalloc_size()

2019-09-16 Thread Pengfei Li
The size of kmalloc can be obtained from kmalloc_info[],
so remove kmalloc_size() that will not be used anymore.

Signed-off-by: Pengfei Li 
Acked-by: Vlastimil Babka 
Acked-by: Roman Gushchin 
Acked-by: David Rientjes 
---
 include/linux/slab.h | 20 
 mm/slab.c|  5 +++--
 mm/slab_common.c |  5 ++---
 3 files changed, 5 insertions(+), 25 deletions(-)

diff --git a/include/linux/slab.h b/include/linux/slab.h
index 56c9c7eed34e..e773e5764b7b 100644
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -557,26 +557,6 @@ static __always_inline void *kmalloc(size_t size, gfp_t 
flags)
return __kmalloc(size, flags);
 }
 
-/*
- * Determine size used for the nth kmalloc cache.
- * return size or 0 if a kmalloc cache for that
- * size does not exist
- */
-static __always_inline unsigned int kmalloc_size(unsigned int n)
-{
-#ifndef CONFIG_SLOB
-   if (n > 2)
-   return 1U << n;
-
-   if (n == 1 && KMALLOC_MIN_SIZE <= 32)
-   return 96;
-
-   if (n == 2 && KMALLOC_MIN_SIZE <= 64)
-   return 192;
-#endif
-   return 0;
-}
-
 static __always_inline void *kmalloc_node(size_t size, gfp_t flags, int node)
 {
 #ifndef CONFIG_SLOB
diff --git a/mm/slab.c b/mm/slab.c
index c42b6211f42e..7bc4e90e1147 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -1248,8 +1248,9 @@ void __init kmem_cache_init(void)
 */
kmalloc_caches[KMALLOC_NORMAL][INDEX_NODE] = create_kmalloc_cache(
kmalloc_info[INDEX_NODE].name[KMALLOC_NORMAL],
-   kmalloc_size(INDEX_NODE), ARCH_KMALLOC_FLAGS,
-   0, kmalloc_size(INDEX_NODE));
+   kmalloc_info[INDEX_NODE].size,
+   ARCH_KMALLOC_FLAGS, 0,
+   kmalloc_info[INDEX_NODE].size);
slab_state = PARTIAL_NODE;
setup_kmalloc_cache_index_table();
 
diff --git a/mm/slab_common.c b/mm/slab_common.c
index 4e78490292df..df030cf9f44f 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -1239,11 +1239,10 @@ void __init create_kmalloc_caches(slab_flags_t flags)
struct kmem_cache *s = kmalloc_caches[KMALLOC_NORMAL][i];
 
if (s) {
-   unsigned int size = kmalloc_size(i);
-
kmalloc_caches[KMALLOC_DMA][i] = create_kmalloc_cache(
kmalloc_info[i].name[KMALLOC_DMA],
-   size, SLAB_CACHE_DMA | flags, 0, 0);
+   kmalloc_info[i].size,
+   SLAB_CACHE_DMA | flags, 0, 0);
}
}
 #endif
-- 
2.21.0



[PATCH v5 1/7] mm, slab: Make kmalloc_info[] contain all types of names

2019-09-16 Thread Pengfei Li
There are three types of kmalloc, KMALLOC_NORMAL, KMALLOC_RECLAIM
and KMALLOC_DMA.

The name of KMALLOC_NORMAL is contained in kmalloc_info[].name,
but the names of KMALLOC_RECLAIM and KMALLOC_DMA are dynamically
generated by kmalloc_cache_name().

This patch predefines the names of all types of kmalloc to save
the time spent dynamically generating names.

Besides, remove the kmalloc_cache_name() that is no longer used.

Signed-off-by: Pengfei Li 
Acked-by: Vlastimil Babka 
Acked-by: Roman Gushchin 
Acked-by: David Rientjes 
---
 mm/slab.c|  2 +-
 mm/slab.h|  2 +-
 mm/slab_common.c | 91 ++--
 3 files changed, 51 insertions(+), 44 deletions(-)

diff --git a/mm/slab.c b/mm/slab.c
index 9df370558e5d..c42b6211f42e 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -1247,7 +1247,7 @@ void __init kmem_cache_init(void)
 * structures first.  Without this, further allocations will bug.
 */
kmalloc_caches[KMALLOC_NORMAL][INDEX_NODE] = create_kmalloc_cache(
-   kmalloc_info[INDEX_NODE].name,
+   kmalloc_info[INDEX_NODE].name[KMALLOC_NORMAL],
kmalloc_size(INDEX_NODE), ARCH_KMALLOC_FLAGS,
0, kmalloc_size(INDEX_NODE));
slab_state = PARTIAL_NODE;
diff --git a/mm/slab.h b/mm/slab.h
index 9057b8056b07..2fc8f956906a 100644
--- a/mm/slab.h
+++ b/mm/slab.h
@@ -76,7 +76,7 @@ extern struct kmem_cache *kmem_cache;
 
 /* A table of kmalloc cache names and sizes */
 extern const struct kmalloc_info_struct {
-   const char *name;
+   const char *name[NR_KMALLOC_TYPES];
unsigned int size;
 } kmalloc_info[];
 
diff --git a/mm/slab_common.c b/mm/slab_common.c
index 807490fe217a..4e78490292df 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -1092,26 +1092,56 @@ struct kmem_cache *kmalloc_slab(size_t size, gfp_t 
flags)
return kmalloc_caches[kmalloc_type(flags)][index];
 }
 
+#ifdef CONFIG_ZONE_DMA
+#define INIT_KMALLOC_INFO(__size, __short_size)\
+{  \
+   .name[KMALLOC_NORMAL]  = "kmalloc-" #__short_size,  \
+   .name[KMALLOC_RECLAIM] = "kmalloc-rcl-" #__short_size,  \
+   .name[KMALLOC_DMA] = "dma-kmalloc-" #__short_size,  \
+   .size = __size, \
+}
+#else
+#define INIT_KMALLOC_INFO(__size, __short_size)\
+{  \
+   .name[KMALLOC_NORMAL]  = "kmalloc-" #__short_size,  \
+   .name[KMALLOC_RECLAIM] = "kmalloc-rcl-" #__short_size,  \
+   .size = __size, \
+}
+#endif
+
 /*
  * kmalloc_info[] is to make slub_debug=,kmalloc-xx option work at boot time.
  * kmalloc_index() supports up to 2^26=64MB, so the final entry of the table is
  * kmalloc-67108864.
  */
 const struct kmalloc_info_struct kmalloc_info[] __initconst = {
-   {NULL,  0}, {"kmalloc-96", 96},
-   {"kmalloc-192",   192}, {"kmalloc-8",   8},
-   {"kmalloc-16", 16}, {"kmalloc-32", 32},
-   {"kmalloc-64", 64}, {"kmalloc-128",   128},
-   {"kmalloc-256",   256}, {"kmalloc-512",   512},
-   {"kmalloc-1k",   1024}, {"kmalloc-2k",   2048},
-   {"kmalloc-4k",   4096}, {"kmalloc-8k",   8192},
-   {"kmalloc-16k", 16384}, {"kmalloc-32k", 32768},
-   {"kmalloc-64k", 65536}, {"kmalloc-128k",   131072},
-   {"kmalloc-256k",   262144}, {"kmalloc-512k",   524288},
-   {"kmalloc-1M",1048576}, {"kmalloc-2M",2097152},
-   {"kmalloc-4M",4194304}, {"kmalloc-8M",8388608},
-   {"kmalloc-16M",  16777216}, {"kmalloc-32M",  33554432},
-   {"kmalloc-64M",  67108864}
+   INIT_KMALLOC_INFO(0, 0),
+   INIT_KMALLOC_INFO(96, 96),
+   INIT_KMALLOC_INFO(192, 192),
+   INIT_KMALLOC_INFO(8, 8),
+   INIT_KMALLOC_INFO(16, 16),
+   INIT_KMALLOC_INFO(32, 32),
+   INIT_KMALLOC_INFO(64, 64),
+   INIT_KMALLOC_INFO(128, 128),
+   INIT_KMALLOC_INFO(256, 256),
+   INIT_KMALLOC_INFO(512, 512),
+   INIT_KMALLOC_INFO(1024, 1k),
+   INIT_KMALLOC_INFO(2048, 2k),
+   INIT_KMALLOC_INFO(4096, 4k),
+   INIT_KMALLOC_INFO(8192, 8k),
+   INIT_KMALLOC_INFO(16384, 16k),
+   

[PATCH v5 0/7] mm, slab: Make kmalloc_info[] contain all types of names

2019-09-16 Thread Pengfei Li
tcp_sendmsg_locked  31293112 -17
tcp_md5_do_add   783 766 -17
tcp_fastopen_defer_connect   279 262 -17
sr_read_tochdr.isra  260 243 -17
sr_read_tocentry.isra337 320 -17
sr_is_xa 385 368 -17
sr_get_mcn   269 252 -17
scsi_probe_and_add_lun  29472930 -17
ring_buffer_read_prepare 103  86 -17
request_firmware_nowait  405 388 -17
ohci_urb_enqueue31853168 -17
nfs_alloc_seqid   96  79 -17
nfs4_get_state_owner10491032 -17
nfs4_do_close587 570 -17
mempool_create_node  173 156 -17
ip6_setup_cork  10301013 -17
ida_alloc_range  951 934 -17
gss_import_sec_context   187 170 -17
dma_pool_alloc   419 402 -17
devres_open_group223 206 -17
cfg80211_parse_mbssid_data  24062389 -17
ip_setup_cork374 354 -20
kmalloc_caches   336 312 -24
__i915_sw_fence_await_sw_fence   429 405 -24
kmalloc_cache_name57   - -57
create_kmalloc_caches270 195 -75
new_kmalloc_cache112   --112
kmalloc_info 432   8-424
Total: Before=14789209, After=14788507, chg -0.00%

Pengfei Li (7):
  mm, slab: Make kmalloc_info[] contain all types of names
  mm, slab: Remove unused kmalloc_size()
  mm, slab_common: Use enum kmalloc_cache_type to iterate over kmalloc
caches
  mm, slab: Return ZERO_SIZE_ALLOC for zero sized kmalloc requests
  mm, slab_common: Make kmalloc_caches[] start at size KMALLOC_MIN_SIZE
  mm, slab_common: Initialize the same size of kmalloc_caches[]
  mm, slab_common: Modify kmalloc_caches[type][idx] to
kmalloc_caches[idx][type]

 include/linux/slab.h | 136 ++
 mm/slab.c|  11 ++-
 mm/slab.h|  10 +-
 mm/slab_common.c | 227 ++-
 mm/slub.c|  12 +--
 5 files changed, 187 insertions(+), 209 deletions(-)

-- 
2.21.0



[RESEND v4 5/7] mm, slab_common: Make kmalloc_caches[] start at size KMALLOC_MIN_SIZE

2019-09-15 Thread Pengfei Li
96  -9
ohci_urb_enqueue31853176  -9
nfs_alloc_seqid   96  87  -9
nfs4_get_state_owner10491040  -9
nfs4_do_close587 578  -9
mempool_create_node  173 164  -9
ip6_setup_cork  10301021  -9
dma_pool_alloc   419 410  -9
devres_open_group223 214  -9
cfg80211_parse_mbssid_data  24062397  -9
__igmp_group_dropped 629 619 -10
gss_import_sec_context   187 176 -11
ip_setup_cork374 362 -12
__i915_sw_fence_await_sw_fence   429 417 -12
kmalloc_caches   336 312 -24
create_kmalloc_caches270 214 -56
kmalloc_cache_name57   - -57
new_kmalloc_cache112   --112
kmalloc_info 432   8-424
Total: Before=14874616, After=14874375, chg -0.00%

Signed-off-by: Pengfei Li 
---
 include/linux/slab.h |  96 --
 mm/slab.h|  10 +--
 mm/slab_common.c | 158 ---
 mm/slub.c|  12 ++--
 4 files changed, 133 insertions(+), 143 deletions(-)

diff --git a/include/linux/slab.h b/include/linux/slab.h
index 1f05f68f2c3e..f53bb6980110 100644
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -297,6 +297,23 @@ static inline void __check_heap_object(const void *ptr, 
unsigned long n,
 #define KMALLOC_MIN_SIZE (1 << KMALLOC_SHIFT_LOW)
 #endif
 
+#define KMALLOC_CACHE_MIN_NUM  (KMALLOC_SHIFT_HIGH - KMALLOC_SHIFT_LOW + 1)
+
+#if KMALLOC_MIN_SIZE > 64
+   #define KMALLOC_SIZE_96_EXIST   (0)
+   #define KMALLOC_SIZE_192_EXIST  (0)
+#elif KMALLOC_MIN_SIZE > 32
+   #define KMALLOC_SIZE_96_EXIST   (0)
+   #define KMALLOC_SIZE_192_EXIST  (1)
+#else
+   #define KMALLOC_SIZE_96_EXIST   (1)
+   #define KMALLOC_SIZE_192_EXIST  (1)
+#endif
+
+#define KMALLOC_CACHE_NUM  (KMALLOC_CACHE_MIN_NUM  \
+   + KMALLOC_SIZE_96_EXIST \
+   + KMALLOC_SIZE_192_EXIST)
+
 /*
  * This restriction comes from byte sized index implementation.
  * Page size is normally 2^12 bytes and, in this case, if we want to use
@@ -323,7 +340,7 @@ enum kmalloc_cache_type {
 
 #ifndef CONFIG_SLOB
 extern struct kmem_cache *
-kmalloc_caches[NR_KMALLOC_TYPES][KMALLOC_SHIFT_HIGH + 1];
+kmalloc_caches[NR_KMALLOC_TYPES][KMALLOC_CACHE_NUM];
 
 static __always_inline enum kmalloc_cache_type kmalloc_type(gfp_t flags)
 {
@@ -345,13 +362,18 @@ static __always_inline enum kmalloc_cache_type 
kmalloc_type(gfp_t flags)
 #endif
 }
 
+/* kmalloc_index adjust size: (0, 96] */
+#define KMALLOC_IDX_ADJ_0  (KMALLOC_SHIFT_LOW)
+
+/* kmalloc_index adjust size: (96, 192] */
+#define KMALLOC_IDX_ADJ_1  (KMALLOC_IDX_ADJ_0 - KMALLOC_SIZE_96_EXIST)
+
+/* kmalloc_index adjust size: (192, N] */
+#define KMALLOC_IDX_ADJ_2  (KMALLOC_IDX_ADJ_1 - KMALLOC_SIZE_192_EXIST)
+
 /*
  * Figure out which kmalloc slab an allocation of a certain size
  * belongs to.
- * 0 = zero alloc
- * 1 =  65 .. 96 bytes
- * 2 = 129 .. 192 bytes
- * n = 2^(n-1)+1 .. 2^n
  */
 static __always_inline unsigned int kmalloc_index(size_t size)
 {
@@ -359,36 +381,40 @@ static __always_inline unsigned int kmalloc_index(size_t 
size)
return ZERO_SIZE_ALLOC;
 
if (size <= KMALLOC_MIN_SIZE)
-   return KMALLOC_SHIFT_LOW;
-
-   if (KMALLOC_MIN_SIZE <= 32 && size > 64 && size <= 96)
-   return 1;
-   if (KMALLOC_MIN_SIZE <= 64 && size > 128 && size <= 192)
-   return 2;
-   if (size <=  8) return 3;
-   if (size <= 16) return 4;
-   if (size <= 32) return 5;
-   if (size <= 64) return 6;
-   if (size <=128) return 7;
-   if (size <=256) return 8;
-   if (size <=512) return 9;
-   if (size <=   1024) return 10;
-   if (size <=   2 * 1024) return 11;
-   if (size <=   4 * 1024) return 12;
-   if (size <=   8 * 1024) return 13;
-   if (size <=  16 * 1024) return 14;
-   if (size <=  32 * 1024) return 15;
-   if (size <=  64 * 1024) return 16;
-   if (size <= 128 * 1024) return 17;
-   if (size <= 256 * 1024) return 18;
-   if (size <= 512 * 1024) return 19;
-   if (size <= 1024 * 1024) return 20;
-   if (size <=  2 * 1024 * 1024) return 21;
-   if (size <=  4 * 1024 * 1024) return 22;
-   if (size <=  8 * 1024 * 1024) ret

[RESEND v4 7/7] mm, slab_common: Modify kmalloc_caches[type][idx] to kmalloc_caches[idx][type]

2019-09-15 Thread Pengfei Li
KMALLOC_NORMAL is the most frequently accessed, and kmalloc_caches[]
is initialized by different types of the same size.

So modifying kmalloc_caches[type][idx] to kmalloc_caches[idx][type]
will benefit performance.

$ ./scripts/bloat-o-meter vmlinux.patch_1-6 vmlinux.patch_1-7
add/remove: 0/0 grow/shrink: 2/57 up/down: 8/-457 (-449)
Function old new   delta
tg3_self_test   42554259  +4
nf_queue 666 670  +4
kmalloc_slab  97  93  -4
i915_sw_fence_await_dma_fence441 437  -4
__igmp_group_dropped 619 615  -4
gss_import_sec_context   176 170  -6
xhci_alloc_command   212 205  -7
create_kmalloc_caches155 148  -7
xprt_switch_alloc136 128  -8
xhci_segment_alloc   297 289  -8
xhci_ring_alloc  369 361  -8
xhci_mem_init   36643656  -8
xhci_alloc_virt_device   496 488  -8
xhci_alloc_tt_info   346 338  -8
xhci_alloc_stream_info   718 710  -8
xhci_alloc_container_ctx 215 207  -8
xfrm_policy_alloc271 263  -8
tcp_sendmsg_locked  31203112  -8
tcp_md5_do_add   774 766  -8
tcp_fastopen_defer_connect   270 262  -8
sr_read_tochdr.isra  251 243  -8
sr_read_tocentry.isra328 320  -8
sr_is_xa 376 368  -8
sr_get_mcn   260 252  -8
selinux_sk_alloc_security113 105  -8
sdev_evt_send_simple 118 110  -8
sdev_evt_alloc79  71  -8
scsi_probe_and_add_lun  29382930  -8
sbitmap_queue_init_node  418 410  -8
ring_buffer_read_prepare  94  86  -8
request_firmware_nowait  396 388  -8
regulatory_hint_found_beacon 394 386  -8
ohci_urb_enqueue31763168  -8
nla_strdup   142 134  -8
nfs_alloc_seqid   87  79  -8
nfs4_get_state_owner10401032  -8
nfs4_do_close578 570  -8
nf_ct_tmpl_alloc  85  77  -8
mempool_create_node  164 156  -8
ip_setup_cork362 354  -8
ip6_setup_cork  10211013  -8
gss_create_cred  140 132  -8
drm_flip_work_allocate_task   70  62  -8
dma_pool_alloc   410 402  -8
devres_open_group214 206  -8
cfg80211_stop_iface  260 252  -8
cfg80211_sinfo_alloc_tid_stats77  69  -8
cfg80211_port_authorized 212 204  -8
cfg80211_parse_mbssid_data  23972389  -8
cfg80211_ibss_joined 335 327  -8
call_usermodehelper_setup149 141  -8
bpf_prog_alloc_no_stats  182 174  -8
blk_alloc_flush_queue191 183  -8
bdi_alloc_node   195 187  -8
audit_log_d_path 196 188  -8
_netlbl_catmap_getnode   247 239  -8
ip_mc_inc_group  475 467  -8
__i915_sw_fence_await_sw_fence   417 405 -12
ida_alloc_range  955 934 -21
Total: Before=14874316, After=14873867, chg -0.00%

Signed-off-by: Pengfei Li 
---
 include/linux/slab.h |  6 +++---
 mm/slab.c|  4 ++--
 mm/slab_common.c |  8 
 mm/slub.c| 12 ++--
 4 files changed, 15 insertions(+), 15 deletions(-)

diff --git a/include/linux/slab.h b/include/linux/slab.h
index f53bb6980110..0842db5f7053 100644
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -340,7 +340,7 @@ enum kmalloc_cache_type {
 
 #ifndef CONFIG_SLOB
 extern struct kmem_cache *
-kmalloc_caches[NR_KMALLOC_TYPES][KMALLOC_CACHE_NUM];
+kmalloc_caches[KMALLOC_CACHE_NUM][NR_KMALLOC_TYPES];
 
 static __always_inline enum kmalloc_cache_type kmalloc_type(gfp_t flags)
 {
@@ -582,7 +582,7 @@ static __always_inline void

[RESEND v4 3/7] mm, slab_common: Use enum kmalloc_cache_type to iterate over kmalloc caches

2019-09-15 Thread Pengfei Li
The type of local variable *type* of new_kmalloc_cache() should
be enum kmalloc_cache_type instead of int, so correct it.

Signed-off-by: Pengfei Li 
Acked-by: Vlastimil Babka 
Acked-by: Roman Gushchin 
---
 mm/slab_common.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/mm/slab_common.c b/mm/slab_common.c
index 8b542cfcc4f2..af45b5278fdc 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -1192,7 +1192,7 @@ void __init setup_kmalloc_cache_index_table(void)
 }
 
 static void __init
-new_kmalloc_cache(int idx, int type, slab_flags_t flags)
+new_kmalloc_cache(int idx, enum kmalloc_cache_type type, slab_flags_t flags)
 {
if (type == KMALLOC_RECLAIM)
flags |= SLAB_RECLAIM_ACCOUNT;
@@ -1210,7 +1210,8 @@ new_kmalloc_cache(int idx, int type, slab_flags_t flags)
  */
 void __init create_kmalloc_caches(slab_flags_t flags)
 {
-   int i, type;
+   int i;
+   enum kmalloc_cache_type type;
 
for (type = KMALLOC_NORMAL; type <= KMALLOC_RECLAIM; type++) {
for (i = KMALLOC_SHIFT_LOW; i <= KMALLOC_SHIFT_HIGH; i++) {
-- 
2.21.0



[RESEND v4 4/7] mm, slab: Return ZERO_SIZE_ALLOC for zero sized kmalloc requests

2019-09-15 Thread Pengfei Li
This is a preparation patch, just replace 0 with ZERO_SIZE_ALLOC
as the return value of zero sized requests.

Signed-off-by: Pengfei Li 
---
 include/linux/slab.h | 16 +++-
 1 file changed, 11 insertions(+), 5 deletions(-)

diff --git a/include/linux/slab.h b/include/linux/slab.h
index e773e5764b7b..1f05f68f2c3e 100644
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -121,14 +121,20 @@
 #define SLAB_DEACTIVATED   ((slab_flags_t __force)0x1000U)
 
 /*
- * ZERO_SIZE_PTR will be returned for zero sized kmalloc requests.
+ * ZERO_SIZE_ALLOC will be returned by kmalloc_index() if it was zero sized
+ * requests.
  *
+ * After that, ZERO_SIZE_PTR will be returned by the function that called
+ * kmalloc_index().
+
  * Dereferencing ZERO_SIZE_PTR will lead to a distinct access fault.
  *
  * ZERO_SIZE_PTR can be passed to kfree though in the same way that NULL can.
  * Both make kfree a no-op.
  */
-#define ZERO_SIZE_PTR ((void *)16)
+#define ZERO_SIZE_ALLOC(UINT_MAX)
+
+#define ZERO_SIZE_PTR  ((void *)16)
 
 #define ZERO_OR_NULL_PTR(x) ((unsigned long)(x) <= \
(unsigned long)ZERO_SIZE_PTR)
@@ -350,7 +356,7 @@ static __always_inline enum kmalloc_cache_type 
kmalloc_type(gfp_t flags)
 static __always_inline unsigned int kmalloc_index(size_t size)
 {
if (!size)
-   return 0;
+   return ZERO_SIZE_ALLOC;
 
if (size <= KMALLOC_MIN_SIZE)
return KMALLOC_SHIFT_LOW;
@@ -546,7 +552,7 @@ static __always_inline void *kmalloc(size_t size, gfp_t 
flags)
 #ifndef CONFIG_SLOB
index = kmalloc_index(size);
 
-   if (!index)
+   if (index == ZERO_SIZE_ALLOC)
return ZERO_SIZE_PTR;
 
return kmem_cache_alloc_trace(
@@ -564,7 +570,7 @@ static __always_inline void *kmalloc_node(size_t size, 
gfp_t flags, int node)
size <= KMALLOC_MAX_CACHE_SIZE) {
unsigned int i = kmalloc_index(size);
 
-   if (!i)
+   if (i == ZERO_SIZE_ALLOC)
return ZERO_SIZE_PTR;
 
return kmem_cache_alloc_node_trace(
-- 
2.21.0



[RESEND v4 6/7] mm, slab_common: Initialize the same size of kmalloc_caches[]

2019-09-15 Thread Pengfei Li
In the current implementation, KMALLOC_RECLAIM is not initialized
until all the KMALLOC_NORMAL sizes have been initialized.

But for a particular size, create_kmalloc_caches() can be executed
faster by initializing different types of kmalloc in order.

$ ./scripts/bloat-o-meter vmlinux.old vmlinux.patch_1-5
add/remove: 1/2 grow/shrink: 6/64 up/down: 872/-1113 (-241)
Function old new   delta
create_kmalloc_caches270 214 -56

$ ./scripts/bloat-o-meter vmlinux.old vmlinux.patch_1-6
add/remove: 1/2 grow/shrink: 6/64 up/down: 872/-1172 (-300)
Function old new   delta
create_kmalloc_caches270 155-115

We can see that it really gets the benefits.

Besides, KMALLOC_DMA will be initialized after "slab_state = UP",
this does not seem to be necessary.

Commit f97d5f634d3b ("slab: Common function to create the kmalloc
array") introduces create_kmalloc_caches().

And I found that for SLAB, KMALLOC_DMA is initialized before
"slab_state = UP". But for SLUB, KMALLOC_DMA is initialized after
"slab_state = UP".

Based on this fact, I think it is okay to initialize KMALLOC_DMA
before "slab_state = UP".

Signed-off-by: Pengfei Li 
---
 mm/slab_common.c | 35 ---
 1 file changed, 12 insertions(+), 23 deletions(-)

diff --git a/mm/slab_common.c b/mm/slab_common.c
index 2aed30deb071..e7903bd28b1f 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -1165,12 +1165,9 @@ void __init setup_kmalloc_cache_index_table(void)
size_index[size_index_elem(i)] = 0;
 }
 
-static void __init
+static __always_inline void __init
 new_kmalloc_cache(int idx, enum kmalloc_cache_type type, slab_flags_t flags)
 {
-   if (type == KMALLOC_RECLAIM)
-   flags |= SLAB_RECLAIM_ACCOUNT;
-
kmalloc_caches[type][idx] = create_kmalloc_cache(
kmalloc_info[idx].name[type],
kmalloc_info[idx].size, flags, 0,
@@ -1185,30 +1182,22 @@ new_kmalloc_cache(int idx, enum kmalloc_cache_type 
type, slab_flags_t flags)
 void __init create_kmalloc_caches(slab_flags_t flags)
 {
int i;
-   enum kmalloc_cache_type type;
 
-   for (type = KMALLOC_NORMAL; type <= KMALLOC_RECLAIM; type++) {
-   for (i = 0; i < KMALLOC_CACHE_NUM; i++) {
-   if (!kmalloc_caches[type][i])
-   new_kmalloc_cache(i, type, flags);
-   }
-   }
+   for (i = 0; i < KMALLOC_CACHE_NUM; i++) {
+   if (!kmalloc_caches[KMALLOC_NORMAL][i])
+   new_kmalloc_cache(i, KMALLOC_NORMAL, flags);
 
-   /* Kmalloc array is now usable */
-   slab_state = UP;
+   new_kmalloc_cache(i, KMALLOC_RECLAIM,
+   flags | SLAB_RECLAIM_ACCOUNT);
 
 #ifdef CONFIG_ZONE_DMA
-   for (i = 0; i < KMALLOC_CACHE_NUM; i++) {
-   struct kmem_cache *s = kmalloc_caches[KMALLOC_NORMAL][i];
-
-   if (s) {
-   kmalloc_caches[KMALLOC_DMA][i] = create_kmalloc_cache(
-   kmalloc_info[i].name[KMALLOC_DMA],
-   kmalloc_info[i].size,
-   SLAB_CACHE_DMA | flags, 0, 0);
-   }
-   }
+   new_kmalloc_cache(i, KMALLOC_DMA,
+   flags | SLAB_CACHE_DMA);
 #endif
+   }
+
+   /* Kmalloc array is now usable */
+   slab_state = UP;
 }
 #endif /* !CONFIG_SLOB */
 
-- 
2.21.0



[RESEND v4 2/7] mm, slab: Remove unused kmalloc_size()

2019-09-15 Thread Pengfei Li
The size of kmalloc can be obtained from kmalloc_info[],
so remove kmalloc_size() that will not be used anymore.

Signed-off-by: Pengfei Li 
Acked-by: Vlastimil Babka 
Acked-by: Roman Gushchin 
---
 include/linux/slab.h | 20 
 mm/slab.c|  5 +++--
 mm/slab_common.c |  5 ++---
 3 files changed, 5 insertions(+), 25 deletions(-)

diff --git a/include/linux/slab.h b/include/linux/slab.h
index 56c9c7eed34e..e773e5764b7b 100644
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -557,26 +557,6 @@ static __always_inline void *kmalloc(size_t size, gfp_t 
flags)
return __kmalloc(size, flags);
 }
 
-/*
- * Determine size used for the nth kmalloc cache.
- * return size or 0 if a kmalloc cache for that
- * size does not exist
- */
-static __always_inline unsigned int kmalloc_size(unsigned int n)
-{
-#ifndef CONFIG_SLOB
-   if (n > 2)
-   return 1U << n;
-
-   if (n == 1 && KMALLOC_MIN_SIZE <= 32)
-   return 96;
-
-   if (n == 2 && KMALLOC_MIN_SIZE <= 64)
-   return 192;
-#endif
-   return 0;
-}
-
 static __always_inline void *kmalloc_node(size_t size, gfp_t flags, int node)
 {
 #ifndef CONFIG_SLOB
diff --git a/mm/slab.c b/mm/slab.c
index c42b6211f42e..7bc4e90e1147 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -1248,8 +1248,9 @@ void __init kmem_cache_init(void)
 */
kmalloc_caches[KMALLOC_NORMAL][INDEX_NODE] = create_kmalloc_cache(
kmalloc_info[INDEX_NODE].name[KMALLOC_NORMAL],
-   kmalloc_size(INDEX_NODE), ARCH_KMALLOC_FLAGS,
-   0, kmalloc_size(INDEX_NODE));
+   kmalloc_info[INDEX_NODE].size,
+   ARCH_KMALLOC_FLAGS, 0,
+   kmalloc_info[INDEX_NODE].size);
slab_state = PARTIAL_NODE;
setup_kmalloc_cache_index_table();
 
diff --git a/mm/slab_common.c b/mm/slab_common.c
index 002e16673581..8b542cfcc4f2 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -1239,11 +1239,10 @@ void __init create_kmalloc_caches(slab_flags_t flags)
struct kmem_cache *s = kmalloc_caches[KMALLOC_NORMAL][i];
 
if (s) {
-   unsigned int size = kmalloc_size(i);
-
kmalloc_caches[KMALLOC_DMA][i] = create_kmalloc_cache(
kmalloc_info[i].name[KMALLOC_DMA],
-   size, SLAB_CACHE_DMA | flags, 0, 0);
+   kmalloc_info[i].size,
+   SLAB_CACHE_DMA | flags, 0, 0);
}
}
 #endif
-- 
2.21.0



[RESEND v4 1/7] mm, slab: Make kmalloc_info[] contain all types of names

2019-09-15 Thread Pengfei Li
There are three types of kmalloc, KMALLOC_NORMAL, KMALLOC_RECLAIM
and KMALLOC_DMA.

The name of KMALLOC_NORMAL is contained in kmalloc_info[].name,
but the names of KMALLOC_RECLAIM and KMALLOC_DMA are dynamically
generated by kmalloc_cache_name().

This patch predefines the names of all types of kmalloc to save
the time spent dynamically generating names.

Besides, remove the kmalloc_cache_name() that is no longer used.

Signed-off-by: Pengfei Li 
Acked-by: Vlastimil Babka 
Acked-by: Roman Gushchin 
---
 mm/slab.c|  2 +-
 mm/slab.h|  2 +-
 mm/slab_common.c | 91 ++--
 3 files changed, 51 insertions(+), 44 deletions(-)

diff --git a/mm/slab.c b/mm/slab.c
index 9df370558e5d..c42b6211f42e 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -1247,7 +1247,7 @@ void __init kmem_cache_init(void)
 * structures first.  Without this, further allocations will bug.
 */
kmalloc_caches[KMALLOC_NORMAL][INDEX_NODE] = create_kmalloc_cache(
-   kmalloc_info[INDEX_NODE].name,
+   kmalloc_info[INDEX_NODE].name[KMALLOC_NORMAL],
kmalloc_size(INDEX_NODE), ARCH_KMALLOC_FLAGS,
0, kmalloc_size(INDEX_NODE));
slab_state = PARTIAL_NODE;
diff --git a/mm/slab.h b/mm/slab.h
index 9057b8056b07..2fc8f956906a 100644
--- a/mm/slab.h
+++ b/mm/slab.h
@@ -76,7 +76,7 @@ extern struct kmem_cache *kmem_cache;
 
 /* A table of kmalloc cache names and sizes */
 extern const struct kmalloc_info_struct {
-   const char *name;
+   const char *name[NR_KMALLOC_TYPES];
unsigned int size;
 } kmalloc_info[];
 
diff --git a/mm/slab_common.c b/mm/slab_common.c
index 807490fe217a..002e16673581 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -1092,26 +1092,56 @@ struct kmem_cache *kmalloc_slab(size_t size, gfp_t 
flags)
return kmalloc_caches[kmalloc_type(flags)][index];
 }
 
+#ifdef CONFIG_ZONE_DMA
+#define SET_KMALLOC_SIZE(__size, __short_size) \
+{  \
+   .name[KMALLOC_NORMAL]  = "kmalloc-" #__short_size,  \
+   .name[KMALLOC_RECLAIM] = "kmalloc-rcl-" #__short_size,  \
+   .name[KMALLOC_DMA] = "dma-kmalloc-" #__short_size,  \
+   .size = __size, \
+}
+#else
+#define SET_KMALLOC_SIZE(__size, __short_size) \
+{  \
+   .name[KMALLOC_NORMAL]  = "kmalloc-" #__short_size,  \
+   .name[KMALLOC_RECLAIM] = "kmalloc-rcl-" #__short_size,  \
+   .size = __size, \
+}
+#endif
+
 /*
  * kmalloc_info[] is to make slub_debug=,kmalloc-xx option work at boot time.
  * kmalloc_index() supports up to 2^26=64MB, so the final entry of the table is
  * kmalloc-67108864.
  */
 const struct kmalloc_info_struct kmalloc_info[] __initconst = {
-   {NULL,  0}, {"kmalloc-96", 96},
-   {"kmalloc-192",   192}, {"kmalloc-8",   8},
-   {"kmalloc-16", 16}, {"kmalloc-32", 32},
-   {"kmalloc-64", 64}, {"kmalloc-128",   128},
-   {"kmalloc-256",   256}, {"kmalloc-512",   512},
-   {"kmalloc-1k",   1024}, {"kmalloc-2k",   2048},
-   {"kmalloc-4k",   4096}, {"kmalloc-8k",   8192},
-   {"kmalloc-16k", 16384}, {"kmalloc-32k", 32768},
-   {"kmalloc-64k", 65536}, {"kmalloc-128k",   131072},
-   {"kmalloc-256k",   262144}, {"kmalloc-512k",   524288},
-   {"kmalloc-1M",1048576}, {"kmalloc-2M",2097152},
-   {"kmalloc-4M",4194304}, {"kmalloc-8M",8388608},
-   {"kmalloc-16M",  16777216}, {"kmalloc-32M",  33554432},
-   {"kmalloc-64M",  67108864}
+   SET_KMALLOC_SIZE(0, 0),
+   SET_KMALLOC_SIZE(96, 96),
+   SET_KMALLOC_SIZE(192, 192),
+   SET_KMALLOC_SIZE(8, 8),
+   SET_KMALLOC_SIZE(16, 16),
+   SET_KMALLOC_SIZE(32, 32),
+   SET_KMALLOC_SIZE(64, 64),
+   SET_KMALLOC_SIZE(128, 128),
+   SET_KMALLOC_SIZE(256, 256),
+   SET_KMALLOC_SIZE(512, 512),
+   SET_KMALLOC_SIZE(1024, 1k),
+   SET_KMALLOC_SIZE(2048, 2k),
+   SET_KMALLOC_SIZE(4096, 4k),
+   SET_KMALLOC_SIZE(8192, 8k),
+   SET_KMALLOC_SIZE(16384, 16k),
+   SET_KMALLOC_SIZE(32768, 32k),
+   SET_KMALLOC_SIZE(655

[RESEND v4 0/7] mm, slab: Make kmalloc_info[] contain all types of names

2019-09-15 Thread Pengfei Li
sr_get_mcn   269 252 -17
scsi_probe_and_add_lun  29472930 -17
ring_buffer_read_prepare 103  86 -17
request_firmware_nowait  405 388 -17
ohci_urb_enqueue31853168 -17
nfs_alloc_seqid   96  79 -17
nfs4_get_state_owner10491032 -17
nfs4_do_close587 570 -17
mempool_create_node  173 156 -17
ip6_setup_cork  10301013 -17
ida_alloc_range  951 934 -17
gss_import_sec_context   187 170 -17
dma_pool_alloc   419 402 -17
devres_open_group223 206 -17
cfg80211_parse_mbssid_data  24062389 -17
ip_setup_cork374 354 -20
kmalloc_caches   336 312 -24
__i915_sw_fence_await_sw_fence   429 405 -24
kmalloc_cache_name57   - -57
new_kmalloc_cache112   --112
create_kmalloc_caches270 148-122
kmalloc_info 432   8-424
Total: Before=14874616, After=14873867, chg -0.01%

Pengfei Li (7):
  mm, slab: Make kmalloc_info[] contain all types of names
  mm, slab: Remove unused kmalloc_size()
  mm, slab_common: Use enum kmalloc_cache_type to iterate over kmalloc
caches
  mm, slab: Return ZERO_SIZE_ALLOC for zero sized kmalloc requests
  mm, slab_common: Make kmalloc_caches[] start at size KMALLOC_MIN_SIZE
  mm, slab_common: Initialize the same size of kmalloc_caches[]
  mm, slab_common: Modify kmalloc_caches[type][idx] to
kmalloc_caches[idx][type]

 include/linux/slab.h | 136 ++
 mm/slab.c|  11 ++-
 mm/slab.h|  10 +-
 mm/slab_common.c | 224 ++-
 mm/slub.c|  12 +--
 5 files changed, 183 insertions(+), 210 deletions(-)

-- 
2.21.0



[PATCH v4 7/7] mm, slab_common: Modify kmalloc_caches[type][idx] to kmalloc_caches[idx][type]

2019-09-15 Thread Pengfei Li
KMALLOC_NORMAL is the most frequently accessed, and kmalloc_caches[]
is initialized by different types of the same size.

So modifying kmalloc_caches[type][idx] to kmalloc_caches[idx][type]
will benefit performance.

$ ./scripts/bloat-o-meter vmlinux.patch_1-6 vmlinux.patch_1-7
add/remove: 0/0 grow/shrink: 2/57 up/down: 8/-457 (-449)
Function old new   delta
tg3_self_test   42554259  +4
nf_queue 666 670  +4
kmalloc_slab  97  93  -4
i915_sw_fence_await_dma_fence441 437  -4
__igmp_group_dropped 619 615  -4
gss_import_sec_context   176 170  -6
xhci_alloc_command   212 205  -7
create_kmalloc_caches155 148  -7
xprt_switch_alloc136 128  -8
xhci_segment_alloc   297 289  -8
xhci_ring_alloc  369 361  -8
xhci_mem_init   36643656  -8
xhci_alloc_virt_device   496 488  -8
xhci_alloc_tt_info   346 338  -8
xhci_alloc_stream_info   718 710  -8
xhci_alloc_container_ctx 215 207  -8
xfrm_policy_alloc271 263  -8
tcp_sendmsg_locked  31203112  -8
tcp_md5_do_add   774 766  -8
tcp_fastopen_defer_connect   270 262  -8
sr_read_tochdr.isra  251 243  -8
sr_read_tocentry.isra328 320  -8
sr_is_xa 376 368  -8
sr_get_mcn   260 252  -8
selinux_sk_alloc_security113 105  -8
sdev_evt_send_simple 118 110  -8
sdev_evt_alloc79  71  -8
scsi_probe_and_add_lun  29382930  -8
sbitmap_queue_init_node  418 410  -8
ring_buffer_read_prepare  94  86  -8
request_firmware_nowait  396 388  -8
regulatory_hint_found_beacon 394 386  -8
ohci_urb_enqueue31763168  -8
nla_strdup   142 134  -8
nfs_alloc_seqid   87  79  -8
nfs4_get_state_owner10401032  -8
nfs4_do_close578 570  -8
nf_ct_tmpl_alloc  85  77  -8
mempool_create_node  164 156  -8
ip_setup_cork362 354  -8
ip6_setup_cork  10211013  -8
gss_create_cred  140 132  -8
drm_flip_work_allocate_task   70  62  -8
dma_pool_alloc   410 402  -8
devres_open_group214 206  -8
cfg80211_stop_iface  260 252  -8
cfg80211_sinfo_alloc_tid_stats77  69  -8
cfg80211_port_authorized 212 204  -8
cfg80211_parse_mbssid_data  23972389  -8
cfg80211_ibss_joined 335 327  -8
call_usermodehelper_setup149 141  -8
bpf_prog_alloc_no_stats  182 174  -8
blk_alloc_flush_queue191 183  -8
bdi_alloc_node   195 187  -8
audit_log_d_path 196 188  -8
_netlbl_catmap_getnode   247 239  -8
ip_mc_inc_group  475 467  -8
__i915_sw_fence_await_sw_fence   417 405 -12
ida_alloc_range  955 934 -21
Total: Before=14874316, After=14873867, chg -0.00%

Signed-off-by: Pengfei Li 
---
 include/linux/slab.h |  6 +++---
 mm/slab.c|  4 ++--
 mm/slab_common.c |  8 
 mm/slub.c| 12 ++--
 4 files changed, 15 insertions(+), 15 deletions(-)

diff --git a/include/linux/slab.h b/include/linux/slab.h
index f53bb6980110..0842db5f7053 100644
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -340,7 +340,7 @@ enum kmalloc_cache_type {
 
 #ifndef CONFIG_SLOB
 extern struct kmem_cache *
-kmalloc_caches[NR_KMALLOC_TYPES][KMALLOC_CACHE_NUM];
+kmalloc_caches[KMALLOC_CACHE_NUM][NR_KMALLOC_TYPES];
 
 static __always_inline enum kmalloc_cache_type kmalloc_type(gfp_t flags)
 {
@@ -582,7 +582,7 @@ static __always_inline void

[PATCH v4 7/7] mm, slab_common: modify kmalloc_caches[type][idx] to kmalloc_caches[idx][type]

2019-09-15 Thread Pengfei Li
KMALLOC_NORMAL is the most frequently accessed, and kmalloc_caches[]
is initialized by different types of the same size.

So modifying kmalloc_caches[type][idx] to kmalloc_caches[idx][type]
will benefit performance.

$ ./scripts/bloat-o-meter vmlinux.patch_1-6 vmlinux.patch_1-7
add/remove: 0/0 grow/shrink: 2/57 up/down: 8/-457 (-449)
Function old new   delta
tg3_self_test   42554259  +4
nf_queue 666 670  +4
kmalloc_slab  97  93  -4
i915_sw_fence_await_dma_fence441 437  -4
__igmp_group_dropped 619 615  -4
gss_import_sec_context   176 170  -6
xhci_alloc_command   212 205  -7
create_kmalloc_caches155 148  -7
xprt_switch_alloc136 128  -8
xhci_segment_alloc   297 289  -8
xhci_ring_alloc  369 361  -8
xhci_mem_init   36643656  -8
xhci_alloc_virt_device   496 488  -8
xhci_alloc_tt_info   346 338  -8
xhci_alloc_stream_info   718 710  -8
xhci_alloc_container_ctx 215 207  -8
xfrm_policy_alloc271 263  -8
tcp_sendmsg_locked  31203112  -8
tcp_md5_do_add   774 766  -8
tcp_fastopen_defer_connect   270 262  -8
sr_read_tochdr.isra  251 243  -8
sr_read_tocentry.isra328 320  -8
sr_is_xa 376 368  -8
sr_get_mcn   260 252  -8
selinux_sk_alloc_security113 105  -8
sdev_evt_send_simple 118 110  -8
sdev_evt_alloc79  71  -8
scsi_probe_and_add_lun  29382930  -8
sbitmap_queue_init_node  418 410  -8
ring_buffer_read_prepare  94  86  -8
request_firmware_nowait  396 388  -8
regulatory_hint_found_beacon 394 386  -8
ohci_urb_enqueue31763168  -8
nla_strdup   142 134  -8
nfs_alloc_seqid   87  79  -8
nfs4_get_state_owner10401032  -8
nfs4_do_close578 570  -8
nf_ct_tmpl_alloc  85  77  -8
mempool_create_node  164 156  -8
ip_setup_cork362 354  -8
ip6_setup_cork  10211013  -8
gss_create_cred  140 132  -8
drm_flip_work_allocate_task   70  62  -8
dma_pool_alloc   410 402  -8
devres_open_group214 206  -8
cfg80211_stop_iface  260 252  -8
cfg80211_sinfo_alloc_tid_stats77  69  -8
cfg80211_port_authorized 212 204  -8
cfg80211_parse_mbssid_data  23972389  -8
cfg80211_ibss_joined 335 327  -8
call_usermodehelper_setup149 141  -8
bpf_prog_alloc_no_stats  182 174  -8
blk_alloc_flush_queue191 183  -8
bdi_alloc_node   195 187  -8
audit_log_d_path 196 188  -8
_netlbl_catmap_getnode   247 239  -8
ip_mc_inc_group  475 467  -8
__i915_sw_fence_await_sw_fence   417 405 -12
ida_alloc_range  955 934 -21
Total: Before=14874316, After=14873867, chg -0.00%

Signed-off-by: Pengfei Li 
---
 include/linux/slab.h |  6 +++---
 mm/slab.c|  4 ++--
 mm/slab_common.c |  8 
 mm/slub.c| 12 ++--
 4 files changed, 15 insertions(+), 15 deletions(-)

diff --git a/include/linux/slab.h b/include/linux/slab.h
index f53bb6980110..0842db5f7053 100644
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -340,7 +340,7 @@ enum kmalloc_cache_type {
 
 #ifndef CONFIG_SLOB
 extern struct kmem_cache *
-kmalloc_caches[NR_KMALLOC_TYPES][KMALLOC_CACHE_NUM];
+kmalloc_caches[KMALLOC_CACHE_NUM][NR_KMALLOC_TYPES];
 
 static __always_inline enum kmalloc_cache_type kmalloc_type(gfp_t flags)
 {
@@ -582,7 +582,7 @@ static __always_inline void

[PATCH v4 6/7] mm, slab_common: Initialize the same size of kmalloc_caches[]

2019-09-15 Thread Pengfei Li
In the current implementation, KMALLOC_RECLAIM is not initialized
until all the KMALLOC_NORMAL sizes have been initialized.

But for a particular size, create_kmalloc_caches() can be executed
faster by initializing different types of kmalloc in order.

$ ./scripts/bloat-o-meter vmlinux.old vmlinux.patch_1-5
add/remove: 1/2 grow/shrink: 6/64 up/down: 872/-1113 (-241)
Function old new   delta
create_kmalloc_caches270 214 -56

$ ./scripts/bloat-o-meter vmlinux.old vmlinux.patch_1-6
add/remove: 1/2 grow/shrink: 6/64 up/down: 872/-1172 (-300)
Function old new   delta
create_kmalloc_caches270 155-115

We can see that it really gets the benefits.

Besides, KMALLOC_DMA will be initialized after "slab_state = UP",
this does not seem to be necessary.

Commit f97d5f634d3b ("slab: Common function to create the kmalloc
array") introduces create_kmalloc_caches().

And I found that for SLAB, KMALLOC_DMA is initialized before
"slab_state = UP". But for SLUB, KMALLOC_DMA is initialized after
"slab_state = UP".

Based on this fact, I think it is okay to initialize KMALLOC_DMA
before "slab_state = UP".

Signed-off-by: Pengfei Li 
---
 mm/slab_common.c | 35 ---
 1 file changed, 12 insertions(+), 23 deletions(-)

diff --git a/mm/slab_common.c b/mm/slab_common.c
index 2aed30deb071..e7903bd28b1f 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -1165,12 +1165,9 @@ void __init setup_kmalloc_cache_index_table(void)
size_index[size_index_elem(i)] = 0;
 }
 
-static void __init
+static __always_inline void __init
 new_kmalloc_cache(int idx, enum kmalloc_cache_type type, slab_flags_t flags)
 {
-   if (type == KMALLOC_RECLAIM)
-   flags |= SLAB_RECLAIM_ACCOUNT;
-
kmalloc_caches[type][idx] = create_kmalloc_cache(
kmalloc_info[idx].name[type],
kmalloc_info[idx].size, flags, 0,
@@ -1185,30 +1182,22 @@ new_kmalloc_cache(int idx, enum kmalloc_cache_type 
type, slab_flags_t flags)
 void __init create_kmalloc_caches(slab_flags_t flags)
 {
int i;
-   enum kmalloc_cache_type type;
 
-   for (type = KMALLOC_NORMAL; type <= KMALLOC_RECLAIM; type++) {
-   for (i = 0; i < KMALLOC_CACHE_NUM; i++) {
-   if (!kmalloc_caches[type][i])
-   new_kmalloc_cache(i, type, flags);
-   }
-   }
+   for (i = 0; i < KMALLOC_CACHE_NUM; i++) {
+   if (!kmalloc_caches[KMALLOC_NORMAL][i])
+   new_kmalloc_cache(i, KMALLOC_NORMAL, flags);
 
-   /* Kmalloc array is now usable */
-   slab_state = UP;
+   new_kmalloc_cache(i, KMALLOC_RECLAIM,
+   flags | SLAB_RECLAIM_ACCOUNT);
 
 #ifdef CONFIG_ZONE_DMA
-   for (i = 0; i < KMALLOC_CACHE_NUM; i++) {
-   struct kmem_cache *s = kmalloc_caches[KMALLOC_NORMAL][i];
-
-   if (s) {
-   kmalloc_caches[KMALLOC_DMA][i] = create_kmalloc_cache(
-   kmalloc_info[i].name[KMALLOC_DMA],
-   kmalloc_info[i].size,
-   SLAB_CACHE_DMA | flags, 0, 0);
-   }
-   }
+   new_kmalloc_cache(i, KMALLOC_DMA,
+   flags | SLAB_CACHE_DMA);
 #endif
+   }
+
+   /* Kmalloc array is now usable */
+   slab_state = UP;
 }
 #endif /* !CONFIG_SLOB */
 
-- 
2.21.0



[PATCH v4 4/7] mm, slab: return ZERO_SIZE_ALLOC for zero sized kmalloc requests

2019-09-15 Thread Pengfei Li
This is a preparation patch, just replace 0 with ZERO_SIZE_ALLOC
as the return value of zero sized requests.

Signed-off-by: Pengfei Li 
---
 include/linux/slab.h | 16 +++-
 1 file changed, 11 insertions(+), 5 deletions(-)

diff --git a/include/linux/slab.h b/include/linux/slab.h
index e773e5764b7b..1f05f68f2c3e 100644
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -121,14 +121,20 @@
 #define SLAB_DEACTIVATED   ((slab_flags_t __force)0x1000U)
 
 /*
- * ZERO_SIZE_PTR will be returned for zero sized kmalloc requests.
+ * ZERO_SIZE_ALLOC will be returned by kmalloc_index() if it was zero sized
+ * requests.
  *
+ * After that, ZERO_SIZE_PTR will be returned by the function that called
+ * kmalloc_index().
+
  * Dereferencing ZERO_SIZE_PTR will lead to a distinct access fault.
  *
  * ZERO_SIZE_PTR can be passed to kfree though in the same way that NULL can.
  * Both make kfree a no-op.
  */
-#define ZERO_SIZE_PTR ((void *)16)
+#define ZERO_SIZE_ALLOC(UINT_MAX)
+
+#define ZERO_SIZE_PTR  ((void *)16)
 
 #define ZERO_OR_NULL_PTR(x) ((unsigned long)(x) <= \
(unsigned long)ZERO_SIZE_PTR)
@@ -350,7 +356,7 @@ static __always_inline enum kmalloc_cache_type 
kmalloc_type(gfp_t flags)
 static __always_inline unsigned int kmalloc_index(size_t size)
 {
if (!size)
-   return 0;
+   return ZERO_SIZE_ALLOC;
 
if (size <= KMALLOC_MIN_SIZE)
return KMALLOC_SHIFT_LOW;
@@ -546,7 +552,7 @@ static __always_inline void *kmalloc(size_t size, gfp_t 
flags)
 #ifndef CONFIG_SLOB
index = kmalloc_index(size);
 
-   if (!index)
+   if (index == ZERO_SIZE_ALLOC)
return ZERO_SIZE_PTR;
 
return kmem_cache_alloc_trace(
@@ -564,7 +570,7 @@ static __always_inline void *kmalloc_node(size_t size, 
gfp_t flags, int node)
size <= KMALLOC_MAX_CACHE_SIZE) {
unsigned int i = kmalloc_index(size);
 
-   if (!i)
+   if (i == ZERO_SIZE_ALLOC)
return ZERO_SIZE_PTR;
 
return kmem_cache_alloc_node_trace(
-- 
2.21.0



[PATCH v4 4/7] mm, slab: Return ZERO_SIZE_ALLOC for zero sized kmalloc requests

2019-09-15 Thread Pengfei Li
This is a preparation patch, just replace 0 with ZERO_SIZE_ALLOC
as the return value of zero sized requests.

Signed-off-by: Pengfei Li 
---
 include/linux/slab.h | 16 +++-
 1 file changed, 11 insertions(+), 5 deletions(-)

diff --git a/include/linux/slab.h b/include/linux/slab.h
index e773e5764b7b..1f05f68f2c3e 100644
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -121,14 +121,20 @@
 #define SLAB_DEACTIVATED   ((slab_flags_t __force)0x1000U)
 
 /*
- * ZERO_SIZE_PTR will be returned for zero sized kmalloc requests.
+ * ZERO_SIZE_ALLOC will be returned by kmalloc_index() if it was zero sized
+ * requests.
  *
+ * After that, ZERO_SIZE_PTR will be returned by the function that called
+ * kmalloc_index().
+
  * Dereferencing ZERO_SIZE_PTR will lead to a distinct access fault.
  *
  * ZERO_SIZE_PTR can be passed to kfree though in the same way that NULL can.
  * Both make kfree a no-op.
  */
-#define ZERO_SIZE_PTR ((void *)16)
+#define ZERO_SIZE_ALLOC(UINT_MAX)
+
+#define ZERO_SIZE_PTR  ((void *)16)
 
 #define ZERO_OR_NULL_PTR(x) ((unsigned long)(x) <= \
(unsigned long)ZERO_SIZE_PTR)
@@ -350,7 +356,7 @@ static __always_inline enum kmalloc_cache_type 
kmalloc_type(gfp_t flags)
 static __always_inline unsigned int kmalloc_index(size_t size)
 {
if (!size)
-   return 0;
+   return ZERO_SIZE_ALLOC;
 
if (size <= KMALLOC_MIN_SIZE)
return KMALLOC_SHIFT_LOW;
@@ -546,7 +552,7 @@ static __always_inline void *kmalloc(size_t size, gfp_t 
flags)
 #ifndef CONFIG_SLOB
index = kmalloc_index(size);
 
-   if (!index)
+   if (index == ZERO_SIZE_ALLOC)
return ZERO_SIZE_PTR;
 
return kmem_cache_alloc_trace(
@@ -564,7 +570,7 @@ static __always_inline void *kmalloc_node(size_t size, 
gfp_t flags, int node)
size <= KMALLOC_MAX_CACHE_SIZE) {
unsigned int i = kmalloc_index(size);
 
-   if (!i)
+   if (i == ZERO_SIZE_ALLOC)
return ZERO_SIZE_PTR;
 
return kmem_cache_alloc_node_trace(
-- 
2.21.0



[PATCH v4 6/7] mm, slab_common: initialize the same size of kmalloc_caches[]

2019-09-15 Thread Pengfei Li
In the current implementation, KMALLOC_RECLAIM is not initialized
until all the KMALLOC_NORMAL sizes have been initialized.

But for a particular size, create_kmalloc_caches() can be executed
faster by initializing different types of kmalloc in order.

$ ./scripts/bloat-o-meter vmlinux.old vmlinux.patch_1-5
add/remove: 1/2 grow/shrink: 6/64 up/down: 872/-1113 (-241)
Function old new   delta
create_kmalloc_caches270 214 -56

$ ./scripts/bloat-o-meter vmlinux.old vmlinux.patch_1-6
add/remove: 1/2 grow/shrink: 6/64 up/down: 872/-1172 (-300)
Function old new   delta
create_kmalloc_caches270 155-115

We can see that it really gets the benefits.

Besides, KMALLOC_DMA will be initialized after "slab_state = UP",
this does not seem to be necessary.

Commit f97d5f634d3b ("slab: Common function to create the kmalloc
array") introduces create_kmalloc_caches().

And I found that for SLAB, KMALLOC_DMA is initialized before
"slab_state = UP". But for SLUB, KMALLOC_DMA is initialized after
"slab_state = UP".

Based on this fact, I think it is okay to initialize KMALLOC_DMA
before "slab_state = UP".

Signed-off-by: Pengfei Li 
---
 mm/slab_common.c | 35 ---
 1 file changed, 12 insertions(+), 23 deletions(-)

diff --git a/mm/slab_common.c b/mm/slab_common.c
index 2aed30deb071..e7903bd28b1f 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -1165,12 +1165,9 @@ void __init setup_kmalloc_cache_index_table(void)
size_index[size_index_elem(i)] = 0;
 }
 
-static void __init
+static __always_inline void __init
 new_kmalloc_cache(int idx, enum kmalloc_cache_type type, slab_flags_t flags)
 {
-   if (type == KMALLOC_RECLAIM)
-   flags |= SLAB_RECLAIM_ACCOUNT;
-
kmalloc_caches[type][idx] = create_kmalloc_cache(
kmalloc_info[idx].name[type],
kmalloc_info[idx].size, flags, 0,
@@ -1185,30 +1182,22 @@ new_kmalloc_cache(int idx, enum kmalloc_cache_type 
type, slab_flags_t flags)
 void __init create_kmalloc_caches(slab_flags_t flags)
 {
int i;
-   enum kmalloc_cache_type type;
 
-   for (type = KMALLOC_NORMAL; type <= KMALLOC_RECLAIM; type++) {
-   for (i = 0; i < KMALLOC_CACHE_NUM; i++) {
-   if (!kmalloc_caches[type][i])
-   new_kmalloc_cache(i, type, flags);
-   }
-   }
+   for (i = 0; i < KMALLOC_CACHE_NUM; i++) {
+   if (!kmalloc_caches[KMALLOC_NORMAL][i])
+   new_kmalloc_cache(i, KMALLOC_NORMAL, flags);
 
-   /* Kmalloc array is now usable */
-   slab_state = UP;
+   new_kmalloc_cache(i, KMALLOC_RECLAIM,
+   flags | SLAB_RECLAIM_ACCOUNT);
 
 #ifdef CONFIG_ZONE_DMA
-   for (i = 0; i < KMALLOC_CACHE_NUM; i++) {
-   struct kmem_cache *s = kmalloc_caches[KMALLOC_NORMAL][i];
-
-   if (s) {
-   kmalloc_caches[KMALLOC_DMA][i] = create_kmalloc_cache(
-   kmalloc_info[i].name[KMALLOC_DMA],
-   kmalloc_info[i].size,
-   SLAB_CACHE_DMA | flags, 0, 0);
-   }
-   }
+   new_kmalloc_cache(i, KMALLOC_DMA,
+   flags | SLAB_CACHE_DMA);
 #endif
+   }
+
+   /* Kmalloc array is now usable */
+   slab_state = UP;
 }
 #endif /* !CONFIG_SLOB */
 
-- 
2.21.0



[PATCH v4 5/7] mm, slab_common: make kmalloc_caches[] start at size KMALLOC_MIN_SIZE

2019-09-15 Thread Pengfei Li
96  -9
ohci_urb_enqueue31853176  -9
nfs_alloc_seqid   96  87  -9
nfs4_get_state_owner10491040  -9
nfs4_do_close587 578  -9
mempool_create_node  173 164  -9
ip6_setup_cork  10301021  -9
dma_pool_alloc   419 410  -9
devres_open_group223 214  -9
cfg80211_parse_mbssid_data  24062397  -9
__igmp_group_dropped 629 619 -10
gss_import_sec_context   187 176 -11
ip_setup_cork374 362 -12
__i915_sw_fence_await_sw_fence   429 417 -12
kmalloc_caches   336 312 -24
create_kmalloc_caches270 214 -56
kmalloc_cache_name57   - -57
new_kmalloc_cache112   --112
kmalloc_info 432   8-424
Total: Before=14874616, After=14874375, chg -0.00%

Signed-off-by: Pengfei Li 
---
 include/linux/slab.h |  96 --
 mm/slab.h|  10 +--
 mm/slab_common.c | 158 ---
 mm/slub.c|  12 ++--
 4 files changed, 133 insertions(+), 143 deletions(-)

diff --git a/include/linux/slab.h b/include/linux/slab.h
index 1f05f68f2c3e..f53bb6980110 100644
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -297,6 +297,23 @@ static inline void __check_heap_object(const void *ptr, 
unsigned long n,
 #define KMALLOC_MIN_SIZE (1 << KMALLOC_SHIFT_LOW)
 #endif
 
+#define KMALLOC_CACHE_MIN_NUM  (KMALLOC_SHIFT_HIGH - KMALLOC_SHIFT_LOW + 1)
+
+#if KMALLOC_MIN_SIZE > 64
+   #define KMALLOC_SIZE_96_EXIST   (0)
+   #define KMALLOC_SIZE_192_EXIST  (0)
+#elif KMALLOC_MIN_SIZE > 32
+   #define KMALLOC_SIZE_96_EXIST   (0)
+   #define KMALLOC_SIZE_192_EXIST  (1)
+#else
+   #define KMALLOC_SIZE_96_EXIST   (1)
+   #define KMALLOC_SIZE_192_EXIST  (1)
+#endif
+
+#define KMALLOC_CACHE_NUM  (KMALLOC_CACHE_MIN_NUM  \
+   + KMALLOC_SIZE_96_EXIST \
+   + KMALLOC_SIZE_192_EXIST)
+
 /*
  * This restriction comes from byte sized index implementation.
  * Page size is normally 2^12 bytes and, in this case, if we want to use
@@ -323,7 +340,7 @@ enum kmalloc_cache_type {
 
 #ifndef CONFIG_SLOB
 extern struct kmem_cache *
-kmalloc_caches[NR_KMALLOC_TYPES][KMALLOC_SHIFT_HIGH + 1];
+kmalloc_caches[NR_KMALLOC_TYPES][KMALLOC_CACHE_NUM];
 
 static __always_inline enum kmalloc_cache_type kmalloc_type(gfp_t flags)
 {
@@ -345,13 +362,18 @@ static __always_inline enum kmalloc_cache_type 
kmalloc_type(gfp_t flags)
 #endif
 }
 
+/* kmalloc_index adjust size: (0, 96] */
+#define KMALLOC_IDX_ADJ_0  (KMALLOC_SHIFT_LOW)
+
+/* kmalloc_index adjust size: (96, 192] */
+#define KMALLOC_IDX_ADJ_1  (KMALLOC_IDX_ADJ_0 - KMALLOC_SIZE_96_EXIST)
+
+/* kmalloc_index adjust size: (192, N] */
+#define KMALLOC_IDX_ADJ_2  (KMALLOC_IDX_ADJ_1 - KMALLOC_SIZE_192_EXIST)
+
 /*
  * Figure out which kmalloc slab an allocation of a certain size
  * belongs to.
- * 0 = zero alloc
- * 1 =  65 .. 96 bytes
- * 2 = 129 .. 192 bytes
- * n = 2^(n-1)+1 .. 2^n
  */
 static __always_inline unsigned int kmalloc_index(size_t size)
 {
@@ -359,36 +381,40 @@ static __always_inline unsigned int kmalloc_index(size_t 
size)
return ZERO_SIZE_ALLOC;
 
if (size <= KMALLOC_MIN_SIZE)
-   return KMALLOC_SHIFT_LOW;
-
-   if (KMALLOC_MIN_SIZE <= 32 && size > 64 && size <= 96)
-   return 1;
-   if (KMALLOC_MIN_SIZE <= 64 && size > 128 && size <= 192)
-   return 2;
-   if (size <=  8) return 3;
-   if (size <= 16) return 4;
-   if (size <= 32) return 5;
-   if (size <= 64) return 6;
-   if (size <=128) return 7;
-   if (size <=256) return 8;
-   if (size <=512) return 9;
-   if (size <=   1024) return 10;
-   if (size <=   2 * 1024) return 11;
-   if (size <=   4 * 1024) return 12;
-   if (size <=   8 * 1024) return 13;
-   if (size <=  16 * 1024) return 14;
-   if (size <=  32 * 1024) return 15;
-   if (size <=  64 * 1024) return 16;
-   if (size <= 128 * 1024) return 17;
-   if (size <= 256 * 1024) return 18;
-   if (size <= 512 * 1024) return 19;
-   if (size <= 1024 * 1024) return 20;
-   if (size <=  2 * 1024 * 1024) return 21;
-   if (size <=  4 * 1024 * 1024) return 22;
-   if (size <=  8 * 1024 * 1024) ret

[PATCH v4 5/7] mm, slab_common: Make kmalloc_caches[] start at size KMALLOC_MIN_SIZE

2019-09-15 Thread Pengfei Li
96  -9
ohci_urb_enqueue31853176  -9
nfs_alloc_seqid   96  87  -9
nfs4_get_state_owner10491040  -9
nfs4_do_close587 578  -9
mempool_create_node  173 164  -9
ip6_setup_cork  10301021  -9
dma_pool_alloc   419 410  -9
devres_open_group223 214  -9
cfg80211_parse_mbssid_data  24062397  -9
__igmp_group_dropped 629 619 -10
gss_import_sec_context   187 176 -11
ip_setup_cork374 362 -12
__i915_sw_fence_await_sw_fence   429 417 -12
kmalloc_caches   336 312 -24
create_kmalloc_caches270 214 -56
kmalloc_cache_name57   - -57
new_kmalloc_cache112   --112
kmalloc_info 432   8-424
Total: Before=14874616, After=14874375, chg -0.00%

Signed-off-by: Pengfei Li 
---
 include/linux/slab.h |  96 --
 mm/slab.h|  10 +--
 mm/slab_common.c | 158 ---
 mm/slub.c|  12 ++--
 4 files changed, 133 insertions(+), 143 deletions(-)

diff --git a/include/linux/slab.h b/include/linux/slab.h
index 1f05f68f2c3e..f53bb6980110 100644
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -297,6 +297,23 @@ static inline void __check_heap_object(const void *ptr, 
unsigned long n,
 #define KMALLOC_MIN_SIZE (1 << KMALLOC_SHIFT_LOW)
 #endif
 
+#define KMALLOC_CACHE_MIN_NUM  (KMALLOC_SHIFT_HIGH - KMALLOC_SHIFT_LOW + 1)
+
+#if KMALLOC_MIN_SIZE > 64
+   #define KMALLOC_SIZE_96_EXIST   (0)
+   #define KMALLOC_SIZE_192_EXIST  (0)
+#elif KMALLOC_MIN_SIZE > 32
+   #define KMALLOC_SIZE_96_EXIST   (0)
+   #define KMALLOC_SIZE_192_EXIST  (1)
+#else
+   #define KMALLOC_SIZE_96_EXIST   (1)
+   #define KMALLOC_SIZE_192_EXIST  (1)
+#endif
+
+#define KMALLOC_CACHE_NUM  (KMALLOC_CACHE_MIN_NUM  \
+   + KMALLOC_SIZE_96_EXIST \
+   + KMALLOC_SIZE_192_EXIST)
+
 /*
  * This restriction comes from byte sized index implementation.
  * Page size is normally 2^12 bytes and, in this case, if we want to use
@@ -323,7 +340,7 @@ enum kmalloc_cache_type {
 
 #ifndef CONFIG_SLOB
 extern struct kmem_cache *
-kmalloc_caches[NR_KMALLOC_TYPES][KMALLOC_SHIFT_HIGH + 1];
+kmalloc_caches[NR_KMALLOC_TYPES][KMALLOC_CACHE_NUM];
 
 static __always_inline enum kmalloc_cache_type kmalloc_type(gfp_t flags)
 {
@@ -345,13 +362,18 @@ static __always_inline enum kmalloc_cache_type 
kmalloc_type(gfp_t flags)
 #endif
 }
 
+/* kmalloc_index adjust size: (0, 96] */
+#define KMALLOC_IDX_ADJ_0  (KMALLOC_SHIFT_LOW)
+
+/* kmalloc_index adjust size: (96, 192] */
+#define KMALLOC_IDX_ADJ_1  (KMALLOC_IDX_ADJ_0 - KMALLOC_SIZE_96_EXIST)
+
+/* kmalloc_index adjust size: (192, N] */
+#define KMALLOC_IDX_ADJ_2  (KMALLOC_IDX_ADJ_1 - KMALLOC_SIZE_192_EXIST)
+
 /*
  * Figure out which kmalloc slab an allocation of a certain size
  * belongs to.
- * 0 = zero alloc
- * 1 =  65 .. 96 bytes
- * 2 = 129 .. 192 bytes
- * n = 2^(n-1)+1 .. 2^n
  */
 static __always_inline unsigned int kmalloc_index(size_t size)
 {
@@ -359,36 +381,40 @@ static __always_inline unsigned int kmalloc_index(size_t 
size)
return ZERO_SIZE_ALLOC;
 
if (size <= KMALLOC_MIN_SIZE)
-   return KMALLOC_SHIFT_LOW;
-
-   if (KMALLOC_MIN_SIZE <= 32 && size > 64 && size <= 96)
-   return 1;
-   if (KMALLOC_MIN_SIZE <= 64 && size > 128 && size <= 192)
-   return 2;
-   if (size <=  8) return 3;
-   if (size <= 16) return 4;
-   if (size <= 32) return 5;
-   if (size <= 64) return 6;
-   if (size <=128) return 7;
-   if (size <=256) return 8;
-   if (size <=512) return 9;
-   if (size <=   1024) return 10;
-   if (size <=   2 * 1024) return 11;
-   if (size <=   4 * 1024) return 12;
-   if (size <=   8 * 1024) return 13;
-   if (size <=  16 * 1024) return 14;
-   if (size <=  32 * 1024) return 15;
-   if (size <=  64 * 1024) return 16;
-   if (size <= 128 * 1024) return 17;
-   if (size <= 256 * 1024) return 18;
-   if (size <= 512 * 1024) return 19;
-   if (size <= 1024 * 1024) return 20;
-   if (size <=  2 * 1024 * 1024) return 21;
-   if (size <=  4 * 1024 * 1024) return 22;
-   if (size <=  8 * 1024 * 1024) ret

[PATCH v4 3/7] mm, slab_common: Use enum kmalloc_cache_type to iterate over kmalloc caches

2019-09-15 Thread Pengfei Li
The type of local variable *type* of new_kmalloc_cache() should
be enum kmalloc_cache_type instead of int, so correct it.

Signed-off-by: Pengfei Li 
Acked-by: Vlastimil Babka 
Acked-by: Roman Gushchin 
---
 mm/slab_common.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/mm/slab_common.c b/mm/slab_common.c
index 8b542cfcc4f2..af45b5278fdc 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -1192,7 +1192,7 @@ void __init setup_kmalloc_cache_index_table(void)
 }
 
 static void __init
-new_kmalloc_cache(int idx, int type, slab_flags_t flags)
+new_kmalloc_cache(int idx, enum kmalloc_cache_type type, slab_flags_t flags)
 {
if (type == KMALLOC_RECLAIM)
flags |= SLAB_RECLAIM_ACCOUNT;
@@ -1210,7 +1210,8 @@ new_kmalloc_cache(int idx, int type, slab_flags_t flags)
  */
 void __init create_kmalloc_caches(slab_flags_t flags)
 {
-   int i, type;
+   int i;
+   enum kmalloc_cache_type type;
 
for (type = KMALLOC_NORMAL; type <= KMALLOC_RECLAIM; type++) {
for (i = KMALLOC_SHIFT_LOW; i <= KMALLOC_SHIFT_HIGH; i++) {
-- 
2.21.0



[PATCH v4 2/7] mm, slab: Remove unused kmalloc_size()

2019-09-15 Thread Pengfei Li
The size of kmalloc can be obtained from kmalloc_info[],
so remove kmalloc_size() that will not be used anymore.

Signed-off-by: Pengfei Li 
Acked-by: Vlastimil Babka 
Acked-by: Roman Gushchin 
---
 include/linux/slab.h | 20 
 mm/slab.c|  5 +++--
 mm/slab_common.c |  5 ++---
 3 files changed, 5 insertions(+), 25 deletions(-)

diff --git a/include/linux/slab.h b/include/linux/slab.h
index 56c9c7eed34e..e773e5764b7b 100644
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -557,26 +557,6 @@ static __always_inline void *kmalloc(size_t size, gfp_t 
flags)
return __kmalloc(size, flags);
 }
 
-/*
- * Determine size used for the nth kmalloc cache.
- * return size or 0 if a kmalloc cache for that
- * size does not exist
- */
-static __always_inline unsigned int kmalloc_size(unsigned int n)
-{
-#ifndef CONFIG_SLOB
-   if (n > 2)
-   return 1U << n;
-
-   if (n == 1 && KMALLOC_MIN_SIZE <= 32)
-   return 96;
-
-   if (n == 2 && KMALLOC_MIN_SIZE <= 64)
-   return 192;
-#endif
-   return 0;
-}
-
 static __always_inline void *kmalloc_node(size_t size, gfp_t flags, int node)
 {
 #ifndef CONFIG_SLOB
diff --git a/mm/slab.c b/mm/slab.c
index c42b6211f42e..7bc4e90e1147 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -1248,8 +1248,9 @@ void __init kmem_cache_init(void)
 */
kmalloc_caches[KMALLOC_NORMAL][INDEX_NODE] = create_kmalloc_cache(
kmalloc_info[INDEX_NODE].name[KMALLOC_NORMAL],
-   kmalloc_size(INDEX_NODE), ARCH_KMALLOC_FLAGS,
-   0, kmalloc_size(INDEX_NODE));
+   kmalloc_info[INDEX_NODE].size,
+   ARCH_KMALLOC_FLAGS, 0,
+   kmalloc_info[INDEX_NODE].size);
slab_state = PARTIAL_NODE;
setup_kmalloc_cache_index_table();
 
diff --git a/mm/slab_common.c b/mm/slab_common.c
index 002e16673581..8b542cfcc4f2 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -1239,11 +1239,10 @@ void __init create_kmalloc_caches(slab_flags_t flags)
struct kmem_cache *s = kmalloc_caches[KMALLOC_NORMAL][i];
 
if (s) {
-   unsigned int size = kmalloc_size(i);
-
kmalloc_caches[KMALLOC_DMA][i] = create_kmalloc_cache(
kmalloc_info[i].name[KMALLOC_DMA],
-   size, SLAB_CACHE_DMA | flags, 0, 0);
+   kmalloc_info[i].size,
+   SLAB_CACHE_DMA | flags, 0, 0);
}
}
 #endif
-- 
2.21.0



[PATCH v4 3/7] mm, slab_common: use enum kmalloc_cache_type to iterate over kmalloc caches

2019-09-15 Thread Pengfei Li
The type of local variable *type* of new_kmalloc_cache() should
be enum kmalloc_cache_type instead of int, so correct it.

Signed-off-by: Pengfei Li 
Acked-by: Vlastimil Babka 
Acked-by: Roman Gushchin 
---
 mm/slab_common.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/mm/slab_common.c b/mm/slab_common.c
index 8b542cfcc4f2..af45b5278fdc 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -1192,7 +1192,7 @@ void __init setup_kmalloc_cache_index_table(void)
 }
 
 static void __init
-new_kmalloc_cache(int idx, int type, slab_flags_t flags)
+new_kmalloc_cache(int idx, enum kmalloc_cache_type type, slab_flags_t flags)
 {
if (type == KMALLOC_RECLAIM)
flags |= SLAB_RECLAIM_ACCOUNT;
@@ -1210,7 +1210,8 @@ new_kmalloc_cache(int idx, int type, slab_flags_t flags)
  */
 void __init create_kmalloc_caches(slab_flags_t flags)
 {
-   int i, type;
+   int i;
+   enum kmalloc_cache_type type;
 
for (type = KMALLOC_NORMAL; type <= KMALLOC_RECLAIM; type++) {
for (i = KMALLOC_SHIFT_LOW; i <= KMALLOC_SHIFT_HIGH; i++) {
-- 
2.21.0



[PATCH v4 0/7] mm, slab: Make kmalloc_info[] contain all types of names

2019-09-15 Thread Pengfei Li
sr_get_mcn   269 252 -17
scsi_probe_and_add_lun  29472930 -17
ring_buffer_read_prepare 103  86 -17
request_firmware_nowait  405 388 -17
ohci_urb_enqueue31853168 -17
nfs_alloc_seqid   96  79 -17
nfs4_get_state_owner10491032 -17
nfs4_do_close587 570 -17
mempool_create_node  173 156 -17
ip6_setup_cork  10301013 -17
ida_alloc_range  951 934 -17
gss_import_sec_context   187 170 -17
dma_pool_alloc   419 402 -17
devres_open_group223 206 -17
cfg80211_parse_mbssid_data  24062389 -17
ip_setup_cork374 354 -20
kmalloc_caches   336 312 -24
__i915_sw_fence_await_sw_fence   429 405 -24
kmalloc_cache_name57   - -57
new_kmalloc_cache112   --112
create_kmalloc_caches270 148-122
kmalloc_info 432   8-424
Total: Before=14874616, After=14873867, chg -0.01%


Pengfei Li (7):
  mm, slab: Make kmalloc_info[] contain all types of names
  mm, slab: Remove unused kmalloc_size()
  mm, slab_common: Use enum kmalloc_cache_type to iterate over kmalloc
caches
  mm, slab: Return ZERO_SIZE_ALLOC for zero sized kmalloc requests
  mm, slab_common: Make kmalloc_caches[] start at size KMALLOC_MIN_SIZE
  mm, slab_common: Initialize the same size of kmalloc_caches[]
  mm, slab_common: Modify kmalloc_caches[type][idx] to
kmalloc_caches[idx][type]

 include/linux/slab.h | 136 ++
 mm/slab.c|  11 ++-
 mm/slab.h|  10 +-
 mm/slab_common.c | 224 ++-
 mm/slub.c|  12 +--
 5 files changed, 183 insertions(+), 210 deletions(-)

-- 
2.21.0



[PATCH v4 1/7] mm, slab: Make kmalloc_info[] contain all types of names

2019-09-15 Thread Pengfei Li
There are three types of kmalloc, KMALLOC_NORMAL, KMALLOC_RECLAIM
and KMALLOC_DMA.

The name of KMALLOC_NORMAL is contained in kmalloc_info[].name,
but the names of KMALLOC_RECLAIM and KMALLOC_DMA are dynamically
generated by kmalloc_cache_name().

This patch predefines the names of all types of kmalloc to save
the time spent dynamically generating names.

Besides, remove the kmalloc_cache_name() that is no longer used.

Signed-off-by: Pengfei Li 
Acked-by: Vlastimil Babka 
Acked-by: Roman Gushchin 
---
 mm/slab.c|  2 +-
 mm/slab.h|  2 +-
 mm/slab_common.c | 91 ++--
 3 files changed, 51 insertions(+), 44 deletions(-)

diff --git a/mm/slab.c b/mm/slab.c
index 9df370558e5d..c42b6211f42e 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -1247,7 +1247,7 @@ void __init kmem_cache_init(void)
 * structures first.  Without this, further allocations will bug.
 */
kmalloc_caches[KMALLOC_NORMAL][INDEX_NODE] = create_kmalloc_cache(
-   kmalloc_info[INDEX_NODE].name,
+   kmalloc_info[INDEX_NODE].name[KMALLOC_NORMAL],
kmalloc_size(INDEX_NODE), ARCH_KMALLOC_FLAGS,
0, kmalloc_size(INDEX_NODE));
slab_state = PARTIAL_NODE;
diff --git a/mm/slab.h b/mm/slab.h
index 9057b8056b07..2fc8f956906a 100644
--- a/mm/slab.h
+++ b/mm/slab.h
@@ -76,7 +76,7 @@ extern struct kmem_cache *kmem_cache;
 
 /* A table of kmalloc cache names and sizes */
 extern const struct kmalloc_info_struct {
-   const char *name;
+   const char *name[NR_KMALLOC_TYPES];
unsigned int size;
 } kmalloc_info[];
 
diff --git a/mm/slab_common.c b/mm/slab_common.c
index 807490fe217a..002e16673581 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -1092,26 +1092,56 @@ struct kmem_cache *kmalloc_slab(size_t size, gfp_t 
flags)
return kmalloc_caches[kmalloc_type(flags)][index];
 }
 
+#ifdef CONFIG_ZONE_DMA
+#define SET_KMALLOC_SIZE(__size, __short_size) \
+{  \
+   .name[KMALLOC_NORMAL]  = "kmalloc-" #__short_size,  \
+   .name[KMALLOC_RECLAIM] = "kmalloc-rcl-" #__short_size,  \
+   .name[KMALLOC_DMA] = "dma-kmalloc-" #__short_size,  \
+   .size = __size, \
+}
+#else
+#define SET_KMALLOC_SIZE(__size, __short_size) \
+{  \
+   .name[KMALLOC_NORMAL]  = "kmalloc-" #__short_size,  \
+   .name[KMALLOC_RECLAIM] = "kmalloc-rcl-" #__short_size,  \
+   .size = __size, \
+}
+#endif
+
 /*
  * kmalloc_info[] is to make slub_debug=,kmalloc-xx option work at boot time.
  * kmalloc_index() supports up to 2^26=64MB, so the final entry of the table is
  * kmalloc-67108864.
  */
 const struct kmalloc_info_struct kmalloc_info[] __initconst = {
-   {NULL,  0}, {"kmalloc-96", 96},
-   {"kmalloc-192",   192}, {"kmalloc-8",   8},
-   {"kmalloc-16", 16}, {"kmalloc-32", 32},
-   {"kmalloc-64", 64}, {"kmalloc-128",   128},
-   {"kmalloc-256",   256}, {"kmalloc-512",   512},
-   {"kmalloc-1k",   1024}, {"kmalloc-2k",   2048},
-   {"kmalloc-4k",   4096}, {"kmalloc-8k",   8192},
-   {"kmalloc-16k", 16384}, {"kmalloc-32k", 32768},
-   {"kmalloc-64k", 65536}, {"kmalloc-128k",   131072},
-   {"kmalloc-256k",   262144}, {"kmalloc-512k",   524288},
-   {"kmalloc-1M",1048576}, {"kmalloc-2M",2097152},
-   {"kmalloc-4M",4194304}, {"kmalloc-8M",8388608},
-   {"kmalloc-16M",  16777216}, {"kmalloc-32M",  33554432},
-   {"kmalloc-64M",  67108864}
+   SET_KMALLOC_SIZE(0, 0),
+   SET_KMALLOC_SIZE(96, 96),
+   SET_KMALLOC_SIZE(192, 192),
+   SET_KMALLOC_SIZE(8, 8),
+   SET_KMALLOC_SIZE(16, 16),
+   SET_KMALLOC_SIZE(32, 32),
+   SET_KMALLOC_SIZE(64, 64),
+   SET_KMALLOC_SIZE(128, 128),
+   SET_KMALLOC_SIZE(256, 256),
+   SET_KMALLOC_SIZE(512, 512),
+   SET_KMALLOC_SIZE(1024, 1k),
+   SET_KMALLOC_SIZE(2048, 2k),
+   SET_KMALLOC_SIZE(4096, 4k),
+   SET_KMALLOC_SIZE(8192, 8k),
+   SET_KMALLOC_SIZE(16384, 16k),
+   SET_KMALLOC_SIZE(32768, 32k),
+   SET_KMALLOC_SIZE(655

Re: [PATCH v3 4/4] mm, slab_common: Make the loop for initializing KMALLOC_DMA start from 1

2019-09-11 Thread Pengfei Li
On Tue, Sep 10, 2019 at 6:26 PM Vlastimil Babka  wrote:
>
> On 9/10/19 3:26 AM, Pengfei Li wrote:
> > KMALLOC_DMA will be initialized only if KMALLOC_NORMAL with
> > the same index exists.
> >
> > And kmalloc_caches[KMALLOC_NORMAL][0] is always NULL.
> >
> > Therefore, the loop that initializes KMALLOC_DMA should start
> > at 1 instead of 0, which will reduce 1 meaningless attempt.
>
> IMHO the saving of one iteration isn't worth making the code more
> subtle. KMALLOC_SHIFT_LOW would be nice, but that would skip 1 + 2 which
> are special.
>

Yes, I agree with you.
This really makes the code more subtle.

> Since you're doing these cleanups, have you considered reordering
> kmalloc_info, size_index, kmalloc_index() etc so that sizes 96 and 192
> are ordered naturally between 64, 128 and 256? That should remove
> various special casing such as in create_kmalloc_caches(). I can't
> guarantee it will be possible without breaking e.g. constant folding
> optimizations etc., but seems to me it should be feasible. (There are
> definitely more places to change than those I listed.)
>

In the past two days, I am working on what you suggested.

So far, I have completed the coding work, but I need some time to make
sure there are no bugs and verify the impact on performance.

I will send v4 soon.

Thank you for your review and suggestions.

--
Pengfei

> > Signed-off-by: Pengfei Li 
> > ---
> >   mm/slab_common.c | 2 +-
> >   1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/mm/slab_common.c b/mm/slab_common.c
> > index af45b5278fdc..c81fc7dc2946 100644
> > --- a/mm/slab_common.c
> > +++ b/mm/slab_common.c
> > @@ -1236,7 +1236,7 @@ void __init create_kmalloc_caches(slab_flags_t flags)
> >   slab_state = UP;
> >
> >   #ifdef CONFIG_ZONE_DMA
> > - for (i = 0; i <= KMALLOC_SHIFT_HIGH; i++) {
> > + for (i = 1; i <= KMALLOC_SHIFT_HIGH; i++) {
> >   struct kmem_cache *s = kmalloc_caches[KMALLOC_NORMAL][i];
> >
> >   if (s) {
> >
>


[PATCH v3 4/4] mm, slab_common: Make the loop for initializing KMALLOC_DMA start from 1

2019-09-09 Thread Pengfei Li
KMALLOC_DMA will be initialized only if KMALLOC_NORMAL with
the same index exists.

And kmalloc_caches[KMALLOC_NORMAL][0] is always NULL.

Therefore, the loop that initializes KMALLOC_DMA should start
at 1 instead of 0, which will reduce 1 meaningless attempt.

Signed-off-by: Pengfei Li 
---
 mm/slab_common.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/slab_common.c b/mm/slab_common.c
index af45b5278fdc..c81fc7dc2946 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -1236,7 +1236,7 @@ void __init create_kmalloc_caches(slab_flags_t flags)
slab_state = UP;
 
 #ifdef CONFIG_ZONE_DMA
-   for (i = 0; i <= KMALLOC_SHIFT_HIGH; i++) {
+   for (i = 1; i <= KMALLOC_SHIFT_HIGH; i++) {
struct kmem_cache *s = kmalloc_caches[KMALLOC_NORMAL][i];
 
if (s) {
-- 
2.21.0



[PATCH v3 1/4] mm, slab: Make kmalloc_info[] contain all types of names

2019-09-09 Thread Pengfei Li
There are three types of kmalloc, KMALLOC_NORMAL, KMALLOC_RECLAIM
and KMALLOC_DMA.

The name of KMALLOC_NORMAL is contained in kmalloc_info[].name,
but the names of KMALLOC_RECLAIM and KMALLOC_DMA are dynamically
generated by kmalloc_cache_name().

This patch predefines the names of all types of kmalloc to save
the time spent dynamically generating names.

Besides, remove the kmalloc_cache_name() that is no longer used.

Signed-off-by: Pengfei Li 
Acked-by: Vlastimil Babka 
Acked-by: Roman Gushchin 
---
 mm/slab.c|  2 +-
 mm/slab.h|  2 +-
 mm/slab_common.c | 91 ++--
 3 files changed, 51 insertions(+), 44 deletions(-)

diff --git a/mm/slab.c b/mm/slab.c
index 9df370558e5d..c42b6211f42e 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -1247,7 +1247,7 @@ void __init kmem_cache_init(void)
 * structures first.  Without this, further allocations will bug.
 */
kmalloc_caches[KMALLOC_NORMAL][INDEX_NODE] = create_kmalloc_cache(
-   kmalloc_info[INDEX_NODE].name,
+   kmalloc_info[INDEX_NODE].name[KMALLOC_NORMAL],
kmalloc_size(INDEX_NODE), ARCH_KMALLOC_FLAGS,
0, kmalloc_size(INDEX_NODE));
slab_state = PARTIAL_NODE;
diff --git a/mm/slab.h b/mm/slab.h
index 9057b8056b07..2fc8f956906a 100644
--- a/mm/slab.h
+++ b/mm/slab.h
@@ -76,7 +76,7 @@ extern struct kmem_cache *kmem_cache;
 
 /* A table of kmalloc cache names and sizes */
 extern const struct kmalloc_info_struct {
-   const char *name;
+   const char *name[NR_KMALLOC_TYPES];
unsigned int size;
 } kmalloc_info[];
 
diff --git a/mm/slab_common.c b/mm/slab_common.c
index 807490fe217a..002e16673581 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -1092,26 +1092,56 @@ struct kmem_cache *kmalloc_slab(size_t size, gfp_t 
flags)
return kmalloc_caches[kmalloc_type(flags)][index];
 }
 
+#ifdef CONFIG_ZONE_DMA
+#define SET_KMALLOC_SIZE(__size, __short_size) \
+{  \
+   .name[KMALLOC_NORMAL]  = "kmalloc-" #__short_size,  \
+   .name[KMALLOC_RECLAIM] = "kmalloc-rcl-" #__short_size,  \
+   .name[KMALLOC_DMA] = "dma-kmalloc-" #__short_size,  \
+   .size = __size, \
+}
+#else
+#define SET_KMALLOC_SIZE(__size, __short_size) \
+{  \
+   .name[KMALLOC_NORMAL]  = "kmalloc-" #__short_size,  \
+   .name[KMALLOC_RECLAIM] = "kmalloc-rcl-" #__short_size,  \
+   .size = __size, \
+}
+#endif
+
 /*
  * kmalloc_info[] is to make slub_debug=,kmalloc-xx option work at boot time.
  * kmalloc_index() supports up to 2^26=64MB, so the final entry of the table is
  * kmalloc-67108864.
  */
 const struct kmalloc_info_struct kmalloc_info[] __initconst = {
-   {NULL,  0}, {"kmalloc-96", 96},
-   {"kmalloc-192",   192}, {"kmalloc-8",   8},
-   {"kmalloc-16", 16}, {"kmalloc-32", 32},
-   {"kmalloc-64", 64}, {"kmalloc-128",   128},
-   {"kmalloc-256",   256}, {"kmalloc-512",   512},
-   {"kmalloc-1k",   1024}, {"kmalloc-2k",   2048},
-   {"kmalloc-4k",   4096}, {"kmalloc-8k",   8192},
-   {"kmalloc-16k", 16384}, {"kmalloc-32k", 32768},
-   {"kmalloc-64k", 65536}, {"kmalloc-128k",   131072},
-   {"kmalloc-256k",   262144}, {"kmalloc-512k",   524288},
-   {"kmalloc-1M",1048576}, {"kmalloc-2M",2097152},
-   {"kmalloc-4M",4194304}, {"kmalloc-8M",8388608},
-   {"kmalloc-16M",  16777216}, {"kmalloc-32M",  33554432},
-   {"kmalloc-64M",  67108864}
+   SET_KMALLOC_SIZE(0, 0),
+   SET_KMALLOC_SIZE(96, 96),
+   SET_KMALLOC_SIZE(192, 192),
+   SET_KMALLOC_SIZE(8, 8),
+   SET_KMALLOC_SIZE(16, 16),
+   SET_KMALLOC_SIZE(32, 32),
+   SET_KMALLOC_SIZE(64, 64),
+   SET_KMALLOC_SIZE(128, 128),
+   SET_KMALLOC_SIZE(256, 256),
+   SET_KMALLOC_SIZE(512, 512),
+   SET_KMALLOC_SIZE(1024, 1k),
+   SET_KMALLOC_SIZE(2048, 2k),
+   SET_KMALLOC_SIZE(4096, 4k),
+   SET_KMALLOC_SIZE(8192, 8k),
+   SET_KMALLOC_SIZE(16384, 16k),
+   SET_KMALLOC_SIZE(32768, 32k),
+   SET_KMALLOC_SIZE(655

[PATCH v3 0/4] mm, slab: Make kmalloc_info[] contain all types of names

2019-09-09 Thread Pengfei Li
Changes in v3
--
1. restore __initconst (patch 1/4)
2. rename patch 3/4
3. add more clarification for patch 4/4

Changes in v2
--
1. remove __initconst (patch 1/5)
2. squash patch 2/5
3. add ack tag from Vlastimil Babka


There are three types of kmalloc, KMALLOC_NORMAL, KMALLOC_RECLAIM
and KMALLOC_DMA.

The name of KMALLOC_NORMAL is contained in kmalloc_info[].name,
but the names of KMALLOC_RECLAIM and KMALLOC_DMA are dynamically
generated by kmalloc_cache_name().

Patch1 predefines the names of all types of kmalloc to save
the time spent dynamically generating names.

The other 4 patches did some cleanup work.

These changes make sense, and the time spent by new_kmalloc_cache()
has been reduced by approximately 36.3%.

 Time spent by
 new_kmalloc_cache()
5.3-rc7   66264
5.3-rc7+patch 42188


Pengfei Li (4):
  mm, slab: Make kmalloc_info[] contain all types of names
  mm, slab: Remove unused kmalloc_size()
  mm, slab_common: use enum kmalloc_cache_type to iterate over kmalloc
caches
  mm, slab_common: Make the loop for initializing KMALLOC_DMA start from
1

 include/linux/slab.h |  20 -
 mm/slab.c|   7 +--
 mm/slab.h|   2 +-
 mm/slab_common.c | 101 +++
 4 files changed, 59 insertions(+), 71 deletions(-)

-- 
2.21.0



[PATCH v3 2/4] mm, slab: Remove unused kmalloc_size()

2019-09-09 Thread Pengfei Li
The size of kmalloc can be obtained from kmalloc_info[],
so remove kmalloc_size() that will not be used anymore.

Signed-off-by: Pengfei Li 
Acked-by: Vlastimil Babka 
Acked-by: Roman Gushchin 
---
 include/linux/slab.h | 20 
 mm/slab.c|  5 +++--
 mm/slab_common.c |  5 ++---
 3 files changed, 5 insertions(+), 25 deletions(-)

diff --git a/include/linux/slab.h b/include/linux/slab.h
index 56c9c7eed34e..e773e5764b7b 100644
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -557,26 +557,6 @@ static __always_inline void *kmalloc(size_t size, gfp_t 
flags)
return __kmalloc(size, flags);
 }
 
-/*
- * Determine size used for the nth kmalloc cache.
- * return size or 0 if a kmalloc cache for that
- * size does not exist
- */
-static __always_inline unsigned int kmalloc_size(unsigned int n)
-{
-#ifndef CONFIG_SLOB
-   if (n > 2)
-   return 1U << n;
-
-   if (n == 1 && KMALLOC_MIN_SIZE <= 32)
-   return 96;
-
-   if (n == 2 && KMALLOC_MIN_SIZE <= 64)
-   return 192;
-#endif
-   return 0;
-}
-
 static __always_inline void *kmalloc_node(size_t size, gfp_t flags, int node)
 {
 #ifndef CONFIG_SLOB
diff --git a/mm/slab.c b/mm/slab.c
index c42b6211f42e..7bc4e90e1147 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -1248,8 +1248,9 @@ void __init kmem_cache_init(void)
 */
kmalloc_caches[KMALLOC_NORMAL][INDEX_NODE] = create_kmalloc_cache(
kmalloc_info[INDEX_NODE].name[KMALLOC_NORMAL],
-   kmalloc_size(INDEX_NODE), ARCH_KMALLOC_FLAGS,
-   0, kmalloc_size(INDEX_NODE));
+   kmalloc_info[INDEX_NODE].size,
+   ARCH_KMALLOC_FLAGS, 0,
+   kmalloc_info[INDEX_NODE].size);
slab_state = PARTIAL_NODE;
setup_kmalloc_cache_index_table();
 
diff --git a/mm/slab_common.c b/mm/slab_common.c
index 002e16673581..8b542cfcc4f2 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -1239,11 +1239,10 @@ void __init create_kmalloc_caches(slab_flags_t flags)
struct kmem_cache *s = kmalloc_caches[KMALLOC_NORMAL][i];
 
if (s) {
-   unsigned int size = kmalloc_size(i);
-
kmalloc_caches[KMALLOC_DMA][i] = create_kmalloc_cache(
kmalloc_info[i].name[KMALLOC_DMA],
-   size, SLAB_CACHE_DMA | flags, 0, 0);
+   kmalloc_info[i].size,
+   SLAB_CACHE_DMA | flags, 0, 0);
}
}
 #endif
-- 
2.21.0



[PATCH v3 3/4] mm, slab_common: use enum kmalloc_cache_type to iterate over kmalloc caches

2019-09-09 Thread Pengfei Li
The type of local variable *type* of new_kmalloc_cache() should
be enum kmalloc_cache_type instead of int, so correct it.

Signed-off-by: Pengfei Li 
Acked-by: Vlastimil Babka 
Acked-by: Roman Gushchin 
---
 mm/slab_common.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/mm/slab_common.c b/mm/slab_common.c
index 8b542cfcc4f2..af45b5278fdc 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -1192,7 +1192,7 @@ void __init setup_kmalloc_cache_index_table(void)
 }
 
 static void __init
-new_kmalloc_cache(int idx, int type, slab_flags_t flags)
+new_kmalloc_cache(int idx, enum kmalloc_cache_type type, slab_flags_t flags)
 {
if (type == KMALLOC_RECLAIM)
flags |= SLAB_RECLAIM_ACCOUNT;
@@ -1210,7 +1210,8 @@ new_kmalloc_cache(int idx, int type, slab_flags_t flags)
  */
 void __init create_kmalloc_caches(slab_flags_t flags)
 {
-   int i, type;
+   int i;
+   enum kmalloc_cache_type type;
 
for (type = KMALLOC_NORMAL; type <= KMALLOC_RECLAIM; type++) {
for (i = KMALLOC_SHIFT_LOW; i <= KMALLOC_SHIFT_HIGH; i++) {
-- 
2.21.0



Re: [PATCH v2 3/4] mm, slab_common: Make 'type' is enum kmalloc_cache_type

2019-09-09 Thread Pengfei Li
On Tue, Sep 10, 2019 at 4:00 AM Roman Gushchin  wrote:
>
> On Tue, Sep 10, 2019 at 01:07:14AM +0800, Pengfei Li wrote:
>
> Hi Pengfei!
>
> > The 'type' of the function new_kmalloc_cache should be
> > enum kmalloc_cache_type instead of int, so correct it.
>
> I think you mean type of the 'i' variable, not the type of
> new_kmalloc_cache() function. Also the name of the patch is
> misleading. How about
> mm, slab_common: use enum kmalloc_cache_type to iterate over kmalloc caches ?
> Or something like this.
>

Ok, this name is really better :)

> The rest of the series looks good to me.
>
> Please, feel free to use
> Acked-by: Roman Gushchin 
> for patches [1-3] in the series after fixing this commit message and
> restoring __initconst.
>

Thanks!

> Patch [4] needs some additional clarifications, IMO.
>

I will add more clarification in v3.

> Thank you!
>
> >
> > Signed-off-by: Pengfei Li 
> > Acked-by: Vlastimil Babka 
>
>
>
> > ---
> >  mm/slab_common.c | 5 +++--
> >  1 file changed, 3 insertions(+), 2 deletions(-)
> >
> > diff --git a/mm/slab_common.c b/mm/slab_common.c
> > index cae27210e4c3..d64a64660f86 100644
> > --- a/mm/slab_common.c
> > +++ b/mm/slab_common.c
> > @@ -1192,7 +1192,7 @@ void __init setup_kmalloc_cache_index_table(void)
> >  }
> >
> >  static void __init
> > -new_kmalloc_cache(int idx, int type, slab_flags_t flags)
> > +new_kmalloc_cache(int idx, enum kmalloc_cache_type type, slab_flags_t 
> > flags)
> >  {
> >   if (type == KMALLOC_RECLAIM)
> >   flags |= SLAB_RECLAIM_ACCOUNT;
> > @@ -1210,7 +1210,8 @@ new_kmalloc_cache(int idx, int type, slab_flags_t 
> > flags)
> >   */
> >  void __init create_kmalloc_caches(slab_flags_t flags)
> >  {
> > - int i, type;
> > + int i;
> > + enum kmalloc_cache_type type;
> >
> >   for (type = KMALLOC_NORMAL; type <= KMALLOC_RECLAIM; type++) {
> >   for (i = KMALLOC_SHIFT_LOW; i <= KMALLOC_SHIFT_HIGH; i++) {
> > --
> > 2.21.0
> >
> >


Re: [PATCH 1/5] mm, slab: Make kmalloc_info[] contain all types of names

2019-09-09 Thread Pengfei Li
On Tue, Sep 10, 2019 at 2:30 AM Rasmus Villemoes
 wrote:
>
> On 09/09/2019 18.53, Pengfei Li wrote:
> > On Mon, Sep 9, 2019 at 10:59 PM Vlastimil Babka  wrote:
>
> >>>   /*
> >>>* kmalloc_info[] is to make slub_debug=,kmalloc-xx option work at boot 
> >>> time.
> >>>* kmalloc_index() supports up to 2^26=64MB, so the final entry of the 
> >>> table is
> >>>* kmalloc-67108864.
> >>>*/
> >>>   const struct kmalloc_info_struct kmalloc_info[] __initconst = {
> >>
> >> BTW should it really be an __initconst, when references to the names
> >> keep on living in kmem_cache structs? Isn't this for data that's
> >> discarded after init?
> >
> > You are right, I will remove __initconst in v2.
>
> No, __initconst is correct, and should be kept. The string literals
> which the .name pointers point to live in .rodata, and we're copying the
> values of these .name pointers. Nothing refers to something inside
> kmalloc_info[] after init. (It would be a whole different matter if
> struct kmalloc_info_struct consisted of { char name[NN]; unsigned int
> size; }).
>

Thank you for your comment. I will keep it in v3.

I did learn :)


> Rasmus


[PATCH v2 4/4] mm, slab_common: Make initializing KMALLOC_DMA start from 1

2019-09-09 Thread Pengfei Li
kmalloc_caches[KMALLOC_NORMAL][0] will never be initialized,
so the loop should start at 1 instead of 0

Signed-off-by: Pengfei Li 
---
 mm/slab_common.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/slab_common.c b/mm/slab_common.c
index d64a64660f86..6b3e526934d9 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -1236,7 +1236,7 @@ void __init create_kmalloc_caches(slab_flags_t flags)
slab_state = UP;
 
 #ifdef CONFIG_ZONE_DMA
-   for (i = 0; i <= KMALLOC_SHIFT_HIGH; i++) {
+   for (i = 1; i <= KMALLOC_SHIFT_HIGH; i++) {
struct kmem_cache *s = kmalloc_caches[KMALLOC_NORMAL][i];
 
if (s) {
-- 
2.21.0



[PATCH v2 2/4] mm, slab: Remove unused kmalloc_size()

2019-09-09 Thread Pengfei Li
The size of kmalloc can be obtained from kmalloc_info[],
so remove kmalloc_size() that will not be used anymore.

Signed-off-by: Pengfei Li 
Acked-by: Vlastimil Babka 
---
 include/linux/slab.h | 20 
 mm/slab.c|  5 +++--
 mm/slab_common.c |  5 ++---
 3 files changed, 5 insertions(+), 25 deletions(-)

diff --git a/include/linux/slab.h b/include/linux/slab.h
index 56c9c7eed34e..e773e5764b7b 100644
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -557,26 +557,6 @@ static __always_inline void *kmalloc(size_t size, gfp_t 
flags)
return __kmalloc(size, flags);
 }
 
-/*
- * Determine size used for the nth kmalloc cache.
- * return size or 0 if a kmalloc cache for that
- * size does not exist
- */
-static __always_inline unsigned int kmalloc_size(unsigned int n)
-{
-#ifndef CONFIG_SLOB
-   if (n > 2)
-   return 1U << n;
-
-   if (n == 1 && KMALLOC_MIN_SIZE <= 32)
-   return 96;
-
-   if (n == 2 && KMALLOC_MIN_SIZE <= 64)
-   return 192;
-#endif
-   return 0;
-}
-
 static __always_inline void *kmalloc_node(size_t size, gfp_t flags, int node)
 {
 #ifndef CONFIG_SLOB
diff --git a/mm/slab.c b/mm/slab.c
index c42b6211f42e..7bc4e90e1147 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -1248,8 +1248,9 @@ void __init kmem_cache_init(void)
 */
kmalloc_caches[KMALLOC_NORMAL][INDEX_NODE] = create_kmalloc_cache(
kmalloc_info[INDEX_NODE].name[KMALLOC_NORMAL],
-   kmalloc_size(INDEX_NODE), ARCH_KMALLOC_FLAGS,
-   0, kmalloc_size(INDEX_NODE));
+   kmalloc_info[INDEX_NODE].size,
+   ARCH_KMALLOC_FLAGS, 0,
+   kmalloc_info[INDEX_NODE].size);
slab_state = PARTIAL_NODE;
setup_kmalloc_cache_index_table();
 
diff --git a/mm/slab_common.c b/mm/slab_common.c
index 61c1e2e54263..cae27210e4c3 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -1239,11 +1239,10 @@ void __init create_kmalloc_caches(slab_flags_t flags)
struct kmem_cache *s = kmalloc_caches[KMALLOC_NORMAL][i];
 
if (s) {
-   unsigned int size = kmalloc_size(i);
-
kmalloc_caches[KMALLOC_DMA][i] = create_kmalloc_cache(
kmalloc_info[i].name[KMALLOC_DMA],
-   size, SLAB_CACHE_DMA | flags, 0, 0);
+   kmalloc_info[i].size,
+   SLAB_CACHE_DMA | flags, 0, 0);
}
}
 #endif
-- 
2.21.0



[PATCH v2 1/4] mm, slab: Make kmalloc_info[] contain all types of names

2019-09-09 Thread Pengfei Li
There are three types of kmalloc, KMALLOC_NORMAL, KMALLOC_RECLAIM
and KMALLOC_DMA.

The name of KMALLOC_NORMAL is contained in kmalloc_info[].name,
but the names of KMALLOC_RECLAIM and KMALLOC_DMA are dynamically
generated by kmalloc_cache_name().

This patch predefines the names of all types of kmalloc to save
the time spent dynamically generating names.

Besides, remove the kmalloc_cache_name() that is no longer used.

Signed-off-by: Pengfei Li 
Acked-by: Vlastimil Babka 
---
 mm/slab.c|  2 +-
 mm/slab.h|  2 +-
 mm/slab_common.c | 93 ++--
 3 files changed, 52 insertions(+), 45 deletions(-)

diff --git a/mm/slab.c b/mm/slab.c
index 9df370558e5d..c42b6211f42e 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -1247,7 +1247,7 @@ void __init kmem_cache_init(void)
 * structures first.  Without this, further allocations will bug.
 */
kmalloc_caches[KMALLOC_NORMAL][INDEX_NODE] = create_kmalloc_cache(
-   kmalloc_info[INDEX_NODE].name,
+   kmalloc_info[INDEX_NODE].name[KMALLOC_NORMAL],
kmalloc_size(INDEX_NODE), ARCH_KMALLOC_FLAGS,
0, kmalloc_size(INDEX_NODE));
slab_state = PARTIAL_NODE;
diff --git a/mm/slab.h b/mm/slab.h
index 9057b8056b07..2fc8f956906a 100644
--- a/mm/slab.h
+++ b/mm/slab.h
@@ -76,7 +76,7 @@ extern struct kmem_cache *kmem_cache;
 
 /* A table of kmalloc cache names and sizes */
 extern const struct kmalloc_info_struct {
-   const char *name;
+   const char *name[NR_KMALLOC_TYPES];
unsigned int size;
 } kmalloc_info[];
 
diff --git a/mm/slab_common.c b/mm/slab_common.c
index 807490fe217a..61c1e2e54263 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -1092,26 +1092,56 @@ struct kmem_cache *kmalloc_slab(size_t size, gfp_t 
flags)
return kmalloc_caches[kmalloc_type(flags)][index];
 }
 
+#ifdef CONFIG_ZONE_DMA
+#define SET_KMALLOC_SIZE(__size, __short_size) \
+{  \
+   .name[KMALLOC_NORMAL]  = "kmalloc-" #__short_size,  \
+   .name[KMALLOC_RECLAIM] = "kmalloc-rcl-" #__short_size,  \
+   .name[KMALLOC_DMA] = "dma-kmalloc-" #__short_size,  \
+   .size = __size, \
+}
+#else
+#define SET_KMALLOC_SIZE(__size, __short_size) \
+{  \
+   .name[KMALLOC_NORMAL]  = "kmalloc-" #__short_size,  \
+   .name[KMALLOC_RECLAIM] = "kmalloc-rcl-" #__short_size,  \
+   .size = __size, \
+}
+#endif
+
 /*
  * kmalloc_info[] is to make slub_debug=,kmalloc-xx option work at boot time.
  * kmalloc_index() supports up to 2^26=64MB, so the final entry of the table is
  * kmalloc-67108864.
  */
-const struct kmalloc_info_struct kmalloc_info[] __initconst = {
-   {NULL,  0}, {"kmalloc-96", 96},
-   {"kmalloc-192",   192}, {"kmalloc-8",   8},
-   {"kmalloc-16", 16}, {"kmalloc-32", 32},
-   {"kmalloc-64", 64}, {"kmalloc-128",   128},
-   {"kmalloc-256",   256}, {"kmalloc-512",   512},
-   {"kmalloc-1k",   1024}, {"kmalloc-2k",   2048},
-   {"kmalloc-4k",   4096}, {"kmalloc-8k",   8192},
-   {"kmalloc-16k", 16384}, {"kmalloc-32k", 32768},
-   {"kmalloc-64k", 65536}, {"kmalloc-128k",   131072},
-   {"kmalloc-256k",   262144}, {"kmalloc-512k",   524288},
-   {"kmalloc-1M",1048576}, {"kmalloc-2M",2097152},
-   {"kmalloc-4M",4194304}, {"kmalloc-8M",8388608},
-   {"kmalloc-16M",  16777216}, {"kmalloc-32M",  33554432},
-   {"kmalloc-64M",  67108864}
+const struct kmalloc_info_struct kmalloc_info[] = {
+   SET_KMALLOC_SIZE(0, 0),
+   SET_KMALLOC_SIZE(96, 96),
+   SET_KMALLOC_SIZE(192, 192),
+   SET_KMALLOC_SIZE(8, 8),
+   SET_KMALLOC_SIZE(16, 16),
+   SET_KMALLOC_SIZE(32, 32),
+   SET_KMALLOC_SIZE(64, 64),
+   SET_KMALLOC_SIZE(128, 128),
+   SET_KMALLOC_SIZE(256, 256),
+   SET_KMALLOC_SIZE(512, 512),
+   SET_KMALLOC_SIZE(1024, 1k),
+   SET_KMALLOC_SIZE(2048, 2k),
+   SET_KMALLOC_SIZE(4096, 4k),
+   SET_KMALLOC_SIZE(8192, 8k),
+   SET_KMALLOC_SIZE(16384, 16k),
+   SET_KMALLOC_SIZE(32768, 32k),
+

[PATCH v2 3/4] mm, slab_common: Make 'type' is enum kmalloc_cache_type

2019-09-09 Thread Pengfei Li
The 'type' of the function new_kmalloc_cache should be
enum kmalloc_cache_type instead of int, so correct it.

Signed-off-by: Pengfei Li 
Acked-by: Vlastimil Babka 
---
 mm/slab_common.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/mm/slab_common.c b/mm/slab_common.c
index cae27210e4c3..d64a64660f86 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -1192,7 +1192,7 @@ void __init setup_kmalloc_cache_index_table(void)
 }
 
 static void __init
-new_kmalloc_cache(int idx, int type, slab_flags_t flags)
+new_kmalloc_cache(int idx, enum kmalloc_cache_type type, slab_flags_t flags)
 {
if (type == KMALLOC_RECLAIM)
flags |= SLAB_RECLAIM_ACCOUNT;
@@ -1210,7 +1210,8 @@ new_kmalloc_cache(int idx, int type, slab_flags_t flags)
  */
 void __init create_kmalloc_caches(slab_flags_t flags)
 {
-   int i, type;
+   int i;
+   enum kmalloc_cache_type type;
 
for (type = KMALLOC_NORMAL; type <= KMALLOC_RECLAIM; type++) {
for (i = KMALLOC_SHIFT_LOW; i <= KMALLOC_SHIFT_HIGH; i++) {
-- 
2.21.0



[PATCH v2 0/4] mm, slab: Make kmalloc_info[] contain all types of names

2019-09-09 Thread Pengfei Li


Changes in v2
--
1. remove __initconst (patch 1/5)
2. squash patch 2/5
3. add ack tag from Vlastimil Babka


There are three types of kmalloc, KMALLOC_NORMAL, KMALLOC_RECLAIM
and KMALLOC_DMA.

The name of KMALLOC_NORMAL is contained in kmalloc_info[].name,
but the names of KMALLOC_RECLAIM and KMALLOC_DMA are dynamically
generated by kmalloc_cache_name().

Patch1 predefines the names of all types of kmalloc to save
the time spent dynamically generating names.

The other 4 patches did some cleanup work.

These changes make sense, and the time spent by new_kmalloc_cache()
has been reduced by approximately 36.3%.

 Time spent by
 new_kmalloc_cache()
5.3-rc7   66264
5.3-rc7+patch 42188

Pengfei Li (4):
  mm, slab: Make kmalloc_info[] contain all types of names
  mm, slab: Remove unused kmalloc_size()
  mm, slab_common: Make 'type' is enum kmalloc_cache_type
  mm, slab_common: Make initializing KMALLOC_DMA start from 1

 include/linux/slab.h |  20 -
 mm/slab.c|   7 +--
 mm/slab.h|   2 +-
 mm/slab_common.c | 103 +++
 4 files changed, 60 insertions(+), 72 deletions(-)

-- 
2.21.0



Re: [PATCH 2/5] mm, slab_common: Remove unused kmalloc_cache_name()

2019-09-09 Thread Pengfei Li
On Mon, Sep 9, 2019 at 10:59 PM Vlastimil Babka  wrote:
>
> On 9/3/19 6:04 PM, Pengfei Li wrote:
> > Since the name of kmalloc can be obtained from kmalloc_info[],
> > remove the kmalloc_cache_name() that is no longer used.
>
> That could simply be part of patch 1/5 really.
>

Ok, thanks.

> > Signed-off-by: Pengfei Li 
>
> Ack
>
> > ---
> >   mm/slab_common.c | 15 ---
> >   1 file changed, 15 deletions(-)


Re: [PATCH 1/5] mm, slab: Make kmalloc_info[] contain all types of names

2019-09-09 Thread Pengfei Li
On Mon, Sep 9, 2019 at 10:59 PM Vlastimil Babka  wrote:
>
> On 9/3/19 6:04 PM, Pengfei Li wrote:
> > There are three types of kmalloc, KMALLOC_NORMAL, KMALLOC_RECLAIM
> > and KMALLOC_DMA.
> >
> > The name of KMALLOC_NORMAL is contained in kmalloc_info[].name,
> > but the names of KMALLOC_RECLAIM and KMALLOC_DMA are dynamically
> > generated by kmalloc_cache_name().
> >
> > This patch predefines the names of all types of kmalloc to save
> > the time spent dynamically generating names.
>
> As I said, IMHO it's more useful that we don't need to allocate the
> names dynamically anymore, and it's simpler overall.
>

Thank you very much for your review.

> > Signed-off-by: Pengfei Li 
>
> Acked-by: Vlastimil Babka 
>
> >   /*
> >* kmalloc_info[] is to make slub_debug=,kmalloc-xx option work at boot 
> > time.
> >* kmalloc_index() supports up to 2^26=64MB, so the final entry of the 
> > table is
> >* kmalloc-67108864.
> >*/
> >   const struct kmalloc_info_struct kmalloc_info[] __initconst = {
>
> BTW should it really be an __initconst, when references to the names
> keep on living in kmem_cache structs? Isn't this for data that's
> discarded after init?

You are right, I will remove __initconst in v2.


Re: [PATCH 0/5] mm, slab: Make kmalloc_info[] contain all types of names

2019-09-05 Thread Pengfei Li
On Thu, Sep 5, 2019 at 8:25 PM Vlastimil Babka  wrote:
>
> On 9/3/19 6:04 PM, Pengfei Li wrote:
> > There are three types of kmalloc, KMALLOC_NORMAL, KMALLOC_RECLAIM
> > and KMALLOC_DMA.
> >
> > The name of KMALLOC_NORMAL is contained in kmalloc_info[].name,
> > but the names of KMALLOC_RECLAIM and KMALLOC_DMA are dynamically
> > generated by kmalloc_cache_name().
> >
> > Patch1 predefines the names of all types of kmalloc to save
> > the time spent dynamically generating names.
> >
> > The other 4 patches did some cleanup work.
> >
> > These changes make sense, and the time spent by new_kmalloc_cache()
> > has been reduced by approximately 36.3%.
> >
> >  Time spent by
> >  new_kmalloc_cache()
> > 5.3-rc7   66264
> > 5.3-rc7+patch 42188
>
> Note that the caches are created only once upon boot, so I doubt that

Thank you for your comments.
Yes, kmalloc-xxx are only created at boot time.

> these time savings (is it in CPU cycles?) will be noticeable at all.

Yes, it is CPU cycles.

> But diffstat looks ok, and it avoids using kmalloc() (via kasprintf()) to
> allocate names for kmalloc(), so in that sense I think it's worthwhile
> to consider. Thanks.
>

Thanks.

> > Pengfei Li (5):
> >   mm, slab: Make kmalloc_info[] contain all types of names
> >   mm, slab_common: Remove unused kmalloc_cache_name()
> >   mm, slab: Remove unused kmalloc_size()
> >   mm, slab_common: Make 'type' is enum kmalloc_cache_type
> >   mm, slab_common: Make initializing KMALLOC_DMA start from 1
> >
> >  include/linux/slab.h |  20 -
> >  mm/slab.c|   7 +--
> >  mm/slab.h|   2 +-
> >  mm/slab_common.c | 101 +++
> >  4 files changed, 59 insertions(+), 71 deletions(-)
> >
>


Re: [PATCH 0/5] mm, slab: Make kmalloc_info[] contain all types of names

2019-09-04 Thread Pengfei Li
On Thu, Sep 5, 2019 at 3:27 AM Christopher Lameter  wrote:
>
> On Wed, 4 Sep 2019, Pengfei Li wrote:
>
> > There are three types of kmalloc, KMALLOC_NORMAL, KMALLOC_RECLAIM
> > and KMALLOC_DMA.
>
> I only got a few patches of this set. Can I see the complete patchset
> somewhere?

Yes, you can get the patches from
https://patchwork.kernel.org/cover/11128325/ or
https://lore.kernel.org/patchwork/cover/1123412/

Hope you like it :)


[PATCH 5/5] mm, slab_common: Make initializing KMALLOC_DMA start from 1

2019-09-03 Thread Pengfei Li
kmalloc_caches[KMALLOC_NORMAL][0] will never be initialized,
so the loop should start at 1 instead of 0

Signed-off-by: Pengfei Li 
---
 mm/slab_common.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/slab_common.c b/mm/slab_common.c
index af45b5278fdc..c81fc7dc2946 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -1236,7 +1236,7 @@ void __init create_kmalloc_caches(slab_flags_t flags)
slab_state = UP;
 
 #ifdef CONFIG_ZONE_DMA
-   for (i = 0; i <= KMALLOC_SHIFT_HIGH; i++) {
+   for (i = 1; i <= KMALLOC_SHIFT_HIGH; i++) {
struct kmem_cache *s = kmalloc_caches[KMALLOC_NORMAL][i];
 
if (s) {
-- 
2.21.0



[PATCH 4/5] mm, slab_common: Make 'type' is enum kmalloc_cache_type

2019-09-03 Thread Pengfei Li
The 'type' of the function new_kmalloc_cache should be
enum kmalloc_cache_type instead of int, so correct it.

Signed-off-by: Pengfei Li 
---
 mm/slab_common.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/mm/slab_common.c b/mm/slab_common.c
index 8b542cfcc4f2..af45b5278fdc 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -1192,7 +1192,7 @@ void __init setup_kmalloc_cache_index_table(void)
 }
 
 static void __init
-new_kmalloc_cache(int idx, int type, slab_flags_t flags)
+new_kmalloc_cache(int idx, enum kmalloc_cache_type type, slab_flags_t flags)
 {
if (type == KMALLOC_RECLAIM)
flags |= SLAB_RECLAIM_ACCOUNT;
@@ -1210,7 +1210,8 @@ new_kmalloc_cache(int idx, int type, slab_flags_t flags)
  */
 void __init create_kmalloc_caches(slab_flags_t flags)
 {
-   int i, type;
+   int i;
+   enum kmalloc_cache_type type;
 
for (type = KMALLOC_NORMAL; type <= KMALLOC_RECLAIM; type++) {
for (i = KMALLOC_SHIFT_LOW; i <= KMALLOC_SHIFT_HIGH; i++) {
-- 
2.21.0



[PATCH 3/5] mm, slab: Remove unused kmalloc_size()

2019-09-03 Thread Pengfei Li
The size of kmalloc can be obtained from kmalloc_info[],
so remove kmalloc_size() that will not be used anymore.

Signed-off-by: Pengfei Li 
---
 include/linux/slab.h | 20 
 mm/slab.c|  5 +++--
 mm/slab_common.c |  5 ++---
 3 files changed, 5 insertions(+), 25 deletions(-)

diff --git a/include/linux/slab.h b/include/linux/slab.h
index 56c9c7eed34e..e773e5764b7b 100644
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -557,26 +557,6 @@ static __always_inline void *kmalloc(size_t size, gfp_t 
flags)
return __kmalloc(size, flags);
 }
 
-/*
- * Determine size used for the nth kmalloc cache.
- * return size or 0 if a kmalloc cache for that
- * size does not exist
- */
-static __always_inline unsigned int kmalloc_size(unsigned int n)
-{
-#ifndef CONFIG_SLOB
-   if (n > 2)
-   return 1U << n;
-
-   if (n == 1 && KMALLOC_MIN_SIZE <= 32)
-   return 96;
-
-   if (n == 2 && KMALLOC_MIN_SIZE <= 64)
-   return 192;
-#endif
-   return 0;
-}
-
 static __always_inline void *kmalloc_node(size_t size, gfp_t flags, int node)
 {
 #ifndef CONFIG_SLOB
diff --git a/mm/slab.c b/mm/slab.c
index c42b6211f42e..7bc4e90e1147 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -1248,8 +1248,9 @@ void __init kmem_cache_init(void)
 */
kmalloc_caches[KMALLOC_NORMAL][INDEX_NODE] = create_kmalloc_cache(
kmalloc_info[INDEX_NODE].name[KMALLOC_NORMAL],
-   kmalloc_size(INDEX_NODE), ARCH_KMALLOC_FLAGS,
-   0, kmalloc_size(INDEX_NODE));
+   kmalloc_info[INDEX_NODE].size,
+   ARCH_KMALLOC_FLAGS, 0,
+   kmalloc_info[INDEX_NODE].size);
slab_state = PARTIAL_NODE;
setup_kmalloc_cache_index_table();
 
diff --git a/mm/slab_common.c b/mm/slab_common.c
index 002e16673581..8b542cfcc4f2 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -1239,11 +1239,10 @@ void __init create_kmalloc_caches(slab_flags_t flags)
struct kmem_cache *s = kmalloc_caches[KMALLOC_NORMAL][i];
 
if (s) {
-   unsigned int size = kmalloc_size(i);
-
kmalloc_caches[KMALLOC_DMA][i] = create_kmalloc_cache(
kmalloc_info[i].name[KMALLOC_DMA],
-   size, SLAB_CACHE_DMA | flags, 0, 0);
+   kmalloc_info[i].size,
+   SLAB_CACHE_DMA | flags, 0, 0);
}
}
 #endif
-- 
2.21.0



[PATCH 2/5] mm, slab_common: Remove unused kmalloc_cache_name()

2019-09-03 Thread Pengfei Li
Since the name of kmalloc can be obtained from kmalloc_info[],
remove the kmalloc_cache_name() that is no longer used.

Signed-off-by: Pengfei Li 
---
 mm/slab_common.c | 15 ---
 1 file changed, 15 deletions(-)

diff --git a/mm/slab_common.c b/mm/slab_common.c
index 7bd88cc09987..002e16673581 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -1191,21 +1191,6 @@ void __init setup_kmalloc_cache_index_table(void)
}
 }
 
-static const char *
-kmalloc_cache_name(const char *prefix, unsigned int size)
-{
-
-   static const char units[3] = "\0kM";
-   int idx = 0;
-
-   while (size >= 1024 && (size % 1024 == 0)) {
-   size /= 1024;
-   idx++;
-   }
-
-   return kasprintf(GFP_NOWAIT, "%s-%u%c", prefix, size, units[idx]);
-}
-
 static void __init
 new_kmalloc_cache(int idx, int type, slab_flags_t flags)
 {
-- 
2.21.0



[PATCH 1/5] mm, slab: Make kmalloc_info[] contain all types of names

2019-09-03 Thread Pengfei Li
There are three types of kmalloc, KMALLOC_NORMAL, KMALLOC_RECLAIM
and KMALLOC_DMA.

The name of KMALLOC_NORMAL is contained in kmalloc_info[].name,
but the names of KMALLOC_RECLAIM and KMALLOC_DMA are dynamically
generated by kmalloc_cache_name().

This patch predefines the names of all types of kmalloc to save
the time spent dynamically generating names.

Signed-off-by: Pengfei Li 
---
 mm/slab.c|  2 +-
 mm/slab.h|  2 +-
 mm/slab_common.c | 76 +++-
 3 files changed, 51 insertions(+), 29 deletions(-)

diff --git a/mm/slab.c b/mm/slab.c
index 9df370558e5d..c42b6211f42e 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -1247,7 +1247,7 @@ void __init kmem_cache_init(void)
 * structures first.  Without this, further allocations will bug.
 */
kmalloc_caches[KMALLOC_NORMAL][INDEX_NODE] = create_kmalloc_cache(
-   kmalloc_info[INDEX_NODE].name,
+   kmalloc_info[INDEX_NODE].name[KMALLOC_NORMAL],
kmalloc_size(INDEX_NODE), ARCH_KMALLOC_FLAGS,
0, kmalloc_size(INDEX_NODE));
slab_state = PARTIAL_NODE;
diff --git a/mm/slab.h b/mm/slab.h
index 9057b8056b07..2fc8f956906a 100644
--- a/mm/slab.h
+++ b/mm/slab.h
@@ -76,7 +76,7 @@ extern struct kmem_cache *kmem_cache;
 
 /* A table of kmalloc cache names and sizes */
 extern const struct kmalloc_info_struct {
-   const char *name;
+   const char *name[NR_KMALLOC_TYPES];
unsigned int size;
 } kmalloc_info[];
 
diff --git a/mm/slab_common.c b/mm/slab_common.c
index 807490fe217a..7bd88cc09987 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -1092,26 +1092,56 @@ struct kmem_cache *kmalloc_slab(size_t size, gfp_t 
flags)
return kmalloc_caches[kmalloc_type(flags)][index];
 }
 
+#ifdef CONFIG_ZONE_DMA
+#define SET_KMALLOC_SIZE(__size, __short_size) \
+{  \
+   .name[KMALLOC_NORMAL]  = "kmalloc-" #__short_size,  \
+   .name[KMALLOC_RECLAIM] = "kmalloc-rcl-" #__short_size,  \
+   .name[KMALLOC_DMA] = "dma-kmalloc-" #__short_size,  \
+   .size = __size, \
+}
+#else
+#define SET_KMALLOC_SIZE(__size, __short_size) \
+{  \
+   .name[KMALLOC_NORMAL]  = "kmalloc-" #__short_size,  \
+   .name[KMALLOC_RECLAIM] = "kmalloc-rcl-" #__short_size,  \
+   .size = __size, \
+}
+#endif
+
 /*
  * kmalloc_info[] is to make slub_debug=,kmalloc-xx option work at boot time.
  * kmalloc_index() supports up to 2^26=64MB, so the final entry of the table is
  * kmalloc-67108864.
  */
 const struct kmalloc_info_struct kmalloc_info[] __initconst = {
-   {NULL,  0}, {"kmalloc-96", 96},
-   {"kmalloc-192",   192}, {"kmalloc-8",   8},
-   {"kmalloc-16", 16}, {"kmalloc-32", 32},
-   {"kmalloc-64", 64}, {"kmalloc-128",   128},
-   {"kmalloc-256",   256}, {"kmalloc-512",   512},
-   {"kmalloc-1k",   1024}, {"kmalloc-2k",   2048},
-   {"kmalloc-4k",   4096}, {"kmalloc-8k",   8192},
-   {"kmalloc-16k", 16384}, {"kmalloc-32k", 32768},
-   {"kmalloc-64k", 65536}, {"kmalloc-128k",   131072},
-   {"kmalloc-256k",   262144}, {"kmalloc-512k",   524288},
-   {"kmalloc-1M",1048576}, {"kmalloc-2M",2097152},
-   {"kmalloc-4M",4194304}, {"kmalloc-8M",8388608},
-   {"kmalloc-16M",  16777216}, {"kmalloc-32M",  33554432},
-   {"kmalloc-64M",  67108864}
+   SET_KMALLOC_SIZE(0, 0),
+   SET_KMALLOC_SIZE(96, 96),
+   SET_KMALLOC_SIZE(192, 192),
+   SET_KMALLOC_SIZE(8, 8),
+   SET_KMALLOC_SIZE(16, 16),
+   SET_KMALLOC_SIZE(32, 32),
+   SET_KMALLOC_SIZE(64, 64),
+   SET_KMALLOC_SIZE(128, 128),
+   SET_KMALLOC_SIZE(256, 256),
+   SET_KMALLOC_SIZE(512, 512),
+   SET_KMALLOC_SIZE(1024, 1k),
+   SET_KMALLOC_SIZE(2048, 2k),
+   SET_KMALLOC_SIZE(4096, 4k),
+   SET_KMALLOC_SIZE(8192, 8k),
+   SET_KMALLOC_SIZE(16384, 16k),
+   SET_KMALLOC_SIZE(32768, 32k),
+   SET_KMALLOC_SIZE(65536, 64k),
+   SET_KMALLOC_SIZE(131072, 128k),
+   SET_KMALLOC_SIZE(262144, 256k),
+   SET_KMALLOC_SIZE(5242

[PATCH 0/5] mm, slab: Make kmalloc_info[] contain all types of names

2019-09-03 Thread Pengfei Li
There are three types of kmalloc, KMALLOC_NORMAL, KMALLOC_RECLAIM
and KMALLOC_DMA.

The name of KMALLOC_NORMAL is contained in kmalloc_info[].name,
but the names of KMALLOC_RECLAIM and KMALLOC_DMA are dynamically
generated by kmalloc_cache_name().

Patch1 predefines the names of all types of kmalloc to save
the time spent dynamically generating names.

The other 4 patches did some cleanup work.

These changes make sense, and the time spent by new_kmalloc_cache()
has been reduced by approximately 36.3%.

 Time spent by
 new_kmalloc_cache()
5.3-rc7   66264
5.3-rc7+patch 42188

Pengfei Li (5):
  mm, slab: Make kmalloc_info[] contain all types of names
  mm, slab_common: Remove unused kmalloc_cache_name()
  mm, slab: Remove unused kmalloc_size()
  mm, slab_common: Make 'type' is enum kmalloc_cache_type
  mm, slab_common: Make initializing KMALLOC_DMA start from 1

 include/linux/slab.h |  20 -
 mm/slab.c|   7 +--
 mm/slab.h|   2 +-
 mm/slab_common.c | 101 +++
 4 files changed, 59 insertions(+), 71 deletions(-)

-- 
2.21.0



Re: [PATCH] mm/page_alloc: cleanup __alloc_pages_direct_compact()

2019-08-19 Thread Pengfei Li
On Mon, Aug 19, 2019 at 9:50 PM Vlastimil Babka  wrote:
>
> On 8/17/19 12:51 PM, Pengfei Li wrote:
> > This patch cleans up the if(page).
> >
> > No functional change.
> >
> > Signed-off-by: Pengfei Li 
>
> I don't see much benefit here. The indentation wasn't that bad that it
> had to be reduced using goto. But the patch is not incorrect so I'm not
> NACKing.
>

Thanks for your review and comments.

This patch reduces the number of times the if(page)
(as the compiler does), and the downside is that there is a goto.

If this improves readability, accept it. Otherwise, leave it as it is.

Thanks again.

---
Pengfei


> > ---
> >  mm/page_alloc.c | 28 
> >  1 file changed, 16 insertions(+), 12 deletions(-)
> >
> > diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> > index 272c6de1bf4e..51f056ac09f5 100644
> > --- a/mm/page_alloc.c
> > +++ b/mm/page_alloc.c
> > @@ -3890,6 +3890,7 @@ __alloc_pages_direct_compact(gfp_t gfp_mask, unsigned 
> > int order,
> >   enum compact_priority prio, enum compact_result 
> > *compact_result)
> >  {
> >   struct page *page = NULL;
> > + struct zone *zone;
> >   unsigned long pflags;
> >   unsigned int noreclaim_flag;
> >
> > @@ -3911,23 +3912,26 @@ __alloc_pages_direct_compact(gfp_t gfp_mask, 
> > unsigned int order,
> >*/
> >   count_vm_event(COMPACTSTALL);
> >
> > - /* Prep a captured page if available */
> > - if (page)
> > + if (page) {
> > + /* Prep a captured page if available */
> >   prep_new_page(page, order, gfp_mask, alloc_flags);
> > -
> > - /* Try get a page from the freelist if available */
> > - if (!page)
> > + } else {
> > + /* Try get a page from the freelist if available */
> >   page = get_page_from_freelist(gfp_mask, order, alloc_flags, 
> > ac);
> >
> > - if (page) {
> > - struct zone *zone = page_zone(page);
> > -
> > - zone->compact_blockskip_flush = false;
> > - compaction_defer_reset(zone, order, true);
> > - count_vm_event(COMPACTSUCCESS);
> > - return page;
> > + if (!page)
> > + goto failed;
> >   }
> >
> > + zone = page_zone(page);
> > + zone->compact_blockskip_flush = false;
> > + compaction_defer_reset(zone, order, true);
> > +
> > + count_vm_event(COMPACTSUCCESS);
> > +
> > + return page;
> > +
> > +failed:
> >   /*
> >* It's bad if compaction run occurs and fails. The most likely reason
> >* is that pages exist, but not enough to satisfy watermarks.
> >
>


[PATCH] mm/page_alloc: cleanup __alloc_pages_direct_compact()

2019-08-17 Thread Pengfei Li
This patch cleans up the if(page).

No functional change.

Signed-off-by: Pengfei Li 
---
 mm/page_alloc.c | 28 
 1 file changed, 16 insertions(+), 12 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 272c6de1bf4e..51f056ac09f5 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -3890,6 +3890,7 @@ __alloc_pages_direct_compact(gfp_t gfp_mask, unsigned int 
order,
enum compact_priority prio, enum compact_result *compact_result)
 {
struct page *page = NULL;
+   struct zone *zone;
unsigned long pflags;
unsigned int noreclaim_flag;
 
@@ -3911,23 +3912,26 @@ __alloc_pages_direct_compact(gfp_t gfp_mask, unsigned 
int order,
 */
count_vm_event(COMPACTSTALL);
 
-   /* Prep a captured page if available */
-   if (page)
+   if (page) {
+   /* Prep a captured page if available */
prep_new_page(page, order, gfp_mask, alloc_flags);
-
-   /* Try get a page from the freelist if available */
-   if (!page)
+   } else {
+   /* Try get a page from the freelist if available */
page = get_page_from_freelist(gfp_mask, order, alloc_flags, ac);
 
-   if (page) {
-   struct zone *zone = page_zone(page);
-
-   zone->compact_blockskip_flush = false;
-   compaction_defer_reset(zone, order, true);
-   count_vm_event(COMPACTSUCCESS);
-   return page;
+   if (!page)
+   goto failed;
}
 
+   zone = page_zone(page);
+   zone->compact_blockskip_flush = false;
+   compaction_defer_reset(zone, order, true);
+
+   count_vm_event(COMPACTSUCCESS);
+
+   return page;
+
+failed:
/*
 * It's bad if compaction run occurs and fails. The most likely reason
 * is that pages exist, but not enough to satisfy watermarks.
-- 
2.21.0



[PATCH] mm/compaction: remove unnecessary zone parameter in isolate_migratepages()

2019-08-06 Thread Pengfei Li
Like commit 40cacbcb3240 ("mm, compaction: remove unnecessary zone
parameter in some instances"), remove unnecessary zone parameter.

No functional change.

Signed-off-by: Pengfei Li 
---
 mm/compaction.c | 13 ++---
 1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/mm/compaction.c b/mm/compaction.c
index 952dc2fb24e5..685c3e3d0a0f 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -1737,8 +1737,7 @@ static unsigned long fast_find_migrateblock(struct 
compact_control *cc)
  * starting at the block pointed to by the migrate scanner pfn within
  * compact_control.
  */
-static isolate_migrate_t isolate_migratepages(struct zone *zone,
-   struct compact_control *cc)
+static isolate_migrate_t isolate_migratepages(struct compact_control *cc)
 {
unsigned long block_start_pfn;
unsigned long block_end_pfn;
@@ -1756,8 +1755,8 @@ static isolate_migrate_t isolate_migratepages(struct zone 
*zone,
 */
low_pfn = fast_find_migrateblock(cc);
block_start_pfn = pageblock_start_pfn(low_pfn);
-   if (block_start_pfn < zone->zone_start_pfn)
-   block_start_pfn = zone->zone_start_pfn;
+   if (block_start_pfn < cc->zone->zone_start_pfn)
+   block_start_pfn = cc->zone->zone_start_pfn;
 
/*
 * fast_find_migrateblock marks a pageblock skipped so to avoid
@@ -1787,8 +1786,8 @@ static isolate_migrate_t isolate_migratepages(struct zone 
*zone,
if (!(low_pfn % (SWAP_CLUSTER_MAX * pageblock_nr_pages)))
cond_resched();
 
-   page = pageblock_pfn_to_page(block_start_pfn, block_end_pfn,
-   zone);
+   page = pageblock_pfn_to_page(block_start_pfn,
+   block_end_pfn, cc->zone);
if (!page)
continue;
 
@@ -2158,7 +2157,7 @@ compact_zone(struct compact_control *cc, struct 
capture_control *capc)
cc->rescan = true;
}
 
-   switch (isolate_migratepages(cc->zone, cc)) {
+   switch (isolate_migratepages(cc)) {
case ISOLATE_ABORT:
ret = COMPACT_CONTENDED;
putback_movable_pages(>migratepages);
-- 
2.21.0



Re: [PATCH 00/10] make "order" unsigned int

2019-07-29 Thread Pengfei Li
On Mon, Jul 29, 2019 at 4:34 PM Mel Gorman  wrote:
>
> On Sun, Jul 28, 2019 at 12:44:36AM +0800, Pengfei Li wrote:
> > On Fri, Jul 26, 2019 at 3:26 PM Mel Gorman  
> > wrote:
> > >
> >
> > Thank you for your comments.
> >
> > > On Fri, Jul 26, 2019 at 02:42:43AM +0800, Pengfei Li wrote:
> > > > Objective
> > > > 
> > > > The motivation for this series of patches is use unsigned int for
> > > > "order" in compaction.c, just like in other memory subsystems.
> > > >
> > >
> > > Why? The series is relatively subtle in parts, particularly patch 5.
> >
> > Before I sent this series of patches, I took a close look at the
> > git log for compact.c.
> >
> > Here is a short history, trouble you to look patiently.
> >
> > 1) At first, "order" is _unsigned int_
> >
> > The commit 56de7263fcf3 ("mm: compaction: direct compact when a
> > high-order allocation fails") introduced the "order" in
> > compact_control and its type is unsigned int.
> >
> > Besides, you specify that order == -1 is the flag that triggers
> > compaction via proc.
> >
>
> Yes, specifying that compaction in that context is for the entire zone
> without any specific allocation context or request.

Yes

>
> > 2) Next, because order equals -1 is special, it causes an error.
> >
> > The commit 7be62de99adc ("vmscan: kswapd carefully call compaction")
> > determines if "order" is less than 0.
> >
> > This condition is always true because the type of "order" is
> > _unsigned int_.
> >
> > -   compact_zone(zone, );
> > +   if (cc->order < 0 || !compaction_deferred(zone))
> >
> > 3) Finally, in order to fix the above error, the type of the order
> > is modified to _int_
> >
> > It is done by commit: aad6ec3777bf ("mm: compaction: make
> > compact_control order signed").
> >
> > The reason I mention this is because I want to express that the
> > type of "order" is originally _unsigned int_. And "order" is
> > modified to _int_ because of the special value of -1.
> >
>
> And in itself, why does that matter?

The -1 makes order is int, which breaks the consistency of the type of order.

>
> > If the special value of "order" is not a negative number (for
> > example, -1), but a number greater than MAX_ORDER - 1 (for example,
> > MAX_ORDER), then the "order" may still be _unsigned int_ now.
> >
>
> Sure, but then you have to check that every check on order understands
> the new special value.
>

Since this check is done by is_via_compact_memory(), it is easy to modify the
special value being checked.

I have checked every check many times and now I need some reviews from
the community.

> > > There have been places where by it was important for order to be able to
> > > go negative due to loop exit conditions.
> >
> > I think that even if "cc->order" is _unsigned int_, it can be done
> > with a local temporary variable easily.
> >
> > Like this,
> >
> > function(...)
> > {
> > for(int tmp_order = cc->order; tmp_order >= 0; tmp_order--) {
> > ...
> > }
> > }
> >
>
> Yes, it can be expressed as unsigned but in itself why does that justify
> the review of a large series?

At first glance it seems that this series of patches is large. But at least
half of it is to modify the corresponding trace function. And you can see
that except patch 4 and patch 5, other patches only modify the type of
order.

Even for patch 5 with function modifications, the modified code has only
about 50 lines.

> There is limited to no performance gain
> and functionally it's equivalent.

This is just clean up. And others have done similar work before.

commit d00181b96eb8 ("mm: use 'unsigned int' for page order")
commit 7aeb09f9104b ("mm: page_alloc: use unsigned int for order in
more places")

>
> > > If there was a gain from this
> > > or it was a cleanup in the context of another major body of work, I
> > > could understand the justification but that does not appear to be the
> > > case here.
> > >
> >
> > My final conclusion:
> >
> > Why "order" is _int_ instead of unsigned int?
> >   => Because order == -1 is used as the flag.
> > => So what about making "order" greater than MAX_ORDER - 1?
> >   => The "order&qu

Re: [PATCH 00/10] make "order" unsigned int

2019-07-27 Thread Pengfei Li
On Fri, Jul 26, 2019 at 3:12 PM Michal Hocko  wrote:
>

Thank you for your comments.

> On Fri 26-07-19 07:48:36, Pengfei Li wrote:
> [...]
> > For the benefit, "order" may be negative, which is confusing and weird.
>
> order = -1 has a special meaning.
>

Yes. But I mean -1 can be replaced by any number greater than
MAX_ORDER - 1 and there is no reason to be negative.

> > There is no good reason not to do this since it can be avoided.
>
> "This is good because we can do it" doesn't really sound like a
> convincing argument to me. I would understand if this reduced a
> generated code, made an overall code readability much better or
> something along those lines. Also we only use MAX_ORDER range of values
> so I could argue that a smaller data type (e.g. short) should be
> sufficient for this data type.
>

I resend an email to interpret the meaning of my commit, and I would be
very grateful if you post some comments on this.

> Please note that _any_ change, alebit seemingly small, can introduce a
> subtle bug. Also each patch requires a man power to review so you have
> to understand that "just because we can" is not a strong motivation for
> people to spend their time on such a patch.

Sincerely thank you, I will keep these in mind.

> --
> Michal Hocko
> SUSE Labs

--
Pengfei


Re: [PATCH 00/10] make "order" unsigned int

2019-07-27 Thread Pengfei Li
On Fri, Jul 26, 2019 at 3:26 PM Mel Gorman  wrote:
>

Thank you for your comments.

> On Fri, Jul 26, 2019 at 02:42:43AM +0800, Pengfei Li wrote:
> > Objective
> > 
> > The motivation for this series of patches is use unsigned int for
> > "order" in compaction.c, just like in other memory subsystems.
> >
>
> Why? The series is relatively subtle in parts, particularly patch 5.

Before I sent this series of patches, I took a close look at the
git log for compact.c.

Here is a short history, trouble you to look patiently.

1) At first, "order" is _unsigned int_

The commit 56de7263fcf3 ("mm: compaction: direct compact when a
high-order allocation fails") introduced the "order" in
compact_control and its type is unsigned int.

Besides, you specify that order == -1 is the flag that triggers
compaction via proc.

2) Next, because order equals -1 is special, it causes an error.

The commit 7be62de99adc ("vmscan: kswapd carefully call compaction")
determines if "order" is less than 0.

This condition is always true because the type of "order" is
_unsigned int_.

-   compact_zone(zone, );
+   if (cc->order < 0 || !compaction_deferred(zone))

3) Finally, in order to fix the above error, the type of the order
is modified to _int_

It is done by commit: aad6ec3777bf ("mm: compaction: make
compact_control order signed").

The reason I mention this is because I want to express that the
type of "order" is originally _unsigned int_. And "order" is
modified to _int_ because of the special value of -1.

If the special value of "order" is not a negative number (for
example, -1), but a number greater than MAX_ORDER - 1 (for example,
MAX_ORDER), then the "order" may still be _unsigned int_ now.

> There have been places where by it was important for order to be able to
> go negative due to loop exit conditions.

I think that even if "cc->order" is _unsigned int_, it can be done
with a local temporary variable easily.

Like this,

function(...)
{
for(int tmp_order = cc->order; tmp_order >= 0; tmp_order--) {
...
}
}

> If there was a gain from this
> or it was a cleanup in the context of another major body of work, I
> could understand the justification but that does not appear to be the
> case here.
>

My final conclusion:

Why "order" is _int_ instead of unsigned int?
  => Because order == -1 is used as the flag.
=> So what about making "order" greater than MAX_ORDER - 1?
  => The "order" can be _unsigned int_ just like in most places.

(Can we only pick -1 as this special value?)

This series of patches makes sense because,

1) It guarantees that "order" remains the same type.

No one likes to see this

__alloc_pages_slowpath(unsigned int order, ...)
 => should_compact_retry(int order, ...)/* The type changed */
  => compaction_zonelist_suitable(int order, ...)
   => __compaction_suitable(int order, ...)
=> zone_watermark_ok(unsigned int order, ...)   /* The type
changed again! */

2) It eliminates the evil "order == -1".

If "order" is specified as any positive number greater than
MAX_ORDER - 1 in commit 56de7263fcf3, perhaps no int order will
appear in compact.c until now.

> --
> Mel Gorman

Thank you again for your comments, and sincerely thank you for
your patience in reading such a long email.

> SUSE Labs


Re: [PATCH 02/10] mm/page_alloc: use unsigned int for "order" in __rmqueue_fallback()

2019-07-26 Thread Pengfei Li
On Fri, Jul 26, 2019 at 5:36 PM Rasmus Villemoes
 wrote:
>
> On 25/07/2019 20.42, Pengfei Li wrote:
> > Because "order" will never be negative in __rmqueue_fallback(),
> > so just make "order" unsigned int.
> > And modify trace_mm_page_alloc_extfrag() accordingly.
> >
>
> > diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> > index 75c18f4fd66a..1432cbcd87cd 100644
> > --- a/mm/page_alloc.c
> > +++ b/mm/page_alloc.c
> > @@ -2631,8 +2631,8 @@ static bool unreserve_highatomic_pageblock(const 
> > struct alloc_context *ac,
> >   * condition simpler.
> >   */
> >  static __always_inline bool
> > -__rmqueue_fallback(struct zone *zone, int order, int start_migratetype,
> > - unsigned int alloc_flags)
> > +__rmqueue_fallback(struct zone *zone, unsigned int order,
> > + int start_migratetype, unsigned int alloc_flags)
> >  {
>
> Please read the last paragraph of the comment above this function, run
> git blame to figure out when that was introduced, and then read the full
> commit description.

Thanks for your comments.

I have read the commit info of commit b002529d2563 ("mm/page_alloc.c:
eliminate unsigned confusion in __rmqueue_fallback").

And I looked at the discussion at https://lkml.org/lkml/2017/6/21/684 in detail.

> Here be dragons. At the very least, this patch is
> wrong in that it makes that comment inaccurate.

I wonder if you noticed the commit 6bb154504f8b ("mm, page_alloc: spread
allocations across zones before introducing fragmentation").

Commit 6bb154504f8b introduces a local variable min_order in
__rmqueue_fallback().

And you can see

for (current_order = MAX_ORDER - 1; current_order >= min_order;
--current_order) {

The “current_order” and "min_order"  are int, so here is ok.

Since __rmqueue_fallback() is only called by __rmqueue() and "order" is unsigned
int in __rmqueue(), then I think that making "order" is also unsigned
int is good.

Maybe I should also modify the comments here?

>
> Rasmus

Thank you again for your review.

--
Pengfei


Re: [PATCH 00/10] make "order" unsigned int

2019-07-25 Thread Pengfei Li
On Fri, Jul 26, 2019 at 2:52 AM Qian Cai  wrote:
>
> On Fri, 2019-07-26 at 02:42 +0800, Pengfei Li wrote:
> > Objective
> > 
> > The motivation for this series of patches is use unsigned int for
> > "order" in compaction.c, just like in other memory subsystems.
>
> I suppose you will need more justification for this change. Right now, I don't

Thanks for your comments.

> see much real benefit apart from possibly introducing more regressions in 
> those

As you can see, except for patch [05/10], other commits only modify the type
of "order". So the change is not big.

For the benefit, "order" may be negative, which is confusing and weird.
There is no good reason not to do this since it can be avoided.

> tricky areas of the code. Also, your testing seems quite lightweight.
>

Yes, you are right.
I use "stress" for stress testing, and made some small code coverage testing.

As you said, I need more ideas and comments about testing.
Any suggestions for testing?

Thanks again.

--
Pengfei

> >
> > In addition, did some cleanup about "order" in page_alloc
> > and vmscan.
> >
> >
> > Description
> > 
> > Directly modifying the type of "order" to unsigned int is ok in most
> > places, because "order" is always non-negative.
> >
> > But there are two places that are special, one is next_search_order()
> > and the other is compact_node().
> >
> > For next_search_order(), order may be negative. It can be avoided by
> > some modifications.
> >
> > For compact_node(), order = -1 means performing manual compaction.
> > It can be avoided by specifying order = MAX_ORDER.
> >
> > Key changes in [PATCH 05/10] mm/compaction: make "order" and
> > "search_order" unsigned.
> >
> > More information can be obtained from commit messages.
> >
> >
> > Test
> > 
> > I have done some stress testing locally and have not found any problems.
> >
> > In addition, local tests indicate no performance impact.
> >
> >
> > Pengfei Li (10):
> >   mm/page_alloc: use unsigned int for "order" in should_compact_retry()
> >   mm/page_alloc: use unsigned int for "order" in __rmqueue_fallback()
> >   mm/page_alloc: use unsigned int for "order" in should_compact_retry()
> >   mm/page_alloc: remove never used "order" in alloc_contig_range()
> >   mm/compaction: make "order" and "search_order" unsigned int in struct
> > compact_control
> >   mm/compaction: make "order" unsigned int in compaction.c
> >   trace/events/compaction: make "order" unsigned int
> >   mm/compaction: use unsigned int for "compact_order_failed" in struct
> > zone
> >   mm/compaction: use unsigned int for "kcompactd_max_order" in struct
> > pglist_data
> >   mm/vmscan: use unsigned int for "kswapd_order" in struct pglist_data
> >
> >  include/linux/compaction.h|  30 +++
> >  include/linux/mmzone.h|   8 +-
> >  include/trace/events/compaction.h |  40 +-
> >  include/trace/events/kmem.h   |   6 +-
> >  include/trace/events/oom.h|   6 +-
> >  include/trace/events/vmscan.h |   4 +-
> >  mm/compaction.c   | 127 +++---
> >  mm/internal.h |   6 +-
> >  mm/page_alloc.c   |  16 ++--
> >  mm/vmscan.c   |   6 +-
> >  10 files changed, 126 insertions(+), 123 deletions(-)
> >


Re: [PATCH 01/10] mm/page_alloc: use unsigned int for "order" in should_compact_retry()

2019-07-25 Thread Pengfei Li
On Fri, Jul 26, 2019 at 2:58 AM Matthew Wilcox  wrote:
>
> On Fri, Jul 26, 2019 at 02:42:44AM +0800, Pengfei Li wrote:
> >  static inline bool
> > -should_compact_retry(struct alloc_context *ac, int order, int alloc_flags,
> > -  enum compact_result compact_result,
> > -  enum compact_priority *compact_priority,
> > -  int *compaction_retries)
> > +should_compact_retry(struct alloc_context *ac, unsigned int order,
> > + int alloc_flags, enum compact_result compact_result,
> > + enum compact_priority *compact_priority, int *compaction_retries)
> >  {
> >   int max_retries = MAX_COMPACT_RETRIES;
>
> One tab here is insufficient indentation.  It should be at least two.

Thanks for your comments.

> Some parts of the kernel insist on lining up arguments with the opening
> parenthesis of the function; I don't know if mm really obeys this rule,
> but you're indenting function arguments to the same level as the opening
> variables of the function, which is confusing.

I will use two tabs in the next version.

--
Pengfei


[PATCH 09/10] mm/compaction: use unsigned int for "kcompactd_max_order" in struct pglist_data

2019-07-25 Thread Pengfei Li
Because "kcompactd_max_order" will never be negative, so just
make it unsigned int.

Signed-off-by: Pengfei Li 
---
 include/linux/compaction.h | 6 --
 include/linux/mmzone.h | 2 +-
 mm/compaction.c| 2 +-
 3 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/include/linux/compaction.h b/include/linux/compaction.h
index a8049d582265..1b296de6efef 100644
--- a/include/linux/compaction.h
+++ b/include/linux/compaction.h
@@ -175,7 +175,8 @@ bool compaction_zonelist_suitable(struct alloc_context *ac,
 
 extern int kcompactd_run(int nid);
 extern void kcompactd_stop(int nid);
-extern void wakeup_kcompactd(pg_data_t *pgdat, int order, int classzone_idx);
+extern void wakeup_kcompactd(pg_data_t *pgdat, unsigned int order,
+   int classzone_idx);
 
 #else
 static inline void reset_isolation_suitable(pg_data_t *pgdat)
@@ -220,7 +221,8 @@ static inline void kcompactd_stop(int nid)
 {
 }
 
-static inline void wakeup_kcompactd(pg_data_t *pgdat, int order, int 
classzone_idx)
+static inline void wakeup_kcompactd(pg_data_t *pgdat, unsigned int order,
+   int classzone_idx)
 {
 }
 
diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index 0947e7cb4214..60bebdf47661 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -723,7 +723,7 @@ typedef struct pglist_data {
int kswapd_failures;/* Number of 'reclaimed == 0' runs */
 
 #ifdef CONFIG_COMPACTION
-   int kcompactd_max_order;
+   unsigned int kcompactd_max_order;
enum zone_type kcompactd_classzone_idx;
wait_queue_head_t kcompactd_wait;
struct task_struct *kcompactd;
diff --git a/mm/compaction.c b/mm/compaction.c
index aad638ad2cc6..909ead244cff 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -2607,7 +2607,7 @@ static void kcompactd_do_work(pg_data_t *pgdat)
pgdat->kcompactd_classzone_idx = pgdat->nr_zones - 1;
 }
 
-void wakeup_kcompactd(pg_data_t *pgdat, int order, int classzone_idx)
+void wakeup_kcompactd(pg_data_t *pgdat, unsigned int order, int classzone_idx)
 {
if (!order)
return;
-- 
2.21.0



[PATCH 10/10] mm/vmscan: use unsigned int for "kswapd_order" in struct pglist_data

2019-07-25 Thread Pengfei Li
Because "kswapd_order" will never be negative, so just make it
unsigned int. And modify wakeup_kswapd(), kswapd_try_to_sleep()
and trace_mm_vmscan_kswapd_wake() accordingly.

Besides, make "order" unsigned int in two related trace functions.

Signed-off-by: Pengfei Li 
---
 include/linux/mmzone.h|  4 ++--
 include/trace/events/compaction.h | 10 +-
 include/trace/events/vmscan.h |  4 ++--
 mm/vmscan.c   |  6 +++---
 4 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index 60bebdf47661..1196ed0cee67 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -717,7 +717,7 @@ typedef struct pglist_data {
wait_queue_head_t pfmemalloc_wait;
struct task_struct *kswapd; /* Protected by
   mem_hotplug_begin/end() */
-   int kswapd_order;
+   unsigned int kswapd_order;
enum zone_type kswapd_classzone_idx;
 
int kswapd_failures;/* Number of 'reclaimed == 0' runs */
@@ -802,7 +802,7 @@ static inline bool pgdat_is_empty(pg_data_t *pgdat)
 #include 
 
 void build_all_zonelists(pg_data_t *pgdat);
-void wakeup_kswapd(struct zone *zone, gfp_t gfp_mask, int order,
+void wakeup_kswapd(struct zone *zone, gfp_t gfp_mask, unsigned int order,
   enum zone_type classzone_idx);
 bool __zone_watermark_ok(struct zone *z, unsigned int order, unsigned long 
mark,
 int classzone_idx, unsigned int alloc_flags,
diff --git a/include/trace/events/compaction.h 
b/include/trace/events/compaction.h
index f83ba40f9614..34a9fac3b4d6 100644
--- a/include/trace/events/compaction.h
+++ b/include/trace/events/compaction.h
@@ -314,13 +314,13 @@ TRACE_EVENT(mm_compaction_kcompactd_sleep,
 
 DECLARE_EVENT_CLASS(kcompactd_wake_template,
 
-   TP_PROTO(int nid, int order, enum zone_type classzone_idx),
+   TP_PROTO(int nid, unsigned int order, enum zone_type classzone_idx),
 
TP_ARGS(nid, order, classzone_idx),
 
TP_STRUCT__entry(
__field(int, nid)
-   __field(int, order)
+   __field(unsigned int, order)
__field(enum zone_type, classzone_idx)
),
 
@@ -330,7 +330,7 @@ DECLARE_EVENT_CLASS(kcompactd_wake_template,
__entry->classzone_idx = classzone_idx;
),
 
-   TP_printk("nid=%d order=%d classzone_idx=%-8s",
+   TP_printk("nid=%d order=%u classzone_idx=%-8s",
__entry->nid,
__entry->order,
__print_symbolic(__entry->classzone_idx, ZONE_TYPE))
@@ -338,14 +338,14 @@ DECLARE_EVENT_CLASS(kcompactd_wake_template,
 
 DEFINE_EVENT(kcompactd_wake_template, mm_compaction_wakeup_kcompactd,
 
-   TP_PROTO(int nid, int order, enum zone_type classzone_idx),
+   TP_PROTO(int nid, unsigned int order, enum zone_type classzone_idx),
 
TP_ARGS(nid, order, classzone_idx)
 );
 
 DEFINE_EVENT(kcompactd_wake_template, mm_compaction_kcompactd_wake,
 
-   TP_PROTO(int nid, int order, enum zone_type classzone_idx),
+   TP_PROTO(int nid, unsigned int order, enum zone_type classzone_idx),
 
TP_ARGS(nid, order, classzone_idx)
 );
diff --git a/include/trace/events/vmscan.h b/include/trace/events/vmscan.h
index c37e2280e6dd..13c214f3750b 100644
--- a/include/trace/events/vmscan.h
+++ b/include/trace/events/vmscan.h
@@ -74,7 +74,7 @@ TRACE_EVENT(mm_vmscan_kswapd_wake,
 
 TRACE_EVENT(mm_vmscan_wakeup_kswapd,
 
-   TP_PROTO(int nid, int zid, int order, gfp_t gfp_flags),
+   TP_PROTO(int nid, int zid, unsigned int order, gfp_t gfp_flags),
 
TP_ARGS(nid, zid, order, gfp_flags),
 
@@ -92,7 +92,7 @@ TRACE_EVENT(mm_vmscan_wakeup_kswapd,
__entry->gfp_flags  = gfp_flags;
),
 
-   TP_printk("nid=%d order=%d gfp_flags=%s",
+   TP_printk("nid=%d order=%u gfp_flags=%s",
__entry->nid,
__entry->order,
show_gfp_flags(__entry->gfp_flags))
diff --git a/mm/vmscan.c b/mm/vmscan.c
index f4fd02ae233e..9d98a2e5f736 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -3781,8 +3781,8 @@ static enum zone_type kswapd_classzone_idx(pg_data_t 
*pgdat,
return pgdat->kswapd_classzone_idx;
 }
 
-static void kswapd_try_to_sleep(pg_data_t *pgdat, int alloc_order, int 
reclaim_order,
-   unsigned int classzone_idx)
+static void kswapd_try_to_sleep(pg_data_t *pgdat, unsigned int alloc_order,
+   unsigned int reclaim_order, unsigned int classzone_idx)
 {
long remaining = 0;
DEFINE_WAIT(wait);
@@ -3956,7 +3956,7 @@ static int kswapd(void *p)
  * has failed or is not needed, still wake up kcompactd if only compaction is
  * needed.
  */
-void wakeup_kswapd(struct zone *zone, gfp_t gfp_flags, int order,
+void wakeup_kswapd(st

[PATCH 08/10] mm/compaction: use unsigned int for "compact_order_failed" in struct zone

2019-07-25 Thread Pengfei Li
Because "compact_order_failed" will never be negative, so just
make it unsigned int. And modify three related trace functions
accordingly.

Signed-off-by: Pengfei Li 
---
 include/linux/compaction.h| 12 ++--
 include/linux/mmzone.h|  2 +-
 include/trace/events/compaction.h | 14 +++---
 mm/compaction.c   |  8 
 4 files changed, 18 insertions(+), 18 deletions(-)

diff --git a/include/linux/compaction.h b/include/linux/compaction.h
index 0201dfa57d44..a8049d582265 100644
--- a/include/linux/compaction.h
+++ b/include/linux/compaction.h
@@ -99,11 +99,11 @@ extern void reset_isolation_suitable(pg_data_t *pgdat);
 extern enum compact_result compaction_suitable(struct zone *zone,
unsigned int order, unsigned int alloc_flags, int classzone_idx);
 
-extern void defer_compaction(struct zone *zone, int order);
-extern bool compaction_deferred(struct zone *zone, int order);
-extern void compaction_defer_reset(struct zone *zone, int order,
+extern void defer_compaction(struct zone *zone, unsigned int order);
+extern bool compaction_deferred(struct zone *zone, unsigned int order);
+extern void compaction_defer_reset(struct zone *zone, unsigned int order,
bool alloc_success);
-extern bool compaction_restarting(struct zone *zone, int order);
+extern bool compaction_restarting(struct zone *zone, unsigned int order);
 
 /* Compaction has made some progress and retrying makes sense */
 static inline bool compaction_made_progress(enum compact_result result)
@@ -188,11 +188,11 @@ static inline enum compact_result 
compaction_suitable(struct zone *zone,
return COMPACT_SKIPPED;
 }
 
-static inline void defer_compaction(struct zone *zone, int order)
+static inline void defer_compaction(struct zone *zone, unsigned int order)
 {
 }
 
-static inline bool compaction_deferred(struct zone *zone, int order)
+static inline bool compaction_deferred(struct zone *zone, unsigned int order)
 {
return true;
 }
diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index d77d717c620c..0947e7cb4214 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -545,7 +545,7 @@ struct zone {
 */
unsigned intcompact_considered;
unsigned intcompact_defer_shift;
-   int compact_order_failed;
+   unsigned intcompact_order_failed;
 #endif
 
 #if defined CONFIG_COMPACTION || defined CONFIG_CMA
diff --git a/include/trace/events/compaction.h 
b/include/trace/events/compaction.h
index 1e1e74f6d128..f83ba40f9614 100644
--- a/include/trace/events/compaction.h
+++ b/include/trace/events/compaction.h
@@ -243,17 +243,17 @@ DEFINE_EVENT(mm_compaction_suitable_template, 
mm_compaction_suitable,
 
 DECLARE_EVENT_CLASS(mm_compaction_defer_template,
 
-   TP_PROTO(struct zone *zone, int order),
+   TP_PROTO(struct zone *zone, unsigned int order),
 
TP_ARGS(zone, order),
 
TP_STRUCT__entry(
__field(int, nid)
__field(enum zone_type, idx)
-   __field(int, order)
+   __field(unsigned int, order)
__field(unsigned int, considered)
__field(unsigned int, defer_shift)
-   __field(int, order_failed)
+   __field(unsigned int, order_failed)
),
 
TP_fast_assign(
@@ -265,7 +265,7 @@ DECLARE_EVENT_CLASS(mm_compaction_defer_template,
__entry->order_failed = zone->compact_order_failed;
),
 
-   TP_printk("node=%d zone=%-8s order=%d order_failed=%d consider=%u 
limit=%lu",
+   TP_printk("node=%d zone=%-8s order=%u order_failed=%u consider=%u 
limit=%lu",
__entry->nid,
__print_symbolic(__entry->idx, ZONE_TYPE),
__entry->order,
@@ -276,21 +276,21 @@ DECLARE_EVENT_CLASS(mm_compaction_defer_template,
 
 DEFINE_EVENT(mm_compaction_defer_template, mm_compaction_deferred,
 
-   TP_PROTO(struct zone *zone, int order),
+   TP_PROTO(struct zone *zone, unsigned int order),
 
TP_ARGS(zone, order)
 );
 
 DEFINE_EVENT(mm_compaction_defer_template, mm_compaction_defer_compaction,
 
-   TP_PROTO(struct zone *zone, int order),
+   TP_PROTO(struct zone *zone, unsigned int order),
 
TP_ARGS(zone, order)
 );
 
 DEFINE_EVENT(mm_compaction_defer_template, mm_compaction_defer_reset,
 
-   TP_PROTO(struct zone *zone, int order),
+   TP_PROTO(struct zone *zone, unsigned int order),
 
TP_ARGS(zone, order)
 );
diff --git a/mm/compaction.c b/mm/compaction.c
index ac5df82d46e0..aad638ad2cc6 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -139,7 +139,7 @@ EXPORT_SYMBOL(__ClearPageMovable);
  * allocation success. 1 << compact_defer_limit compactions are skipped up
  * to a limit of 1 << COMPACT_MAX_DEFER_SHIFT
  */
-void defer_compaction(struct zone *zone, int 

[PATCH 06/10] mm/compaction: make "order" unsigned int in compaction.c

2019-07-25 Thread Pengfei Li
Since compact_control->order and compact_control->search_order
have been modified to unsigned int in the previous commit, then
some of the functions in compaction.c are modified accordingly.

Signed-off-by: Pengfei Li 
---
 include/linux/compaction.h | 12 ++--
 mm/compaction.c| 21 ++---
 2 files changed, 16 insertions(+), 17 deletions(-)

diff --git a/include/linux/compaction.h b/include/linux/compaction.h
index 9569e7c786d3..0201dfa57d44 100644
--- a/include/linux/compaction.h
+++ b/include/linux/compaction.h
@@ -96,8 +96,8 @@ extern enum compact_result try_to_compact_pages(gfp_t 
gfp_mask,
const struct alloc_context *ac, enum compact_priority prio,
struct page **page);
 extern void reset_isolation_suitable(pg_data_t *pgdat);
-extern enum compact_result compaction_suitable(struct zone *zone, int order,
-   unsigned int alloc_flags, int classzone_idx);
+extern enum compact_result compaction_suitable(struct zone *zone,
+   unsigned int order, unsigned int alloc_flags, int classzone_idx);
 
 extern void defer_compaction(struct zone *zone, int order);
 extern bool compaction_deferred(struct zone *zone, int order);
@@ -170,8 +170,8 @@ static inline bool compaction_withdrawn(enum compact_result 
result)
 }
 
 
-bool compaction_zonelist_suitable(struct alloc_context *ac, int order,
-   int alloc_flags);
+bool compaction_zonelist_suitable(struct alloc_context *ac,
+   unsigned int order, int alloc_flags);
 
 extern int kcompactd_run(int nid);
 extern void kcompactd_stop(int nid);
@@ -182,8 +182,8 @@ static inline void reset_isolation_suitable(pg_data_t 
*pgdat)
 {
 }
 
-static inline enum compact_result compaction_suitable(struct zone *zone, int 
order,
-   int alloc_flags, int classzone_idx)
+static inline enum compact_result compaction_suitable(struct zone *zone,
+   unsigned int order, int alloc_flags, int classzone_idx)
 {
return COMPACT_SKIPPED;
 }
diff --git a/mm/compaction.c b/mm/compaction.c
index e47d8fa943a6..ac5df82d46e0 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -1639,7 +1639,7 @@ static unsigned long fast_find_migrateblock(struct 
compact_control *cc)
unsigned long distance;
unsigned long pfn = cc->migrate_pfn;
unsigned long high_pfn;
-   int order;
+   unsigned int order;
 
/* Skip hints are relied on to avoid repeats on the fast search */
if (cc->ignore_skip_hint)
@@ -1958,10 +1958,9 @@ static enum compact_result compact_finished(struct 
compact_control *cc)
  *   COMPACT_SUCCESS  - If the allocation would succeed without compaction
  *   COMPACT_CONTINUE - If compaction should run now
  */
-static enum compact_result __compaction_suitable(struct zone *zone, int order,
-   unsigned int alloc_flags,
-   int classzone_idx,
-   unsigned long wmark_target)
+static enum compact_result __compaction_suitable(struct zone *zone,
+   unsigned int order, unsigned int alloc_flags,
+   int classzone_idx, unsigned long wmark_target)
 {
unsigned long watermark;
 
@@ -1998,7 +1997,7 @@ static enum compact_result __compaction_suitable(struct 
zone *zone, int order,
return COMPACT_CONTINUE;
 }
 
-enum compact_result compaction_suitable(struct zone *zone, int order,
+enum compact_result compaction_suitable(struct zone *zone, unsigned int order,
unsigned int alloc_flags,
int classzone_idx)
 {
@@ -2036,7 +2035,7 @@ enum compact_result compaction_suitable(struct zone 
*zone, int order,
return ret;
 }
 
-bool compaction_zonelist_suitable(struct alloc_context *ac, int order,
+bool compaction_zonelist_suitable(struct alloc_context *ac, unsigned int order,
int alloc_flags)
 {
struct zone *zone;
@@ -2278,10 +2277,10 @@ compact_zone(struct compact_control *cc, struct 
capture_control *capc)
return ret;
 }
 
-static enum compact_result compact_zone_order(struct zone *zone, int order,
-   gfp_t gfp_mask, enum compact_priority prio,
-   unsigned int alloc_flags, int classzone_idx,
-   struct page **capture)
+static enum compact_result compact_zone_order(struct zone *zone,
+   unsigned int order, gfp_t gfp_mask,
+   enum compact_priority prio, unsigned int alloc_flags,
+   int classzone_idx, struct page **capture)
 {
enum compact_result ret;
struct compact_control cc = {
-- 
2.21.0



[PATCH 07/10] trace/events/compaction: make "order" unsigned int

2019-07-25 Thread Pengfei Li
Make the same type as "compact_control->order".

Signed-off-by: Pengfei Li 
---
 include/trace/events/compaction.h | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/include/trace/events/compaction.h 
b/include/trace/events/compaction.h
index e5bf6ee4e814..1e1e74f6d128 100644
--- a/include/trace/events/compaction.h
+++ b/include/trace/events/compaction.h
@@ -170,14 +170,14 @@ TRACE_EVENT(mm_compaction_end,
 TRACE_EVENT(mm_compaction_try_to_compact_pages,
 
TP_PROTO(
-   int order,
+   unsigned int order,
gfp_t gfp_mask,
int prio),
 
TP_ARGS(order, gfp_mask, prio),
 
TP_STRUCT__entry(
-   __field(int, order)
+   __field(unsigned int, order)
__field(gfp_t, gfp_mask)
__field(int, prio)
),
@@ -188,7 +188,7 @@ TRACE_EVENT(mm_compaction_try_to_compact_pages,
__entry->prio = prio;
),
 
-   TP_printk("order=%d gfp_mask=%s priority=%d",
+   TP_printk("order=%u gfp_mask=%s priority=%d",
__entry->order,
show_gfp_flags(__entry->gfp_mask),
__entry->prio)
@@ -197,7 +197,7 @@ TRACE_EVENT(mm_compaction_try_to_compact_pages,
 DECLARE_EVENT_CLASS(mm_compaction_suitable_template,
 
TP_PROTO(struct zone *zone,
-   int order,
+   unsigned int order,
int ret),
 
TP_ARGS(zone, order, ret),
@@ -205,7 +205,7 @@ DECLARE_EVENT_CLASS(mm_compaction_suitable_template,
TP_STRUCT__entry(
__field(int, nid)
__field(enum zone_type, idx)
-   __field(int, order)
+   __field(unsigned int, order)
__field(int, ret)
),
 
@@ -216,7 +216,7 @@ DECLARE_EVENT_CLASS(mm_compaction_suitable_template,
__entry->ret = ret;
),
 
-   TP_printk("node=%d zone=%-8s order=%d ret=%s",
+   TP_printk("node=%d zone=%-8s order=%u ret=%s",
__entry->nid,
__print_symbolic(__entry->idx, ZONE_TYPE),
__entry->order,
@@ -226,7 +226,7 @@ DECLARE_EVENT_CLASS(mm_compaction_suitable_template,
 DEFINE_EVENT(mm_compaction_suitable_template, mm_compaction_finished,
 
TP_PROTO(struct zone *zone,
-   int order,
+   unsigned int order,
int ret),
 
TP_ARGS(zone, order, ret)
@@ -235,7 +235,7 @@ DEFINE_EVENT(mm_compaction_suitable_template, 
mm_compaction_finished,
 DEFINE_EVENT(mm_compaction_suitable_template, mm_compaction_suitable,
 
TP_PROTO(struct zone *zone,
-   int order,
+   unsigned int order,
int ret),
 
TP_ARGS(zone, order, ret)
-- 
2.21.0



[PATCH 03/10] mm/page_alloc: use unsigned int for "order" in should_compact_retry()

2019-07-25 Thread Pengfei Li
Because "order" will never be negative in should_compact_retry(),
so just make "order" unsigned int.

Signed-off-by: Pengfei Li 
---
 mm/page_alloc.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 1432cbcd87cd..7d47af09461f 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -839,7 +839,7 @@ static inline struct capture_control *task_capc(struct zone 
*zone)
 
 static inline bool
 compaction_capture(struct capture_control *capc, struct page *page,
-  int order, int migratetype)
+  unsigned int order, int migratetype)
 {
if (!capc || order != capc->cc->order)
return false;
@@ -870,7 +870,7 @@ static inline struct capture_control *task_capc(struct zone 
*zone)
 
 static inline bool
 compaction_capture(struct capture_control *capc, struct page *page,
-  int order, int migratetype)
+  unsigned int order, int migratetype)
 {
return false;
 }
-- 
2.21.0



[PATCH 05/10] mm/compaction: make "order" and "search_order" unsigned int in struct compact_control

2019-07-25 Thread Pengfei Li
Objective

The "order"and "search_order" is int in struct compact_control. This
commit is aim to make "order" is unsigned int like other mm subsystem.

Change

1) Change "order" and "search_order" to unsigned int

2) Make is_via_compact_memory() return true when "order" is equal to
MAX_ORDER instead of -1, and rename it to is_manual_compaction() for
a clearer meaning.

3) Modify next_search_order() to fit unsigned order.

4) Restore fast_search_fail to unsigned int.
This is ok because search_order is already unsigned int, and after
reverting fast_search_fail to unsigned int, compact_control is still
within two cache lines.

Signed-off-by: Pengfei Li 
---
 mm/compaction.c | 96 +
 mm/internal.h   |  6 ++--
 2 files changed, 53 insertions(+), 49 deletions(-)

diff --git a/mm/compaction.c b/mm/compaction.c
index 952dc2fb24e5..e47d8fa943a6 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -75,7 +75,7 @@ static void split_map_pages(struct list_head *list)
list_for_each_entry_safe(page, next, list, lru) {
list_del(>lru);
 
-   order = page_private(page);
+   order = page_order(page);
nr_pages = 1 << order;
 
post_alloc_hook(page, order, __GFP_MOVABLE);
@@ -879,7 +879,7 @@ isolate_migratepages_block(struct compact_control *cc, 
unsigned long low_pfn,
 * potential isolation targets.
 */
if (PageBuddy(page)) {
-   unsigned long freepage_order = page_order_unsafe(page);
+   unsigned int freepage_order = page_order_unsafe(page);
 
/*
 * Without lock, we cannot be sure that what we got is
@@ -1119,6 +1119,15 @@ isolate_migratepages_range(struct compact_control *cc, 
unsigned long start_pfn,
 #endif /* CONFIG_COMPACTION || CONFIG_CMA */
 #ifdef CONFIG_COMPACTION
 
+/*
+ * order == MAX_ORDER is expected when compacting via
+ * /proc/sys/vm/compact_memory
+ */
+static inline bool is_manual_compaction(struct compact_control *cc)
+{
+   return cc->order == MAX_ORDER;
+}
+
 static bool suitable_migration_source(struct compact_control *cc,
struct page *page)
 {
@@ -1167,7 +1176,7 @@ static bool suitable_migration_target(struct 
compact_control *cc,
 static inline unsigned int
 freelist_scan_limit(struct compact_control *cc)
 {
-   unsigned short shift = BITS_PER_LONG - 1;
+   unsigned int shift = BITS_PER_LONG - 1;
 
return (COMPACT_CLUSTER_MAX >> min(shift, cc->fast_search_fail)) + 1;
 }
@@ -1253,21 +1262,24 @@ fast_isolate_around(struct compact_control *cc, 
unsigned long pfn, unsigned long
 }
 
 /* Search orders in round-robin fashion */
-static int next_search_order(struct compact_control *cc, int order)
+static unsigned int
+next_search_order(struct compact_control *cc, unsigned int order)
 {
-   order--;
-   if (order < 0)
-   order = cc->order - 1;
+   unsigned int next_order = order - 1;
 
-   /* Search wrapped around? */
-   if (order == cc->search_order) {
-   cc->search_order--;
-   if (cc->search_order < 0)
+   if (order == 0)
+   next_order = cc->order - 1;
+
+   if (next_order == cc->search_order) {
+   next_order = UINT_MAX;
+
+   order = cc->search_order;
+   cc->search_order -= 1;
+   if (order == 0)
cc->search_order = cc->order - 1;
-   return -1;
}
 
-   return order;
+   return next_order;
 }
 
 static unsigned long
@@ -1280,10 +1292,10 @@ fast_isolate_freepages(struct compact_control *cc)
unsigned long distance;
struct page *page = NULL;
bool scan_start = false;
-   int order;
+   unsigned int order;
 
-   /* Full compaction passes in a negative order */
-   if (cc->order <= 0)
+   /* Full compaction when manual compaction */
+   if (is_manual_compaction(cc))
return cc->free_pfn;
 
/*
@@ -1310,10 +1322,10 @@ fast_isolate_freepages(struct compact_control *cc)
 * Search starts from the last successful isolation order or the next
 * order to search after a previous failure
 */
-   cc->search_order = min_t(unsigned int, cc->order - 1, cc->search_order);
+   cc->search_order = min(cc->order - 1, cc->search_order);
 
for (order = cc->search_order;
-!page && order >= 0;
+!page && order < MAX_ORDER;
 order = next_search_order(cc, order)) {
struct free_area *area = >zone->free_area[order];
struct list_head *freelist;
@@ -1837,15 +1849,6 @@ s

[PATCH 04/10] mm/page_alloc: remove never used "order" in alloc_contig_range()

2019-07-25 Thread Pengfei Li
The "order" will never be used in alloc_contig_range(), and "order"
is a negative number is very strange. So just remove it.

Signed-off-by: Pengfei Li 
---
 mm/page_alloc.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 7d47af09461f..6208ebfac980 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -8347,7 +8347,6 @@ int alloc_contig_range(unsigned long start, unsigned long 
end,
 
struct compact_control cc = {
.nr_migratepages = 0,
-   .order = -1,
.zone = page_zone(pfn_to_page(start)),
.mode = MIGRATE_SYNC,
.ignore_skip_hint = true,
-- 
2.21.0



[PATCH 02/10] mm/page_alloc: use unsigned int for "order" in __rmqueue_fallback()

2019-07-25 Thread Pengfei Li
Because "order" will never be negative in __rmqueue_fallback(),
so just make "order" unsigned int.
And modify trace_mm_page_alloc_extfrag() accordingly.

Signed-off-by: Pengfei Li 
---
 include/trace/events/kmem.h | 6 +++---
 mm/page_alloc.c | 4 ++--
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/include/trace/events/kmem.h b/include/trace/events/kmem.h
index eb57e3037deb..31f4d09aa31f 100644
--- a/include/trace/events/kmem.h
+++ b/include/trace/events/kmem.h
@@ -277,7 +277,7 @@ TRACE_EVENT(mm_page_pcpu_drain,
 TRACE_EVENT(mm_page_alloc_extfrag,
 
TP_PROTO(struct page *page,
-   int alloc_order, int fallback_order,
+   unsigned int alloc_order, int fallback_order,
int alloc_migratetype, int fallback_migratetype),
 
TP_ARGS(page,
@@ -286,7 +286,7 @@ TRACE_EVENT(mm_page_alloc_extfrag,
 
TP_STRUCT__entry(
__field(unsigned long,  pfn )
-   __field(int,alloc_order )
+   __field(unsigned int,   alloc_order )
__field(int,fallback_order  )
__field(int,alloc_migratetype   )
__field(int,fallback_migratetype)
@@ -303,7 +303,7 @@ TRACE_EVENT(mm_page_alloc_extfrag,
get_pageblock_migratetype(page));
),
 
-   TP_printk("page=%p pfn=%lu alloc_order=%d fallback_order=%d 
pageblock_order=%d alloc_migratetype=%d fallback_migratetype=%d fragmenting=%d 
change_ownership=%d",
+   TP_printk("page=%p pfn=%lu alloc_order=%u fallback_order=%d 
pageblock_order=%d alloc_migratetype=%d fallback_migratetype=%d fragmenting=%d 
change_ownership=%d",
pfn_to_page(__entry->pfn),
__entry->pfn,
__entry->alloc_order,
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 75c18f4fd66a..1432cbcd87cd 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2631,8 +2631,8 @@ static bool unreserve_highatomic_pageblock(const struct 
alloc_context *ac,
  * condition simpler.
  */
 static __always_inline bool
-__rmqueue_fallback(struct zone *zone, int order, int start_migratetype,
-   unsigned int alloc_flags)
+__rmqueue_fallback(struct zone *zone, unsigned int order,
+   int start_migratetype, unsigned int alloc_flags)
 {
struct free_area *area;
int current_order;
-- 
2.21.0



[PATCH 00/10] make "order" unsigned int

2019-07-25 Thread Pengfei Li
Objective

The motivation for this series of patches is use unsigned int for
"order" in compaction.c, just like in other memory subsystems.

In addition, did some cleanup about "order" in page_alloc
and vmscan.


Description

Directly modifying the type of "order" to unsigned int is ok in most
places, because "order" is always non-negative.

But there are two places that are special, one is next_search_order()
and the other is compact_node().

For next_search_order(), order may be negative. It can be avoided by
some modifications.

For compact_node(), order = -1 means performing manual compaction.
It can be avoided by specifying order = MAX_ORDER.

Key changes in [PATCH 05/10] mm/compaction: make "order" and
"search_order" unsigned.

More information can be obtained from commit messages.


Test

I have done some stress testing locally and have not found any problems.

In addition, local tests indicate no performance impact.


Pengfei Li (10):
  mm/page_alloc: use unsigned int for "order" in should_compact_retry()
  mm/page_alloc: use unsigned int for "order" in __rmqueue_fallback()
  mm/page_alloc: use unsigned int for "order" in should_compact_retry()
  mm/page_alloc: remove never used "order" in alloc_contig_range()
  mm/compaction: make "order" and "search_order" unsigned int in struct
compact_control
  mm/compaction: make "order" unsigned int in compaction.c
  trace/events/compaction: make "order" unsigned int
  mm/compaction: use unsigned int for "compact_order_failed" in struct
zone
  mm/compaction: use unsigned int for "kcompactd_max_order" in struct
pglist_data
  mm/vmscan: use unsigned int for "kswapd_order" in struct pglist_data

 include/linux/compaction.h|  30 +++
 include/linux/mmzone.h|   8 +-
 include/trace/events/compaction.h |  40 +-
 include/trace/events/kmem.h   |   6 +-
 include/trace/events/oom.h|   6 +-
 include/trace/events/vmscan.h |   4 +-
 mm/compaction.c   | 127 +++---
 mm/internal.h |   6 +-
 mm/page_alloc.c   |  16 ++--
 mm/vmscan.c   |   6 +-
 10 files changed, 126 insertions(+), 123 deletions(-)

-- 
2.21.0



[PATCH 01/10] mm/page_alloc: use unsigned int for "order" in should_compact_retry()

2019-07-25 Thread Pengfei Li
Like another should_compact_retry(), use unsigned int for "order".
And modify trace_compact_retry() accordingly.

Signed-off-by: Pengfei Li 
---
 include/trace/events/oom.h | 6 +++---
 mm/page_alloc.c| 7 +++
 2 files changed, 6 insertions(+), 7 deletions(-)

diff --git a/include/trace/events/oom.h b/include/trace/events/oom.h
index 26a11e4a2c36..b7fa989349c7 100644
--- a/include/trace/events/oom.h
+++ b/include/trace/events/oom.h
@@ -154,7 +154,7 @@ TRACE_EVENT(skip_task_reaping,
 #ifdef CONFIG_COMPACTION
 TRACE_EVENT(compact_retry,
 
-   TP_PROTO(int order,
+   TP_PROTO(unsigned int order,
enum compact_priority priority,
enum compact_result result,
int retries,
@@ -164,7 +164,7 @@ TRACE_EVENT(compact_retry,
TP_ARGS(order, priority, result, retries, max_retries, ret),
 
TP_STRUCT__entry(
-   __field(int, order)
+   __field(unsigned int, order)
__field(int, priority)
__field(int, result)
__field(int, retries)
@@ -181,7 +181,7 @@ TRACE_EVENT(compact_retry,
__entry->ret = ret;
),
 
-   TP_printk("order=%d priority=%s compaction_result=%s retries=%d 
max_retries=%d should_retry=%d",
+   TP_printk("order=%u priority=%s compaction_result=%s retries=%d 
max_retries=%d should_retry=%d",
__entry->order,
__print_symbolic(__entry->priority, 
COMPACTION_PRIORITY),
__print_symbolic(__entry->result, COMPACTION_FEEDBACK),
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 272c6de1bf4e..75c18f4fd66a 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -3940,10 +3940,9 @@ __alloc_pages_direct_compact(gfp_t gfp_mask, unsigned 
int order,
 }
 
 static inline bool
-should_compact_retry(struct alloc_context *ac, int order, int alloc_flags,
-enum compact_result compact_result,
-enum compact_priority *compact_priority,
-int *compaction_retries)
+should_compact_retry(struct alloc_context *ac, unsigned int order,
+   int alloc_flags, enum compact_result compact_result,
+   enum compact_priority *compact_priority, int *compaction_retries)
 {
int max_retries = MAX_COMPACT_RETRIES;
int min_priority;
-- 
2.21.0



Re: [PATCH v6 1/2] mm/vmalloc: do not keep unpurged areas in the busy tree

2019-07-25 Thread Pengfei Li
Thanks.

Signed-off-by: Pengfei Li 

On Thu, Jul 25, 2019 at 10:36 AM Andrew Morton
 wrote:
>
> On Tue, 16 Jul 2019 23:26:55 +0800 Pengfei Li  wrote:
>
> > From: "Uladzislau Rezki (Sony)" 
> >
> > The busy tree can be quite big, even though the area is freed
> > or unmapped it still stays there until "purge" logic removes
> > it.
> >
> > 1) Optimize and reduce the size of "busy" tree by removing a
> > node from it right away as soon as user triggers free paths.
> > It is possible to do so, because the allocation is done using
> > another augmented tree.
> >
> > The vmalloc test driver shows the difference, for example the
> > "fix_size_alloc_test" is ~11% better comparing with default
> > configuration:
> >
> > sudo ./test_vmalloc.sh performance
> >
> > 
> > Summary: fix_size_alloc_test loops: 100 avg: 993985 usec
> > Summary: full_fit_alloc_test loops: 100 avg: 973554 usec
> > Summary: long_busy_list_alloc_test loops: 100 avg: 12617652 usec
> > 
> >
> > 
> > Summary: fix_size_alloc_test loops: 100 avg: 882263 usec
> > Summary: full_fit_alloc_test loops: 100 avg: 973407 usec
> > Summary: long_busy_list_alloc_test loops: 100 avg: 12593929 usec
> > 
> >
> > 2) Since the busy tree now contains allocated areas only and does
> > not interfere with lazily free nodes, introduce the new function
> > show_purge_info() that dumps "unpurged" areas that is propagated
> > through "/proc/vmallocinfo".
> >
> > 3) Eliminate VM_LAZY_FREE flag.
> >
> > Signed-off-by: Uladzislau Rezki (Sony) 
>
> This should have included your signed-off-by, since you were on the
> patch delivery path.  (Documentation/process/submitting-patches.rst,
> section 11).
>
> Please send along your signed-off-by?


[PATCH v6 2/2] mm/vmalloc: modify struct vmap_area to reduce its size

2019-07-16 Thread Pengfei Li
Objective
-
The current implementation of struct vmap_area wasted space.

After applying this commit, sizeof(struct vmap_area) has been
reduced from 11 words to 8 words.

Description
---
1) Pack "subtree_max_size", "vm" and "purge_list".
This is no problem because
A) "subtree_max_size" is only used when vmap_area is in
   "free" tree
B) "vm" is only used when vmap_area is in "busy" tree
C) "purge_list" is only used when vmap_area is in
   vmap_purge_list

2) Eliminate "flags".
Since only one flag VM_VM_AREA is being used, and the same
thing can be done by judging whether "vm" is NULL, then the
"flags" can be eliminated.

Signed-off-by: Pengfei Li 
Suggested-by: Uladzislau Rezki (Sony) 
---
 include/linux/vmalloc.h | 20 +---
 mm/vmalloc.c| 24 ++--
 2 files changed, 23 insertions(+), 21 deletions(-)

diff --git a/include/linux/vmalloc.h b/include/linux/vmalloc.h
index 9b21d0047710..a1334bd18ef1 100644
--- a/include/linux/vmalloc.h
+++ b/include/linux/vmalloc.h
@@ -51,15 +51,21 @@ struct vmap_area {
unsigned long va_start;
unsigned long va_end;
 
-   /*
-* Largest available free size in subtree.
-*/
-   unsigned long subtree_max_size;
-   unsigned long flags;
struct rb_node rb_node; /* address sorted rbtree */
struct list_head list;  /* address sorted list */
-   struct llist_node purge_list;/* "lazy purge" list */
-   struct vm_struct *vm;
+
+   /*
+* The following three variables can be packed, because
+* a vmap_area object is always one of the three states:
+*1) in "free" tree (root is vmap_area_root)
+*2) in "busy" tree (root is free_vmap_area_root)
+*3) in purge list  (head is vmap_purge_list)
+*/
+   union {
+   unsigned long subtree_max_size; /* in "free" tree */
+   struct vm_struct *vm;   /* in "busy" tree */
+   struct llist_node purge_list;   /* in purge list */
+   };
 };
 
 /*
diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index 71d8040a8a0b..2f7edc0466e7 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -329,7 +329,6 @@ EXPORT_SYMBOL(vmalloc_to_pfn);
 #define DEBUG_AUGMENT_PROPAGATE_CHECK 0
 #define DEBUG_AUGMENT_LOWEST_MATCH_CHECK 0
 
-#define VM_VM_AREA 0x04
 
 static DEFINE_SPINLOCK(vmap_area_lock);
 /* Export for kexec only */
@@ -1115,7 +1114,7 @@ static struct vmap_area *alloc_vmap_area(unsigned long 
size,
 
va->va_start = addr;
va->va_end = addr + size;
-   va->flags = 0;
+   va->vm = NULL;
insert_vmap_area(va, _area_root, _area_list);
 
spin_unlock(_area_lock);
@@ -1922,7 +1921,6 @@ void __init vmalloc_init(void)
if (WARN_ON_ONCE(!va))
continue;
 
-   va->flags = VM_VM_AREA;
va->va_start = (unsigned long)tmp->addr;
va->va_end = va->va_start + tmp->size;
va->vm = tmp;
@@ -2020,7 +2018,6 @@ static void setup_vmalloc_vm(struct vm_struct *vm, struct 
vmap_area *va,
vm->size = va->va_end - va->va_start;
vm->caller = caller;
va->vm = vm;
-   va->flags |= VM_VM_AREA;
spin_unlock(_area_lock);
 }
 
@@ -2125,10 +2122,10 @@ struct vm_struct *find_vm_area(const void *addr)
struct vmap_area *va;
 
va = find_vmap_area((unsigned long)addr);
-   if (va && va->flags & VM_VM_AREA)
-   return va->vm;
+   if (!va)
+   return NULL;
 
-   return NULL;
+   return va->vm;
 }
 
 /**
@@ -2149,11 +2146,10 @@ struct vm_struct *remove_vm_area(const void *addr)
 
spin_lock(_area_lock);
va = __find_vmap_area((unsigned long)addr);
-   if (va && va->flags & VM_VM_AREA) {
+   if (va && va->vm) {
struct vm_struct *vm = va->vm;
 
va->vm = NULL;
-   va->flags &= ~VM_VM_AREA;
spin_unlock(_area_lock);
 
kasan_free_shadow(vm);
@@ -2856,7 +2852,7 @@ long vread(char *buf, char *addr, unsigned long count)
if (!count)
break;
 
-   if (!(va->flags & VM_VM_AREA))
+   if (!va->vm)
continue;
 
vm = va->vm;
@@ -2936,7 +2932,7 @@ long vwrite(char *buf, char *addr, unsigned long count)
if (!count)
break;
 
-   if (!(va->flags & VM_VM_AREA))
+   if (!va->vm)
continue;
 
vm = va->vm;
@@ -3466,10 +3462,10 @@ static int s_show(struct 

[PATCH v6 1/2] mm/vmalloc: do not keep unpurged areas in the busy tree

2019-07-16 Thread Pengfei Li
From: "Uladzislau Rezki (Sony)" 

The busy tree can be quite big, even though the area is freed
or unmapped it still stays there until "purge" logic removes
it.

1) Optimize and reduce the size of "busy" tree by removing a
node from it right away as soon as user triggers free paths.
It is possible to do so, because the allocation is done using
another augmented tree.

The vmalloc test driver shows the difference, for example the
"fix_size_alloc_test" is ~11% better comparing with default
configuration:

sudo ./test_vmalloc.sh performance


Summary: fix_size_alloc_test loops: 100 avg: 993985 usec
Summary: full_fit_alloc_test loops: 100 avg: 973554 usec
Summary: long_busy_list_alloc_test loops: 100 avg: 12617652 usec



Summary: fix_size_alloc_test loops: 100 avg: 882263 usec
Summary: full_fit_alloc_test loops: 100 avg: 973407 usec
Summary: long_busy_list_alloc_test loops: 100 avg: 12593929 usec


2) Since the busy tree now contains allocated areas only and does
not interfere with lazily free nodes, introduce the new function
show_purge_info() that dumps "unpurged" areas that is propagated
through "/proc/vmallocinfo".

3) Eliminate VM_LAZY_FREE flag.

Signed-off-by: Uladzislau Rezki (Sony) 
---
 mm/vmalloc.c | 52 
 1 file changed, 44 insertions(+), 8 deletions(-)

diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index 4fa8d84599b0..71d8040a8a0b 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -329,7 +329,6 @@ EXPORT_SYMBOL(vmalloc_to_pfn);
 #define DEBUG_AUGMENT_PROPAGATE_CHECK 0
 #define DEBUG_AUGMENT_LOWEST_MATCH_CHECK 0
 
-#define VM_LAZY_FREE   0x02
 #define VM_VM_AREA 0x04
 
 static DEFINE_SPINLOCK(vmap_area_lock);
@@ -1276,7 +1275,14 @@ static bool __purge_vmap_area_lazy(unsigned long start, 
unsigned long end)
llist_for_each_entry_safe(va, n_va, valist, purge_list) {
unsigned long nr = (va->va_end - va->va_start) >> PAGE_SHIFT;
 
-   __free_vmap_area(va);
+   /*
+* Finally insert or merge lazily-freed area. It is
+* detached and there is no need to "unlink" it from
+* anything.
+*/
+   merge_or_add_vmap_area(va,
+   _vmap_area_root, _vmap_area_list);
+
atomic_long_sub(nr, _lazy_nr);
 
if (atomic_long_read(_lazy_nr) < resched_threshold)
@@ -1318,6 +1324,10 @@ static void free_vmap_area_noflush(struct vmap_area *va)
 {
unsigned long nr_lazy;
 
+   spin_lock(_area_lock);
+   unlink_va(va, _area_root);
+   spin_unlock(_area_lock);
+
nr_lazy = atomic_long_add_return((va->va_end - va->va_start) >>
PAGE_SHIFT, _lazy_nr);
 
@@ -2137,14 +2147,13 @@ struct vm_struct *remove_vm_area(const void *addr)
 
might_sleep();
 
-   va = find_vmap_area((unsigned long)addr);
+   spin_lock(_area_lock);
+   va = __find_vmap_area((unsigned long)addr);
if (va && va->flags & VM_VM_AREA) {
struct vm_struct *vm = va->vm;
 
-   spin_lock(_area_lock);
va->vm = NULL;
va->flags &= ~VM_VM_AREA;
-   va->flags |= VM_LAZY_FREE;
spin_unlock(_area_lock);
 
kasan_free_shadow(vm);
@@ -2152,6 +2161,8 @@ struct vm_struct *remove_vm_area(const void *addr)
 
return vm;
}
+
+   spin_unlock(_area_lock);
return NULL;
 }
 
@@ -3431,6 +3442,22 @@ static void show_numa_info(struct seq_file *m, struct 
vm_struct *v)
}
 }
 
+static void show_purge_info(struct seq_file *m)
+{
+   struct llist_node *head;
+   struct vmap_area *va;
+
+   head = READ_ONCE(vmap_purge_list.first);
+   if (head == NULL)
+   return;
+
+   llist_for_each_entry(va, head, purge_list) {
+   seq_printf(m, "0x%pK-0x%pK %7ld unpurged vm_area\n",
+   (void *)va->va_start, (void *)va->va_end,
+   va->va_end - va->va_start);
+   }
+}
+
 static int s_show(struct seq_file *m, void *p)
 {
struct vmap_area *va;
@@ -3443,10 +3470,9 @@ static int s_show(struct seq_file *m, void *p)
 * behalf of vmap area is being tear down or vm_map_ram allocation.
 */
if (!(va->flags & VM_VM_AREA)) {
-   seq_printf(m, "0x%pK-0x%pK %7ld %s\n",
+   seq_printf(m, "0x%pK-0x%pK %7ld vm_map_ram\n",
(void *)va->va_start, (void *)va->va_end,
-   va->va_end - va->va_start,
-   va->flags & VM_LAZY_FREE ? "unpurged vm_area" : 
"vm_map_ram");
+   va->va_end - va->va_start);
 
return 0;
}
@@ -3482,6 +3508,16 @@ static int s_show(struct seq_file *m, void *p)
 
show_numa_info(m, v);
seq_putc(m, '\n');
+
+   /*
+* As a final step, dump "unpurged" areas. Note,
+* that 

[PATCH v6 0/2] mm/vmalloc.c: improve readability and rewrite vmap_area

2019-07-16 Thread Pengfei Li
v5 -> v6
* patch 2: keep the comment in s_show()

v4 -> v5
* Base on next-20190716
* patch 1: From Uladzislau Rezki (Sony)  (author)
  - https://lkml.org/lkml/2019/7/16/276
* patch 2: Use v3

v3 -> v4:
* Base on next-20190711
* patch 1: From: Uladzislau Rezki (Sony)  (author)
  - https://lkml.org/lkml/2019/7/3/661
* patch 2: Modify the layout of struct vmap_area for readability

v2 -> v3:
* patch 1-4: Abandoned
* patch 5:
  - Eliminate "flags" (suggested by Uladzislau Rezki)
  - Base on https://lkml.org/lkml/2019/6/6/455
and https://lkml.org/lkml/2019/7/3/661

v1 -> v2:
* patch 3: Rename __find_vmap_area to __search_va_in_busy_tree
   instead of __search_va_from_busy_tree.
* patch 5: Add motivation and necessary test data to the commit
   message.
* patch 5: Let va->flags use only some low bits of va_start
   instead of completely overwriting va_start.

The current implementation of struct vmap_area wasted space.

After applying this commit, sizeof(struct vmap_area) has been
reduced from 11 words to 8 words.

Pengfei Li (1):
  mm/vmalloc: modify struct vmap_area to reduce its size

Uladzislau Rezki (Sony) (1):
  mm/vmalloc: do not keep unpurged areas in the busy tree

 include/linux/vmalloc.h | 20 +++
 mm/vmalloc.c| 76 +
 2 files changed, 67 insertions(+), 29 deletions(-)

-- 
2.21.0



Re: [PATCH v5 2/2] mm/vmalloc: modify struct vmap_area to reduce its size

2019-07-16 Thread Pengfei Li
On Tue, Jul 16, 2019 at 10:35 PM Uladzislau Rezki  wrote:
>
> On Tue, Jul 16, 2019 at 09:26:04PM +0800, Pengfei Li wrote:
> > Objective
> > -
> > The current implementation of struct vmap_area wasted space.
> >
> > After applying this commit, sizeof(struct vmap_area) has been
> > reduced from 11 words to 8 words.
> >
> > Description
> > ---
> > 1) Pack "subtree_max_size", "vm" and "purge_list".
> > This is no problem because
> > A) "subtree_max_size" is only used when vmap_area is in
> >"free" tree
> > B) "vm" is only used when vmap_area is in "busy" tree
> > C) "purge_list" is only used when vmap_area is in
> >vmap_purge_list
> >
> > 2) Eliminate "flags".
> > Since only one flag VM_VM_AREA is being used, and the same
> > thing can be done by judging whether "vm" is NULL, then the
> > "flags" can be eliminated.
> >
> > Signed-off-by: Pengfei Li 
> > Suggested-by: Uladzislau Rezki (Sony) 
> > ---
> >  include/linux/vmalloc.h | 20 +---
> >  mm/vmalloc.c| 24 ++--
> >  2 files changed, 23 insertions(+), 21 deletions(-)
> >
> > diff --git a/include/linux/vmalloc.h b/include/linux/vmalloc.h
> > index 9b21d0047710..a1334bd18ef1 100644
> > --- a/include/linux/vmalloc.h
> > +++ b/include/linux/vmalloc.h
> > @@ -51,15 +51,21 @@ struct vmap_area {
> >   unsigned long va_start;
> >   unsigned long va_end;
> >
> > - /*
> > -  * Largest available free size in subtree.
> > -  */
> > - unsigned long subtree_max_size;
> > - unsigned long flags;
> >   struct rb_node rb_node; /* address sorted rbtree */
> >   struct list_head list;  /* address sorted list */
> > - struct llist_node purge_list;/* "lazy purge" list */
> > - struct vm_struct *vm;
> > +
> > + /*
> > +  * The following three variables can be packed, because
> > +  * a vmap_area object is always one of the three states:
> > +  *1) in "free" tree (root is vmap_area_root)
> > +  *2) in "busy" tree (root is free_vmap_area_root)
> > +  *3) in purge list  (head is vmap_purge_list)
> > +  */
> > + union {
> > + unsigned long subtree_max_size; /* in "free" tree */
> > + struct vm_struct *vm;   /* in "busy" tree */
> > + struct llist_node purge_list;   /* in purge list */
> > + };
> >  };
> >
> >  /*
> > diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> > index 71d8040a8a0b..39bf9cf4175a 100644
> > --- a/mm/vmalloc.c
> > +++ b/mm/vmalloc.c
> > @@ -329,7 +329,6 @@ EXPORT_SYMBOL(vmalloc_to_pfn);
> >  #define DEBUG_AUGMENT_PROPAGATE_CHECK 0
> >  #define DEBUG_AUGMENT_LOWEST_MATCH_CHECK 0
> >
> > -#define VM_VM_AREA   0x04
> >
> >  static DEFINE_SPINLOCK(vmap_area_lock);
> >  /* Export for kexec only */
> > @@ -1115,7 +1114,7 @@ static struct vmap_area *alloc_vmap_area(unsigned 
> > long size,
> >
> >   va->va_start = addr;
> >   va->va_end = addr + size;
> > - va->flags = 0;
> > + va->vm = NULL;
> >   insert_vmap_area(va, _area_root, _area_list);
> >
> >   spin_unlock(_area_lock);
> > @@ -1922,7 +1921,6 @@ void __init vmalloc_init(void)
> >   if (WARN_ON_ONCE(!va))
> >   continue;
> >
> > - va->flags = VM_VM_AREA;
> >   va->va_start = (unsigned long)tmp->addr;
> >   va->va_end = va->va_start + tmp->size;
> >   va->vm = tmp;
> > @@ -2020,7 +2018,6 @@ static void setup_vmalloc_vm(struct vm_struct *vm, 
> > struct vmap_area *va,
> >   vm->size = va->va_end - va->va_start;
> >   vm->caller = caller;
> >   va->vm = vm;
> > - va->flags |= VM_VM_AREA;
> >   spin_unlock(_area_lock);
> >  }
> >
> > @@ -2125,10 +2122,10 @@ struct vm_struct *find_vm_area(const void *addr)
> >   struct vmap_area *va;
> >
> >   va = find_vmap_area((unsigned long)addr);
> > - if (va && va->flags & VM_VM_AREA)
> > - return va->vm;
> > + if (!va)
> > + return NULL;
> >
> > - return NULL;
> > + return

[PATCH v5 1/2] mm/vmalloc: do not keep unpurged areas in the busy tree

2019-07-16 Thread Pengfei Li
From: "Uladzislau Rezki (Sony)" 

The busy tree can be quite big, even though the area is freed
or unmapped it still stays there until "purge" logic removes
it.

1) Optimize and reduce the size of "busy" tree by removing a
node from it right away as soon as user triggers free paths.
It is possible to do so, because the allocation is done using
another augmented tree.

The vmalloc test driver shows the difference, for example the
"fix_size_alloc_test" is ~11% better comparing with default
configuration:

sudo ./test_vmalloc.sh performance


Summary: fix_size_alloc_test loops: 100 avg: 993985 usec
Summary: full_fit_alloc_test loops: 100 avg: 973554 usec
Summary: long_busy_list_alloc_test loops: 100 avg: 12617652 usec



Summary: fix_size_alloc_test loops: 100 avg: 882263 usec
Summary: full_fit_alloc_test loops: 100 avg: 973407 usec
Summary: long_busy_list_alloc_test loops: 100 avg: 12593929 usec


2) Since the busy tree now contains allocated areas only and does
not interfere with lazily free nodes, introduce the new function
show_purge_info() that dumps "unpurged" areas that is propagated
through "/proc/vmallocinfo".

3) Eliminate VM_LAZY_FREE flag.

Signed-off-by: Uladzislau Rezki (Sony) 
---
 mm/vmalloc.c | 52 
 1 file changed, 44 insertions(+), 8 deletions(-)

diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index 4fa8d84599b0..71d8040a8a0b 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -329,7 +329,6 @@ EXPORT_SYMBOL(vmalloc_to_pfn);
 #define DEBUG_AUGMENT_PROPAGATE_CHECK 0
 #define DEBUG_AUGMENT_LOWEST_MATCH_CHECK 0
 
-#define VM_LAZY_FREE   0x02
 #define VM_VM_AREA 0x04
 
 static DEFINE_SPINLOCK(vmap_area_lock);
@@ -1276,7 +1275,14 @@ static bool __purge_vmap_area_lazy(unsigned long start, 
unsigned long end)
llist_for_each_entry_safe(va, n_va, valist, purge_list) {
unsigned long nr = (va->va_end - va->va_start) >> PAGE_SHIFT;
 
-   __free_vmap_area(va);
+   /*
+* Finally insert or merge lazily-freed area. It is
+* detached and there is no need to "unlink" it from
+* anything.
+*/
+   merge_or_add_vmap_area(va,
+   _vmap_area_root, _vmap_area_list);
+
atomic_long_sub(nr, _lazy_nr);
 
if (atomic_long_read(_lazy_nr) < resched_threshold)
@@ -1318,6 +1324,10 @@ static void free_vmap_area_noflush(struct vmap_area *va)
 {
unsigned long nr_lazy;
 
+   spin_lock(_area_lock);
+   unlink_va(va, _area_root);
+   spin_unlock(_area_lock);
+
nr_lazy = atomic_long_add_return((va->va_end - va->va_start) >>
PAGE_SHIFT, _lazy_nr);
 
@@ -2137,14 +2147,13 @@ struct vm_struct *remove_vm_area(const void *addr)
 
might_sleep();
 
-   va = find_vmap_area((unsigned long)addr);
+   spin_lock(_area_lock);
+   va = __find_vmap_area((unsigned long)addr);
if (va && va->flags & VM_VM_AREA) {
struct vm_struct *vm = va->vm;
 
-   spin_lock(_area_lock);
va->vm = NULL;
va->flags &= ~VM_VM_AREA;
-   va->flags |= VM_LAZY_FREE;
spin_unlock(_area_lock);
 
kasan_free_shadow(vm);
@@ -2152,6 +2161,8 @@ struct vm_struct *remove_vm_area(const void *addr)
 
return vm;
}
+
+   spin_unlock(_area_lock);
return NULL;
 }
 
@@ -3431,6 +3442,22 @@ static void show_numa_info(struct seq_file *m, struct 
vm_struct *v)
}
 }
 
+static void show_purge_info(struct seq_file *m)
+{
+   struct llist_node *head;
+   struct vmap_area *va;
+
+   head = READ_ONCE(vmap_purge_list.first);
+   if (head == NULL)
+   return;
+
+   llist_for_each_entry(va, head, purge_list) {
+   seq_printf(m, "0x%pK-0x%pK %7ld unpurged vm_area\n",
+   (void *)va->va_start, (void *)va->va_end,
+   va->va_end - va->va_start);
+   }
+}
+
 static int s_show(struct seq_file *m, void *p)
 {
struct vmap_area *va;
@@ -3443,10 +3470,9 @@ static int s_show(struct seq_file *m, void *p)
 * behalf of vmap area is being tear down or vm_map_ram allocation.
 */
if (!(va->flags & VM_VM_AREA)) {
-   seq_printf(m, "0x%pK-0x%pK %7ld %s\n",
+   seq_printf(m, "0x%pK-0x%pK %7ld vm_map_ram\n",
(void *)va->va_start, (void *)va->va_end,
-   va->va_end - va->va_start,
-   va->flags & VM_LAZY_FREE ? "unpurged vm_area" : 
"vm_map_ram");
+   va->va_end - va->va_start);
 
return 0;
}
@@ -3482,6 +3508,16 @@ static int s_show(struct seq_file *m, void *p)
 
show_numa_info(m, v);
seq_putc(m, '\n');
+
+   /*
+* As a final step, dump "unpurged" areas. Note,
+* that 

[PATCH v5 2/2] mm/vmalloc: modify struct vmap_area to reduce its size

2019-07-16 Thread Pengfei Li
Objective
-
The current implementation of struct vmap_area wasted space.

After applying this commit, sizeof(struct vmap_area) has been
reduced from 11 words to 8 words.

Description
---
1) Pack "subtree_max_size", "vm" and "purge_list".
This is no problem because
A) "subtree_max_size" is only used when vmap_area is in
   "free" tree
B) "vm" is only used when vmap_area is in "busy" tree
C) "purge_list" is only used when vmap_area is in
   vmap_purge_list

2) Eliminate "flags".
Since only one flag VM_VM_AREA is being used, and the same
thing can be done by judging whether "vm" is NULL, then the
"flags" can be eliminated.

Signed-off-by: Pengfei Li 
Suggested-by: Uladzislau Rezki (Sony) 
---
 include/linux/vmalloc.h | 20 +---
 mm/vmalloc.c| 24 ++--
 2 files changed, 23 insertions(+), 21 deletions(-)

diff --git a/include/linux/vmalloc.h b/include/linux/vmalloc.h
index 9b21d0047710..a1334bd18ef1 100644
--- a/include/linux/vmalloc.h
+++ b/include/linux/vmalloc.h
@@ -51,15 +51,21 @@ struct vmap_area {
unsigned long va_start;
unsigned long va_end;
 
-   /*
-* Largest available free size in subtree.
-*/
-   unsigned long subtree_max_size;
-   unsigned long flags;
struct rb_node rb_node; /* address sorted rbtree */
struct list_head list;  /* address sorted list */
-   struct llist_node purge_list;/* "lazy purge" list */
-   struct vm_struct *vm;
+
+   /*
+* The following three variables can be packed, because
+* a vmap_area object is always one of the three states:
+*1) in "free" tree (root is vmap_area_root)
+*2) in "busy" tree (root is free_vmap_area_root)
+*3) in purge list  (head is vmap_purge_list)
+*/
+   union {
+   unsigned long subtree_max_size; /* in "free" tree */
+   struct vm_struct *vm;   /* in "busy" tree */
+   struct llist_node purge_list;   /* in purge list */
+   };
 };
 
 /*
diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index 71d8040a8a0b..39bf9cf4175a 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -329,7 +329,6 @@ EXPORT_SYMBOL(vmalloc_to_pfn);
 #define DEBUG_AUGMENT_PROPAGATE_CHECK 0
 #define DEBUG_AUGMENT_LOWEST_MATCH_CHECK 0
 
-#define VM_VM_AREA 0x04
 
 static DEFINE_SPINLOCK(vmap_area_lock);
 /* Export for kexec only */
@@ -1115,7 +1114,7 @@ static struct vmap_area *alloc_vmap_area(unsigned long 
size,
 
va->va_start = addr;
va->va_end = addr + size;
-   va->flags = 0;
+   va->vm = NULL;
insert_vmap_area(va, _area_root, _area_list);
 
spin_unlock(_area_lock);
@@ -1922,7 +1921,6 @@ void __init vmalloc_init(void)
if (WARN_ON_ONCE(!va))
continue;
 
-   va->flags = VM_VM_AREA;
va->va_start = (unsigned long)tmp->addr;
va->va_end = va->va_start + tmp->size;
va->vm = tmp;
@@ -2020,7 +2018,6 @@ static void setup_vmalloc_vm(struct vm_struct *vm, struct 
vmap_area *va,
vm->size = va->va_end - va->va_start;
vm->caller = caller;
va->vm = vm;
-   va->flags |= VM_VM_AREA;
spin_unlock(_area_lock);
 }
 
@@ -2125,10 +2122,10 @@ struct vm_struct *find_vm_area(const void *addr)
struct vmap_area *va;
 
va = find_vmap_area((unsigned long)addr);
-   if (va && va->flags & VM_VM_AREA)
-   return va->vm;
+   if (!va)
+   return NULL;
 
-   return NULL;
+   return va->vm;
 }
 
 /**
@@ -2149,11 +2146,10 @@ struct vm_struct *remove_vm_area(const void *addr)
 
spin_lock(_area_lock);
va = __find_vmap_area((unsigned long)addr);
-   if (va && va->flags & VM_VM_AREA) {
+   if (va && va->vm) {
struct vm_struct *vm = va->vm;
 
va->vm = NULL;
-   va->flags &= ~VM_VM_AREA;
spin_unlock(_area_lock);
 
kasan_free_shadow(vm);
@@ -2856,7 +2852,7 @@ long vread(char *buf, char *addr, unsigned long count)
if (!count)
break;
 
-   if (!(va->flags & VM_VM_AREA))
+   if (!va->vm)
continue;
 
vm = va->vm;
@@ -2936,7 +2932,7 @@ long vwrite(char *buf, char *addr, unsigned long count)
if (!count)
break;
 
-   if (!(va->flags & VM_VM_AREA))
+   if (!va->vm)
continue;
 
vm = va->vm;
@@ -3466,10 +3462,10 @@ static int s_show(struct 

[PATCH v5 0/2] mm/vmalloc.c: improve readability and rewrite vmap_area

2019-07-16 Thread Pengfei Li
v4 -> v5
* Base on next-20190716
* patch 1: From Uladzislau Rezki (Sony)  (author)
  - https://lkml.org/lkml/2019/7/16/276
* patch 2: Use v3

v3 -> v4:
* Base on next-20190711
* patch 1: From: Uladzislau Rezki (Sony)  (author)
  - https://lkml.org/lkml/2019/7/3/661
* patch 2: Modify the layout of struct vmap_area for readability

v2 -> v3:
* patch 1-4: Abandoned
* patch 5:
  - Eliminate "flags" (suggested by Uladzislau Rezki)
  - Base on https://lkml.org/lkml/2019/6/6/455
and https://lkml.org/lkml/2019/7/3/661

v1 -> v2:
* patch 3: Rename __find_vmap_area to __search_va_in_busy_tree
   instead of __search_va_from_busy_tree.
* patch 5: Add motivation and necessary test data to the commit
   message.
* patch 5: Let va->flags use only some low bits of va_start
   instead of completely overwriting va_start.

The current implementation of struct vmap_area wasted space.

After applying this commit, sizeof(struct vmap_area) has been
reduced from 11 words to 8 words.

Pengfei Li (1):
  mm/vmalloc: modify struct vmap_area to reduce its size

Uladzislau Rezki (Sony) (1):
  mm/vmalloc: do not keep unpurged areas in the busy tree

 include/linux/vmalloc.h | 20 +++
 mm/vmalloc.c| 76 +
 2 files changed, 67 insertions(+), 29 deletions(-)

-- 
2.21.0



Re: [PATCH v4 2/2] mm/vmalloc.c: Modify struct vmap_area to reduce its size

2019-07-15 Thread Pengfei Li
Hi, Vlad

Thanks for the comments form you and Matthew, now I am sure
v3 is enough.

I will follow the next version of your "mm/vmalloc: do not
keep unpurged areas in the busy tree".

Thanks again for your patience with me!

--
Pengfei


Re: [PATCH v4 2/2] mm/vmalloc.c: Modify struct vmap_area to reduce its size

2019-07-12 Thread Pengfei Li
On Fri, Jul 12, 2019 at 9:49 PM Matthew Wilcox  wrote:
>
> On Fri, Jul 12, 2019 at 08:02:13PM +0800, Pengfei Li wrote:
>
> I don't think you need struct union struct union.  Because llist_node
> is just a pointer, you can get the same savings with just:
>
> union {
> struct llist_node purge_list;
> struct vm_struct *vm;
> unsigned long subtree_max_size;
> };
>

Thanks for your comments.

As you said, I did this in v3.
https://patchwork.kernel.org/patch/11031507/

The reason why I use struct union struct in v4 is that I want to
express "in the tree" and "in the purge list" are two completely
isolated cases.

struct vmap_area {
union {
struct {/* Case A: In the tree */
...
};

struct {/* Case B: In the purge list */
...
};
};
};

The "rb_node" and "list" should also not be used when va is in
the purge list

what do you think of this idea?


[PATCH v4 2/2] mm/vmalloc.c: Modify struct vmap_area to reduce its size

2019-07-12 Thread Pengfei Li
Objective
-
The current implementation of struct vmap_area wasted space.

After applying this commit, sizeof(struct vmap_area) has been
reduced from 11 words to 8 words.

Description
---
1) Pack "vm" and "subtree_max_size"
This is no problem because
  A) "vm" is only used when vmap_area is in "busy" tree
  B) "subtree_max_size" is only used when vmap_area is in
 "free" tree

2) Pack "purge_list"
The variable "purge_list" is only used when vmap_area is in
"lazy purge" list. So it can be packed with other variables,
which are only used in rbtree and list sorted by address.

3) Eliminate "flags".
Since only one flag VM_VM_AREA is being used, and the same
thing can be done by judging whether "vm" is NULL, then the
"flags" can be eliminated.

Signed-off-by: Pengfei Li 
Suggested-by: Uladzislau Rezki (Sony) 
---
 include/linux/vmalloc.h | 40 +++-
 mm/vmalloc.c| 28 +---
 2 files changed, 44 insertions(+), 24 deletions(-)

diff --git a/include/linux/vmalloc.h b/include/linux/vmalloc.h
index 9b21d0047710..6fb377ca9e7a 100644
--- a/include/linux/vmalloc.h
+++ b/include/linux/vmalloc.h
@@ -51,15 +51,37 @@ struct vmap_area {
unsigned long va_start;
unsigned long va_end;
 
-   /*
-* Largest available free size in subtree.
-*/
-   unsigned long subtree_max_size;
-   unsigned long flags;
-   struct rb_node rb_node; /* address sorted rbtree */
-   struct list_head list;  /* address sorted list */
-   struct llist_node purge_list;/* "lazy purge" list */
-   struct vm_struct *vm;
+   union {
+   /* In rbtree and list sorted by address */
+   struct {
+   union {
+   /*
+* In "busy" rbtree and list.
+* rbtree root: vmap_area_root
+* list head:   vmap_area_list
+*/
+   struct vm_struct *vm;
+
+   /*
+* In "free" rbtree and list.
+* rbtree root: free_vmap_area_root
+* list head:   free_vmap_area_list
+*/
+   unsigned long subtree_max_size;
+   };
+
+   struct rb_node rb_node;
+   struct list_head list;
+   };
+
+   /*
+* In "lazy purge" list.
+* llist head: vmap_purge_list
+*/
+   struct {
+   struct llist_node purge_list;
+   };
+   };
 };
 
 /*
diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index 9eb700a2087b..1245d3285a32 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -329,7 +329,6 @@ EXPORT_SYMBOL(vmalloc_to_pfn);
 #define DEBUG_AUGMENT_PROPAGATE_CHECK 0
 #define DEBUG_AUGMENT_LOWEST_MATCH_CHECK 0
 
-#define VM_VM_AREA 0x04
 
 static DEFINE_SPINLOCK(vmap_area_lock);
 /* Export for kexec only */
@@ -1115,7 +1114,7 @@ static struct vmap_area *alloc_vmap_area(unsigned long 
size,
 
va->va_start = addr;
va->va_end = addr + size;
-   va->flags = 0;
+   va->vm = NULL;
insert_vmap_area(va, _area_root, _area_list);
 
spin_unlock(_area_lock);
@@ -1279,7 +1278,9 @@ static bool __purge_vmap_area_lazy(unsigned long start, 
unsigned long end)
llist_for_each_entry_safe(va, n_va, valist, purge_list) {
unsigned long nr = (va->va_end - va->va_start) >> PAGE_SHIFT;
 
-   __free_vmap_area(va);
+   merge_or_add_vmap_area(va,
+   _vmap_area_root, _vmap_area_list);
+
atomic_long_sub(nr, _lazy_nr);
 
if (atomic_long_read(_lazy_nr) < resched_threshold)
@@ -1919,7 +1920,6 @@ void __init vmalloc_init(void)
if (WARN_ON_ONCE(!va))
continue;
 
-   va->flags = VM_VM_AREA;
va->va_start = (unsigned long)tmp->addr;
va->va_end = va->va_start + tmp->size;
va->vm = tmp;
@@ -2017,7 +2017,6 @@ static void setup_vmalloc_vm(struct vm_struct *vm, struct 
vmap_area *va,
vm->size = va->va_end - va->va_start;
vm->caller = caller;
va->vm = vm;
-   va->flags |= VM_VM_AREA;
spin_unlock(_area_lock);
 }
 
@@ -2122,10 +2121,10 @@ struct vm_struct *find_vm_area(const void *addr)
struct vmap_area *va;
 
va = find_vmap_area((unsigned long)addr);
-   if (va && va->flags & VM_VM_AREA)
- 

[PATCH v4 0/2] mm/vmalloc.c: improve readability and rewrite vmap_area

2019-07-12 Thread Pengfei Li
v3 -> v4:
* Base on next-20190711
* patch 1: From: Uladzislau Rezki (Sony)  (author)
  - https://lkml.org/lkml/2019/7/3/661
* patch 2: Modify the layout of struct vmap_area for readability

v2 -> v3:
* patch 1-4: Abandoned
* patch 5:
  - Eliminate "flags" (suggested by Uladzislau Rezki)
  - Base on https://lkml.org/lkml/2019/6/6/455
and https://lkml.org/lkml/2019/7/3/661

v1 -> v2:
* patch 3: Rename __find_vmap_area to __search_va_in_busy_tree
   instead of __search_va_from_busy_tree.
* patch 5: Add motivation and necessary test data to the commit
   message.
* patch 5: Let va->flags use only some low bits of va_start
   instead of completely overwriting va_start.

The current implementation of struct vmap_area wasted space.

After applying this commit, sizeof(struct vmap_area) has been
reduced from 11 words to 8 words.

Pengfei Li (1):
  mm/vmalloc.c: Modify struct vmap_area to reduce its size

Uladzislau Rezki (Sony) (1):
  mm/vmalloc: do not keep unpurged areas in the busy tree

 include/linux/vmalloc.h | 40 -
 mm/vmalloc.c| 79 -
 2 files changed, 86 insertions(+), 33 deletions(-)

-- 
2.21.0



  1   2   >