Re: [Devel] [PATCH rh7 2/2] mm/memcg: reclaim only kmem if kmem limit reached.

2017-08-28 Thread Stanislav Kinsburskiy


25.08.2017 18:38, Andrey Ryabinin пишет:
> If kmem limit on memcg reached, we go into memory reclaim,
> and reclaim everything we can, including page cache and anon.
> Reclaiming page cache or anon won't help since we need to lower
> only kmem usage. This patch fixes the problem by avoiding
> non-kmem reclaim on hitting the kmem limit.
> 

Can't there be a situation, when some object in anon mem or page cache holds 
some object in kmem (indirectly)?

> https://jira.sw.ru/browse/PSBM-69226
> Signed-off-by: Andrey Ryabinin 
> ---
>  include/linux/memcontrol.h | 10 ++
>  include/linux/swap.h   |  2 +-
>  mm/memcontrol.c| 30 --
>  mm/vmscan.c| 31 ---
>  4 files changed, 51 insertions(+), 22 deletions(-)
> 
> diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
> index 1a52e58ab7de..1d6bc80c4c90 100644
> --- a/include/linux/memcontrol.h
> +++ b/include/linux/memcontrol.h
> @@ -45,6 +45,16 @@ struct mem_cgroup_reclaim_cookie {
>   unsigned int generation;
>  };
>  
> +/*
> + * Reclaim flags for mem_cgroup_hierarchical_reclaim
> + */
> +#define MEM_CGROUP_RECLAIM_NOSWAP_BIT0x0
> +#define MEM_CGROUP_RECLAIM_NOSWAP(1 << MEM_CGROUP_RECLAIM_NOSWAP_BIT)
> +#define MEM_CGROUP_RECLAIM_SHRINK_BIT0x1
> +#define MEM_CGROUP_RECLAIM_SHRINK(1 << MEM_CGROUP_RECLAIM_SHRINK_BIT)
> +#define MEM_CGROUP_RECLAIM_KMEM_BIT  0x2
> +#define MEM_CGROUP_RECLAIM_KMEM  (1 << 
> MEM_CGROUP_RECLAIM_KMEM_BIT)
> +
>  #ifdef CONFIG_MEMCG
>  int mem_cgroup_try_charge(struct page *page, struct mm_struct *mm,
> gfp_t gfp_mask, struct mem_cgroup **memcgp);
> diff --git a/include/linux/swap.h b/include/linux/swap.h
> index bd162f9bef0d..bd47451ec95a 100644
> --- a/include/linux/swap.h
> +++ b/include/linux/swap.h
> @@ -324,7 +324,7 @@ extern unsigned long try_to_free_pages(struct zonelist 
> *zonelist, int order,
>  extern int __isolate_lru_page(struct page *page, isolate_mode_t mode);
>  extern unsigned long try_to_free_mem_cgroup_pages(struct mem_cgroup *mem,
> unsigned long nr_pages,
> -   gfp_t gfp_mask, bool noswap);
> +   gfp_t gfp_mask, int flags);
>  extern unsigned long mem_cgroup_shrink_node_zone(struct mem_cgroup *mem,
>   gfp_t gfp_mask, bool noswap,
>   struct zone *zone,
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 97824e281d7a..f9a5f3819a31 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -511,16 +511,6 @@ enum res_type {
>  #define OOM_CONTROL  (0)
>  
>  /*
> - * Reclaim flags for mem_cgroup_hierarchical_reclaim
> - */
> -#define MEM_CGROUP_RECLAIM_NOSWAP_BIT0x0
> -#define MEM_CGROUP_RECLAIM_NOSWAP(1 << MEM_CGROUP_RECLAIM_NOSWAP_BIT)
> -#define MEM_CGROUP_RECLAIM_SHRINK_BIT0x1
> -#define MEM_CGROUP_RECLAIM_SHRINK(1 << MEM_CGROUP_RECLAIM_SHRINK_BIT)
> -#define MEM_CGROUP_RECLAIM_KMEM_BIT  0x2
> -#define MEM_CGROUP_RECLAIM_KMEM  (1 << 
> MEM_CGROUP_RECLAIM_KMEM_BIT)
> -
> -/*
>   * The memcg_create_mutex will be held whenever a new cgroup is created.
>   * As a consequence, any change that needs to protect against new child 
> cgroups
>   * appearing has to hold it as well.
> @@ -2137,7 +2127,7 @@ static unsigned long mem_cgroup_reclaim(struct 
> mem_cgroup *memcg,
>   if (loop)
>   drain_all_stock_async(memcg);
>   total += try_to_free_mem_cgroup_pages(memcg, SWAP_CLUSTER_MAX,
> -   gfp_mask, noswap);
> +   gfp_mask, flags);
>   if (test_thread_flag(TIF_MEMDIE) ||
>   fatal_signal_pending(current))
>   return 1;
> @@ -2150,6 +2140,16 @@ static unsigned long mem_cgroup_reclaim(struct 
> mem_cgroup *memcg,
>   break;
>   if (mem_cgroup_margin(memcg, flags & MEM_CGROUP_RECLAIM_KMEM))
>   break;
> +
> + /*
> +  * Try harder to reclaim dcache. dcache reclaim may
> +  * temporarly fail due to dcache->dlock being held
> +  * by someone else. We must try harder to avoid premature
> +  * slab allocation failures.
> +  */
> + if (flags & MEM_CGROUP_RECLAIM_KMEM &&
> + page_counter_read(&memcg->dcache))
> + continue;
>   /*
>* If nothing was reclaimed after two attempts, there
>* may be no reclaimable pages in this hierarchy.
> @@ -2778,11 +2778,13 @@ static int try_charge(struct mem_cgroup *memcg, gfp_t 
> gfp_mask, bool kmem_charge
>   struct mem_cgroup *mem_over_limit;
>   struc

Re: [Devel] [PATCH rh7 2/2] mm/memcg: reclaim only kmem if kmem limit reached.

2017-08-28 Thread Andrey Ryabinin
On 08/28/2017 12:02 PM, Stanislav Kinsburskiy wrote:
> 
> 
> 25.08.2017 18:38, Andrey Ryabinin пишет:
>> If kmem limit on memcg reached, we go into memory reclaim,
>> and reclaim everything we can, including page cache and anon.
>> Reclaiming page cache or anon won't help since we need to lower
>> only kmem usage. This patch fixes the problem by avoiding
>> non-kmem reclaim on hitting the kmem limit.
>>
> 
> Can't there be a situation, when some object in anon mem or page cache holds 
> some object in kmem (indirectly)?
> 

None that I know of.
___
Devel mailing list
Devel@openvz.org
https://lists.openvz.org/mailman/listinfo/devel


[Devel] [PATCH rh7] mm/hmm: Restore removed hunk in copy_one_pte()

2017-08-28 Thread Andrey Ryabinin
Rebased "ms/mm: remove rest usage of VM_NONLINEAR and pte_file()"
removed huge hunk from copy_one_pte for no reason. Bring it back

https://jira.sw.ru/browse/PSBM-70740

Fixes: be8e22c9c444 ("ms/mm: remove rest usage of VM_NONLINEAR and pte_file()")
Signed-off-by: Andrey Ryabinin 
---
 mm/memory.c | 25 ++---
 1 file changed, 22 insertions(+), 3 deletions(-)

diff --git a/mm/memory.c b/mm/memory.c
index e53e8dd288eb..c30a042cebf5 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -879,13 +879,32 @@ copy_one_pte(struct mm_struct *dst_mm, struct mm_struct 
*src_mm,
pte = pte_swp_mksoft_dirty(pte);
set_pte_at(src_mm, addr, src_pte, pte);
}
-   } else {
+   } else if (is_hmm_entry(entry)) {
+   page = hmm_entry_to_page(entry);
+
+   /*
+* Update rss count even for un-addressable
+* page as they should be consider just like
+* any other page.
+*/
+   get_page(page);
+   rss[mm_counter(page)]++;
+   page_dup_rmap(page);
+
+   if (is_write_hmm_entry(entry) &&
+   is_cow_mapping(vm_flags)) {
+   make_hmm_entry_read(&entry);
+   pte = swp_entry_to_pte(entry);
+   if (pte_swp_soft_dirty(*src_pte))
+   pte = pte_swp_mksoft_dirty(pte);
+   set_pte_at(src_mm, addr, src_pte, pte);
+   }
+   } else
/*
 * This can not happen because HMM migration holds
 * mmap_sem in read mode.
 */
-   VM_BUG_ON(is_hmm_entry(entry));
-   }
+VM_BUG_ON(is_hmm_entry(entry));
goto out_set_pte;
}
 
-- 
2.13.5

___
Devel mailing list
Devel@openvz.org
https://lists.openvz.org/mailman/listinfo/devel


[Devel] [PATCH rh7 2/2] ms/x86, efi, kasan: #undef memset/memcpy/memmove per arch

2017-08-28 Thread Andrey Ryabinin
From: Andrey Ryabinin 

commit 769a8089c1fd2fe94c13e66fe6e03d7820953ee3 upstream.

In not-instrumented code KASAN replaces instrumented memset/memcpy/memmove
with not-instrumented analogues __memset/__memcpy/__memove.

However, on x86 the EFI stub is not linked with the kernel.  It uses
not-instrumented mem*() functions from arch/x86/boot/compressed/string.c

So we don't replace them with __mem*() variants in EFI stub.

On ARM64 the EFI stub is linked with the kernel, so we should replace
mem*() functions with __mem*(), because the EFI stub runs before KASAN
sets up early shadow.

So let's move these #undef mem* into arch's asm/efi.h which is also
included by the EFI stub.

Also, this will fix the warning in 32-bit build reported by kbuild test
robot:

efi-stub-helper.c:599:2: warning: implicit declaration of function 
'memcpy'

[a...@linux-foundation.org: use 80 cols in comment]
Signed-off-by: Andrey Ryabinin 
Reported-by: Fengguang Wu 
Cc: Will Deacon 
Cc: Catalin Marinas 
Cc: Matt Fleming 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: "H. Peter Anvin" 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds 
Signed-off-by: Andrey Ryabinin 
---
 arch/x86/include/asm/efi.h | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/arch/x86/include/asm/efi.h b/arch/x86/include/asm/efi.h
index 2215cd26512d..8e70dd8bde6c 100644
--- a/arch/x86/include/asm/efi.h
+++ b/arch/x86/include/asm/efi.h
@@ -68,6 +68,16 @@ extern u64 asmlinkage efi_call(void *fp, ...);
 extern void __iomem *efi_ioremap(unsigned long addr, unsigned long size,
 u32 type, u64 attribute);
 
+/*
+ * CONFIG_KASAN may redefine memset to __memset.  __memset function is present
+ * only in kernel binary.  Since the EFI stub linked into a separate binary it
+ * doesn't have __memset().  So we should use standard memset from
+ * arch/x86/boot/compressed/string.c.  The same applies to memcpy and memmove.
+ */
+#undef memcpy
+#undef memset
+#undef memmove
+
 #endif /* CONFIG_X86_32 */
 
 extern int add_efi_memmap;
-- 
2.13.5

___
Devel mailing list
Devel@openvz.org
https://lists.openvz.org/mailman/listinfo/devel


[Devel] [PATCH rh7 1/2] fixup for - ms/x86_64: kasan: add interceptors for memset/memmove/memcpy functions

2017-08-28 Thread Andrey Ryabinin
This hunk shoud have been in the following patch:

796752003a42b70d1f32eb5771885f59febada0d
Author: Andrey Ryabinin 
Date:   Thu Sep 3 19:27:42 2015 +0400

ms/x86_64: kasan: add interceptors for memset/memmove/memcpy functions

This fixes:
In file included from ./arch/x86/include/asm/string.h:4:0,
 from include/linux/string.h:18,
 from include/linux/efi.h:15,
 from arch/x86/boot/compressed/eboot.c:12:
./arch/x86/include/asm/string_64.h:79:0: warning: "memset" redefined
 #define memset(s, c, n) __memset(s, c, n)

In file included from arch/x86/boot/compressed/eboot.c:11:0:
arch/x86/boot/compressed/../string.h:18:0: note: this is the location of the 
previous definition
 #define memset(d,c,l) __builtin_memset(d,c,l)

Signed-off-by: Andrey Ryabinin 
---
 arch/x86/boot/compressed/eboot.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/boot/compressed/eboot.c b/arch/x86/boot/compressed/eboot.c
index ec76e778d101..b4a6e12a2800 100644
--- a/arch/x86/boot/compressed/eboot.c
+++ b/arch/x86/boot/compressed/eboot.c
@@ -8,7 +8,6 @@
  * --- */
 
 #include 
-#include "../string.h"
 #include 
 #include 
 #include 
@@ -16,6 +15,7 @@
 #include 
 #include 
 
+#include "../string.h"
 #include "eboot.h"
 
 static efi_system_table_t *sys_table;
-- 
2.13.5

___
Devel mailing list
Devel@openvz.org
https://lists.openvz.org/mailman/listinfo/devel


[Devel] [PATCH RHEL7 COMMIT] fixup for - ms/x86_64: kasan: add interceptors for memset/memmove/memcpy functions

2017-08-28 Thread Konstantin Khorenko
The commit is pushed to "branch-rh7-3.10.0-693.1.1.vz7.37.x-ovz" and will 
appear at https://src.openvz.org/scm/ovz/vzkernel.git
after rh7-3.10.0-693.1.1.el7
-->
commit 2b22264b4e695b658fc71b472796c8728b119cc8
Author: Andrey Ryabinin 
Date:   Mon Aug 28 15:57:43 2017 +0300

fixup for - ms/x86_64: kasan: add interceptors for memset/memmove/memcpy 
functions

This hunk shoud have been in the following patch:

796752003a42b70d1f32eb5771885f59febada0d
Author: Andrey Ryabinin 
Date:   Thu Sep 3 19:27:42 2015 +0400

ms/x86_64: kasan: add interceptors for memset/memmove/memcpy functions

This fixes:
In file included from ./arch/x86/include/asm/string.h:4:0,
 from include/linux/string.h:18,
 from include/linux/efi.h:15,
 from arch/x86/boot/compressed/eboot.c:12:
./arch/x86/include/asm/string_64.h:79:0: warning: "memset" redefined
 #define memset(s, c, n) __memset(s, c, n)

In file included from arch/x86/boot/compressed/eboot.c:11:0:
arch/x86/boot/compressed/../string.h:18:0: note: this is the location of 
the previous definition
 #define memset(d,c,l) __builtin_memset(d,c,l)

Signed-off-by: Andrey Ryabinin 
---
 arch/x86/boot/compressed/eboot.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/boot/compressed/eboot.c b/arch/x86/boot/compressed/eboot.c
index ec76e77..b4a6e12 100644
--- a/arch/x86/boot/compressed/eboot.c
+++ b/arch/x86/boot/compressed/eboot.c
@@ -8,7 +8,6 @@
  * --- */
 
 #include 
-#include "../string.h"
 #include 
 #include 
 #include 
@@ -16,6 +15,7 @@
 #include 
 #include 
 
+#include "../string.h"
 #include "eboot.h"
 
 static efi_system_table_t *sys_table;
___
Devel mailing list
Devel@openvz.org
https://lists.openvz.org/mailman/listinfo/devel


[Devel] [PATCH RHEL7 COMMIT] ms/x86, efi, kasan: #undef memset/memcpy/memmove per arch

2017-08-28 Thread Konstantin Khorenko
The commit is pushed to "branch-rh7-3.10.0-693.1.1.vz7.37.x-ovz" and will 
appear at https://src.openvz.org/scm/ovz/vzkernel.git
after rh7-3.10.0-693.1.1.el7
-->
commit f4b5ab2e530bf8bc46c79d4effa14e51a6643e69
Author: Andrey Ryabinin 
Date:   Mon Aug 28 15:57:45 2017 +0300

ms/x86, efi, kasan: #undef memset/memcpy/memmove per arch

commit 769a8089c1fd2fe94c13e66fe6e03d7820953ee3 upstream.

In not-instrumented code KASAN replaces instrumented memset/memcpy/memmove
with not-instrumented analogues __memset/__memcpy/__memove.

However, on x86 the EFI stub is not linked with the kernel.  It uses
not-instrumented mem*() functions from arch/x86/boot/compressed/string.c

So we don't replace them with __mem*() variants in EFI stub.

On ARM64 the EFI stub is linked with the kernel, so we should replace
mem*() functions with __mem*(), because the EFI stub runs before KASAN
sets up early shadow.

So let's move these #undef mem* into arch's asm/efi.h which is also
included by the EFI stub.

Also, this will fix the warning in 32-bit build reported by kbuild test
robot:

efi-stub-helper.c:599:2: warning: implicit declaration of function 
'memcpy'

[a...@linux-foundation.org: use 80 cols in comment]
Signed-off-by: Andrey Ryabinin 

Reported-by: Fengguang Wu 
Cc: Will Deacon 
Cc: Catalin Marinas 
Cc: Matt Fleming 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: "H. Peter Anvin" 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds 
Signed-off-by: Andrey Ryabinin 
---
 arch/x86/include/asm/efi.h | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/arch/x86/include/asm/efi.h b/arch/x86/include/asm/efi.h
index 2215cd2..8e70dd8 100644
--- a/arch/x86/include/asm/efi.h
+++ b/arch/x86/include/asm/efi.h
@@ -68,6 +68,16 @@ extern u64 asmlinkage efi_call(void *fp, ...);
 extern void __iomem *efi_ioremap(unsigned long addr, unsigned long size,
 u32 type, u64 attribute);
 
+/*
+ * CONFIG_KASAN may redefine memset to __memset.  __memset function is present
+ * only in kernel binary.  Since the EFI stub linked into a separate binary it
+ * doesn't have __memset().  So we should use standard memset from
+ * arch/x86/boot/compressed/string.c.  The same applies to memcpy and memmove.
+ */
+#undef memcpy
+#undef memset
+#undef memmove
+
 #endif /* CONFIG_X86_32 */
 
 extern int add_efi_memmap;
___
Devel mailing list
Devel@openvz.org
https://lists.openvz.org/mailman/listinfo/devel