Re: [Devel] [PATCH rh7 2/2] mm/memcg: reclaim only kmem if kmem limit reached.

2017-08-31 Thread Andrey Ryabinin


On 08/31/2017 12:58 PM, Konstantin Khorenko wrote:
> Do we want to push it to mainstream as well?
> 

I don't think so. Distributions are slowly moving towards v2 cgroup, where
kmem limit simply doesn't exists. And for legacy cgroup v1 lack of reclaim on 
kmem limit
hit wasn't a mistake but a deliberate choice. There is no clear use case for 
this, but
it's adds a lot complexity to the reclaim code and just looks a bit ugly.


> -- 
> Best regards,
> 
> Konstantin Khorenko,
> Virtuozzo Linux Kernel Team
> 
___
Devel mailing list
Devel@openvz.org
https://lists.openvz.org/mailman/listinfo/devel


Re: [Devel] [PATCH rh7 2/2] mm/memcg: reclaim only kmem if kmem limit reached.

2017-08-31 Thread Konstantin Khorenko

Do we want to push it to mainstream as well?

--
Best regards,

Konstantin Khorenko,
Virtuozzo Linux Kernel Team

On 08/25/2017 06:38 PM, Andrey Ryabinin wrote:

If kmem limit on memcg reached, we go into memory reclaim,
and reclaim everything we can, including page cache and anon.
Reclaiming page cache or anon won't help since we need to lower
only kmem usage. This patch fixes the problem by avoiding
non-kmem reclaim on hitting the kmem limit.

https://jira.sw.ru/browse/PSBM-69226
Signed-off-by: Andrey Ryabinin 
---
 include/linux/memcontrol.h | 10 ++
 include/linux/swap.h   |  2 +-
 mm/memcontrol.c| 30 --
 mm/vmscan.c| 31 ---
 4 files changed, 51 insertions(+), 22 deletions(-)

diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index 1a52e58ab7de..1d6bc80c4c90 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -45,6 +45,16 @@ struct mem_cgroup_reclaim_cookie {
unsigned int generation;
 };

+/*
+ * Reclaim flags for mem_cgroup_hierarchical_reclaim
+ */
+#define MEM_CGROUP_RECLAIM_NOSWAP_BIT  0x0
+#define MEM_CGROUP_RECLAIM_NOSWAP  (1 << MEM_CGROUP_RECLAIM_NOSWAP_BIT)
+#define MEM_CGROUP_RECLAIM_SHRINK_BIT  0x1
+#define MEM_CGROUP_RECLAIM_SHRINK  (1 << MEM_CGROUP_RECLAIM_SHRINK_BIT)
+#define MEM_CGROUP_RECLAIM_KMEM_BIT0x2
+#define MEM_CGROUP_RECLAIM_KMEM(1 << 
MEM_CGROUP_RECLAIM_KMEM_BIT)
+
 #ifdef CONFIG_MEMCG
 int mem_cgroup_try_charge(struct page *page, struct mm_struct *mm,
  gfp_t gfp_mask, struct mem_cgroup **memcgp);
diff --git a/include/linux/swap.h b/include/linux/swap.h
index bd162f9bef0d..bd47451ec95a 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -324,7 +324,7 @@ extern unsigned long try_to_free_pages(struct zonelist 
*zonelist, int order,
 extern int __isolate_lru_page(struct page *page, isolate_mode_t mode);
 extern unsigned long try_to_free_mem_cgroup_pages(struct mem_cgroup *mem,
  unsigned long nr_pages,
- gfp_t gfp_mask, bool noswap);
+ gfp_t gfp_mask, int flags);
 extern unsigned long mem_cgroup_shrink_node_zone(struct mem_cgroup *mem,
gfp_t gfp_mask, bool noswap,
struct zone *zone,
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 97824e281d7a..f9a5f3819a31 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -511,16 +511,6 @@ enum res_type {
 #define OOM_CONTROL(0)

 /*
- * Reclaim flags for mem_cgroup_hierarchical_reclaim
- */
-#define MEM_CGROUP_RECLAIM_NOSWAP_BIT  0x0
-#define MEM_CGROUP_RECLAIM_NOSWAP  (1 << MEM_CGROUP_RECLAIM_NOSWAP_BIT)
-#define MEM_CGROUP_RECLAIM_SHRINK_BIT  0x1
-#define MEM_CGROUP_RECLAIM_SHRINK  (1 << MEM_CGROUP_RECLAIM_SHRINK_BIT)
-#define MEM_CGROUP_RECLAIM_KMEM_BIT0x2
-#define MEM_CGROUP_RECLAIM_KMEM(1 << 
MEM_CGROUP_RECLAIM_KMEM_BIT)
-
-/*
  * The memcg_create_mutex will be held whenever a new cgroup is created.
  * As a consequence, any change that needs to protect against new child cgroups
  * appearing has to hold it as well.
@@ -2137,7 +2127,7 @@ static unsigned long mem_cgroup_reclaim(struct mem_cgroup 
*memcg,
if (loop)
drain_all_stock_async(memcg);
total += try_to_free_mem_cgroup_pages(memcg, SWAP_CLUSTER_MAX,
- gfp_mask, noswap);
+ gfp_mask, flags);
if (test_thread_flag(TIF_MEMDIE) ||
fatal_signal_pending(current))
return 1;
@@ -2150,6 +2140,16 @@ static unsigned long mem_cgroup_reclaim(struct 
mem_cgroup *memcg,
break;
if (mem_cgroup_margin(memcg, flags & MEM_CGROUP_RECLAIM_KMEM))
break;
+
+   /*
+* Try harder to reclaim dcache. dcache reclaim may
+* temporarly fail due to dcache->dlock being held
+* by someone else. We must try harder to avoid premature
+* slab allocation failures.
+*/
+   if (flags & MEM_CGROUP_RECLAIM_KMEM &&
+   page_counter_read(>dcache))
+   continue;
/*
 * If nothing was reclaimed after two attempts, there
 * may be no reclaimable pages in this hierarchy.
@@ -2778,11 +2778,13 @@ static int try_charge(struct mem_cgroup *memcg, gfp_t 
gfp_mask, bool kmem_charge
struct mem_cgroup *mem_over_limit;
struct page_counter *counter;
unsigned long nr_reclaimed;
-   unsigned long flags = 0;
+   unsigned long flags;


Re: [Devel] [PATCH rh7 2/2] mm/memcg: reclaim only kmem if kmem limit reached.

2017-08-28 Thread Andrey Ryabinin
On 08/28/2017 12:02 PM, Stanislav Kinsburskiy wrote:
> 
> 
> 25.08.2017 18:38, Andrey Ryabinin пишет:
>> If kmem limit on memcg reached, we go into memory reclaim,
>> and reclaim everything we can, including page cache and anon.
>> Reclaiming page cache or anon won't help since we need to lower
>> only kmem usage. This patch fixes the problem by avoiding
>> non-kmem reclaim on hitting the kmem limit.
>>
> 
> Can't there be a situation, when some object in anon mem or page cache holds 
> some object in kmem (indirectly)?
> 

None that I know of.
___
Devel mailing list
Devel@openvz.org
https://lists.openvz.org/mailman/listinfo/devel


Re: [Devel] [PATCH rh7 2/2] mm/memcg: reclaim only kmem if kmem limit reached.

2017-08-28 Thread Stanislav Kinsburskiy


25.08.2017 18:38, Andrey Ryabinin пишет:
> If kmem limit on memcg reached, we go into memory reclaim,
> and reclaim everything we can, including page cache and anon.
> Reclaiming page cache or anon won't help since we need to lower
> only kmem usage. This patch fixes the problem by avoiding
> non-kmem reclaim on hitting the kmem limit.
> 

Can't there be a situation, when some object in anon mem or page cache holds 
some object in kmem (indirectly)?

> https://jira.sw.ru/browse/PSBM-69226
> Signed-off-by: Andrey Ryabinin 
> ---
>  include/linux/memcontrol.h | 10 ++
>  include/linux/swap.h   |  2 +-
>  mm/memcontrol.c| 30 --
>  mm/vmscan.c| 31 ---
>  4 files changed, 51 insertions(+), 22 deletions(-)
> 
> diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
> index 1a52e58ab7de..1d6bc80c4c90 100644
> --- a/include/linux/memcontrol.h
> +++ b/include/linux/memcontrol.h
> @@ -45,6 +45,16 @@ struct mem_cgroup_reclaim_cookie {
>   unsigned int generation;
>  };
>  
> +/*
> + * Reclaim flags for mem_cgroup_hierarchical_reclaim
> + */
> +#define MEM_CGROUP_RECLAIM_NOSWAP_BIT0x0
> +#define MEM_CGROUP_RECLAIM_NOSWAP(1 << MEM_CGROUP_RECLAIM_NOSWAP_BIT)
> +#define MEM_CGROUP_RECLAIM_SHRINK_BIT0x1
> +#define MEM_CGROUP_RECLAIM_SHRINK(1 << MEM_CGROUP_RECLAIM_SHRINK_BIT)
> +#define MEM_CGROUP_RECLAIM_KMEM_BIT  0x2
> +#define MEM_CGROUP_RECLAIM_KMEM  (1 << 
> MEM_CGROUP_RECLAIM_KMEM_BIT)
> +
>  #ifdef CONFIG_MEMCG
>  int mem_cgroup_try_charge(struct page *page, struct mm_struct *mm,
> gfp_t gfp_mask, struct mem_cgroup **memcgp);
> diff --git a/include/linux/swap.h b/include/linux/swap.h
> index bd162f9bef0d..bd47451ec95a 100644
> --- a/include/linux/swap.h
> +++ b/include/linux/swap.h
> @@ -324,7 +324,7 @@ extern unsigned long try_to_free_pages(struct zonelist 
> *zonelist, int order,
>  extern int __isolate_lru_page(struct page *page, isolate_mode_t mode);
>  extern unsigned long try_to_free_mem_cgroup_pages(struct mem_cgroup *mem,
> unsigned long nr_pages,
> -   gfp_t gfp_mask, bool noswap);
> +   gfp_t gfp_mask, int flags);
>  extern unsigned long mem_cgroup_shrink_node_zone(struct mem_cgroup *mem,
>   gfp_t gfp_mask, bool noswap,
>   struct zone *zone,
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 97824e281d7a..f9a5f3819a31 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -511,16 +511,6 @@ enum res_type {
>  #define OOM_CONTROL  (0)
>  
>  /*
> - * Reclaim flags for mem_cgroup_hierarchical_reclaim
> - */
> -#define MEM_CGROUP_RECLAIM_NOSWAP_BIT0x0
> -#define MEM_CGROUP_RECLAIM_NOSWAP(1 << MEM_CGROUP_RECLAIM_NOSWAP_BIT)
> -#define MEM_CGROUP_RECLAIM_SHRINK_BIT0x1
> -#define MEM_CGROUP_RECLAIM_SHRINK(1 << MEM_CGROUP_RECLAIM_SHRINK_BIT)
> -#define MEM_CGROUP_RECLAIM_KMEM_BIT  0x2
> -#define MEM_CGROUP_RECLAIM_KMEM  (1 << 
> MEM_CGROUP_RECLAIM_KMEM_BIT)
> -
> -/*
>   * The memcg_create_mutex will be held whenever a new cgroup is created.
>   * As a consequence, any change that needs to protect against new child 
> cgroups
>   * appearing has to hold it as well.
> @@ -2137,7 +2127,7 @@ static unsigned long mem_cgroup_reclaim(struct 
> mem_cgroup *memcg,
>   if (loop)
>   drain_all_stock_async(memcg);
>   total += try_to_free_mem_cgroup_pages(memcg, SWAP_CLUSTER_MAX,
> -   gfp_mask, noswap);
> +   gfp_mask, flags);
>   if (test_thread_flag(TIF_MEMDIE) ||
>   fatal_signal_pending(current))
>   return 1;
> @@ -2150,6 +2140,16 @@ static unsigned long mem_cgroup_reclaim(struct 
> mem_cgroup *memcg,
>   break;
>   if (mem_cgroup_margin(memcg, flags & MEM_CGROUP_RECLAIM_KMEM))
>   break;
> +
> + /*
> +  * Try harder to reclaim dcache. dcache reclaim may
> +  * temporarly fail due to dcache->dlock being held
> +  * by someone else. We must try harder to avoid premature
> +  * slab allocation failures.
> +  */
> + if (flags & MEM_CGROUP_RECLAIM_KMEM &&
> + page_counter_read(>dcache))
> + continue;
>   /*
>* If nothing was reclaimed after two attempts, there
>* may be no reclaimable pages in this hierarchy.
> @@ -2778,11 +2778,13 @@ static int try_charge(struct mem_cgroup *memcg, gfp_t 
> gfp_mask, bool kmem_charge
>   struct mem_cgroup