Re: [PATCH] mm: memcontrol: use per-cpu stocks for socket memory uncharging
On Thu, Sep 7, 2017 at 11:47 AM, Roman Gushchinwrote: > On Thu, Sep 07, 2017 at 11:44:12AM -0700, Shakeel Butt wrote: >> >> As far as other types of pages go: page cache and anon are already >> >> batched pretty well, but I think kmem might benefit from this >> >> too. Have you considered using the stock in memcg_kmem_uncharge()? >> > >> > Good idea! >> > I'll try to find an appropriate testcase and check if it really >> > brings any benefits. If so, I'll master a patch. >> > >> >> Hi Roman, did you get the chance to try this on memcg_kmem_uncharge()? > > Hi Shakeel! > > Not yet, I'll try to it asap. > Do you have an example when it's costly? > Nah, I was also curious of such examples.
Re: [PATCH] mm: memcontrol: use per-cpu stocks for socket memory uncharging
On Thu, Sep 7, 2017 at 11:47 AM, Roman Gushchin wrote: > On Thu, Sep 07, 2017 at 11:44:12AM -0700, Shakeel Butt wrote: >> >> As far as other types of pages go: page cache and anon are already >> >> batched pretty well, but I think kmem might benefit from this >> >> too. Have you considered using the stock in memcg_kmem_uncharge()? >> > >> > Good idea! >> > I'll try to find an appropriate testcase and check if it really >> > brings any benefits. If so, I'll master a patch. >> > >> >> Hi Roman, did you get the chance to try this on memcg_kmem_uncharge()? > > Hi Shakeel! > > Not yet, I'll try to it asap. > Do you have an example when it's costly? > Nah, I was also curious of such examples.
Re: [PATCH] mm: memcontrol: use per-cpu stocks for socket memory uncharging
On Thu, Sep 07, 2017 at 11:44:12AM -0700, Shakeel Butt wrote: > >> As far as other types of pages go: page cache and anon are already > >> batched pretty well, but I think kmem might benefit from this > >> too. Have you considered using the stock in memcg_kmem_uncharge()? > > > > Good idea! > > I'll try to find an appropriate testcase and check if it really > > brings any benefits. If so, I'll master a patch. > > > > Hi Roman, did you get the chance to try this on memcg_kmem_uncharge()? Hi Shakeel! Not yet, I'll try to it asap. Do you have an example when it's costly? Thanks!
Re: [PATCH] mm: memcontrol: use per-cpu stocks for socket memory uncharging
On Thu, Sep 07, 2017 at 11:44:12AM -0700, Shakeel Butt wrote: > >> As far as other types of pages go: page cache and anon are already > >> batched pretty well, but I think kmem might benefit from this > >> too. Have you considered using the stock in memcg_kmem_uncharge()? > > > > Good idea! > > I'll try to find an appropriate testcase and check if it really > > brings any benefits. If so, I'll master a patch. > > > > Hi Roman, did you get the chance to try this on memcg_kmem_uncharge()? Hi Shakeel! Not yet, I'll try to it asap. Do you have an example when it's costly? Thanks!
Re: [PATCH] mm: memcontrol: use per-cpu stocks for socket memory uncharging
>> As far as other types of pages go: page cache and anon are already >> batched pretty well, but I think kmem might benefit from this >> too. Have you considered using the stock in memcg_kmem_uncharge()? > > Good idea! > I'll try to find an appropriate testcase and check if it really > brings any benefits. If so, I'll master a patch. > Hi Roman, did you get the chance to try this on memcg_kmem_uncharge()?
Re: [PATCH] mm: memcontrol: use per-cpu stocks for socket memory uncharging
>> As far as other types of pages go: page cache and anon are already >> batched pretty well, but I think kmem might benefit from this >> too. Have you considered using the stock in memcg_kmem_uncharge()? > > Good idea! > I'll try to find an appropriate testcase and check if it really > brings any benefits. If so, I'll master a patch. > Hi Roman, did you get the chance to try this on memcg_kmem_uncharge()?
Re: [PATCH] mm: memcontrol: use per-cpu stocks for socket memory uncharging
On Tue 29-08-17 11:01:50, Roman Gushchin wrote: > We've noticed a quite sensible performance overhead on some hosts > with significant network traffic when socket memory accounting > is enabled. > > Perf top shows that socket memory uncharging path is hot: > 2.13% [kernel][k] page_counter_cancel > 1.14% [kernel][k] __sk_mem_reduce_allocated > 1.14% [kernel][k] _raw_spin_lock > 0.87% [kernel][k] _raw_spin_lock_irqsave > 0.84% [kernel][k] tcp_ack > 0.84% [kernel][k] ixgbe_poll > 0.83% < workload > > 0.82% [kernel][k] enqueue_entity > 0.68% [kernel][k] __fget > 0.68% [kernel][k] tcp_delack_timer_handler > 0.67% [kernel][k] __schedule > 0.60% < workload > > 0.59% [kernel][k] __inet6_lookup_established > 0.55% [kernel][k] __switch_to > 0.55% [kernel][k] menu_select > 0.54% libc-2.20.so[.] __memcpy_avx_unaligned > > To address this issue, the existing per-cpu stock infrastructure > can be used. > > refill_stock() can be called from mem_cgroup_uncharge_skmem() > to move charge to a per-cpu stock instead of calling atomic > page_counter_uncharge(). > > To prevent the uncontrolled growth of per-cpu stocks, > refill_stock() will explicitly drain the cached charge, > if the cached value exceeds CHARGE_BATCH. > > This allows significantly optimize the load: > 1.21% [kernel][k] _raw_spin_lock > 1.01% [kernel][k] ixgbe_poll > 0.92% [kernel][k] _raw_spin_lock_irqsave > 0.90% [kernel][k] enqueue_entity > 0.86% [kernel][k] tcp_ack > 0.85% < workload > > 0.74% perf-11120.map [.] 0x0061bf24 > 0.73% [kernel][k] __schedule > 0.67% [kernel][k] __fget > 0.63% [kernel][k] __inet6_lookup_established > 0.62% [kernel][k] menu_select > 0.59% < workload > > 0.59% [kernel][k] __switch_to > 0.57% libc-2.20.so[.] __memcpy_avx_unaligned > > Signed-off-by: Roman Gushchin> Cc: Johannes Weiner > Cc: Michal Hocko > Cc: Vladimir Davydov > Cc: cgro...@vger.kernel.org > Cc: kernel-t...@fb.com > Cc: linux...@kvack.org > Cc: linux-kernel@vger.kernel.org Acked-by: Michal Hocko > --- > mm/memcontrol.c | 6 -- > 1 file changed, 4 insertions(+), 2 deletions(-) > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > index b9cf3cf4a3d0..a69d23082abf 100644 > --- a/mm/memcontrol.c > +++ b/mm/memcontrol.c > @@ -1792,6 +1792,9 @@ static void refill_stock(struct mem_cgroup *memcg, > unsigned int nr_pages) > } > stock->nr_pages += nr_pages; > > + if (stock->nr_pages > CHARGE_BATCH) > + drain_stock(stock); > + > local_irq_restore(flags); > } > > @@ -5886,8 +5889,7 @@ void mem_cgroup_uncharge_skmem(struct mem_cgroup > *memcg, unsigned int nr_pages) > > this_cpu_sub(memcg->stat->count[MEMCG_SOCK], nr_pages); > > - page_counter_uncharge(>memory, nr_pages); > - css_put_many(>css, nr_pages); > + refill_stock(memcg, nr_pages); > } > > static int __init cgroup_memory(char *s) > -- > 2.13.5 -- Michal Hocko SUSE Labs
Re: [PATCH] mm: memcontrol: use per-cpu stocks for socket memory uncharging
On Tue 29-08-17 11:01:50, Roman Gushchin wrote: > We've noticed a quite sensible performance overhead on some hosts > with significant network traffic when socket memory accounting > is enabled. > > Perf top shows that socket memory uncharging path is hot: > 2.13% [kernel][k] page_counter_cancel > 1.14% [kernel][k] __sk_mem_reduce_allocated > 1.14% [kernel][k] _raw_spin_lock > 0.87% [kernel][k] _raw_spin_lock_irqsave > 0.84% [kernel][k] tcp_ack > 0.84% [kernel][k] ixgbe_poll > 0.83% < workload > > 0.82% [kernel][k] enqueue_entity > 0.68% [kernel][k] __fget > 0.68% [kernel][k] tcp_delack_timer_handler > 0.67% [kernel][k] __schedule > 0.60% < workload > > 0.59% [kernel][k] __inet6_lookup_established > 0.55% [kernel][k] __switch_to > 0.55% [kernel][k] menu_select > 0.54% libc-2.20.so[.] __memcpy_avx_unaligned > > To address this issue, the existing per-cpu stock infrastructure > can be used. > > refill_stock() can be called from mem_cgroup_uncharge_skmem() > to move charge to a per-cpu stock instead of calling atomic > page_counter_uncharge(). > > To prevent the uncontrolled growth of per-cpu stocks, > refill_stock() will explicitly drain the cached charge, > if the cached value exceeds CHARGE_BATCH. > > This allows significantly optimize the load: > 1.21% [kernel][k] _raw_spin_lock > 1.01% [kernel][k] ixgbe_poll > 0.92% [kernel][k] _raw_spin_lock_irqsave > 0.90% [kernel][k] enqueue_entity > 0.86% [kernel][k] tcp_ack > 0.85% < workload > > 0.74% perf-11120.map [.] 0x0061bf24 > 0.73% [kernel][k] __schedule > 0.67% [kernel][k] __fget > 0.63% [kernel][k] __inet6_lookup_established > 0.62% [kernel][k] menu_select > 0.59% < workload > > 0.59% [kernel][k] __switch_to > 0.57% libc-2.20.so[.] __memcpy_avx_unaligned > > Signed-off-by: Roman Gushchin > Cc: Johannes Weiner > Cc: Michal Hocko > Cc: Vladimir Davydov > Cc: cgro...@vger.kernel.org > Cc: kernel-t...@fb.com > Cc: linux...@kvack.org > Cc: linux-kernel@vger.kernel.org Acked-by: Michal Hocko > --- > mm/memcontrol.c | 6 -- > 1 file changed, 4 insertions(+), 2 deletions(-) > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > index b9cf3cf4a3d0..a69d23082abf 100644 > --- a/mm/memcontrol.c > +++ b/mm/memcontrol.c > @@ -1792,6 +1792,9 @@ static void refill_stock(struct mem_cgroup *memcg, > unsigned int nr_pages) > } > stock->nr_pages += nr_pages; > > + if (stock->nr_pages > CHARGE_BATCH) > + drain_stock(stock); > + > local_irq_restore(flags); > } > > @@ -5886,8 +5889,7 @@ void mem_cgroup_uncharge_skmem(struct mem_cgroup > *memcg, unsigned int nr_pages) > > this_cpu_sub(memcg->stat->count[MEMCG_SOCK], nr_pages); > > - page_counter_uncharge(>memory, nr_pages); > - css_put_many(>css, nr_pages); > + refill_stock(memcg, nr_pages); > } > > static int __init cgroup_memory(char *s) > -- > 2.13.5 -- Michal Hocko SUSE Labs
Re: [PATCH] mm: memcontrol: use per-cpu stocks for socket memory uncharging
On Wed 30-08-17 13:57:29, Roman Gushchin wrote: > On Wed, Aug 30, 2017 at 02:55:43PM +0200, Michal Hocko wrote: > > On Wed 30-08-17 13:44:59, Roman Gushchin wrote: > > > On Wed, Aug 30, 2017 at 02:36:55PM +0200, Michal Hocko wrote: > > > > On Tue 29-08-17 11:01:50, Roman Gushchin wrote: > > > > [...] > > > > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > > > > > index b9cf3cf4a3d0..a69d23082abf 100644 > > > > > --- a/mm/memcontrol.c > > > > > +++ b/mm/memcontrol.c > > > > > @@ -1792,6 +1792,9 @@ static void refill_stock(struct mem_cgroup > > > > > *memcg, unsigned int nr_pages) > > > > > } > > > > > stock->nr_pages += nr_pages; > > > > > > > > > > + if (stock->nr_pages > CHARGE_BATCH) > > > > > + drain_stock(stock); > > > > > + > > > > > local_irq_restore(flags); > > > > > } > > > > > > > > Why do we need this? In other words, why cannot we rely on draining we > > > > already do? > > > > > > The existing draining depends on memory pressure, so to keep > > > the accounting (which we expose to a user) reasonable accurate > > > even without memory pressure, we need to limit the size of per-cpu stocks. > > > > Why don't we need this for regular page charges? Or maybe we do but that > > sounds like a seprate and an unrealted fix to me. > > Because we never refill more than CHARGE_BATCH. You are right that a single process will not but try_charge is a preemptible context and so multiple processes might pass consume_stock and then charge a N*CHARGE_BATCH. But I agree that this is quite unlikely so a separate patch is probably not worth it. -- Michal Hocko SUSE Labs
Re: [PATCH] mm: memcontrol: use per-cpu stocks for socket memory uncharging
On Wed 30-08-17 13:57:29, Roman Gushchin wrote: > On Wed, Aug 30, 2017 at 02:55:43PM +0200, Michal Hocko wrote: > > On Wed 30-08-17 13:44:59, Roman Gushchin wrote: > > > On Wed, Aug 30, 2017 at 02:36:55PM +0200, Michal Hocko wrote: > > > > On Tue 29-08-17 11:01:50, Roman Gushchin wrote: > > > > [...] > > > > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > > > > > index b9cf3cf4a3d0..a69d23082abf 100644 > > > > > --- a/mm/memcontrol.c > > > > > +++ b/mm/memcontrol.c > > > > > @@ -1792,6 +1792,9 @@ static void refill_stock(struct mem_cgroup > > > > > *memcg, unsigned int nr_pages) > > > > > } > > > > > stock->nr_pages += nr_pages; > > > > > > > > > > + if (stock->nr_pages > CHARGE_BATCH) > > > > > + drain_stock(stock); > > > > > + > > > > > local_irq_restore(flags); > > > > > } > > > > > > > > Why do we need this? In other words, why cannot we rely on draining we > > > > already do? > > > > > > The existing draining depends on memory pressure, so to keep > > > the accounting (which we expose to a user) reasonable accurate > > > even without memory pressure, we need to limit the size of per-cpu stocks. > > > > Why don't we need this for regular page charges? Or maybe we do but that > > sounds like a seprate and an unrealted fix to me. > > Because we never refill more than CHARGE_BATCH. You are right that a single process will not but try_charge is a preemptible context and so multiple processes might pass consume_stock and then charge a N*CHARGE_BATCH. But I agree that this is quite unlikely so a separate patch is probably not worth it. -- Michal Hocko SUSE Labs
Re: [PATCH] mm: memcontrol: use per-cpu stocks for socket memory uncharging
On Wed, Aug 30, 2017 at 02:55:43PM +0200, Michal Hocko wrote: > On Wed 30-08-17 13:44:59, Roman Gushchin wrote: > > On Wed, Aug 30, 2017 at 02:36:55PM +0200, Michal Hocko wrote: > > > On Tue 29-08-17 11:01:50, Roman Gushchin wrote: > > > [...] > > > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > > > > index b9cf3cf4a3d0..a69d23082abf 100644 > > > > --- a/mm/memcontrol.c > > > > +++ b/mm/memcontrol.c > > > > @@ -1792,6 +1792,9 @@ static void refill_stock(struct mem_cgroup > > > > *memcg, unsigned int nr_pages) > > > > } > > > > stock->nr_pages += nr_pages; > > > > > > > > + if (stock->nr_pages > CHARGE_BATCH) > > > > + drain_stock(stock); > > > > + > > > > local_irq_restore(flags); > > > > } > > > > > > Why do we need this? In other words, why cannot we rely on draining we > > > already do? > > > > The existing draining depends on memory pressure, so to keep > > the accounting (which we expose to a user) reasonable accurate > > even without memory pressure, we need to limit the size of per-cpu stocks. > > Why don't we need this for regular page charges? Or maybe we do but that > sounds like a seprate and an unrealted fix to me. Because we never refill more than CHARGE_BATCH.
Re: [PATCH] mm: memcontrol: use per-cpu stocks for socket memory uncharging
On Wed, Aug 30, 2017 at 02:55:43PM +0200, Michal Hocko wrote: > On Wed 30-08-17 13:44:59, Roman Gushchin wrote: > > On Wed, Aug 30, 2017 at 02:36:55PM +0200, Michal Hocko wrote: > > > On Tue 29-08-17 11:01:50, Roman Gushchin wrote: > > > [...] > > > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > > > > index b9cf3cf4a3d0..a69d23082abf 100644 > > > > --- a/mm/memcontrol.c > > > > +++ b/mm/memcontrol.c > > > > @@ -1792,6 +1792,9 @@ static void refill_stock(struct mem_cgroup > > > > *memcg, unsigned int nr_pages) > > > > } > > > > stock->nr_pages += nr_pages; > > > > > > > > + if (stock->nr_pages > CHARGE_BATCH) > > > > + drain_stock(stock); > > > > + > > > > local_irq_restore(flags); > > > > } > > > > > > Why do we need this? In other words, why cannot we rely on draining we > > > already do? > > > > The existing draining depends on memory pressure, so to keep > > the accounting (which we expose to a user) reasonable accurate > > even without memory pressure, we need to limit the size of per-cpu stocks. > > Why don't we need this for regular page charges? Or maybe we do but that > sounds like a seprate and an unrealted fix to me. Because we never refill more than CHARGE_BATCH.
Re: [PATCH] mm: memcontrol: use per-cpu stocks for socket memory uncharging
On Wed 30-08-17 13:44:59, Roman Gushchin wrote: > On Wed, Aug 30, 2017 at 02:36:55PM +0200, Michal Hocko wrote: > > On Tue 29-08-17 11:01:50, Roman Gushchin wrote: > > [...] > > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > > > index b9cf3cf4a3d0..a69d23082abf 100644 > > > --- a/mm/memcontrol.c > > > +++ b/mm/memcontrol.c > > > @@ -1792,6 +1792,9 @@ static void refill_stock(struct mem_cgroup *memcg, > > > unsigned int nr_pages) > > > } > > > stock->nr_pages += nr_pages; > > > > > > + if (stock->nr_pages > CHARGE_BATCH) > > > + drain_stock(stock); > > > + > > > local_irq_restore(flags); > > > } > > > > Why do we need this? In other words, why cannot we rely on draining we > > already do? > > The existing draining depends on memory pressure, so to keep > the accounting (which we expose to a user) reasonable accurate > even without memory pressure, we need to limit the size of per-cpu stocks. Why don't we need this for regular page charges? Or maybe we do but that sounds like a seprate and an unrealted fix to me. -- Michal Hocko SUSE Labs
Re: [PATCH] mm: memcontrol: use per-cpu stocks for socket memory uncharging
On Wed 30-08-17 13:44:59, Roman Gushchin wrote: > On Wed, Aug 30, 2017 at 02:36:55PM +0200, Michal Hocko wrote: > > On Tue 29-08-17 11:01:50, Roman Gushchin wrote: > > [...] > > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > > > index b9cf3cf4a3d0..a69d23082abf 100644 > > > --- a/mm/memcontrol.c > > > +++ b/mm/memcontrol.c > > > @@ -1792,6 +1792,9 @@ static void refill_stock(struct mem_cgroup *memcg, > > > unsigned int nr_pages) > > > } > > > stock->nr_pages += nr_pages; > > > > > > + if (stock->nr_pages > CHARGE_BATCH) > > > + drain_stock(stock); > > > + > > > local_irq_restore(flags); > > > } > > > > Why do we need this? In other words, why cannot we rely on draining we > > already do? > > The existing draining depends on memory pressure, so to keep > the accounting (which we expose to a user) reasonable accurate > even without memory pressure, we need to limit the size of per-cpu stocks. Why don't we need this for regular page charges? Or maybe we do but that sounds like a seprate and an unrealted fix to me. -- Michal Hocko SUSE Labs
Re: [PATCH] mm: memcontrol: use per-cpu stocks for socket memory uncharging
On Wed, Aug 30, 2017 at 02:36:55PM +0200, Michal Hocko wrote: > On Tue 29-08-17 11:01:50, Roman Gushchin wrote: > [...] > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > > index b9cf3cf4a3d0..a69d23082abf 100644 > > --- a/mm/memcontrol.c > > +++ b/mm/memcontrol.c > > @@ -1792,6 +1792,9 @@ static void refill_stock(struct mem_cgroup *memcg, > > unsigned int nr_pages) > > } > > stock->nr_pages += nr_pages; > > > > + if (stock->nr_pages > CHARGE_BATCH) > > + drain_stock(stock); > > + > > local_irq_restore(flags); > > } > > Why do we need this? In other words, why cannot we rely on draining we > already do? The existing draining depends on memory pressure, so to keep the accounting (which we expose to a user) reasonable accurate even without memory pressure, we need to limit the size of per-cpu stocks. Thanks!
Re: [PATCH] mm: memcontrol: use per-cpu stocks for socket memory uncharging
On Wed, Aug 30, 2017 at 02:36:55PM +0200, Michal Hocko wrote: > On Tue 29-08-17 11:01:50, Roman Gushchin wrote: > [...] > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > > index b9cf3cf4a3d0..a69d23082abf 100644 > > --- a/mm/memcontrol.c > > +++ b/mm/memcontrol.c > > @@ -1792,6 +1792,9 @@ static void refill_stock(struct mem_cgroup *memcg, > > unsigned int nr_pages) > > } > > stock->nr_pages += nr_pages; > > > > + if (stock->nr_pages > CHARGE_BATCH) > > + drain_stock(stock); > > + > > local_irq_restore(flags); > > } > > Why do we need this? In other words, why cannot we rely on draining we > already do? The existing draining depends on memory pressure, so to keep the accounting (which we expose to a user) reasonable accurate even without memory pressure, we need to limit the size of per-cpu stocks. Thanks!
Re: [PATCH] mm: memcontrol: use per-cpu stocks for socket memory uncharging
On Tue 29-08-17 11:01:50, Roman Gushchin wrote: [...] > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > index b9cf3cf4a3d0..a69d23082abf 100644 > --- a/mm/memcontrol.c > +++ b/mm/memcontrol.c > @@ -1792,6 +1792,9 @@ static void refill_stock(struct mem_cgroup *memcg, > unsigned int nr_pages) > } > stock->nr_pages += nr_pages; > > + if (stock->nr_pages > CHARGE_BATCH) > + drain_stock(stock); > + > local_irq_restore(flags); > } Why do we need this? In other words, why cannot we rely on draining we already do? > > @@ -5886,8 +5889,7 @@ void mem_cgroup_uncharge_skmem(struct mem_cgroup > *memcg, unsigned int nr_pages) > > this_cpu_sub(memcg->stat->count[MEMCG_SOCK], nr_pages); > > - page_counter_uncharge(>memory, nr_pages); > - css_put_many(>css, nr_pages); > + refill_stock(memcg, nr_pages); > } This makes sense to me. -- Michal Hocko SUSE Labs
Re: [PATCH] mm: memcontrol: use per-cpu stocks for socket memory uncharging
On Tue 29-08-17 11:01:50, Roman Gushchin wrote: [...] > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > index b9cf3cf4a3d0..a69d23082abf 100644 > --- a/mm/memcontrol.c > +++ b/mm/memcontrol.c > @@ -1792,6 +1792,9 @@ static void refill_stock(struct mem_cgroup *memcg, > unsigned int nr_pages) > } > stock->nr_pages += nr_pages; > > + if (stock->nr_pages > CHARGE_BATCH) > + drain_stock(stock); > + > local_irq_restore(flags); > } Why do we need this? In other words, why cannot we rely on draining we already do? > > @@ -5886,8 +5889,7 @@ void mem_cgroup_uncharge_skmem(struct mem_cgroup > *memcg, unsigned int nr_pages) > > this_cpu_sub(memcg->stat->count[MEMCG_SOCK], nr_pages); > > - page_counter_uncharge(>memory, nr_pages); > - css_put_many(>css, nr_pages); > + refill_stock(memcg, nr_pages); > } This makes sense to me. -- Michal Hocko SUSE Labs
Re: [PATCH] mm: memcontrol: use per-cpu stocks for socket memory uncharging
On Tue, Aug 29, 2017 at 03:26:21PM -0400, Johannes Weiner wrote: > On Tue, Aug 29, 2017 at 11:01:50AM +0100, Roman Gushchin wrote: > > We've noticed a quite sensible performance overhead on some hosts > > with significant network traffic when socket memory accounting > > is enabled. > > > > Perf top shows that socket memory uncharging path is hot: > > 2.13% [kernel][k] page_counter_cancel > > 1.14% [kernel][k] __sk_mem_reduce_allocated > > 1.14% [kernel][k] _raw_spin_lock > > 0.87% [kernel][k] _raw_spin_lock_irqsave > > 0.84% [kernel][k] tcp_ack > > 0.84% [kernel][k] ixgbe_poll > > 0.83% < workload > > > 0.82% [kernel][k] enqueue_entity > > 0.68% [kernel][k] __fget > > 0.68% [kernel][k] tcp_delack_timer_handler > > 0.67% [kernel][k] __schedule > > 0.60% < workload > > > 0.59% [kernel][k] __inet6_lookup_established > > 0.55% [kernel][k] __switch_to > > 0.55% [kernel][k] menu_select > > 0.54% libc-2.20.so[.] __memcpy_avx_unaligned > > > > To address this issue, the existing per-cpu stock infrastructure > > can be used. > > > > refill_stock() can be called from mem_cgroup_uncharge_skmem() > > to move charge to a per-cpu stock instead of calling atomic > > page_counter_uncharge(). > > > > To prevent the uncontrolled growth of per-cpu stocks, > > refill_stock() will explicitly drain the cached charge, > > if the cached value exceeds CHARGE_BATCH. > > > > This allows significantly optimize the load: > > 1.21% [kernel][k] _raw_spin_lock > > 1.01% [kernel][k] ixgbe_poll > > 0.92% [kernel][k] _raw_spin_lock_irqsave > > 0.90% [kernel][k] enqueue_entity > > 0.86% [kernel][k] tcp_ack > > 0.85% < workload > > > 0.74% perf-11120.map [.] 0x0061bf24 > > 0.73% [kernel][k] __schedule > > 0.67% [kernel][k] __fget > > 0.63% [kernel][k] __inet6_lookup_established > > 0.62% [kernel][k] menu_select > > 0.59% < workload > > > 0.59% [kernel][k] __switch_to > > 0.57% libc-2.20.so[.] __memcpy_avx_unaligned > > > > Signed-off-by: Roman Gushchin> > Cc: Johannes Weiner > > Cc: Michal Hocko > > Cc: Vladimir Davydov > > Cc: cgro...@vger.kernel.org > > Cc: kernel-t...@fb.com > > Cc: linux...@kvack.org > > Cc: linux-kernel@vger.kernel.org > > Acked-by: Johannes Weiner > > Neat! Hi, Andrew! Can you, please, pull this patch? Thank you! Roman
Re: [PATCH] mm: memcontrol: use per-cpu stocks for socket memory uncharging
On Tue, Aug 29, 2017 at 03:26:21PM -0400, Johannes Weiner wrote: > On Tue, Aug 29, 2017 at 11:01:50AM +0100, Roman Gushchin wrote: > > We've noticed a quite sensible performance overhead on some hosts > > with significant network traffic when socket memory accounting > > is enabled. > > > > Perf top shows that socket memory uncharging path is hot: > > 2.13% [kernel][k] page_counter_cancel > > 1.14% [kernel][k] __sk_mem_reduce_allocated > > 1.14% [kernel][k] _raw_spin_lock > > 0.87% [kernel][k] _raw_spin_lock_irqsave > > 0.84% [kernel][k] tcp_ack > > 0.84% [kernel][k] ixgbe_poll > > 0.83% < workload > > > 0.82% [kernel][k] enqueue_entity > > 0.68% [kernel][k] __fget > > 0.68% [kernel][k] tcp_delack_timer_handler > > 0.67% [kernel][k] __schedule > > 0.60% < workload > > > 0.59% [kernel][k] __inet6_lookup_established > > 0.55% [kernel][k] __switch_to > > 0.55% [kernel][k] menu_select > > 0.54% libc-2.20.so[.] __memcpy_avx_unaligned > > > > To address this issue, the existing per-cpu stock infrastructure > > can be used. > > > > refill_stock() can be called from mem_cgroup_uncharge_skmem() > > to move charge to a per-cpu stock instead of calling atomic > > page_counter_uncharge(). > > > > To prevent the uncontrolled growth of per-cpu stocks, > > refill_stock() will explicitly drain the cached charge, > > if the cached value exceeds CHARGE_BATCH. > > > > This allows significantly optimize the load: > > 1.21% [kernel][k] _raw_spin_lock > > 1.01% [kernel][k] ixgbe_poll > > 0.92% [kernel][k] _raw_spin_lock_irqsave > > 0.90% [kernel][k] enqueue_entity > > 0.86% [kernel][k] tcp_ack > > 0.85% < workload > > > 0.74% perf-11120.map [.] 0x0061bf24 > > 0.73% [kernel][k] __schedule > > 0.67% [kernel][k] __fget > > 0.63% [kernel][k] __inet6_lookup_established > > 0.62% [kernel][k] menu_select > > 0.59% < workload > > > 0.59% [kernel][k] __switch_to > > 0.57% libc-2.20.so[.] __memcpy_avx_unaligned > > > > Signed-off-by: Roman Gushchin > > Cc: Johannes Weiner > > Cc: Michal Hocko > > Cc: Vladimir Davydov > > Cc: cgro...@vger.kernel.org > > Cc: kernel-t...@fb.com > > Cc: linux...@kvack.org > > Cc: linux-kernel@vger.kernel.org > > Acked-by: Johannes Weiner > > Neat! Hi, Andrew! Can you, please, pull this patch? Thank you! Roman
Re: [PATCH] mm: memcontrol: use per-cpu stocks for socket memory uncharging
On Tue, Aug 29, 2017 at 03:26:21PM -0400, Johannes Weiner wrote: > On Tue, Aug 29, 2017 at 11:01:50AM +0100, Roman Gushchin wrote: > > We've noticed a quite sensible performance overhead on some hosts > > with significant network traffic when socket memory accounting > > is enabled. > > > > Perf top shows that socket memory uncharging path is hot: > > 2.13% [kernel][k] page_counter_cancel > > 1.14% [kernel][k] __sk_mem_reduce_allocated > > 1.14% [kernel][k] _raw_spin_lock > > 0.87% [kernel][k] _raw_spin_lock_irqsave > > 0.84% [kernel][k] tcp_ack > > 0.84% [kernel][k] ixgbe_poll > > 0.83% < workload > > > 0.82% [kernel][k] enqueue_entity > > 0.68% [kernel][k] __fget > > 0.68% [kernel][k] tcp_delack_timer_handler > > 0.67% [kernel][k] __schedule > > 0.60% < workload > > > 0.59% [kernel][k] __inet6_lookup_established > > 0.55% [kernel][k] __switch_to > > 0.55% [kernel][k] menu_select > > 0.54% libc-2.20.so[.] __memcpy_avx_unaligned > > > > To address this issue, the existing per-cpu stock infrastructure > > can be used. > > > > refill_stock() can be called from mem_cgroup_uncharge_skmem() > > to move charge to a per-cpu stock instead of calling atomic > > page_counter_uncharge(). > > > > To prevent the uncontrolled growth of per-cpu stocks, > > refill_stock() will explicitly drain the cached charge, > > if the cached value exceeds CHARGE_BATCH. > > > > This allows significantly optimize the load: > > 1.21% [kernel][k] _raw_spin_lock > > 1.01% [kernel][k] ixgbe_poll > > 0.92% [kernel][k] _raw_spin_lock_irqsave > > 0.90% [kernel][k] enqueue_entity > > 0.86% [kernel][k] tcp_ack > > 0.85% < workload > > > 0.74% perf-11120.map [.] 0x0061bf24 > > 0.73% [kernel][k] __schedule > > 0.67% [kernel][k] __fget > > 0.63% [kernel][k] __inet6_lookup_established > > 0.62% [kernel][k] menu_select > > 0.59% < workload > > > 0.59% [kernel][k] __switch_to > > 0.57% libc-2.20.so[.] __memcpy_avx_unaligned > > > > Signed-off-by: Roman Gushchin> > Cc: Johannes Weiner > > Cc: Michal Hocko > > Cc: Vladimir Davydov > > Cc: cgro...@vger.kernel.org > > Cc: kernel-t...@fb.com > > Cc: linux...@kvack.org > > Cc: linux-kernel@vger.kernel.org > > Acked-by: Johannes Weiner > > Neat! > > As far as other types of pages go: page cache and anon are already > batched pretty well, but I think kmem might benefit from this > too. Have you considered using the stock in memcg_kmem_uncharge()? Good idea! I'll try to find an appropriate testcase and check if it really brings any benefits. If so, I'll master a patch. Thanks!
Re: [PATCH] mm: memcontrol: use per-cpu stocks for socket memory uncharging
On Tue, Aug 29, 2017 at 03:26:21PM -0400, Johannes Weiner wrote: > On Tue, Aug 29, 2017 at 11:01:50AM +0100, Roman Gushchin wrote: > > We've noticed a quite sensible performance overhead on some hosts > > with significant network traffic when socket memory accounting > > is enabled. > > > > Perf top shows that socket memory uncharging path is hot: > > 2.13% [kernel][k] page_counter_cancel > > 1.14% [kernel][k] __sk_mem_reduce_allocated > > 1.14% [kernel][k] _raw_spin_lock > > 0.87% [kernel][k] _raw_spin_lock_irqsave > > 0.84% [kernel][k] tcp_ack > > 0.84% [kernel][k] ixgbe_poll > > 0.83% < workload > > > 0.82% [kernel][k] enqueue_entity > > 0.68% [kernel][k] __fget > > 0.68% [kernel][k] tcp_delack_timer_handler > > 0.67% [kernel][k] __schedule > > 0.60% < workload > > > 0.59% [kernel][k] __inet6_lookup_established > > 0.55% [kernel][k] __switch_to > > 0.55% [kernel][k] menu_select > > 0.54% libc-2.20.so[.] __memcpy_avx_unaligned > > > > To address this issue, the existing per-cpu stock infrastructure > > can be used. > > > > refill_stock() can be called from mem_cgroup_uncharge_skmem() > > to move charge to a per-cpu stock instead of calling atomic > > page_counter_uncharge(). > > > > To prevent the uncontrolled growth of per-cpu stocks, > > refill_stock() will explicitly drain the cached charge, > > if the cached value exceeds CHARGE_BATCH. > > > > This allows significantly optimize the load: > > 1.21% [kernel][k] _raw_spin_lock > > 1.01% [kernel][k] ixgbe_poll > > 0.92% [kernel][k] _raw_spin_lock_irqsave > > 0.90% [kernel][k] enqueue_entity > > 0.86% [kernel][k] tcp_ack > > 0.85% < workload > > > 0.74% perf-11120.map [.] 0x0061bf24 > > 0.73% [kernel][k] __schedule > > 0.67% [kernel][k] __fget > > 0.63% [kernel][k] __inet6_lookup_established > > 0.62% [kernel][k] menu_select > > 0.59% < workload > > > 0.59% [kernel][k] __switch_to > > 0.57% libc-2.20.so[.] __memcpy_avx_unaligned > > > > Signed-off-by: Roman Gushchin > > Cc: Johannes Weiner > > Cc: Michal Hocko > > Cc: Vladimir Davydov > > Cc: cgro...@vger.kernel.org > > Cc: kernel-t...@fb.com > > Cc: linux...@kvack.org > > Cc: linux-kernel@vger.kernel.org > > Acked-by: Johannes Weiner > > Neat! > > As far as other types of pages go: page cache and anon are already > batched pretty well, but I think kmem might benefit from this > too. Have you considered using the stock in memcg_kmem_uncharge()? Good idea! I'll try to find an appropriate testcase and check if it really brings any benefits. If so, I'll master a patch. Thanks!
Re: [PATCH] mm: memcontrol: use per-cpu stocks for socket memory uncharging
On Tue, Aug 29, 2017 at 11:01:50AM +0100, Roman Gushchin wrote: > We've noticed a quite sensible performance overhead on some hosts > with significant network traffic when socket memory accounting > is enabled. > > Perf top shows that socket memory uncharging path is hot: > 2.13% [kernel][k] page_counter_cancel > 1.14% [kernel][k] __sk_mem_reduce_allocated > 1.14% [kernel][k] _raw_spin_lock > 0.87% [kernel][k] _raw_spin_lock_irqsave > 0.84% [kernel][k] tcp_ack > 0.84% [kernel][k] ixgbe_poll > 0.83% < workload > > 0.82% [kernel][k] enqueue_entity > 0.68% [kernel][k] __fget > 0.68% [kernel][k] tcp_delack_timer_handler > 0.67% [kernel][k] __schedule > 0.60% < workload > > 0.59% [kernel][k] __inet6_lookup_established > 0.55% [kernel][k] __switch_to > 0.55% [kernel][k] menu_select > 0.54% libc-2.20.so[.] __memcpy_avx_unaligned > > To address this issue, the existing per-cpu stock infrastructure > can be used. > > refill_stock() can be called from mem_cgroup_uncharge_skmem() > to move charge to a per-cpu stock instead of calling atomic > page_counter_uncharge(). > > To prevent the uncontrolled growth of per-cpu stocks, > refill_stock() will explicitly drain the cached charge, > if the cached value exceeds CHARGE_BATCH. > > This allows significantly optimize the load: > 1.21% [kernel][k] _raw_spin_lock > 1.01% [kernel][k] ixgbe_poll > 0.92% [kernel][k] _raw_spin_lock_irqsave > 0.90% [kernel][k] enqueue_entity > 0.86% [kernel][k] tcp_ack > 0.85% < workload > > 0.74% perf-11120.map [.] 0x0061bf24 > 0.73% [kernel][k] __schedule > 0.67% [kernel][k] __fget > 0.63% [kernel][k] __inet6_lookup_established > 0.62% [kernel][k] menu_select > 0.59% < workload > > 0.59% [kernel][k] __switch_to > 0.57% libc-2.20.so[.] __memcpy_avx_unaligned > > Signed-off-by: Roman Gushchin> Cc: Johannes Weiner > Cc: Michal Hocko > Cc: Vladimir Davydov > Cc: cgro...@vger.kernel.org > Cc: kernel-t...@fb.com > Cc: linux...@kvack.org > Cc: linux-kernel@vger.kernel.org Acked-by: Johannes Weiner Neat! As far as other types of pages go: page cache and anon are already batched pretty well, but I think kmem might benefit from this too. Have you considered using the stock in memcg_kmem_uncharge()?
Re: [PATCH] mm: memcontrol: use per-cpu stocks for socket memory uncharging
On Tue, Aug 29, 2017 at 11:01:50AM +0100, Roman Gushchin wrote: > We've noticed a quite sensible performance overhead on some hosts > with significant network traffic when socket memory accounting > is enabled. > > Perf top shows that socket memory uncharging path is hot: > 2.13% [kernel][k] page_counter_cancel > 1.14% [kernel][k] __sk_mem_reduce_allocated > 1.14% [kernel][k] _raw_spin_lock > 0.87% [kernel][k] _raw_spin_lock_irqsave > 0.84% [kernel][k] tcp_ack > 0.84% [kernel][k] ixgbe_poll > 0.83% < workload > > 0.82% [kernel][k] enqueue_entity > 0.68% [kernel][k] __fget > 0.68% [kernel][k] tcp_delack_timer_handler > 0.67% [kernel][k] __schedule > 0.60% < workload > > 0.59% [kernel][k] __inet6_lookup_established > 0.55% [kernel][k] __switch_to > 0.55% [kernel][k] menu_select > 0.54% libc-2.20.so[.] __memcpy_avx_unaligned > > To address this issue, the existing per-cpu stock infrastructure > can be used. > > refill_stock() can be called from mem_cgroup_uncharge_skmem() > to move charge to a per-cpu stock instead of calling atomic > page_counter_uncharge(). > > To prevent the uncontrolled growth of per-cpu stocks, > refill_stock() will explicitly drain the cached charge, > if the cached value exceeds CHARGE_BATCH. > > This allows significantly optimize the load: > 1.21% [kernel][k] _raw_spin_lock > 1.01% [kernel][k] ixgbe_poll > 0.92% [kernel][k] _raw_spin_lock_irqsave > 0.90% [kernel][k] enqueue_entity > 0.86% [kernel][k] tcp_ack > 0.85% < workload > > 0.74% perf-11120.map [.] 0x0061bf24 > 0.73% [kernel][k] __schedule > 0.67% [kernel][k] __fget > 0.63% [kernel][k] __inet6_lookup_established > 0.62% [kernel][k] menu_select > 0.59% < workload > > 0.59% [kernel][k] __switch_to > 0.57% libc-2.20.so[.] __memcpy_avx_unaligned > > Signed-off-by: Roman Gushchin > Cc: Johannes Weiner > Cc: Michal Hocko > Cc: Vladimir Davydov > Cc: cgro...@vger.kernel.org > Cc: kernel-t...@fb.com > Cc: linux...@kvack.org > Cc: linux-kernel@vger.kernel.org Acked-by: Johannes Weiner Neat! As far as other types of pages go: page cache and anon are already batched pretty well, but I think kmem might benefit from this too. Have you considered using the stock in memcg_kmem_uncharge()?