Re: High kmalloc-32 slab cache consumption with 10k containers

2021-04-07 Thread Kirill Tkhai
On 07.04.2021 14:47, Bharata B Rao wrote: > On Wed, Apr 07, 2021 at 01:07:27PM +0300, Kirill Tkhai wrote: >>> Here is how the calculation turns out to be in my setup: >>> >>> Number of possible NUMA nodes = 2 >>> Number of mounts per container = 7 (Check b

Re: High kmalloc-32 slab cache consumption with 10k containers

2021-04-07 Thread Kirill Tkhai
On 07.04.2021 08:05, Bharata B Rao wrote: > On Wed, Apr 07, 2021 at 08:28:07AM +1000, Dave Chinner wrote: >> On Mon, Apr 05, 2021 at 11:18:48AM +0530, Bharata B Rao wrote: >>> Hi, >>> >>> When running 1 (more-or-less-empty-)containers on a bare-metal Power9 >>> server(160 CPUs, 2 NUMA nodes,

Re: [v8 PATCH 10/13] mm: vmscan: use per memcg nr_deferred of shrinker

2021-02-16 Thread Kirill Tkhai
Signed-off-by: Yang Shi Acked-by: Kirill Tkhai > --- > mm/vmscan.c | 78 - > 1 file changed, 66 insertions(+), 12 deletions(-) > > diff --git a/mm/vmscan.c b/mm/vmscan.c > index fcb399e18fc3..57cbc6bc8a49 100644 > ---

Re: [v8 PATCH 09/13] mm: vmscan: add per memcg shrinker nr_deferred

2021-02-16 Thread Kirill Tkhai
nkers would solve the > unfairness and bring > better isolation. > > When memcg is not enabled (!CONFIG_MEMCG or memcg disabled), the shrinker's > nr_deferred > would be used. And non memcg aware shrinkers use shrinker's nr_deferred all > the time. > > Signed-off-by: Yang Shi

Re: [v8 PATCH 05/13] mm: vmscan: use kvfree_rcu instead of call_rcu

2021-02-16 Thread Kirill Tkhai
On 17.02.2021 03:13, Yang Shi wrote: > Using kvfree_rcu() to free the old shrinker_maps instead of call_rcu(). > We don't have to define a dedicated callback for call_rcu() anymore. > > Signed-off-by: Yang Shi Acked-by: Kirill Tkhai > --- > mm/vmscan.c | 7 +-- &g

Re: [v7 PATCH 05/12] mm: memcontrol: rename shrinker_map to shrinker_info

2021-02-11 Thread Kirill Tkhai
gt; make shrinker_map not only include map anymore, so rename it to >>> "memcg_shrinker_info". >>> And this should make the patch adding nr_deferred cleaner and readable and >>> make >>> review easier. Also remove the "memcg_"

Re: [v7 PATCH 09/12] mm: vmscan: use per memcg nr_deferred of shrinker

2021-02-10 Thread Kirill Tkhai
On 10.02.2021 04:52, Yang Shi wrote: > On Tue, Feb 9, 2021 at 5:27 PM Roman Gushchin wrote: >> >> On Tue, Feb 09, 2021 at 09:46:43AM -0800, Yang Shi wrote: >>> Use per memcg's nr_deferred for memcg aware shrinkers. The shrinker's >>> nr_deferred >>> will be used in the following cases: >>>

Re: [v7 PATCH 06/12] mm: vmscan: add shrinker_info_protected() helper

2021-02-10 Thread Kirill Tkhai
And the later patch > will add more dereference places. > > So extract the dereference into a helper to make the code more readable. No > functional change. > > Signed-off-by: Yang Shi Acked-by: Kirill Tkhai > --- > mm/vmscan.c | 15 ++- > 1 file changed,

Re: [PATCH] mm/list_lru.c: remove kvfree_rcu_local()

2021-02-08 Thread Kirill Tkhai
), so remove the local kvfree_rcu_local() and just > use the global one. > > Signed-off-by: Shakeel Butt Reviewed-by: Kirill Tkhai > --- > mm/list_lru.c | 12 ++-- > 1 file changed, 2 insertions(+), 10 deletions(-) > > diff --git a/mm/list_lru.c b/mm/list_l

Re: [v6 PATCH 08/11] mm: vmscan: use per memcg nr_deferred of shrinker

2021-02-05 Thread Kirill Tkhai
On 04.02.2021 20:23, Yang Shi wrote: > On Thu, Feb 4, 2021 at 12:42 AM Kirill Tkhai wrote: >> >> On 03.02.2021 20:20, Yang Shi wrote: >>> Use per memcg's nr_deferred for memcg aware shrinkers. The shrinker's >>> nr_deferred >>> will be used in the f

Re: [v6 PATCH 07/11] mm: vmscan: add per memcg shrinker nr_deferred

2021-02-05 Thread Kirill Tkhai
On 04.02.2021 20:17, Yang Shi wrote: > On Thu, Feb 4, 2021 at 12:31 AM Kirill Tkhai wrote: >> >> On 03.02.2021 20:20, Yang Shi wrote: >>> Currently the number of deferred objects are per shrinker, but some slabs, >>> for example, >>> vfs inode/den

Re: [v6 PATCH 09/11] mm: vmscan: don't need allocate shrinker->nr_deferred for memcg aware shrinkers

2021-02-05 Thread Kirill Tkhai
On 04.02.2021 20:32, Yang Shi wrote: > On Thu, Feb 4, 2021 at 2:14 AM Kirill Tkhai wrote: >> >> On 04.02.2021 12:29, Kirill Tkhai wrote: >>> On 03.02.2021 20:20, Yang Shi wrote: >>>> Now nr_deferred is available on per memcg level for memcg aware shrinker

Re: [v6 PATCH 11/11] mm: vmscan: shrink deferred objects proportional to priority

2021-02-04 Thread Kirill Tkhai
On 03.02.2021 20:20, Yang Shi wrote: > The number of deferred objects might get windup to an absurd number, and it > results in clamp of slab objects. It is undesirable for sustaining > workingset. > > So shrink deferred objects proportional to priority and cap nr_deferred to > twice > of

Re: [v6 PATCH 10/11] mm: memcontrol: reparent nr_deferred when memcg offline

2021-02-04 Thread Kirill Tkhai
On 03.02.2021 20:20, Yang Shi wrote: > Now shrinker's nr_deferred is per memcg for memcg aware shrinkers, add to > parent's > corresponding nr_deferred when memcg offline. > > Acked-by: Vlastimil Babka > Signed-off-by: Yang Shi Acked-by: Kirill Tkhai > --- > include

Re: [v6 PATCH 09/11] mm: vmscan: don't need allocate shrinker->nr_deferred for memcg aware shrinkers

2021-02-04 Thread Kirill Tkhai
On 04.02.2021 12:29, Kirill Tkhai wrote: > On 03.02.2021 20:20, Yang Shi wrote: >> Now nr_deferred is available on per memcg level for memcg aware shrinkers, >> so don't need >> allocate shrinker->nr_deferred for such shrinkers anymore. >> >> The prealloc_m

Re: [v6 PATCH 09/11] mm: vmscan: don't need allocate shrinker->nr_deferred for memcg aware shrinkers

2021-02-04 Thread Kirill Tkhai
On 03.02.2021 20:20, Yang Shi wrote: > Now nr_deferred is available on per memcg level for memcg aware shrinkers, so > don't need > allocate shrinker->nr_deferred for such shrinkers anymore. > > The prealloc_memcg_shrinker() would return -ENOSYS if !CONFIG_MEMCG or memcg > is disabled > by

Re: [v6 PATCH 08/11] mm: vmscan: use per memcg nr_deferred of shrinker

2021-02-04 Thread Kirill Tkhai
On 03.02.2021 20:20, Yang Shi wrote: > Use per memcg's nr_deferred for memcg aware shrinkers. The shrinker's > nr_deferred > will be used in the following cases: > 1. Non memcg aware shrinkers > 2. !CONFIG_MEMCG > 3. memcg is disabled by boot parameter > > Signed-off-by: Yang Shi >

Re: [v6 PATCH 07/11] mm: vmscan: add per memcg shrinker nr_deferred

2021-02-04 Thread Kirill Tkhai
On 03.02.2021 20:20, Yang Shi wrote: > Currently the number of deferred objects are per shrinker, but some slabs, > for example, > vfs inode/dentry cache are per memcg, this would result in poor isolation > among memcgs. > > The deferred objects typically are generated by __GFP_NOFS

Re: [v6 PATCH 06/11] mm: vmscan: use a new flag to indicate shrinker is registered

2021-02-04 Thread Kirill Tkhai
heir > shrinker->nr_deferred would always be NULL. This would prevent the shrinkers > from unregistering correctly. > > Remove SHRINKER_REGISTERING since we could check if shrinker is registered > successfully by the new flag. > > Signed-off-by: Yang Shi Acked-by: Kirill Tk

Re: [v6 PATCH 05/11] mm: memcontrol: rename shrinker_map to shrinker_info

2021-02-04 Thread Kirill Tkhai
rred cleaner and readable and > make > review easier. Also remove the "memcg_" prefix. > > Acked-by: Vlastimil Babka > Signed-off-by: Yang Shi Acked-by: Kirill Tkhai > --- > include/linux/memcontrol.h | 8 ++--- > mm/memcontro

Re: [v6 PATCH 04/11] mm: vmscan: remove memcg_shrinker_map_size

2021-02-04 Thread Kirill Tkhai
by > iterating the > bit map. > > Signed-off-by: Yang Shi Acked-by: Kirill Tkhai > --- > mm/vmscan.c | 18 +- > 1 file changed, 9 insertions(+), 9 deletions(-) > > diff --git a/mm/vmscan.c b/mm/vmscan.c > index e4ddaaaeffe2..641077b09e5d 1

Re: [v6 PATCH 03/11] mm: vmscan: use shrinker_rwsem to protect shrinker_maps allocation

2021-02-03 Thread Kirill Tkhai
On 03.02.2021 20:20, Yang Shi wrote: > Since memcg_shrinker_map_size just can be changed under holding shrinker_rwsem > exclusively, the read side can be protected by holding read lock, so it sounds > superfluous to have a dedicated mutex. > > Kirill Tkhai suggested use w

Re: [v6 PATCH 02/11] mm: vmscan: consolidate shrinker_maps handling code

2021-02-03 Thread Kirill Tkhai
structure. So > move the > shrinker_maps handling code into vmscan.c for tighter integration with > shrinker code, > and remove the "memcg_" prefix. There is no functional change. > > Acked-by: Vlastimil Babka > Signed-off-by: Yang Shi Acked-by: Kirill Tkh

Re: [v6 PATCH 01/11] mm: vmscan: use nid from shrink_control for tracepoint

2021-02-03 Thread Kirill Tkhai
shrink happens > on one > node but end up on the other node. It seems confusing. And the following > patch > will remove using nid directly in do_shrink_slab(), this patch also helps > cleanup > the code. > > Acked-by: Vlastimil Babka > Signed-off-by: Yang Shi Ack

Re: [v4 PATCH 04/11] mm: vmscan: remove memcg_shrinker_map_size

2021-01-26 Thread Kirill Tkhai
On 22.01.2021 02:06, Yang Shi wrote: > Both memcg_shrinker_map_size and shrinker_nr_max is maintained, but actually > the > map size can be calculated via shrinker_nr_max, so it seems unnecessary to > keep both. > Remove memcg_shrinker_map_size since shrinker_nr_max is also used by > iterating

Re: [v4 PATCH 07/11] mm: vmscan: add per memcg shrinker nr_deferred

2021-01-25 Thread Kirill Tkhai
On 22.01.2021 02:06, Yang Shi wrote: > Currently the number of deferred objects are per shrinker, but some slabs, > for example, > vfs inode/dentry cache are per memcg, this would result in poor isolation > among memcgs. > > The deferred objects typically are generated by __GFP_NOFS

Re: [v4 PATCH 08/11] mm: vmscan: use per memcg nr_deferred of shrinker

2021-01-25 Thread Kirill Tkhai
On 22.01.2021 02:06, Yang Shi wrote: > Use per memcg's nr_deferred for memcg aware shrinkers. The shrinker's > nr_deferred > will be used in the following cases: > 1. Non memcg aware shrinkers > 2. !CONFIG_MEMCG > 3. memcg is disabled by boot parameter > > Signed-off-by: Yang Shi >

Re: [PATCH] prctl: allow to setup brk for et_dyn executables

2021-01-22 Thread Kirill Tkhai
call. > > Reported-by: Keno Fischer > Signed-off-by: Cyrill Gorcunov > CC: Andrew Morton > CC: Dmitry Safonov <0x7f454...@gmail.com> > CC: Andrey Vagin > CC: Kirill Tkhai > CC: Eric W. Biederman > --- > Guys, take a look please once time permit. Hopefully I

Re: [v3 PATCH 09/11] mm: vmscan: don't need allocate shrinker->nr_deferred for memcg aware shrinkers

2021-01-11 Thread Kirill Tkhai
On 11.01.2021 21:40, Yang Shi wrote: > On Wed, Jan 6, 2021 at 3:16 AM Kirill Tkhai wrote: >> >> On 06.01.2021 01:58, Yang Shi wrote: >>> Now nr_deferred is available on per memcg level for memcg aware shrinkers, >>> so don't need >>> allocate shrink

Re: [v3 PATCH 05/11] mm: vmscan: use a new flag to indicate shrinker is registered

2021-01-11 Thread Kirill Tkhai
On 11.01.2021 21:17, Yang Shi wrote: > On Wed, Jan 6, 2021 at 2:22 AM Kirill Tkhai wrote: >> >> On 06.01.2021 01:58, Yang Shi wrote: >>> Currently registered shrinker is indicated by non-NULL >>> shrinker->nr_deferred. >>> This approach i

Re: [v3 PATCH 03/11] mm: vmscan: use shrinker_rwsem to protect shrinker_maps allocation

2021-01-11 Thread Kirill Tkhai
On 11.01.2021 21:57, Yang Shi wrote: > On Mon, Jan 11, 2021 at 9:34 AM Kirill Tkhai wrote: >> >> On 11.01.2021 20:08, Yang Shi wrote: >>> On Wed, Jan 6, 2021 at 1:55 AM Kirill Tkhai wrote: >>>> >>>> On 06.01.2021 01:58, Yang Shi wrote: >>>

Re: [v3 PATCH 03/11] mm: vmscan: use shrinker_rwsem to protect shrinker_maps allocation

2021-01-11 Thread Kirill Tkhai
On 11.01.2021 20:08, Yang Shi wrote: > On Wed, Jan 6, 2021 at 1:55 AM Kirill Tkhai wrote: >> >> On 06.01.2021 01:58, Yang Shi wrote: >>> Since memcg_shrinker_map_size just can be changd under holding >>> shrinker_rwsem >>> exclusively, the read side

Re: [v3 PATCH 06/11] mm: memcontrol: rename shrinker_map to shrinker_info

2021-01-06 Thread Kirill Tkhai
On 06.01.2021 01:58, Yang Shi wrote: > The following patch is going to add nr_deferred into shrinker_map, the change > will > make shrinker_map not only include map anymore, so rename it to a more general > name. And this should make the patch adding nr_deferred cleaner and readable > and make

Re: [v3 PATCH 10/11] mm: memcontrol: reparent nr_deferred when memcg offline

2021-01-06 Thread Kirill Tkhai
On 06.01.2021 01:58, Yang Shi wrote: > Now shrinker's nr_deferred is per memcg for memcg aware shrinkers, add to > parent's > corresponding nr_deferred when memcg offline. > > Signed-off-by: Yang Shi > --- > include/linux/memcontrol.h | 1 + > mm/memcontrol.c| 1 + > mm/vmscan.c

Re: [v3 PATCH 09/11] mm: vmscan: don't need allocate shrinker->nr_deferred for memcg aware shrinkers

2021-01-06 Thread Kirill Tkhai
On 06.01.2021 01:58, Yang Shi wrote: > Now nr_deferred is available on per memcg level for memcg aware shrinkers, so > don't need > allocate shrinker->nr_deferred for such shrinkers anymore. > > The prealloc_memcg_shrinker() would return -ENOSYS if !CONFIG_MEMCG or memcg > is disabled > by

Re: [v3 PATCH 07/11] mm: vmscan: add per memcg shrinker nr_deferred

2021-01-06 Thread Kirill Tkhai
On 06.01.2021 01:58, Yang Shi wrote: > Currently the number of deferred objects are per shrinker, but some slabs, > for example, > vfs inode/dentry cache are per memcg, this would result in poor isolation > among memcgs. > > The deferred objects typically are generated by __GFP_NOFS

Re: [v3 PATCH 05/11] mm: vmscan: use a new flag to indicate shrinker is registered

2021-01-06 Thread Kirill Tkhai
On 06.01.2021 01:58, Yang Shi wrote: > Currently registered shrinker is indicated by non-NULL shrinker->nr_deferred. > This approach is fine with nr_deferred at the shrinker level, but the > following > patches will move MEMCG_AWARE shrinkers' nr_deferred to memcg level, so their >

Re: [v3 PATCH 04/11] mm: vmscan: remove memcg_shrinker_map_size

2021-01-06 Thread Kirill Tkhai
On 06.01.2021 01:58, Yang Shi wrote: > Both memcg_shrinker_map_size and shrinker_nr_max is maintained, but actually > the > map size can be calculated via shrinker_nr_max, so it seems unnecessary to > keep both. > Remove memcg_shrinker_map_size since shrinker_nr_max is also used by > iterating

Re: [v3 PATCH 03/11] mm: vmscan: use shrinker_rwsem to protect shrinker_maps allocation

2021-01-06 Thread Kirill Tkhai
On 06.01.2021 01:58, Yang Shi wrote: > Since memcg_shrinker_map_size just can be changd under holding shrinker_rwsem > exclusively, the read side can be protected by holding read lock, so it sounds > superfluous to have a dedicated mutex. This should not exacerbate the > contention > to

[PATCH v2] crypto: Fix divide error in do_xor_speed()

2020-12-30 Thread Kirill Tkhai
crypto: Fix divide error in do_xor_speed() From: Kirill Tkhai Latest (but not only latest) linux-next panics with divide error on my QEMU setup. The patch at the bottom of this message fixes the problem. xor: measuring software checksum speed divide error: [#1] PREEMPT SMP KASAN PREEMPT

[PATCH] crypto: Fix divide error in do_xor_speed()

2020-12-30 Thread Kirill Tkhai
do_one_initcall+0xc1/0x1b7 ? start_kernel+0x373/0x373 ? unpoison_range+0x3a/0x60 kernel_init_freeable+0x1dd/0x238 ? rest_init+0xc6/0xc6 kernel_init+0x8/0x10a ret_from_fork+0x1f/0x30 ---[ end trace 5bd3c1d0b2da ]--- Signed-off-by: Kirill Tkhai --- crypto/xor.c |2 ++ 1 file changed, 2 insertions

Re: regression: 9a56493f6942 "uts: Use generic ns_common::count" broke makedumpfile 1.6.7

2020-12-16 Thread Kirill Tkhai
On 16.12.2020 17:49, Mike Galbraith wrote: > On Wed, 2020-12-16 at 15:31 +0100, Mike Galbraith wrote: >> On Wed, 2020-12-16 at 17:23 +0300, Kirill Tkhai wrote: >>> >>> Does this regression only cause that one error message "check_release: >>> Can't get t

Re: regression: 9a56493f6942 "uts: Use generic ns_common::count" broke makedumpfile 1.6.7

2020-12-16 Thread Kirill Tkhai
On 16.12.2020 16:32, Mike Galbraith wrote: > On Wed, 2020-12-16 at 15:35 +0300, Kirill Tkhai wrote: >> Hi, Alexander, >> >> On 16.12.2020 14:02, Mike Galbraith wrote: >>> Greetings, >>> >>> With this commit, bisected and confirmed, kdump stops work

Re: [v2 PATCH 2/9] mm: memcontrol: use shrinker_rwsem to protect shrinker_maps allocation

2020-12-16 Thread Kirill Tkhai
16.12.2020, 00:59, "Dave Chinner" : > On Tue, Dec 15, 2020 at 02:53:48PM +0100, Johannes Weiner wrote: >>  On Tue, Dec 15, 2020 at 01:09:57PM +1100, Dave Chinner wrote: >>  > On Mon, Dec 14, 2020 at 02:37:15PM -0800, Yang Shi wrote: >>  > > Since memcg_shrinker_map_size just can be changd under

Re: regression: 9a56493f6942 "uts: Use generic ns_common::count" broke makedumpfile 1.6.7

2020-12-16 Thread Kirill Tkhai
Hi, Alexander, On 16.12.2020 14:02, Mike Galbraith wrote: > Greetings, > > With this commit, bisected and confirmed, kdump stops working here, > makedumpfile saying "check_release: Can't get the kernel version". hasn't your commit 55d9e11398a4 "kdump: append uts_namespace.name offset to

Re: [v2 PATCH 3/9] mm: vmscan: guarantee shrinker_slab_memcg() sees valid shrinker_maps for online memcg

2020-12-15 Thread Kirill Tkhai
15.12.2020, 15:40, "Johannes Weiner" : > On Mon, Dec 14, 2020 at 02:37:16PM -0800, Yang Shi wrote: >>  The shrink_slab_memcg() races with mem_cgroup_css_online(). A visibility of >> CSS_ONLINE flag >>  in shrink_slab_memcg()->mem_cgroup_online() does not guarantee that we will >> see >>  

Re: [PATCH 6/9] mm: vmscan: use per memcg nr_deferred of shrinker

2020-12-10 Thread Kirill Tkhai
On 10.12.2020 18:13, Johannes Weiner wrote: > On Wed, Dec 09, 2020 at 09:32:37AM -0800, Yang Shi wrote: >> On Wed, Dec 9, 2020 at 7:42 AM Kirill Tkhai wrote: >>> >>> On 08.12.2020 20:13, Yang Shi wrote: >>>> On Thu, Dec 3, 2020 at 3:40 AM Kirill Tkhai wrote

Re: [PATCH 6/9] mm: vmscan: use per memcg nr_deferred of shrinker

2020-12-09 Thread Kirill Tkhai
On 08.12.2020 20:13, Yang Shi wrote: > On Thu, Dec 3, 2020 at 3:40 AM Kirill Tkhai wrote: >> >> On 02.12.2020 21:27, Yang Shi wrote: >>> Use per memcg's nr_deferred for memcg aware shrinkers. The shrinker's >>> nr_deferred >>> will be used in the f

Re: [PATCH 6/9] mm: vmscan: use per memcg nr_deferred of shrinker

2020-12-03 Thread Kirill Tkhai
On 02.12.2020 21:27, Yang Shi wrote: > Use per memcg's nr_deferred for memcg aware shrinkers. The shrinker's > nr_deferred > will be used in the following cases: > 1. Non memcg aware shrinkers > 2. !CONFIG_MEMCG > 3. memcg is disabled by boot parameter > > Signed-off-by: Yang Shi >

Re: [v3 PATCH] mm: list_lru: set shrinker map bit when child nr_items is not zero

2020-12-02 Thread Kirill Tkhai
in reparenting is the simplest and avoids lock > contention. > > Fixes: fae91d6d8be5 ("mm/list_lru.c: set bit in memcg shrinker bitmap on > first list_lru item appearance") > Suggested-by: Roman Gushchin > Reviewed-by: Roman Gushchin > Cc: Vladimir Davydov >

Re: [PATCH] mm: list_lru: hold nlru lock to avoid reading transient negative nr_items

2020-12-01 Thread Kirill Tkhai
On 01.12.2020 20:15, Yang Shi wrote: > On Tue, Dec 1, 2020 at 2:25 AM Kirill Tkhai wrote: >> >> On 30.11.2020 23:09, Roman Gushchin wrote: >>> On Mon, Nov 30, 2020 at 10:45:14AM -0800, Yang Shi wrote: >>>> When investigating a slab cache bloat problem, signif

Re: [PATCH] mm: list_lru: hold nlru lock to avoid reading transient negative nr_items

2020-12-01 Thread Kirill Tkhai
On 30.11.2020 23:09, Roman Gushchin wrote: > On Mon, Nov 30, 2020 at 10:45:14AM -0800, Yang Shi wrote: >> When investigating a slab cache bloat problem, significant amount of >> negative dentry cache was seen, but confusingly they neither got shrunk >> by reclaimer (the host has very tight memory)

Re: [PATCH] mm: list_lru: hold nlru lock to avoid reading transient negative nr_items

2020-12-01 Thread Kirill Tkhai
acceptable to synchronize reading nr_items to avoid seeing > intermediate negative nr_items given the simplicity and it is typically > just called by shrinkers when counting the freeable objects. > > The patch is tested with some shrinker intensive workloads, no > noticeable regres

Re: [PATCH v4 02/27] net: datagram: fix some kernel-doc markups

2020-11-16 Thread Kirill Tkhai
On 16.11.2020 13:17, Mauro Carvalho Chehab wrote: > Some identifiers have different names between their prototypes > and the kernel-doc markup. > > Signed-off-by: Mauro Carvalho Chehab Reviewed-by: Kirill Tkhai > --- > net/core/datagram.c | 2 +- > net/core/dev.c

Re: Inconsistent capability requirements for prctl_set_mm_exe_file()

2020-10-27 Thread Kirill Tkhai
On 27.10.2020 15:11, Michael Kerrisk (man-pages) wrote: > Hello Nicolas, Cyrill, and others, > > @Nicolas, your commit ebd6de6812387a changed the capability > requirements for the prctl_set_mm_exe_file() operation from > > ns_capable(CAP_SYS_ADMIN) > > to > > ns_capable(CAP_SYS_ADMIN)

Re: [PATCH 4/5] mm: Do early cow for pinned pages during fork() for ptes

2020-09-24 Thread Kirill Tkhai
On 22.09.2020 00:20, Peter Xu wrote: > This patch is greatly inspired by the discussions on the list from Linus, > Jason > Gunthorpe and others [1]. > > It allows copy_pte_range() to do early cow if the pages were pinned on the > source mm. Currently we don't have an accurate way to know

Re: [PATCH 1/4] mm: Trial do_wp_page() simplification

2020-08-24 Thread Kirill Tkhai
On 24.08.2020 17:30, Jan Kara wrote: > On Mon 24-08-20 11:36:22, Kirill Tkhai wrote: >> On 22.08.2020 02:49, Peter Xu wrote: >>> From: Linus Torvalds >>> >>> How about we just make sure we're the only possible valid user fo the >>> page before we b

Re: [PATCH 0/4] mm: Simplfy cow handling

2020-08-24 Thread Kirill Tkhai
On 22.08.2020 02:49, Peter Xu wrote: > This is a small series that I picked up from Linus's suggestion [0] to > simplify > cow handling (and also more strict) by checking against page refcounts rather > than mapcounts. > > I'm CCing the author and reviewer of commit 52d1e606ee73 on ksm ("mm:

Re: [PATCH 1/4] mm: Trial do_wp_page() simplification

2020-08-24 Thread Kirill Tkhai
On 22.08.2020 02:49, Peter Xu wrote: > From: Linus Torvalds > > How about we just make sure we're the only possible valid user fo the > page before we bother to reuse it? > > Simplify, simplify, simplify. > > And get rid of the nasty serialization on the page lock at the same time. > >

Re: [PATCH 00/23] proc: Introduce /proc/namespaces/ directory to expose namespaces lineary

2020-08-17 Thread Kirill Tkhai
On 14.08.2020 22:21, Andrei Vagin wrote: > On Fri, Aug 14, 2020 at 06:11:58PM +0300, Kirill Tkhai wrote: >> On 14.08.2020 04:16, Andrei Vagin wrote: >>> On Thu, Aug 13, 2020 at 11:12:45AM +0300, Kirill Tkhai wrote: >>>> On 12.08.2020 20:53, Andrei Vagin wrote: >&

Re: [PATCH 00/23] proc: Introduce /proc/namespaces/ directory to expose namespaces lineary

2020-08-14 Thread Kirill Tkhai
On 14.08.2020 04:16, Andrei Vagin wrote: > On Thu, Aug 13, 2020 at 11:12:45AM +0300, Kirill Tkhai wrote: >> On 12.08.2020 20:53, Andrei Vagin wrote: >>> On Tue, Aug 11, 2020 at 01:23:35PM +0300, Kirill Tkhai wrote: >>>> On 10.08.2020 20:34, Andrei Vagin wrote: >&

Re: [PATCH 00/23] proc: Introduce /proc/namespaces/ directory to expose namespaces lineary

2020-08-13 Thread Kirill Tkhai
On 12.08.2020 20:53, Andrei Vagin wrote: > On Tue, Aug 11, 2020 at 01:23:35PM +0300, Kirill Tkhai wrote: >> On 10.08.2020 20:34, Andrei Vagin wrote: >>> On Fri, Aug 07, 2020 at 11:47:57AM +0300, Kirill Tkhai wrote: >>>> On 06.08.2020 11:05, Andrei Vagin wrote: >&

Re: [PATCH 00/23] proc: Introduce /proc/namespaces/ directory to expose namespaces lineary

2020-08-11 Thread Kirill Tkhai
On 10.08.2020 20:34, Andrei Vagin wrote: > On Fri, Aug 07, 2020 at 11:47:57AM +0300, Kirill Tkhai wrote: >> On 06.08.2020 11:05, Andrei Vagin wrote: >>> On Mon, Aug 03, 2020 at 01:03:17PM +0300, Kirill Tkhai wrote: >>>> On 31.07.2020 01:13, Eric W. Biederman wro

Re: [PATCH 00/23] proc: Introduce /proc/namespaces/ directory to expose namespaces lineary

2020-08-07 Thread Kirill Tkhai
On 06.08.2020 11:05, Andrei Vagin wrote: > On Mon, Aug 03, 2020 at 01:03:17PM +0300, Kirill Tkhai wrote: >> On 31.07.2020 01:13, Eric W. Biederman wrote: >>> Kirill Tkhai writes: >>> >>>> On 30.07.2020 17:34, Eric W. Biederman wrote: >>>

Re: [PATCH 00/23] proc: Introduce /proc/namespaces/ directory to expose namespaces lineary

2020-08-04 Thread Kirill Tkhai
On 04.08.2020 08:43, Andrei Vagin wrote: > On Thu, Jul 30, 2020 at 06:01:20PM +0300, Kirill Tkhai wrote: >> On 30.07.2020 17:34, Eric W. Biederman wrote: >>> Kirill Tkhai writes: >>> >>>> Currently, there is no a way to list or iterate all or subset

Re: [PATCH 1/8] ns: Add common refcount into ns_common add use it as counter for net_ns

2020-08-04 Thread Kirill Tkhai
On 04.08.2020 16:52, Eric W. Biederman wrote: > Kirill Tkhai writes: > >> On 04.08.2020 15:21, Eric W. Biederman wrote: >>> Kirill Tkhai writes: >>> >>>> Currently, every type of namespaces has its own counter, >>>> which is stored in ns-spe

Re: [PATCH 1/8] ns: Add common refcount into ns_common add use it as counter for net_ns

2020-08-04 Thread Kirill Tkhai
On 04.08.2020 15:21, Eric W. Biederman wrote: > Kirill Tkhai writes: > >> Currently, every type of namespaces has its own counter, >> which is stored in ns-specific part. Say, @net has >> struct net::count, @pid has struct pid_namespace::kref, etc. >> >> This

[PATCH 8/8] time: Use generic ns_common::count

2020-08-03 Thread Kirill Tkhai
Convert time namespace to use generic counter. Signed-off-by: Kirill Tkhai Acked-by: Christian Brauner --- include/linux/time_namespace.h |9 - kernel/time/namespace.c|9 +++-- 2 files changed, 7 insertions(+), 11 deletions(-) diff --git a/include/linux

[PATCH 3/8] ipc: Use generic ns_common::count

2020-08-03 Thread Kirill Tkhai
Convert uts namespace to use generic counter. Signed-off-by: Kirill Tkhai Acked-by: Christian Brauner --- include/linux/ipc_namespace.h |3 +-- ipc/msgutil.c |2 +- ipc/namespace.c |4 ++-- 3 files changed, 4 insertions(+), 5 deletions(-) diff --git

[PATCH 0/8] namespaces: Introduce generic refcount

2020-08-03 Thread Kirill Tkhai
Every namespace type has its own counter. Some of them are of refcount_t, some of them are of kref. This patchset introduces generic ns_common::count for any type of namespaces instead of them. --- Kirill Tkhai (8): ns: Add common refcount into ns_common add use it as counter for net_ns

[PATCH 2/8] uts: Use generic ns_common::count

2020-08-03 Thread Kirill Tkhai
Convert uts namespace to use generic counter instead of kref. Signed-off-by: Kirill Tkhai Acked-by: Christian Brauner --- include/linux/utsname.h |9 - init/version.c |2 +- kernel/utsname.c|7 ++- 3 files changed, 7 insertions(+), 11 deletions(-) diff

[PATCH 4/8] pid: Use generic ns_common::count

2020-08-03 Thread Kirill Tkhai
Convert pid namespace to use generic counter. Signed-off-by: Kirill Tkhai Acked-by: Christian Brauner --- include/linux/pid_namespace.h |4 +--- kernel/pid.c |2 +- kernel/pid_namespace.c| 13 +++-- 3 files changed, 5 insertions(+), 14 deletions

[PATCH 6/8] mnt: Use generic ns_common::count

2020-08-03 Thread Kirill Tkhai
Convert mount namespace to use generic counter. Signed-off-by: Kirill Tkhai Acked-by: Christian Brauner --- fs/mount.h |3 +-- fs/namespace.c |4 ++-- 2 files changed, 3 insertions(+), 4 deletions(-) diff --git a/fs/mount.h b/fs/mount.h index c3e0bb6e5782..f296862032ec 100644

[PATCH 1/8] ns: Add common refcount into ns_common add use it as counter for net_ns

2020-08-03 Thread Kirill Tkhai
-by: Kirill Tkhai Acked-by: Christian Brauner --- include/linux/ns_common.h |3 +++ include/net/net_namespace.h | 11 --- net/core/net-sysfs.c |6 +++--- net/core/net_namespace.c |6 +++--- net/ipv4/inet_timewait_sock.c |4 ++-- net/ipv4/tcp_metrics.c

[PATCH 7/8] cgroup: Use generic ns_common::count

2020-08-03 Thread Kirill Tkhai
Convert cgroup namespace to use generic counter. Signed-off-by: Kirill Tkhai Acked-by: Christian Brauner --- include/linux/cgroup.h|5 ++--- kernel/cgroup/cgroup.c|2 +- kernel/cgroup/namespace.c |2 +- 3 files changed, 4 insertions(+), 5 deletions(-) diff --git a/include

[PATCH 5/8] user: Use generic ns_common::count

2020-08-03 Thread Kirill Tkhai
Convert user namespace to use generic counter. Signed-off-by: Kirill Tkhai Acked-by: Christian Brauner --- include/linux/user_namespace.h |5 ++--- kernel/user.c |2 +- kernel/user_namespace.c|4 ++-- 3 files changed, 5 insertions(+), 6 deletions(-) diff

Re: [PATCH 00/23] proc: Introduce /proc/namespaces/ directory to expose namespaces lineary

2020-08-03 Thread Kirill Tkhai
On 31.07.2020 01:13, Eric W. Biederman wrote: > Kirill Tkhai writes: > >> On 30.07.2020 17:34, Eric W. Biederman wrote: >>> Kirill Tkhai writes: >>> >>>> Currently, there is no a way to list or iterate all or subset of namespaces >>>> in the

Re: [PATCH 00/23] proc: Introduce /proc/namespaces/ directory to expose namespaces lineary

2020-07-30 Thread Kirill Tkhai
On 30.07.2020 17:34, Eric W. Biederman wrote: > Kirill Tkhai writes: > >> Currently, there is no a way to list or iterate all or subset of namespaces >> in the system. Some namespaces are exposed in /proc/[pid]/ns/ directories, >> but some also may be as open files

Re: [PATCH 01/23] ns: Add common refcount into ns_common add use it as counter for net_ns

2020-07-30 Thread Kirill Tkhai
On 30.07.2020 17:30, Christian Brauner wrote: > On Thu, Jul 30, 2020 at 02:59:25PM +0300, Kirill Tkhai wrote: >> Currently, every type of namespaces has its own counter, >> which is stored in ns-specific part. Say, @net has >> struct net::count, @pid has struct pid

Re: [PATCH 11/23] fs: Add /proc/namespaces/ directory

2020-07-30 Thread Kirill Tkhai
On 30.07.2020 16:26, Christian Brauner wrote: > On Thu, Jul 30, 2020 at 03:00:19PM +0300, Kirill Tkhai wrote: >> This is a new directory to show all namespaces, which can be >> accessed from this /proc tasks credentials. >> >> Every /proc is related to a pid_names

Re: [PATCH 09/23] ns: Introduce ns_idr to be able to iterate all allocated namespaces in the system

2020-07-30 Thread Kirill Tkhai
On 30.07.2020 17:15, Matthew Wilcox wrote: > On Thu, Jul 30, 2020 at 05:12:09PM +0300, Kirill Tkhai wrote: >> On 30.07.2020 16:56, Matthew Wilcox wrote: >>> On Thu, Jul 30, 2020 at 04:32:22PM +0300, Kirill Tkhai wrote: >>>> On 30.07.2020 15:23, Matthew Wilcox

Re: [PATCH 09/23] ns: Introduce ns_idr to be able to iterate all allocated namespaces in the system

2020-07-30 Thread Kirill Tkhai
On 30.07.2020 16:56, Matthew Wilcox wrote: > On Thu, Jul 30, 2020 at 04:32:22PM +0300, Kirill Tkhai wrote: >> On 30.07.2020 15:23, Matthew Wilcox wrote: >>> xa_erase_irqsave(); >> >> static inline void *xa_erase_irqsave(struct xarray *xa, unsigned long index) &g

Re: [PATCH 01/23] ns: Add common refcount into ns_common add use it as counter for net_ns

2020-07-30 Thread Kirill Tkhai
On 30.07.2020 16:35, Christian Brauner wrote: > On Thu, Jul 30, 2020 at 02:59:25PM +0300, Kirill Tkhai wrote: >> Currently, every type of namespaces has its own counter, >> which is stored in ns-specific part. Say, @net has >> struct net::count, @pid has struct pid

Re: [PATCH 09/23] ns: Introduce ns_idr to be able to iterate all allocated namespaces in the system

2020-07-30 Thread Kirill Tkhai
On 30.07.2020 15:23, Matthew Wilcox wrote: > On Thu, Jul 30, 2020 at 03:00:08PM +0300, Kirill Tkhai wrote: >> This patch introduces a new IDR and functions to add/remove and iterate >> registered namespaces in the system. It will be used to list namespaces >> in /proc/n

Re: [PATCH 11/23] fs: Add /proc/namespaces/ directory

2020-07-30 Thread Kirill Tkhai
On 30.07.2020 15:18, Alexey Dobriyan wrote: > On Thu, Jul 30, 2020 at 03:00:19PM +0300, Kirill Tkhai wrote: > >> # ls /proc/namespaces/ -l >> lrwxrwxrwx 1 root root 0 Jul 29 16:50 'cgroup:[4026531835]' -> >> 'cgroup:[4026531835]' >> lrwxrwxrwx 1 root roo

[PATCH 19/23] uts: Add uts namespaces into ns_idr

2020-07-30 Thread Kirill Tkhai
Now they are exposed in /proc/namespace/ directory. Signed-off-by: Kirill Tkhai --- kernel/utsname.c | 10 ++ 1 file changed, 10 insertions(+) diff --git a/kernel/utsname.c b/kernel/utsname.c index aebf4df5f592..883855ca16cd 100644 --- a/kernel/utsname.c +++ b/kernel/utsname.c

[PATCH 23/23] time: Add time namespaces into ns_idr

2020-07-30 Thread Kirill Tkhai
Now they are exposed in /proc/namespace/ directory. Signed-off-by: Kirill Tkhai --- include/linux/time_namespace.h |1 + kernel/time/namespace.c| 11 ++- 2 files changed, 11 insertions(+), 1 deletion(-) diff --git a/include/linux/time_namespace.h b/include/linux

[PATCH 20/23] ipc: Add ipc namespaces into ns_idr

2020-07-30 Thread Kirill Tkhai
Now they are exposed in /proc/namespace/ directory. Signed-off-by: Kirill Tkhai --- ipc/namespace.c | 13 - ipc/shm.c |1 + 2 files changed, 13 insertions(+), 1 deletion(-) diff --git a/ipc/namespace.c b/ipc/namespace.c index 7bd0766ddc3b..ce6f87dd6d08 100644 --- a/ipc

[PATCH 21/23] mnt: Add mount namespaces into ns_idr

2020-07-30 Thread Kirill Tkhai
Now they are exposed in /proc/namespace/ directory. Signed-off-by: Kirill Tkhai --- fs/mount.h |1 + fs/namespace.c | 10 +- 2 files changed, 10 insertions(+), 1 deletion(-) diff --git a/fs/mount.h b/fs/mount.h index f296862032ec..cde7f7bed8ec 100644 --- a/fs/mount.h +++ b/fs

[PATCH 22/23] cgroup: Add cgroup namespaces into ns_idr

2020-07-30 Thread Kirill Tkhai
Now they are exposed in /proc/namespace/ directory. Signed-off-by: Kirill Tkhai --- include/linux/cgroup.h|1 + kernel/cgroup/namespace.c | 23 +++ 2 files changed, 20 insertions(+), 4 deletions(-) diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h index

[PATCH 17/23] pid: Add pid namespaces into ns_idr

2020-07-30 Thread Kirill Tkhai
Now they are exposed in /proc/namespace/ directory. Note, that we already wait RCU grace period before pid namespace's memory is freed. Signed-off-by: Kirill Tkhai --- kernel/pid_namespace.c |7 +++ 1 file changed, 7 insertions(+) diff --git a/kernel/pid_namespace.c b/kernel

[PATCH 16/23] proc_ns_operations: Add can_get method

2020-07-30 Thread Kirill Tkhai
This is a new method to prohibit some namespaces in intermediate state. Currently, it's used to prohibit pid namespace, whose child reaper is not created yet (similar to we have in /proc/[pid]/pid_for_children). Signed-off-by: Kirill Tkhai --- fs/proc/namespaces.c|5 + include/linux

[PATCH 18/23] uts: Free uts namespace one RCU grace period after final counter put

2020-07-30 Thread Kirill Tkhai
This is needed to link uts_ns into ns_idr in next patch. Signed-off-by: Kirill Tkhai --- include/linux/utsname.h |1 + kernel/utsname.c| 10 +- 2 files changed, 10 insertions(+), 1 deletion(-) diff --git a/include/linux/utsname.h b/include/linux/utsname.h index

[PATCH 15/23] pid: Eextract child_reaper check from pidns_for_children_get()

2020-07-30 Thread Kirill Tkhai
This check if for prohibiting access to /proc/[pid]/ns/pid_for_children before first task of the pid namespace is created. /proc/namespaces/ code will use this check too, so we move it into a separate function. Signed-off-by: Kirill Tkhai --- kernel/pid_namespace.c | 25

[PATCH 12/23] user: Free user_ns one RCU grace period after final counter put

2020-07-30 Thread Kirill Tkhai
This is needed to link user_ns into ns_idr in next patch. Signed-off-by: Kirill Tkhai --- include/linux/user_namespace.h |5 - kernel/user_namespace.c|9 - 2 files changed, 12 insertions(+), 2 deletions(-) diff --git a/include/linux/user_namespace.h b/include/linux

[PATCH 13/23] user: Add user namespaces into ns_idr

2020-07-30 Thread Kirill Tkhai
Now they are exposed in /proc/namespace/ directory. Signed-off-by: Kirill Tkhai --- kernel/user_namespace.c | 10 +- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/kernel/user_namespace.c b/kernel/user_namespace.c index 367a942bb484..bbfd7f0f9e7c 100644 --- a/kernel

[PATCH 14/23] net: Add net namespaces into ns_idr

2020-07-30 Thread Kirill Tkhai
Now they are exposed in /proc/namespace/ directory. We already wait RCU grace period in cleanup_net() before pernet_operations exit, so ns_idr_unregister() works as expected. Signed-off-by: Kirill Tkhai --- net/core/net_namespace.c | 12 +++- 1 file changed, 11 insertions(+), 1

[PATCH 08/23] time: Use generic ns_common::count

2020-07-30 Thread Kirill Tkhai
Convert time namespace to use generic counter. Signed-off-by: Kirill Tkhai --- include/linux/time_namespace.h |9 - kernel/time/namespace.c|9 +++-- 2 files changed, 7 insertions(+), 11 deletions(-) diff --git a/include/linux/time_namespace.h b/include/linux

[PATCH 10/23] fs: Rename fs/proc/namespaces.c into fs/proc/task_namespaces.c

2020-07-30 Thread Kirill Tkhai
This file is about task namespaces, so we rename it. No functional changes. Signed-off-by: Kirill Tkhai --- fs/proc/Makefile |2 fs/proc/internal.h|2 fs/proc/namespaces.c | 183 - fs/proc/task_namespaces.c | 183

[PATCH 11/23] fs: Add /proc/namespaces/ directory

2020-07-30 Thread Kirill Tkhai
Jul 29 16:50 'user:[4026531837]' -> 'user:[4026531837]' lrwxrwxrwx 1 root root 0 Jul 29 16:50 'uts:[4026531838]' -> 'uts:[4026531838]' Every namespace may be open like ordinary file in /proc/[pid]/ns. Signed-off-by: Kirill Tkhai --- fs/nsfs.c |2 fs/proc/Makefile|

  1   2   3   4   5   6   7   8   9   10   >