On 3/27/18 3:32 AM, Cyrill Gorcunov wrote:
On Mon, Mar 26, 2018 at 05:59:49PM -0400, Yang Shi wrote:
Say we've two syscalls running prctl_set_mm_map in parallel, and imagine
one have @start_brk = 20 @brk = 10 and second caller has @start_brk = 30
and @brk = 20. Since now the call is gu
Hi folks,
I did a quick test with mlock2 + VM_LOCKONFAULT flag. The test just does
an 1MB anonymous map and 1MB file map with VM_LOCKONFAULT respectively.
Then it tries to access one page of each mapping.
From /proc/meminfo, I can see 1 page marked mlocked from anonymous
mapping. But, the
SIZE.
Signed-off-by: Yang Shi
Cc: "Kirill A. Shutemov"
Cc: Hugh Dickins
Cc: Michal Hocko
Cc: Alexander Viro
Suggested-by: Christoph Hellwig
---
v4 --> v5:
* Adopted suggestion from Kirill to use IS_ENABLED and check 'force' and
'deny'. Extracted the condition
On 4/11/19 9:06 AM, Dave Hansen wrote:
On 4/10/19 8:56 PM, Yang Shi wrote:
When demoting to PMEM node, the target node may have memory pressure,
then the memory pressure may cause migrate_pages() fail.
If the failure is caused by memory pressure (i.e. returning -ENOMEM),
tag the node with
On 4/11/19 7:31 AM, Dave Hansen wrote:
On 4/10/19 8:56 PM, Yang Shi wrote:
include/linux/gfp.h| 12
include/linux/migrate.h| 1 +
include/trace/events/migrate.h | 3 +-
mm/debug.c | 1 +
mm/internal.h | 13 +
mm
On 4/15/19 3:13 PM, Dave Hansen wrote:
On 4/15/19 3:06 PM, Yang Shi wrote:
This seems like an actively bad idea to me.
Why do we need an *active* note to say the node is contended? Why isn't
just getting a failure back from migrate_pages() enough? Have you
observed this in practice?
On 4/15/19 3:14 PM, Dave Hansen wrote:
On 4/15/19 3:10 PM, Yang Shi wrote:
Also, I don't see anything in the code tying this to strictly demote
from DRAM to PMEM. Is that the end effect, or is it really implemented
that way and I missed it?
No, not restrict to PMEM. It just tries to d
On 4/12/19 1:47 AM, Michal Hocko wrote:
On Thu 11-04-19 11:56:50, Yang Shi wrote:
[...]
Design
==
Basically, the approach is aimed to spread data from DRAM (closest to local
CPU) down further to PMEM and disk (typically assume the lower tier storage
is slower, larger and cheaper than the
On 4/16/19 12:47 AM, Michal Hocko wrote:
On Mon 15-04-19 17:09:07, Yang Shi wrote:
On 4/12/19 1:47 AM, Michal Hocko wrote:
On Thu 11-04-19 11:56:50, Yang Shi wrote:
[...]
Design
==
Basically, the approach is aimed to spread data from DRAM (closest to local
CPU) down further to PMEM
On 4/16/19 2:22 PM, Dave Hansen wrote:
On 4/16/19 12:19 PM, Yang Shi wrote:
would we prefer to try all the nodes in the fallback order to find the
first less contended one (i.e. DRAM0 -> PMEM0 -> DRAM1 -> PMEM1 -> Swap)?
Once a page went to DRAM1, how would we tell that it o
On 4/16/19 4:04 PM, Dave Hansen wrote:
On 4/16/19 2:59 PM, Yang Shi wrote:
On 4/16/19 2:22 PM, Dave Hansen wrote:
Keith Busch had a set of patches to let you specify the demotion order
via sysfs for fun. The rules we came up with were:
1. Pages keep no history of where they have been
2
Why cannot we start simple and build from there? In other words I
do not
think we really need anything like N_CPU_MEM at all.
In this patchset N_CPU_MEM is used to tell us what nodes are cpuless
nodes.
They would be the preferred demotion target. Of course, we could
rely on
firmware to just
On 4/17/19 9:39 AM, Michal Hocko wrote:
On Wed 17-04-19 09:37:39, Keith Busch wrote:
On Wed, Apr 17, 2019 at 05:39:23PM +0200, Michal Hocko wrote:
On Wed 17-04-19 09:23:46, Keith Busch wrote:
On Wed, Apr 17, 2019 at 11:23:18AM +0200, Michal Hocko wrote:
On Tue 16-04-19 14:22:33, Dave Hanse
I would also not touch the numa balancing logic at this stage and
rather
see how the current implementation behaves.
I agree we would prefer start from something simpler and see how it
works.
The "twice access" optimization is aimed to reduce the PMEM
bandwidth burden
since the bandwidth
Hi folks,
I noticed that there might be new THP allocation in NUMA fault migration
path (migrate_misplaced_transhuge_page()) even when THP is disabled (set
to "never"). When THP is set to "never", there should be not any new THP
allocation, but the migration path is kind of special. So I'm no
On 4/17/19 11:32 PM, Michal Hocko wrote:
On Wed 17-04-19 21:15:41, Yang Shi wrote:
Hi folks,
I noticed that there might be new THP allocation in NUMA fault migration
path (migrate_misplaced_transhuge_page()) even when THP is disabled (set to
"never"). When THP is set to &quo
On 4/17/19 10:51 AM, Michal Hocko wrote:
On Wed 17-04-19 10:26:05, Yang Shi wrote:
On 4/17/19 9:39 AM, Michal Hocko wrote:
On Wed 17-04-19 09:37:39, Keith Busch wrote:
On Wed, Apr 17, 2019 at 05:39:23PM +0200, Michal Hocko wrote:
On Wed 17-04-19 09:23:46, Keith Busch wrote:
On Wed, Apr
re.
While reading the code, I found that this new spinlock was not used in
get_cmdline() to protect access to these fields.
Fixing this even if there is no issue reported yet for this.
Fixes: 88aa7cc688d4 ("mm: introduce arg_lock to protect
arg_start|end and env_start|end in mm_struct")
On 4/23/19 11:34 AM, Yang Shi wrote:
On 4/23/19 10:52 AM, Michal Hocko wrote:
On Wed 24-04-19 00:43:01, Yang Shi wrote:
The commit 7635d9cbe832 ("mm, thp, proc: report THP eligibility for
each
vma") introduced THPeligible bit for processes' smaps. But, when
checking
the
The commit 6b4c9f446981 ("filemap: drop the mmap_sem for all blocking
operations") changed when mmap_sem is dropped during filemap page fault
and when returning VM_FAULT_RETRY.
Correct the comment to reflect the change.
Cc: Josef Bacik
Signed-off-by: Yang Shi
---
mm/filemap.c | 6
On 1/24/19 12:43 AM, Michal Hocko wrote:
On Wed 23-01-19 12:24:38, Yang Shi wrote:
On 1/23/19 1:59 AM, Michal Hocko wrote:
On Wed 23-01-19 04:09:42, Yang Shi wrote:
In current implementation, both kswapd and direct reclaim has to iterate
all mem cgroups. It is not a problem before
On 3/26/19 6:58 AM, Michal Hocko wrote:
On Sat 23-03-19 12:44:25, Yang Shi wrote:
With Dave Hansen's patches merged into Linus's tree
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=c221c0b0308fd01d9fb33a16f64d2fd95f8830a4
PMEM could be hot plugg
On 3/26/19 11:37 AM, Michal Hocko wrote:
On Tue 26-03-19 11:33:17, Yang Shi wrote:
On 3/26/19 6:58 AM, Michal Hocko wrote:
On Sat 23-03-19 12:44:25, Yang Shi wrote:
With Dave Hansen's patches merged into Linus's tree
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds
On 3/26/19 5:35 PM, Keith Busch wrote:
On Mon, Mar 25, 2019 at 12:49:21PM -0700, Yang Shi wrote:
On 3/24/19 3:20 PM, Keith Busch wrote:
How do these pages eventually get to swap when migration fails? Looks
like that's skipped.
Yes, they will be just put back to LRU. Actually, I
On 3/27/19 10:34 AM, Dan Williams wrote:
On Wed, Mar 27, 2019 at 2:01 AM Michal Hocko wrote:
On Tue 26-03-19 19:58:56, Yang Shi wrote:
On 3/26/19 11:37 AM, Michal Hocko wrote:
On Tue 26-03-19 11:33:17, Yang Shi wrote:
On 3/26/19 6:58 AM, Michal Hocko wrote:
On Sat 23-03-19 12:44:25
On 3/27/19 1:09 PM, Michal Hocko wrote:
On Wed 27-03-19 11:59:28, Yang Shi wrote:
On 3/27/19 10:34 AM, Dan Williams wrote:
On Wed, Mar 27, 2019 at 2:01 AM Michal Hocko wrote:
On Tue 26-03-19 19:58:56, Yang Shi wrote:
[...]
It is still NUMA, users still can see all the NUMA nodes.
No
accurately.
Signed-off-by: Yang Shi
---
mm/huge_memory.c | 11 ++
mm/internal.h| 80 ++
mm/memory.c | 21 ++
mm/vmscan.c | 116 ---
4 files changed, 146 insertions(+), 82 deletions
pages are reclaimed (demoted) since page
reclaim behavior depends on this. Add *nr_succeeded parameter to make
migrate_pages() return how many pages are demoted successfully for all
cases.
Signed-off-by: Yang Shi
---
include/linux/migrate.h | 5 +++--
mm/compaction.c | 3 ++-
mm/gup.c
Need find the cloest cpuless node to demote DRAM pages. Add
"cpuless" parameter to find_next_best_node() to skip DRAM node on
demand.
Signed-off-by: Yang Shi
---
mm/internal.h | 11 +++
mm/page_alloc.c | 14 ++
2 files changed, 21 insertions(+), 4 deletions(-)
N_CPU_MEMORY node states. The nodes with both CPUs
and memory are called "primary" nodes. /sys/devices/system/node/primary
would show the current online "primary" nodes.
Signed-off-by: Yang Shi
---
drivers/base/node.c | 2 ++
include/linux/nodemask.h | 3 ++-
mm/memory
Account the number of demoted pages into reclaim_state->nr_demoted.
Add pgdemote_kswapd and pgdemote_direct VM counters showed in
/proc/vmstat.
Signed-off-by: Yang Shi
---
include/linux/vm_event_item.h | 2 ++
include/linux/vmstat.h| 1 +
mm/internal.h | 1 +
And, define a new migration reason for demotion, called MR_DEMOTE.
Demote page via async migration to avoid blocking.
Signed-off-by: Yang Shi
---
include/linux/gfp.h| 12
include/linux/migrate.h| 1 +
include/trace/events/migrate.h | 3 +-
mm/debug.c
Add counter for page promotion for NUMA balancing.
Signed-off-by: Yang Shi
---
include/linux/vm_event_item.h | 1 +
mm/huge_memory.c | 4
mm/memory.c | 4
mm/vmstat.c | 1 +
4 files changed, 10 insertions(+)
diff --git a/include/linux
.
Check if the target node is PGDAT_CONTENDED or not, if it is just skip
demotion.
Signed-off-by: Yang Shi
---
include/linux/mmzone.h | 3 +++
mm/vmscan.c| 28
2 files changed, 31 insertions(+)
diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
The memcg reclaim happens when the limit is breached, but demotion just
migrate pages to the other node instead of reclaiming them. This sounds
pointless to memcg reclaim since the usage is not reduced at all.
Signed-off-by: Yang Shi
---
mm/vmscan.c | 38
ace kernel pages (i.e. page table, slabs, etc) on DRAM only.
[1]: https://lore.kernel.org/linux-mm/20181226131446.330864...@intel.com/
[2]:
https://lore.kernel.org/linux-mm/20190321200157.29678-1-keith.bu...@intel.com/
[3]:
https://lore.kernel.org/linux-mm/20190404071312.gd12...@dhcp22.suse.cz
On Thu, Feb 11, 2021 at 8:47 AM Kirill Tkhai wrote:
>
> On 10.02.2021 02:33, Yang Shi wrote:
> > On Tue, Feb 9, 2021 at 12:50 PM Roman Gushchin wrote:
> >>
> >> On Tue, Feb 09, 2021 at 09:46:39AM -0800, Yang Shi wrote:
> >>> The following patch is going
On Thu, Feb 11, 2021 at 5:10 AM Vlastimil Babka wrote:
>
> On 2/9/21 6:46 PM, Yang Shi wrote:
> > The number of deferred objects might get windup to an absurd number, and it
> > results in clamp of slab objects. It is undesirable for sustaining
> > workingset.
> >
On Thu, Feb 11, 2021 at 10:52 AM Vlastimil Babka wrote:
>
> On 2/11/21 6:29 PM, Yang Shi wrote:
> > On Thu, Feb 11, 2021 at 5:10 AM Vlastimil Babka wrote:
> >> > trace_mm_shrink_slab_start(shrinker, shrinkctl, nr,
> >> >
On Tue, Feb 9, 2021 at 12:33 PM Roman Gushchin wrote:
>
> On Tue, Feb 09, 2021 at 09:46:37AM -0800, Yang Shi wrote:
> > Since memcg_shrinker_map_size just can be changed under holding
> > shrinker_rwsem
> > exclusively, the read side can be protected by holding read
On Tue, Feb 9, 2021 at 12:43 PM Roman Gushchin wrote:
>
> On Tue, Feb 09, 2021 at 09:46:38AM -0800, Yang Shi wrote:
> > Both memcg_shrinker_map_size and shrinker_nr_max is maintained, but
> > actually the
> > map size can be calculated via shrinker_nr_max, so it seems
On Tue, Feb 9, 2021 at 12:50 PM Roman Gushchin wrote:
>
> On Tue, Feb 09, 2021 at 09:46:39AM -0800, Yang Shi wrote:
> > The following patch is going to add nr_deferred into shrinker_map, the
> > change will
> > make shrinker_map not only include map
On Tue, Feb 9, 2021 at 4:22 PM Roman Gushchin wrote:
>
> On Tue, Feb 09, 2021 at 09:46:40AM -0800, Yang Shi wrote:
> > The shrinker_info is dereferenced in a couple of places via
> > rcu_dereference_protected
> > with different calling conventions, for example, u
On Tue, Feb 9, 2021 at 4:39 PM Roman Gushchin wrote:
>
> On Tue, Feb 09, 2021 at 09:46:41AM -0800, Yang Shi wrote:
> > Currently registered shrinker is indicated by non-NULL
> > shrinker->nr_deferred.
> > This approach is fine with nr_deferred at the shrinker le
On Tue, Feb 9, 2021 at 5:10 PM Roman Gushchin wrote:
>
> On Tue, Feb 09, 2021 at 09:46:42AM -0800, Yang Shi wrote:
> > Currently the number of deferred objects are per shrinker, but some slabs,
> > for example,
> > vfs inode/dentry cache are per memcg, this would
On Tue, Feb 9, 2021 at 5:27 PM Roman Gushchin wrote:
>
> On Tue, Feb 09, 2021 at 09:46:43AM -0800, Yang Shi wrote:
> > Use per memcg's nr_deferred for memcg aware shrinkers. The shrinker's
> > nr_deferred
> > will be used in the following cases:
> >
On Tue, Feb 9, 2021 at 5:34 PM Roman Gushchin wrote:
>
> On Tue, Feb 09, 2021 at 05:12:51PM -0800, Yang Shi wrote:
> > On Tue, Feb 9, 2021 at 4:39 PM Roman Gushchin wrote:
> > >
> > > On Tue, Feb 09, 2021 at 09:46:41AM -0800, Yang Shi wrote:
> > > > Cur
On Tue, Feb 9, 2021 at 5:40 PM Roman Gushchin wrote:
>
> On Tue, Feb 09, 2021 at 05:25:16PM -0800, Yang Shi wrote:
> > On Tue, Feb 9, 2021 at 5:10 PM Roman Gushchin wrote:
> > >
> > > On Tue, Feb 09, 2021 at 09:46:42AM -0800, Yang Shi wrote:
> > > > Cu
On Wed, Feb 10, 2021 at 6:37 AM Kirill Tkhai wrote:
>
> On 10.02.2021 04:52, Yang Shi wrote:
> > On Tue, Feb 9, 2021 at 5:27 PM Roman Gushchin wrote:
> >>
> >> On Tue, Feb 09, 2021 at 09:46:43AM -0800, Yang Shi wrote:
> >>> Use per memcg's nr_defe
On Tue, Feb 9, 2021 at 11:14 AM Shakeel Butt wrote:
>
> On Tue, Feb 9, 2021 at 9:47 AM Yang Shi wrote:
> >
> > The tracepoint's nid should show what node the shrink happens on, the start
> > tracepoint
> > uses nid from shrinkctl, but the nid might be set to
On Tue, Feb 9, 2021 at 4:39 PM Roman Gushchin wrote:
>
> On Tue, Feb 09, 2021 at 09:46:41AM -0800, Yang Shi wrote:
> > Currently registered shrinker is indicated by non-NULL
> > shrinker->nr_deferred.
> > This approach is fine with nr_deferred at the shrinker le
On Mon, Feb 1, 2021 at 11:13 AM Dave Hansen wrote:
>
> On 1/29/21 12:46 PM, Yang Shi wrote:
> ...
> >> int next_demotion_node(int node)
> >> {
> >> - return node_demotion[node];
> >> + /*
> >> +* node_demotion[] is update
On Mon, Jan 25, 2021 at 4:41 PM Dave Hansen wrote:
>
>
> From: Dave Hansen
>
> This is mostly derived from a patch from Yang Shi:
>
>
> https://lore.kernel.org/linux-mm/1560468577-101178-10-git-send-email-yang@linux.alibaba.com/
>
> Add code to the
ibility of future reclaim.
>
> #Signed-off-by: Keith Busch
> Cc: Keith Busch
> [vishal: fixup the migration->demotion rename]
> Signed-off-by: Vishal Verma
> Signed-off-by: Dave Hansen
> Cc: Yang Shi
> Cc: David Rientjes
> Cc: Huang Ying
> Cc: Dan Williams
On Tue, Feb 2, 2021 at 1:35 PM Dave Hansen wrote:
>
> On 2/2/21 10:56 AM, Yang Shi wrote:
> >>
> >> /* If we have no swap space, do not bother scanning anon pages. */
> >> - if (!sc->may_swap || mem_cgroup_get_nr_swap_pages(memcg) &l
On Tue, Feb 2, 2021 at 3:55 AM Oscar Salvador wrote:
>
> On Mon, Jan 25, 2021 at 04:34:27PM -0800, Dave Hansen wrote:
> >
> > From: Dave Hansen
> >
> > This is mostly derived from a patch from Yang Shi:
> >
> >
> > https://lore.kernel.org/l
ytes. 10K memcgs would need ~3.2MB memory. It seems fine.
We have been running the patched kernel on some hosts of our fleet (test and
production) for
months, it works very well. The monitor data shows the working set is sustained
as expected.
Yang Shi (11):
mm: vmscan: use nid from shrink
. It seems confusing. And the following patch
will remove using nid directly in do_shrink_slab(), this patch also helps
cleanup
the code.
Acked-by: Vlastimil Babka
Signed-off-by: Yang Shi
---
mm/vmscan.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/vmscan.c b/mm/vms
he "memcg_" prefix.
Acked-by: Vlastimil Babka
Signed-off-by: Yang Shi
---
include/linux/memcontrol.h | 8 ++---
mm/memcontrol.c| 6 ++--
mm/vmscan.c| 62 +++---
3 files changed, 38 insertions(+), 38 deletions(-)
diff -
Both memcg_shrinker_map_size and shrinker_nr_max is maintained, but actually the
map size can be calculated via shrinker_nr_max, so it seems unnecessary to keep
both.
Remove memcg_shrinker_map_size since shrinker_nr_max is also used by iterating
the
bit map.
Signed-off-by: Yang Shi
---
mm
This would prevent the shrinkers
from unregistering correctly.
Remove SHRINKER_REGISTERING since we could check if shrinker is registered
successfully by the new flag.
Signed-off-by: Yang Shi
---
include/linux/shrinker.h | 7 ---
mm/vmscan.c | 31 +--
2 fi
can.c for tighter integration with shrinker
code,
and remove the "memcg_" prefix. There is no functional change.
Acked-by: Vlastimil Babka
Signed-off-by: Yang Shi
---
include/linux/memcontrol.h | 11 ++--
mm/huge_memory.c | 4 +-
mm/list_lru.c | 6 +-
larity.
And a test with heavy paging workload didn't show write lock makes things worse.
Acked-by: Vlastimil Babka
Signed-off-by: Yang Shi
---
mm/vmscan.c | 16 ++--
1 file changed, 6 insertions(+), 10 deletions(-)
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 96b08c79f18d..e
rinker's SHRINKER_MEMCG_AWARE flag would be
cleared.
This makes the implementation of this patch simpler.
Acked-by: Vlastimil Babka
Signed-off-by: Yang Shi
---
mm/vmscan.c | 31 ---
1 file changed, 16 insertions(+), 15 deletions(-)
diff --git a/mm/vmscan.c b/mm/vmscan.c
index
Use per memcg's nr_deferred for memcg aware shrinkers. The shrinker's
nr_deferred
will be used in the following cases:
1. Non memcg aware shrinkers
2. !CONFIG_MEMCG
3. memcg is disabled by boot parameter
Signed-off-by: Yang Shi
---
mm/vms
x27;s patch:
https://lore.kernel.org/linux-xfs/20191031234618.15403-13-da...@fromorbit.com/
Tested with kernel build and vfs metadata heavy workload in our production
environment, no regression is spotted so far.
Signed-off-by: Yang Shi
---
mm/vmscan.c | 40 +-
ed all
the time.
Signed-off-by: Yang Shi
---
include/linux/memcontrol.h | 7 +++---
mm/vmscan.c| 45 --
2 files changed, 33 insertions(+), 19 deletions(-)
diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index 4c92538
Now shrinker's nr_deferred is per memcg for memcg aware shrinkers, add to
parent's
corresponding nr_deferred when memcg offline.
Acked-by: Vlastimil Babka
Signed-off-by: Yang Shi
---
include/linux/memcontrol.h | 1 +
mm/memcontrol.c| 1 +
mm/vmscan.c
On Tue, Feb 2, 2021 at 4:43 PM Dave Hansen wrote:
>
> On 2/2/21 9:46 AM, Yang Shi wrote:
> > On Mon, Feb 1, 2021 at 11:13 AM Dave Hansen wrote:
> >> On 1/29/21 12:46 PM, Yang Shi wrote:
> >> ...
> >>>> int next_demotion_node(int node)
>
On Thu, Jan 28, 2021 at 8:10 AM Vlastimil Babka wrote:
>
> On 1/28/21 12:33 AM, Yang Shi wrote:
> > The shrinker map management is not purely memcg specific, it is at the
> > intersection
> > between memory cgroup and shrinkers. It's allocation and assignment of
On Thu, Jan 28, 2021 at 8:53 AM Vlastimil Babka wrote:
>
> On 1/28/21 12:33 AM, Yang Shi wrote:
> > Both memcg_shrinker_map_size and shrinker_nr_max is maintained, but
> > actually the
> > map size can be calculated via shrinker_nr_max, so it seems unnecessary to
&
On Thu, Jan 28, 2021 at 9:38 AM Vlastimil Babka wrote:
>
> On 1/28/21 12:33 AM, Yang Shi wrote:
> > The following patch is going to add nr_deferred into shrinker_map, the
> > change will
> > make shrinker_map not only include map anymore, so rename it to a more
> &g
On Thu, Jan 28, 2021 at 9:56 AM Vlastimil Babka wrote:
>
> On 1/28/21 12:33 AM, Yang Shi wrote:
> > Currently registered shrinker is indicated by non-NULL
> > shrinker->nr_deferred.
> > This approach is fine with nr_deferred at the shrinker level, but the
> >
On Fri, Jan 29, 2021 at 3:22 AM Vlastimil Babka wrote:
>
> On 1/28/21 10:22 PM, Yang Shi wrote:
> >> > @@ -266,12 +265,13 @@ int alloc_shrinker_maps(struct mem_cgroup *memcg)
> >> > static int expand_shrinker_maps(int new_id)
> >> > {
> >> &g
On Fri, Jan 29, 2021 at 6:34 AM Kirill Tkhai wrote:
>
> On 28.01.2021 02:33, Yang Shi wrote:
> > The shrinker map management is not purely memcg specific, it is at the
> > intersection
> > between memory cgroup and shrinkers. It's allocation and assignment of a
&g
On Fri, Jan 29, 2021 at 6:59 AM Kirill Tkhai wrote:
>
> On 29.01.2021 17:55, Kirill Tkhai wrote:
> > On 28.01.2021 02:33, Yang Shi wrote:
> >> Use per memcg's nr_deferred for memcg aware shrinkers. The shrinker's
> >> nr_deferred
> >> will be u
On Fri, Jan 29, 2021 at 5:00 AM Vlastimil Babka wrote:
>
> On 1/28/21 12:33 AM, Yang Shi wrote:
> > Currently the number of deferred objects are per shrinker, but some slabs,
> > for example,
> > vfs inode/dentry cache are per memcg, this would result in poor iso
On Fri, Jan 29, 2021 at 7:13 AM Vlastimil Babka wrote:
>
> On 1/28/21 12:33 AM, Yang Shi wrote:
> > Use per memcg's nr_deferred for memcg aware shrinkers. The shrinker's
> > nr_deferred
> > will be used in the following cases:
> > 1. Non memcg
On Fri, Jan 29, 2021 at 7:40 AM Vlastimil Babka wrote:
>
> On 1/28/21 12:33 AM, Yang Shi wrote:
> > Now nr_deferred is available on per memcg level for memcg aware shrinkers,
> > so don't need
> > allocate shrinker->nr_deferred for such shrinkers anymore.
>
On Fri, Jan 29, 2021 at 7:52 AM Vlastimil Babka wrote:
>
> On 1/28/21 12:33 AM, Yang Shi wrote:
> > Now shrinker's nr_deferred is per memcg for memcg aware shrinkers, add to
> > parent's
> > corresponding nr_deferred when memcg offline.
> >
> > Si
On Fri, Jan 29, 2021 at 9:20 AM Yang Shi wrote:
>
> On Fri, Jan 29, 2021 at 5:00 AM Vlastimil Babka wrote:
> >
> > On 1/28/21 12:33 AM, Yang Shi wrote:
> > > Currently the number of deferred objects are per shrinker, but some
> > > slabs, for example,
> &
hat node_demotion[]
> locking has no chance of becoming a bottleneck on large systems
> with lots of CPUs in direct reclaim.
>
> This code is unused for now. It will be called later in the
> series.
>
> Signed-off-by: Dave Hansen
> Cc: Yang Shi
> Cc: David Rientjes
>
;
> This recalculation is far from optimal, most glaringly that it does
> not even attempt to figure out if nodes are actually coming or going.
> But, given the expected paucity of hotplug events, this should be
> fine.
>
> Signed-off-by: Dave Hansen
> Cc: Yang Shi
> Cc:
On Mon, Jan 25, 2021 at 4:41 PM Dave Hansen wrote:
>
>
> From: Yang Shi
>
> The migrate_pages() returns the number of pages that were not migrated,
> or an error code. When returning an error code, there is no way to know
> how many pages were migrated or not migrated.
On Mon, Feb 1, 2021 at 7:17 AM Vlastimil Babka wrote:
>
> On 1/29/21 7:04 PM, Yang Shi wrote:
>
> >> > > @@ -209,9 +214,15 @@ static int expand_one_shrinker_info(struct
> >> > > mem_cgroup *memcg,
> >> > > i
On Thu, Feb 4, 2021 at 12:31 AM Kirill Tkhai wrote:
>
> On 03.02.2021 20:20, Yang Shi wrote:
> > Currently the number of deferred objects are per shrinker, but some slabs,
> > for example,
> > vfs inode/dentry cache are per memcg, this would result in poor iso
On Thu, Feb 4, 2021 at 2:23 AM Kirill Tkhai wrote:
>
> On 03.02.2021 20:20, Yang Shi wrote:
> > The number of deferred objects might get windup to an absurd number, and it
> > results in clamp of slab objects. It is undesirable for sustaining
> > workingset.
> >
On Thu, Feb 4, 2021 at 2:14 AM Kirill Tkhai wrote:
>
> On 04.02.2021 12:29, Kirill Tkhai wrote:
> > On 03.02.2021 20:20, Yang Shi wrote:
> >> Now nr_deferred is available on per memcg level for memcg aware shrinkers,
> >> so don't need
> >> alloc
On Thu, Feb 4, 2021 at 12:42 AM Kirill Tkhai wrote:
>
> On 03.02.2021 20:20, Yang Shi wrote:
> > Use per memcg's nr_deferred for memcg aware shrinkers. The shrinker's
> > nr_deferred
> > will be used in the following cases:
> > 1. Non memcg aware shri
On Fri, Feb 5, 2021 at 6:38 AM Kirill Tkhai wrote:
>
> On 04.02.2021 20:17, Yang Shi wrote:
> > On Thu, Feb 4, 2021 at 12:31 AM Kirill Tkhai wrote:
> >>
> >> On 03.02.2021 20:20, Yang Shi wrote:
> >>> Currently the number of deferred objects are per sh
On Fri, Feb 5, 2021 at 6:42 AM Kirill Tkhai wrote:
>
> On 04.02.2021 20:23, Yang Shi wrote:
> > On Thu, Feb 4, 2021 at 12:42 AM Kirill Tkhai wrote:
> >>
> >> On 03.02.2021 20:20, Yang Shi wrote:
> >>> Use per memcg's nr_deferred for memcg awar
Now shrinker's nr_deferred is per memcg for memcg aware shrinkers, add to
parent's
corresponding nr_deferred when memcg offline.
Acked-by: Vlastimil Babka
Acked-by: Kirill Tkhai
Signed-off-by: Yang Shi
---
include/linux/memcontrol.h | 1 +
mm/memcontrol.c| 1 +
m
rinker's SHRINKER_MEMCG_AWARE flag would be
cleared.
This makes the implementation of this patch simpler.
Acked-by: Vlastimil Babka
Reviewed-by: Kirill Tkhai
Signed-off-by: Yang Shi
---
mm/vmscan.c | 33 ++---
1 file changed, 18 insertions(+), 15 deletions(-)
diff --git a/mm/
x27;s patch:
https://lore.kernel.org/linux-xfs/20191031234618.15403-13-da...@fromorbit.com/
Tested with kernel build and vfs metadata heavy workload in our production
environment, no regression is spotted so far.
Signed-off-by: Yang Shi
---
mm/vmscan.c | 40 +-
Use per memcg's nr_deferred for memcg aware shrinkers. The shrinker's
nr_deferred
will be used in the following cases:
1. Non memcg aware shrinkers
2. !CONFIG_MEMCG
3. memcg is disabled by boot parameter
Signed-off-by: Yang Shi
---
mm/vms
ed all
the time.
Signed-off-by: Yang Shi
---
include/linux/memcontrol.h | 7 +++---
mm/vmscan.c| 49 +-
2 files changed, 37 insertions(+), 19 deletions(-)
diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index 4c92538
ct the dereference into a helper to make the code more readable. No
functional change.
Signed-off-by: Yang Shi
---
mm/vmscan.c | 15 ++-
1 file changed, 10 insertions(+), 5 deletions(-)
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 9436f9246d32..273efbf4d53c 100644
--- a/mm/vmscan.c
++
he "memcg_" prefix.
Acked-by: Vlastimil Babka
Acked-by: Kirill Tkhai
Signed-off-by: Yang Shi
---
include/linux/memcontrol.h | 8 ++---
mm/memcontrol.c| 6 ++--
mm/vmscan.c| 62 +++---
3 files changed, 38 insertions(+), 38 deleti
-by: Yang Shi
---
mm/vmscan.c | 18 +-
1 file changed, 9 insertions(+), 9 deletions(-)
diff --git a/mm/vmscan.c b/mm/vmscan.c
index e4ddaaaeffe2..641077b09e5d 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -185,8 +185,10 @@ static LIST_HEAD(shrinker_list);
static DECLARE_RWSEM
larity.
And a test with heavy paging workload didn't show write lock makes things worse.
Acked-by: Vlastimil Babka
Acked-by: Kirill Tkhai
Signed-off-by: Yang Shi
---
mm/vmscan.c | 16 ++--
1 file changed, 6 insertions(+), 10 deletions(-)
diff --git a/mm/vmscan.c b/mm/vms
301 - 400 of 1402 matches
Mail list logo