track the time process has been preempted by other means, no? We
have context switching tracepoints in place. Have you considered that
option?
--
Michal Hocko
SUSE Labs
On Wed 21-02-24 13:30:51, Carlos Galo wrote:
> On Tue, Feb 20, 2024 at 11:55 PM Michal Hocko wrote:
> >
> > Hi,
> > sorry I have missed this before.
> >
> > On Thu 11-01-24 21:05:30, Carlos Galo wrote:
> > > The current implementation of the mark_victim
pid_nr(victim), victim->comm, K(mm->total_vm),
K(get_mm_counter(mm, MM_ANONPAGES)),
K(get_mm_counter(mm, MM_FILEPAGES)),
K(get_mm_counter(mm, MM_SHMEMPAGES)),
from_kuid(_user_ns, task_uid(victim)),
mm_pgtables_bytes(mm) >> 10, victim->signal->oom_score_adj);
--
Michal Hocko
SUSE Labs
an add-hoc dynamic tracepoints or BPF for a very
special situation is not sufficient?
In other words, tell us more about the usecases and why is this
generally useful.
Thanks!
--
Michal Hocko
SUSE Labs
ually the interesting case at all.
--
Michal Hocko
SUSE Labs
On Tue 20-04-21 15:57:08, Michal Hocko wrote:
[...]
> Usual memory consumption is usually something like LRU pages + Slab
> memory + kernel stack + vmalloc used + pcp.
>
> > But I know that KernelStack is allocated through vmalloc these days,
> > and I don't know whether
Mlocked a subset of Unevictable?
>
> There is some attempt at explaining how these numbers fit together, but
> it's outdated, and doesn't include Mlocked, Unevictable or KernelStack
Agreed there is a lot of tribal knowledge or even misconceptions flying
around and it will take much more work to put everything into shape.
This is only one tiny step forward.
--
Michal Hocko
SUSE Labs
usage.
> >
> > Signed-off-by: Mike Rapoport
>
> Ooops, forgot to add Michal's Ack, sorry.
Let's make it more explicit
Acked-by: Michal Hocko
Thanks!
--
Michal Hocko
SUSE Labs
On Tue 20-04-21 09:25:51, peter.enderb...@sony.com wrote:
> On 4/20/21 11:12 AM, Michal Hocko wrote:
> > On Tue 20-04-21 09:02:57, peter.enderb...@sony.com wrote:
> >>>> But that isn't really system memory at all, it's just allocated device
> >>>> memory.
&g
On Fri 16-04-21 13:24:10, Oscar Salvador wrote:
> Enable x86_64 platform to use the MHP_MEMMAP_ON_MEMORY feature.
>
> Signed-off-by: Oscar Salvador
> Reviewed-by: David Hildenbrand
Acked-by: Michal Hocko
> ---
> arch/x86/Kconfig | 3 +++
> 1 file changed, 3 insertio
return -EINVAL;
> + }
> +
> + /*
> + * Let remove_pmd_table->free_hugepage_table do the
> + * right thing if we used vmem_altmap when hot-adding
> + * the range.
> + */
> + mhp_altmap.alloc = nr_vmemmap_pages;
> + altmap = _altmap;
> + }
> + }
> +
> /* remove memmap entry */
> firmware_map_remove(start, start + size, "System RAM");
I have to say I still dislike this and I would just wrap it inside out
and do the operation from within walk_memory_blocks but I will not
insist.
--
Michal Hocko
SUSE Labs
modify the number of present pages.
>
> Signed-off-by: David Hildenbrand
> Signed-off-by: Oscar Salvador
> Reviewed-by: Oscar Salvador
Not sure self review counts ;)
Acked-by: Michal Hocko
Btw. I strongly suspect the resize lock is quite pointless here.
Something for a follow up patch.
is a special
case we want to allow."
> Signed-off-by: Oscar Salvador
> Reviewed-by: David Hildenbrand
With the changelog extended and the comment clarification (se below)
feel free to add
Acked-by: Michal Hocko
> ---
> mm/memory_hotplug.c | 18 ++
> 1 file c
Reviewed-by: David Hildenbrand
Acked-by: Michal Hocko
> ---
> drivers/base/memory.c | 33 +
> 1 file changed, 21 insertions(+), 12 deletions(-)
>
> diff --git a/drivers/base/memory.c b/drivers/base/memory.c
> index f35298425575..f209925a5d4e
a single counter without a wider context cannot be put into any
reasonable context. There is no notion of the total amount of device
memory usable for dma-buf. As Christian explained some of it can be RAM
based. So a single number is rather pointless on its own in many cases.
Or let me just ask. What can
a step into the right
direction.
Acked-by: Michal Hocko
one nit below
> ---
> Documentation/filesystems/proc.rst | 11 +--
> 1 file changed, 9 insertions(+), 2 deletions(-)
>
> diff --git a/Documentation/filesystems/proc.rst
> b/Documentation/filesystems/proc.rst
> index
now replaced with dma-buf. ION had some overview metrics that was
> similar.
The discussion around the previous version is still not over and as it
seems your proposed approach is not really viable. So please do not send
new versions until that is sorted out.
Thanks!
--
Michal Hocko
SUSE Labs
On Tue 20-04-21 10:00:07, Christian König wrote:
> Am 20.04.21 um 09:46 schrieb Michal Hocko:
> > On Tue 20-04-21 09:32:14, Christian König wrote:
> > > Am 20.04.21 um 09:04 schrieb Michal Hocko:
> > > > On Mon 19-04-21 18:37:13, Christian König wrote:
> > >
s own file makes sense to me as well. If the code is
not conditional (e.g. like swap accounting and some others) then moving
it would make memecontrol.c easier to navigate through.
--
Michal Hocko
SUSE Labs
On Tue 20-04-21 10:20:43, Mike Rapoport wrote:
> On Tue, Apr 20, 2021 at 09:04:51AM +0200, Michal Hocko wrote:
> > On Mon 19-04-21 18:37:13, Christian König wrote:
> > > Am 19.04.21 um 18:11 schrieb Michal Hocko:
> > [...]
> > > > The question is not
On Tue 20-04-21 09:32:14, Christian König wrote:
> Am 20.04.21 um 09:04 schrieb Michal Hocko:
> > On Mon 19-04-21 18:37:13, Christian König wrote:
> > > Am 19.04.21 um 18:11 schrieb Michal Hocko:
[...]
> > What I am trying to bring up with NUMA side is that the same probl
On Mon 19-04-21 18:37:13, Christian König wrote:
> Am 19.04.21 um 18:11 schrieb Michal Hocko:
[...]
> > The question is not whether it is NUMA aware but whether it is useful to
> > know per-numa data for the purpose the counter is supposed to serve.
>
> No, not at all. The
itrary metrics and if that can be done without any
> allocations.
A kernel module or eBPF to implement oom decisions has already been
discussed few years back. But I am afraid this would be hard to wire in
for anything except for the victim selection. I am not sure it is
maintainable to also control when the OOM handling should trigger.
--
Michal Hocko
SUSE Labs
On Mon 19-04-21 17:44:13, Christian König wrote:
> Am 19.04.21 um 17:19 schrieb peter.enderb...@sony.com:
> > On 4/19/21 5:00 PM, Michal Hocko wrote:
> > > On Mon 19-04-21 12:41:58, peter.enderb...@sony.com wrote:
> > > > On 4/19/21 2:16 PM, Michal Hocko wrote:
>
On Mon 19-04-21 12:41:58, peter.enderb...@sony.com wrote:
> On 4/19/21 2:16 PM, Michal Hocko wrote:
> > On Sat 17-04-21 12:40:32, Peter Enderborg wrote:
> >> This adds a total used dma-buf memory. Details
> >> can be found in debugfs, however it is not for everyone
&g
ter
explanation and secondly is this information useful for OOM situations
analysis? If yes then show_mem should dump the value as well.
>From the implementation point of view, is there any reason why this
hasn't used the existing global_node_page_state infrastructure?
--
Michal Hocko
SUSE Labs
On Fri 16-04-21 07:26:43, Dave Hansen wrote:
> On 4/16/21 5:35 AM, Michal Hocko wrote:
> > I have to confess that I haven't grasped the initialization
> > completely. There is a nice comment explaining a 2 socket system with
> > 3 different NUMA nodes attached to i
details
but they do not seem that important.
I am still trying to digest the whole thing but at least jamming
node_reclaim logic into kswapd seems strange to me. Need to think more
about that though.
Btw. do you have any numbers from running this with some real work
workload?
--
Michal Hocko
SUSE
On Thu 15-04-21 15:31:46, Tim Chen wrote:
>
>
> On 4/9/21 12:24 AM, Michal Hocko wrote:
> > On Thu 08-04-21 13:29:08, Shakeel Butt wrote:
> >> On Thu, Apr 8, 2021 at 11:01 AM Yang Shi wrote:
> > [...]
> >>> The low priority jobs should be able to be res
On Fri 16-04-21 13:14:04, Muchun Song wrote:
> lruvec_holds_page_lru_lock() doesn't check anything about locking and is
> used to check whether the page belongs to the lruvec. So rename it to
> page_matches_lruvec().
>
> Signed-off-by: Muchun Song
Acked-by: Michal
enbrand
> Acked-by: Mike Kravetz
Acked-by: Michal Hocko
> ---
> mm/page_alloc.c | 6 --
> 1 file changed, 6 deletions(-)
>
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index b5a94de3cdde..c5338e912ace 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @
from 6c0371490140
("hugetlb: convert PageHugeFreed to HPageFreed flag"). Previously the
explicit clearing was necessary because compound allocations do not get
this initialization (see prep_compound_page).
> Signed-off-by: Oscar Salvador
with that
Acked-by: Michal Hocko
> ---
> mm/hug
On Thu 15-04-21 11:13:16, Muchun Song wrote:
> On Wed, Apr 14, 2021 at 6:15 PM Michal Hocko wrote:
> >
> > On Wed 14-04-21 18:04:35, Muchun Song wrote:
> > > On Wed, Apr 14, 2021 at 5:24 PM Michal Hocko wrote:
> > > >
> > > > On Tue 13-04-21 14:51:
On Wed 14-04-21 13:49:56, Johannes Weiner wrote:
> On Wed, Apr 14, 2021 at 06:00:42PM +0800, Muchun Song wrote:
> > On Wed, Apr 14, 2021 at 5:44 PM Michal Hocko wrote:
> > >
> > > On Tue 13-04-21 14:51:50, Muchun Song wrote:
> > > > We already have a help
> fs/super.c | 27 +++
> include/linux/fs_context.h | 2 ++
> mm/shmem.c | 1 +
> 5 files changed, 24 insertions(+), 8 deletions(-)
[...]
--
Michal Hocko
SUSE Labs
gt;mode == MPOL_PREFERRED_MANY) {
page = alloc_surplus_huge_page(h, (gfp_mask | __GFP_NOWARN) &
~(__GFP_DIRECT_RECLAIM), nodemask);
if (page)
goto got_page;
/* fallback to all nodes */
nodemask = NULL;
And alloc_pages_policy doesn't really help I have to
say. I would have expected that a dedicated alloc_pages_preferred and a
general fallback to __alloc_pages_nodemask would have been much easier
to follow.
--
Michal Hocko
SUSE Labs
RRED_MANY's semantic is more like MPOL_PREFERRED
> that it will first try the preferred node/nodes, and fallback to all
> other nodes when first try fails. Thanks to Michal Hocko for suggestions
> on this.
>
> For now, only interleaved policy will be used so there should be no
> functional
; reuses BIND.
No, this is a bug step back. I think we really want to treat this as
PREFERRED. It doesn't have much to do with the BIND semantic at all.
At this stage there should be 2 things remaining - syscalls plumbing and
2 pass allocation request (optimistic preferred nodes restricted and
fa
to grep for preferred_nodes than nodes.
--
Michal Hocko
SUSE Labs
ixed typos in commit message. (Ben)
> Merged bits from other patches. (Ben)
> annotate mpol_rebind_preferred_many as unused (Ben)
I am giving up on the rebinding code for now until we clarify that in my
earlier email.
--
Michal Hocko
SUSE Labs
gt; + if (nodes_empty(*nodes))
> + return -EINVAL;
> +
> + tmp = nodemask_of_node(first_node(*nodes));
> + return mpol_new_preferred_many(pol, );
> + }
> +
> + return mpol_new_preferred_many(pol, NULL);
> +}
> +
> static int mpol_new_bind(struct mempolicy *pol, const nodemask_t *nodes)
> {
> if (nodes_empty(*nodes))
> --
> 2.7.4
--
Michal Hocko
SUSE Labs
hing in it should really be as simple as node_isset
check.
> default:
> BUG();
Besides that, this should really go!
> @@ -3035,6 +3066,9 @@ void mpol_to_str(char *buffer, int maxlen, struct
> mempolicy *pol)
> switch (mode) {
> case MPOL_DEFAULT:
> break;
> + case MPOL_PREFERRED_MANY:
> + WARN_ON(flags & MPOL_F_LOCAL);
Why WARN_ON here?
> + fallthrough;
> case MPOL_PREFERRED:
> if (flags & MPOL_F_LOCAL)
> mode = MPOL_LOCAL;
> --
> 2.7.4
--
Michal Hocko
SUSE Labs
= tmp;
> pol->w.cpuset_mems_allowed = *nodes;
> }
I have to say that I really disliked the original code (becasuse it
fiddles with user provided input behind the back) I got lost here
completely. What the heck is going on?
a) why do we even care remaping a hint which is overriden by the cpuset
at the page allocator level and b) why do we need to allocate _two_
potentially large temporary bitmaps for that here?
I haven't spotted anything unexpected in the rest.
--
Michal Hocko
SUSE Labs
ode to full nodemask
> mm/mempolicy: Add MPOL_PREFERRED_MANY for multiple preferred nodes
> mm/mempolicy: allow preferred code to take a nodemask
> mm/mempolicy: refactor rebind code for PREFERRED_MANY
>
> Feng Tang (1):
> mem/mempolicy: unify mpol_new_preferred() and
> mpol_new_preferred_many()
>
> .../admin-guide/mm/numa_memory_policy.rst | 22 +-
> include/linux/mempolicy.h | 6 +-
> include/uapi/linux/mempolicy.h | 6 +-
> mm/hugetlb.c | 26 +-
> mm/mempolicy.c | 272
> ++---
> 5 files changed, 225 insertions(+), 107 deletions(-)
>
> --
> 2.7.4
--
Michal Hocko
SUSE Labs
On Wed 14-04-21 12:49:53, Oscar Salvador wrote:
> On Wed, Apr 14, 2021 at 12:32:58PM +0200, Michal Hocko wrote:
[...]
> > > I checked, and when we get there in __alloc_bootmem_huge_page,
> > > page->private is
> > > still zeroed, so I guess it should be safe to as
On Wed 14-04-21 12:01:47, Oscar Salvador wrote:
> On Wed, Apr 14, 2021 at 10:28:33AM +0200, Michal Hocko wrote:
> > You are right it doesn't do it there. But all struct pages, even those
> > that are allocated by the bootmem allocator should initialize its struct
> > pages. T
On Wed 14-04-21 18:04:35, Muchun Song wrote:
> On Wed, Apr 14, 2021 at 5:24 PM Michal Hocko wrote:
> >
> > On Tue 13-04-21 14:51:48, Muchun Song wrote:
> > > When mm is NULL, we do not need to hold rcu lock and call css_tryget for
> > > the root memcg. And
nes Weiner
> Acked-by: Roman Gushchin
> Reviewed-by: Shakeel Butt
Acked-by: Michal Hocko
> ---
> mm/vmscan.c | 6 +++---
> 1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 64bf07cc20f2..e40b21298d77 100644
nd
> CONFIG_MEMCG.
Neat. While you are at it wouldn't it make sesne to rename the function
as well. I do not want to bikeshed but this is really a misnomer. it
doesn't check anything about locking. page_belongs_lruvec?
> Signed-off-by: Muchun Song
> Acked-by: Johannes Weiner
Acked-b
his case it even doesn't
give any advantage for most callers.
Acked-by: Michal Hocko
> ---
> include/linux/memcontrol.h | 10 +-
> mm/compaction.c| 2 +-
> mm/memcontrol.c| 9 +++--
> mm/swap.c | 2 +-
> mm/workingset.c
dereference(mm->owner));
> - if (unlikely(!memcg))
> - memcg = root_mem_cgroup;
> - }
> } while (!css_tryget(>css));
> rcu_read_unlock();
> return memcg;
> --
> 2.11.0
--
Michal Hocko
SUSE Labs
WARN_ON_ONCE in the page_counter_cancel(). Who knows if it
> will trigger? So it is better to fix it.
>
> Signed-off-by: Muchun Song
> Acked-by: Johannes Weiner
> Reviewed-by: Shakeel Butt
Acked-by: Michal Hocko
> ---
> mm/memcontrol.c | 8 +---
> 1 file changed, 5 ins
On Wed 14-04-21 09:41:32, Oscar Salvador wrote:
> On Wed, Apr 14, 2021 at 08:04:21AM +0200, Michal Hocko wrote:
> > On Tue 13-04-21 14:19:03, Mike Kravetz wrote:
> > > On 4/13/21 6:23 AM, Michal Hocko wrote:
> > > The only place where page->private may no
On Tue 13-04-21 14:19:03, Mike Kravetz wrote:
> On 4/13/21 6:23 AM, Michal Hocko wrote:
> > On Tue 13-04-21 12:47:43, Oscar Salvador wrote:
[...]
> > Or do we need it for giga pages which are not allocated by the page
> > allocator? If yes then moving it to prep_compoun
case above we retry as the window race is quite small and we have high
> chances to succeed next time.
>
> With regard to the allocation, we restrict it to the node the page belongs
> to with __GFP_THISNODE, meaning we do not fallback on other node's zones.
>
> Note that gigant
t;nr_huge_pages_node[nid]++;
> + __prep_account_new_huge_page(h, nid);
> spin_unlock_irq(_lock);
> }
Any reason to decouple the locking from the accounting?
>
> --
> 2.16.3
--
Michal Hocko
SUSE Labs
On Tue 13-04-21 15:24:32, Michal Hocko wrote:
> On Tue 13-04-21 12:47:44, Oscar Salvador wrote:
> [...]
> > +static void prep_new_huge_page(struct hstate *h, struct page *page, int
> > nid)
> > +{
> > + __prep_new_huge_page(page);
> > spin_lock_
d(page);
> spin_lock_irq(_lock);
> h->nr_huge_pages++;
> h->nr_huge_pages_node[nid]++;
> - ClearHPageFreed(page);
> spin_unlock_irq(_lock);
> }
>
> --
> 2.16.3
--
Michal Hocko
SUSE Labs
Not sure this is worth it TBH. Even an idea of any
pcp access synchronization with memory hotplug makes for a decent headache.
--
Michal Hocko
SUSE Labs
On Fri 09-04-21 16:26:53, Tim Chen wrote:
>
> On 4/8/21 4:52 AM, Michal Hocko wrote:
>
> >> The top tier memory used is reported in
> >>
> >> memory.toptier_usage_in_bytes
> >>
> >> The amount of top tier memory usable by each cgr
ly dislike kmem and LRU pages to be handled
differently so for that reason
Nacked-by: Michal Hocko
If the optimization really can be provent then the patch would require
to be much more invasive.
> Signed-off-by: Chen Xiaoguang
> Signed-off-by: Chen He
> ---
> incl
:, then
> gets stalled/scheduled out while hotremove rebuilds the zonelist and destroys
> the pcplists, then the first task is resumed and proceeds with
> rmqueue_pcplist().
>
> So that's very rare thus not urgent, and this patch doesn't make it less rare
> so
> not a reason to block it.
Completely agreed here. Not an urgent thing to work on but something to
look into long term.
--
Michal Hocko
SUSE Labs
h that an existing
race was likely never observed.
Acked-by: Michal Hocko
Thanks!
> Signed-off-by: Mel Gorman
> ---
> Resending for email address correction and adding lists
>
> Changelog since v1
> o Minimal fix
>
> mm/page_alloc.c | 4
> 1 file changed, 4 delet
OK. Let's do that for now and I will put a follow up on my todo list.
Thanks!
--
Michal Hocko
SUSE Labs
On Fri 09-04-21 14:42:21, Mel Gorman wrote:
> On Fri, Apr 09, 2021 at 02:48:12PM +0200, Michal Hocko wrote:
> > On Fri 09-04-21 14:42:58, Michal Hocko wrote:
> > > On Fri 09-04-21 13:09:57, Mel Gorman wrote:
> > > > zone_pcp_reset allegedly protects against a race
On Fri 09-04-21 14:42:58, Michal Hocko wrote:
> On Fri 09-04-21 13:09:57, Mel Gorman wrote:
> > zone_pcp_reset allegedly protects against a race with drain_pages
> > using local_irq_save but this is bogus. local_irq_save only operates
> > on the local CPU. If memory hotp
reset pcp of an empty
zone at all? The whole point of this exercise seems to be described in
340175b7d14d5. setup_zone_pageset can check for an already allocated pcp
and simply reinitialize it.
--
Michal Hocko
SUSE Labs
memcg.
The behavior of those limits would be quite tricky for OOM situations
as well due to a lack of NUMA aware oom killer.
--
Michal Hocko
SUSE Labs
is already botched and counters
cannot be trusted this is definitely better than a potentially
completely unusable memcg. It would be nice to mention that in the above
paragraph as a caveat.
> Signed-off-by: Johannes Weiner
Acked-by: Michal Hocko
> ---
> mm/page_counter.c | 8 ++--
On Wed 07-04-21 15:33:26, Tim Chen wrote:
>
>
> On 4/6/21 2:08 AM, Michal Hocko wrote:
> > On Mon 05-04-21 10:08:24, Tim Chen wrote:
> > [...]
> >> To make fine grain cgroup based management of the precious top tier
> >> DRAM memory possible, this p
On Wed 07-04-21 19:13:42, Bharata B Rao wrote:
> On Wed, Apr 07, 2021 at 01:54:48PM +0200, Michal Hocko wrote:
> > On Mon 05-04-21 11:18:48, Bharata B Rao wrote:
> > > Hi,
> > >
> > > When running 1 (more-or-less-empty-)containers on a bare-metal Power9
global memory reclaim iterating over
10k memcgs will likely be very visible. I do remember playing with
similar setups few years back and the overhead was very high.
--
Michal Hocko
SUSE Labs
nging spin_*lock calls to spin_*lock_irq* calls.
> > - Make subpool lock irq safe in a similar manner.
> > - Revert the !in_task check and workqueue handoff.
> >
> > [1]
> > https://lore.kernel.org/linux-mm/f1c03b05bc43a...@google.com/
> >
> >
) {
> > + update_and_free_page(h, page);
> > + cond_resched();
> > + }
> > + spin_lock(_lock);
>
> Can we get here with an empty list?
An emoty page_list? If yes then sure, this can happen but
list_for_each_entry_safe will simply not iterate. Or what do you mean?
--
Michal Hocko
SUSE Labs
On Tue 06-04-21 09:49:13, Mike Kravetz wrote:
> On 4/6/21 2:56 AM, Michal Hocko wrote:
> > On Mon 05-04-21 16:00:39, Mike Kravetz wrote:
[...]
> >> @@ -2298,6 +2312,7 @@ static int alloc_and_dissolve_huge_page(struct
> >> hstat
On Tue 06-04-21 23:12:34, Neil Sun wrote:
>
>
> On 2021/4/6 22:39, Michal Hocko wrote:
> >
> > Have you considered using high limit for the pro-active memory reclaim?
>
> Thanks, Michal, do you mean the procfs interfaces?
> We have set vm.vfs_cache_pressure=1000
On Tue 06-04-21 22:34:02, Neil Sun wrote:
>
>
> On 2021/4/6 19:39, Michal Hocko wrote:
> > On Tue 06-04-21 19:30:22, Neil Sun wrote:
> > > On 2021/4/6 15:21, Michal Hocko wrote:
> > > >
> > > > You are changing semantic of the existing user i
On Tue 06-04-21 19:30:22, Neil Sun wrote:
> On 2021/4/6 15:21, Michal Hocko wrote:
> >
> > You are changing semantic of the existing user interface. This knob has
> > never been memcg aware and it is supposed to have a global impact. I do
> > not think we can simply cha
continue;
> remove_hugetlb_page(h, page, false);
> - update_and_free_page(h, page);
> + list_add(>lru, _list);
> }
> }
> +
> +out:
> + spin_unlock(_lock);
> + list_for_each_entry_safe(page, next, _list, lru) {
> + update_and_free_page(h, page);
> + cond_resched();
> + }
> + spin_lock(_lock);
> }
> #else
> static inline void try_to_free_low(struct hstate *h, unsigned long count,
> --
> 2.30.2
>
--
Michal Hocko
SUSE Labs
te_and_free_page(h, new_page);
> spin_unlock(_lock);
> if (!isolate_huge_page(old_page, list))
the page is not enqued anywhere here so remove_hugetlb_page would blow
when linked list debugging is enabled.
--
Michal Hocko
SUSE Labs
> Signed-off-by: Mike Kravetz
I belive I have acked the previous version already. Anyway
Acked-by: Michal Hocko
> ---
> mm/cma.c | 18 +-
> mm/cma.h | 2 +-
> mm/cma_debug.c | 8
> 3 files changed, 14 insertions(+), 14 deletions(-)
>
&
rather alien concept to the existing memcg
infrastructure IMO. It looks like it is fusing borders between memcg and
cputset controllers.
You also seem to be basing the interface on the very specific usecase.
Can we expect that there will be many different tiers requiring their
own balancing?
--
Michal Hocko
SUSE Labs
g = mem_cgroup_iter(NULL, NULL, NULL);
> + memcg = mem_cgroup_from_task(current);
> do {
> freed += shrink_slab(GFP_KERNEL, nid, memcg, 0);
> } while ((memcg = mem_cgroup_iter(NULL, memcg, NULL)) != NULL);
> --
> 2.7.4
--
Michal Hocko
SUSE Labs
On Thu 01-04-21 21:59:13, Muchun Song wrote:
> On Thu, Apr 1, 2021 at 6:26 PM Michal Hocko wrote:
[...]
> > Even if the css ref count is not really necessary it shouldn't cause any
> > harm and it makes the code easier to understand. At least a comment
> > explaining why
into it's own patch. With more explanation why NOIO is
required.
> Fixes: 682aa8e1a6a1 ("writeback: implement unlocked_inode_to_wb transaction
> and use it for stat updates")
> Signed-off-by: Muchun Song
For the css part feel free to add
Acked-by: Michal Hocko
Even if the css ref
rt = sysfs_kf_seq_start,
> .seq_show = sysfs_kf_seq_show,
> };
>
> static const struct kernfs_ops sysfs_file_kfops_wo = {
> + .seq_start = sysfs_kf_seq_start,
> .write = sysfs_kf_write,
> };
>
> static const struct kernfs_ops sysfs_file_kfops_rw = {
> + .seq_start = sysfs_kf_seq_start,
> .seq_show = sysfs_kf_seq_show,
> .write = sysfs_kf_write,
> };
> --
> 2.25.1
--
Michal Hocko
SUSE Labs
firm that they don't enable it.
> > >
> > > I can confirm that it's certainly not enabled on any of the machines I
> > > have, but..
> >
> > Debian has CONFIG_DEVKMEM disabled since 2.6.31.
>
> SLES, too. (but no idea since when exactly)
15-SP2 IIRC
--
Michal Hocko
SUSE Labs
On Mon 22-03-21 14:49:35, Michal Hocko wrote:
> On Mon 22-03-21 15:00:37, Mike Rapoport wrote:
> > On Mon, Mar 22, 2021 at 11:14:37AM +0100, Michal Hocko wrote:
> > > Le'ts Andrea and Mike
> > >
> > > On Fri 19-03-21 22:24:28, Bui Quang Minh wrote:
&g
t;
> Signed-off-by: Mike Kravetz
Acked-by: Michal Hocko
> ---
> mm/cma.c | 18 +-
> mm/cma.h | 2 +-
> mm/cma_debug.c | 8
> 3 files changed, 14 insertions(+), 14 deletions(-)
>
> diff --git a/mm/cma.c b/mm/cma.c
> index b2393b8
MCG so the below one
is not needed though. It would be great if the changelog mentioned that
so that.
> Signed-off-by: Wan Jiabing
Acked-by: Michal Hocko
> ---
> include/linux/memcontrol.h | 2 --
> 1 file changed, 2 deletions(-)
>
> diff --git a/include/linux/memcontrol.h b/in
On Tue 30-03-21 16:08:36, Muchun Song wrote:
> On Tue, Mar 30, 2021 at 4:01 PM Michal Hocko wrote:
> >
> > On Mon 29-03-21 16:23:55, Mike Kravetz wrote:
> > > Ideally, cma_release could be called from any context. However, that is
> > > not possible because a
accounting effectively
> reverting the commit.
>
> Signed-off-by: Mike Kravetz
Please drop INIT_LIST_HEAD which seems to be a left over from rebasing
to use LIST_HEAD.
Acked-by: Michal Hocko
> ---
> mm/hugetlb.c | 95 +---
>
On Mon 29-03-21 16:23:56, Mike Kravetz wrote:
> Now that cma_release is non-blocking and irq safe, there is no need to
> drop hugetlb_lock before calling.
>
> Signed-off-by: Mike Kravetz
Acked-by: Michal Hocko
> ---
> mm/hugetlb.c | 6 --
> 1 file changed, 6 deletio
ma_bitmap_maxno(cma);
> + unsigned long flags;
>
> - mutex_lock(>lock);
> + spin_lock_irqsave(>lock, flags);
> for (;;) {
> start = find_next_zero_bit(cma->bitmap, bitmap_maxno, end);
> if (start >= bitmap_maxno)
> @@ -61,7 +63,7 @@ static int cma_maxchunk_get(void *data, u64 *val)
> end = find_next_bit(cma->bitmap, bitmap_maxno, start);
> maxchunk = max(end - start, maxchunk);
> }
> - mutex_unlock(>lock);
> + spin_unlock_irqrestore(>lock, flags);
> *val = (u64)maxchunk << cma->order_per_bit;
>
> return 0;
and here.
--
Michal Hocko
SUSE Labs
know since when it is possible to use hugetlb in the networking
context? Maybe this is possible since ever but I am wondering why the
lockdep started complaining only now. Maybe just fuzzing finally started
using this setup which nobody does normally.
--
Michal Hocko
SUSE Labs
could have missed something.
If this is really a practical concern then we can try a more complex
solution based on some data.
--
Michal Hocko
SUSE Labs
ld be focusing on the compaction retry logic and
see whether we can have some "run away" scenarios there. Seeing so many
retries without compaction bailing out sounds like a bug in that retry
logic. Vlastimil is much more familiar with that.
--
Michal Hocko
SUSE Labs
On Fri 26-03-21 15:53:41, David Hildenbrand wrote:
> On 26.03.21 15:38, Michal Hocko wrote:
> > On Fri 26-03-21 09:52:58, David Hildenbrand wrote:
[...]
> > > 2. We won't allocate kasan shadow memory. We most probably have to do it
> > > explicitl
lock, but not the memory hotplug lock. E.g., for get_online_mems(). Might
> have to move that out online_pages.
Could you be more explicit why this locking is needed? What it would
protect from for vmemmap pages?
--
Michal Hocko
SUSE Labs
1 - 100 of 20557 matches
Mail list logo