Re: [PATCH v2] tracing: Have saved_cmdlines arrays all in one allocation

2024-02-13 Thread Tim Chen
need to be allocated. > > Link: > https://lore.kernel.org/linux-trace-kernel/20240212174011.06821...@gandalf.local.home/ > > Signed-off-by: Steven Rostedt (Google) Patch looks good to me. Reviewd-by: Tim Chen > --- > Changes since v1: > https://lore.kernel.org/linu

Re: [PATCH] tracing: Have saved_cmdlines arrays all in one allocation

2024-02-13 Thread Tim Chen
On Mon, 2024-02-12 at 19:13 -0500, Steven Rostedt wrote: > On Mon, 12 Feb 2024 15:39:03 -0800 > Tim Chen wrote: > > > > diff --git a/kernel/trace/trace_sched_switch.c > > > b/kernel/trace/trace_sched_switch.c > > > index e4fbcc3bede5..210c74dc

Re: [PATCH] tracing: Fix wasted memory in saved_cmdlines logic

2024-02-12 Thread Tim Chen
locating the other array. > > Cc: sta...@vger.kernel.org > Fixes: 939c7a4f04fcd ("tracing: Introduce saved_cmdlines_size file") > Signed-off-by: Steven Rostedt (Google) Reviewed-by: Tim Chen > --- > kernel/trace/trace.c | 73 +---

Re: [PATCH] tracing: Have saved_cmdlines arrays all in one allocation

2024-02-12 Thread Tim Chen
sn't need to be allocated. This patch does make better use of the extra space and make the previous change better. Reviewed-by: Tim Chen > > Link: > https://lore.kernel.org/linux-trace-kernel/20240212174011.06821...@gandalf.local.home/ > > Signed-off-by: Steven Rostedt (Google)

Re: [PATCH] tracing: Fix wasted memory in saved_cmdlines logic

2024-02-12 Thread Tim Chen
On Thu, 2024-02-08 at 10:53 -0500, Steven Rostedt wrote: > From: "Steven Rostedt (Google)" > > While looking at improving the saved_cmdlines cache I found a huge amount > of wasted memory that should be used for the cmdlines. > > The tracing data saves pids during the trace. At sched switch, if

Re: [PATCH v7 09/15] x86/sgx: Charge mem_cgroup for per-cgroup reclamation

2024-02-02 Thread Tim Chen
On Mon, 2024-01-22 at 09:20 -0800, Haitao Huang wrote: > > @@ -1047,29 +1037,38 @@ static struct mem_cgroup > *sgx_encl_get_mem_cgroup(struct sgx_encl *encl) > * @encl:an enclave pointer > * @page_index: enclave page index > * @backing: data for accessing backing storage for the

Re: [PATCH 1/4] cpufreq: Add a cpufreq pressure feedback for the scheduler

2023-12-13 Thread Tim Chen
On Tue, 2023-12-12 at 15:27 +0100, Vincent Guittot wrote: > Provide to the scheduler a feedback about the temporary max available > capacity. Unlike arch_update_thermal_pressure, this doesn't need to be > filtered as the pressure will happen for dozens ms or more. > > Signed-off-by: Vincent

Re: [RFC PATCH v5 4/4] scheduler: Add cluster scheduler level for x86

2021-04-20 Thread Tim Chen
On 3/23/21 4:21 PM, Song Bao Hua (Barry Song) wrote: >> >> On 3/18/21 9:16 PM, Barry Song wrote: >>> From: Tim Chen >>> >>> There are x86 CPU architectures (e.g. Jacobsville) where L2 cahce >>> is shared among a cluster of cores i

Re: [RFC PATCH v1 00/11] Manage the top tier memory in a tiered memory

2021-04-15 Thread Tim Chen
On 4/9/21 12:24 AM, Michal Hocko wrote: > On Thu 08-04-21 13:29:08, Shakeel Butt wrote: >> On Thu, Apr 8, 2021 at 11:01 AM Yang Shi wrote: > [...] >>> The low priority jobs should be able to be restricted by cpuset, for >>> example, just keep them on second tier memory nodes. Then all the >>>

Re: [RFC PATCH v1 00/11] Manage the top tier memory in a tiered memory

2021-04-15 Thread Tim Chen
On 4/8/21 10:18 AM, Shakeel Butt wrote: > > Using v1's soft limit like behavior can potentially cause high > priority jobs to stall to make enough space on top tier memory on > their allocation path and I think this patchset is aiming to reduce > that impact by making kswapd do that work.

Re: [RFC PATCH v1 00/11] Manage the top tier memory in a tiered memory

2021-04-14 Thread Tim Chen
On 4/12/21 12:20 PM, Shakeel Butt wrote: >> >> memory_t0.current Current usage of tier 0 memory by the cgroup. >> >> memory_t0.min If tier 0 memory used by the cgroup falls below this >> low >> boundary, the memory will not be subjected to >> demotion

Re: [RFC PATCH v1 00/11] Manage the top tier memory in a tiered memory

2021-04-14 Thread Tim Chen
On 4/8/21 1:29 PM, Shakeel Butt wrote: > On Thu, Apr 8, 2021 at 11:01 AM Yang Shi wrote: > > The low and min limits have semantics similar to the v1's soft limit > for this situation i.e. letting the low priority job occupy top tier > memory and depending on reclaim to take back the excess

Re: [PATCH 2/5] swap: fix do_swap_page() race with swapoff

2021-04-14 Thread Tim Chen
On 4/13/21 6:04 PM, Huang, Ying wrote: > Tim Chen writes: > >> On 4/12/21 6:27 PM, Huang, Ying wrote: >> >>> >>> This isn't the commit that introduces the race. You can use `git blame` >>> find out the correct commit. For this it's comm

Re: [PATCH 2/5] swap: fix do_swap_page() race with swapoff

2021-04-13 Thread Tim Chen
On 4/12/21 6:27 PM, Huang, Ying wrote: > > This isn't the commit that introduces the race. You can use `git blame` > find out the correct commit. For this it's commit 0bcac06f27d7 "mm, > swap: skip swapcache for swapin of synchronous device". > > And I suggest to merge 1/5 and 2/5 to make

Re: [RFC PATCH v3 0/2] scheduler: expose the topology of clusters and add cluster scheduler

2021-04-13 Thread Tim Chen
On 4/13/21 3:45 AM, Song Bao Hua (Barry Song) wrote: > > > > Right now in the main cases of using wake_affine to achieve > better performance, processes are actually bound within one > numa which is also a LLC in kunpeng920. > > Probably LLC=NUMA is also true for X86 Jacobsville, Tim? In

Re: [RFC PATCH v1 00/11] Manage the top tier memory in a tiered memory

2021-04-09 Thread Tim Chen
On 4/8/21 4:52 AM, Michal Hocko wrote: >> The top tier memory used is reported in >> >> memory.toptier_usage_in_bytes >> >> The amount of top tier memory usable by each cgroup without >> triggering page reclaim is controlled by the >> >> memory.toptier_soft_limit_in_bytes > Michal, Thanks

Re: [PATCH] sched/fair: Rate limit calls to update_blocked_averages() for NOHZ

2021-04-09 Thread Tim Chen
On 4/9/21 8:26 AM, Vincent Guittot wrote: I was expecting idle load balancer to be rate limited to 60 Hz, which >>> >>> Why 60Hz ? >>> >> >> My thinking is we will trigger load balance only after rq->next_balance. >> >> void trigger_load_balance(struct rq *rq) >> { >> /*

Re: [PATCH 2/5] swap: fix do_swap_page() race with swapoff

2021-04-09 Thread Tim Chen
On 4/9/21 1:42 AM, Miaohe Lin wrote: > On 2021/4/9 5:34, Tim Chen wrote: >> >> >> On 4/8/21 6:08 AM, Miaohe Lin wrote: >>> When I was investigating the swap code, I found the below possible race >>> window: >>> >&

Re: [PATCH] sched/fair: Rate limit calls to update_blocked_averages() for NOHZ

2021-04-08 Thread Tim Chen
On 4/8/21 7:51 AM, Vincent Guittot wrote: >> I was suprised to find the overall cpu% consumption of >> update_blocked_averages >> and throughput of the benchmark still didn't change much. So I took a >> peek into the profile and found the update_blocked_averages calls shifted to >> the

Re: [PATCH 2/5] swap: fix do_swap_page() race with swapoff

2021-04-08 Thread Tim Chen
On 4/8/21 6:08 AM, Miaohe Lin wrote: > When I was investigating the swap code, I found the below possible race > window: > > CPU 1 CPU 2 > - - > do_swap_page > synchronous swap_readpage > alloc_page_vma >

Re: [RFC PATCH v1 00/11] Manage the top tier memory in a tiered memory

2021-04-07 Thread Tim Chen
On 4/6/21 2:08 AM, Michal Hocko wrote: > On Mon 05-04-21 10:08:24, Tim Chen wrote: > [...] >> To make fine grain cgroup based management of the precious top tier >> DRAM memory possible, this patchset adds a few new features: >> 1. Provides memory monitors on the amount

Re: [PATCH] sched/fair: Rate limit calls to update_blocked_averages() for NOHZ

2021-04-07 Thread Tim Chen
On 4/7/21 7:02 AM, Vincent Guittot wrote: > Hi Tim, > > On Wed, 24 Mar 2021 at 17:05, Tim Chen wrote: >> >> >> >> On 3/24/21 6:44 AM, Vincent Guittot wrote: >>> Hi Tim, >> >>> >>> IIUC your problem, we call update_blocked_aver

[RFC PATCH v1 03/11] mm: Account the top tier memory usage per cgroup

2021-04-05 Thread Tim Chen
For each memory cgroup, account its usage of the top tier memory at the time a top tier page is assigned and uncharged from the cgroup. Signed-off-by: Tim Chen --- include/linux/memcontrol.h | 1 + mm/memcontrol.c| 39 +- 2 files changed, 39

[RFC PATCH v1 11/11] mm: Wakeup kswapd if toptier memory need soft reclaim

2021-04-05 Thread Tim Chen
Detect during page allocation whether free toptier memory is low. If so, wake up kswapd to reclaim memory from those mem cgroups that have exceeded their limit. Signed-off-by: Tim Chen --- include/linux/mmzone.h | 3 +++ mm/page_alloc.c| 2 ++ mm/vmscan.c| 2 +- 3 files

[RFC PATCH v1 10/11] mm: Set toptier_scale_factor via sysctl

2021-04-05 Thread Tim Chen
Update the toptier_scale_factor via sysctl. This variable determines when kswapd wakes up to recalaim toptier memory from those mem cgroups exceeding their toptier memory limit. Signed-off-by: Tim Chen --- include/linux/mm.h | 4 include/linux/mmzone.h | 2 ++ kernel/sysctl.c

[RFC PATCH v1 09/11] mm: Use kswapd to demote pages when toptier memory is tight

2021-04-05 Thread Tim Chen
t exceeded their toptier memory soft limit by deomoting the top tier pages to lower memory tier. Signed-off-by: Tim Chen --- Documentation/admin-guide/sysctl/vm.rst | 12 + include/linux/mmzone.h | 2 + mm/page_alloc.c | 14 + m

[RFC PATCH v1 08/11] mm: Add toptier option for mem_cgroup_soft_limit_reclaim()

2021-04-05 Thread Tim Chen
Add toptier relcaim type in mem_cgroup_soft_limit_reclaim(). This option reclaims top tier memory from cgroups in the order of its excess usage of top tier memory. Signed-off-by: Tim Chen --- include/linux/memcontrol.h | 9 --- mm/memcontrol.c| 48

[RFC PATCH v1 07/11] mm: Account the total top tier memory in use

2021-04-05 Thread Tim Chen
Track the global top tier memory usage stats. They are used as the basis of deciding when to start demoting pages from memory cgroups that have exceeded their soft limit. We start reclaiming top tier memory when the total top tier memory is low. Signed-off-by: Tim Chen --- include/linux

[RFC PATCH v1 06/11] mm: Handle top tier memory in cgroup soft limit memory tree utilities

2021-04-05 Thread Tim Chen
(), to allow returning the cgroup that has the largest exceess usage of toptier memory. Signed-off-by: Tim Chen --- include/linux/memcontrol.h | 9 +++ mm/memcontrol.c| 152 +++-- 2 files changed, 122 insertions(+), 39 deletions(-) diff --git a/include

[RFC PATCH v1 05/11] mm: Add soft_limit_top_tier tree for mem cgroup

2021-04-05 Thread Tim Chen
Define a per node soft_limit_top_tier red black tree that sort and track the cgroups by each group's excess over its toptier soft limit. A cgroup is added to the tree if it has exceeded its top tier soft limit and it has used pages on the node. Signed-off-by: Tim Chen --- mm/memcontrol.c | 68

[RFC PATCH v1 04/11] mm: Report top tier memory usage in sysfs

2021-04-05 Thread Tim Chen
In memory cgroup's sysfs, report the memory cgroup's usage of top tier memory in a new field: "toptier_usage_in_bytes". Signed-off-by: Tim Chen --- mm/memcontrol.c | 8 1 file changed, 8 insertions(+) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index fe7bb8613f5a..68

[RFC PATCH v1 02/11] mm: Add soft memory limit for mem cgroup

2021-04-05 Thread Tim Chen
For each memory cgroup, define a soft memory limit on its top tier memory consumption. Memory cgroups exceeding their top tier limit will be selected for demotion of their top tier memory to lower tier under memory pressure. Signed-off-by: Tim Chen --- include/linux/memcontrol.h | 1 + mm

[RFC PATCH v1 01/11] mm: Define top tier memory node mask

2021-04-05 Thread Tim Chen
/expensive memory lives in the top tier of the memory hierachy and it is a precious resource that needs to be accounted and managed on a memory cgroup basis. Define the top tier memory as the memory nodes that don't have demotion paths into it from higher tier memory. Signed-off-by: Tim Chen

[RFC PATCH v1 00/11] Manage the top tier memory in a tiered memory

2021-04-05 Thread Tim Chen
in lieu of discard and [PATCH 0/6] [RFC v6] NUMA balancing: optimize memory placement for memory tiering system It is part of a larger patchset. You can play with the complete set of patches using the tree: https://git.kernel.org/pub/scm/linux/kernel/git/vishal/tiering.git/log/?h=tiering-0.71 Tim

Re: [PATCH] sched/fair: Rate limit calls to update_blocked_averages() for NOHZ

2021-03-24 Thread Tim Chen
On 3/24/21 6:44 AM, Vincent Guittot wrote: > Hi Tim, > > IIUC your problem, we call update_blocked_averages() but because of: > > if (this_rq->avg_idle < curr_cost + sd->max_newidle_lb_cost) { > update_next_balance(sd, _balance); >

Re: [RFC PATCH v5 4/4] scheduler: Add cluster scheduler level for x86

2021-03-23 Thread Tim Chen
On 3/18/21 9:16 PM, Barry Song wrote: > From: Tim Chen > > There are x86 CPU architectures (e.g. Jacobsville) where L2 cahce > is shared among a cluster of cores instead of being exclusive > to one single core. > > To prevent oversubscription of L2 cache, load should

Re: [PATCH] sched/fair: Rate limit calls to update_blocked_averages() for NOHZ

2021-03-23 Thread Tim Chen
balance in the code path, we can let the idle load balancer update the blocked averages lazily. Something like the following perhaps on top of Vincent's patch? We haven't really tested this change yet but want to see if this change makes sense to you. Tim Signed-off-by: Tim Chen --- diff --g

Re: [Linuxarm] Re: [RFC PATCH v4 3/3] scheduler: Add cluster scheduler level for x86

2021-03-15 Thread Tim Chen
> It seems sensible the more CPU we get in the cluster, the more > we need the kernel to be aware of its existence. > > Tim, it is possible for you to bring up the cpu_cluster_mask and > cluster_sibling for x86 so that the topology can be represented > in sysfs and be used by scheduler? It

Re: [PATCH v2 1/3] mm: Fix dropped memcg from mem cgroup soft limit tree

2021-03-05 Thread Tim Chen
On 3/5/21 1:11 AM, Michal Hocko wrote: > On Thu 04-03-21 09:35:08, Tim Chen wrote: >> >> >> On 2/18/21 11:13 AM, Michal Hocko wrote: >> >>> >>> Fixes: 4e41695356fb ("memory controller: soft limit reclaim on contention") >>> Acked-

Re: [PATCH v2 1/3] mm: Fix dropped memcg from mem cgroup soft limit tree

2021-03-04 Thread Tim Chen
there is a chance that the removed next_mz could be inserted back to the tree from a memcg_check_events that happen in between. So we need to make sure that the next_mz is indeed off the tree and update the excess value before adding it back. Update the patch to the patch below. Thanks. Tim --- >F

Re: [RFC PATCH v4 3/3] scheduler: Add cluster scheduler level for x86

2021-03-03 Thread Tim Chen
On 3/2/21 2:30 AM, Peter Zijlstra wrote: > On Tue, Mar 02, 2021 at 11:59:40AM +1300, Barry Song wrote: >> From: Tim Chen >> >> There are x86 CPU architectures (e.g. Jacobsville) where L2 cahce >> is shared among a cluster of cores instead of being exclusive >&g

Re: [PATCH v2 2/3] mm: Force update of mem cgroup soft limit tree on usage excess

2021-02-26 Thread Tim Chen
On 2/26/21 12:52 AM, Michal Hocko wrote: >> >> Michal, >> >> Let's take an extreme case where memcg 1 always generate the >> first event and memcg 2 generates the rest of 128*8-1 events >> and the pattern repeat. > > I do not follow. Events are per-memcg, aren't they? >

Re: [PATCH v2 2/3] mm: Force update of mem cgroup soft limit tree on usage excess

2021-02-25 Thread Tim Chen
On 2/24/21 3:53 AM, Michal Hocko wrote: > On Mon 22-02-21 11:48:37, Tim Chen wrote: >> >> >> On 2/22/21 11:09 AM, Michal Hocko wrote: >> >>>> >>>> I actually have tried adjusting the threshold but found that it doesn't >>>> w

Re: [PATCH v2 2/3] mm: Force update of mem cgroup soft limit tree on usage excess

2021-02-25 Thread Tim Chen
On 2/22/21 9:41 AM, Tim Chen wrote: > > > On 2/22/21 12:40 AM, Michal Hocko wrote: >> On Fri 19-02-21 10:59:05, Tim Chen wrote: > occurrence. >>>> >>>> Soft limit is evaluated every THRESHOLDS_EVENTS_TARGET * >>>> SOFTLIMIT_EVENTS_TARG

Re: [PATCH v2 2/3] mm: Force update of mem cgroup soft limit tree on usage excess

2021-02-22 Thread Tim Chen
On 2/22/21 11:09 AM, Michal Hocko wrote: >> >> I actually have tried adjusting the threshold but found that it doesn't work >> well for >> the case with unenven memory access frequency between cgroups. The soft >> limit for the low memory event cgroup could creep up quite a lot, exceeding >>

Re: [PATCH v2 2/3] mm: Force update of mem cgroup soft limit tree on usage excess

2021-02-22 Thread Tim Chen
On 2/22/21 11:09 AM, Michal Hocko wrote: > On Mon 22-02-21 09:41:00, Tim Chen wrote: >> >> >> On 2/22/21 12:40 AM, Michal Hocko wrote: >>> On Fri 19-02-21 10:59:05, Tim Chen wrote: >> occurrence. >>>>> >>&

Re: [PATCH v2 3/3] mm: Fix missing mem cgroup soft limit tree updates

2021-02-22 Thread Tim Chen
On 2/17/21 9:56 PM, Johannes Weiner wrote: >> static inline void uncharge_gather_clear(struct uncharge_gather *ug) >> @@ -6849,7 +6850,13 @@ static void uncharge_page(struct page *page, struct >> uncharge_gather *ug) >> * exclusive access to the page. >> */ >> >> -if

Re: [PATCH v2 3/3] mm: Fix missing mem cgroup soft limit tree updates

2021-02-22 Thread Tim Chen
On 2/22/21 12:41 AM, Michal Hocko wrote: >> >> >> Ah, that's true. The added check for soft_limit_excess is not needed. >> >> Do you think it is still a good idea to add patch 3 to >> restrict the uncharge update in page batch of the same node and cgroup? > > I would rather drop it. The less

Re: [PATCH v2 2/3] mm: Force update of mem cgroup soft limit tree on usage excess

2021-02-22 Thread Tim Chen
On 2/22/21 12:40 AM, Michal Hocko wrote: > On Fri 19-02-21 10:59:05, Tim Chen wrote: occurrence. >>> >>> Soft limit is evaluated every THRESHOLDS_EVENTS_TARGET * >>> SOFTLIMIT_EVENTS_TARGET. >>> If all events correspond with a newly charged memor

Re: [PATCH v2 2/3] mm: Force update of mem cgroup soft limit tree on usage excess

2021-02-20 Thread Tim Chen
On 2/19/21 10:59 AM, Tim Chen wrote: > > > On 2/19/21 1:11 AM, Michal Hocko wrote: >> >> Soft limit is evaluated every THRESHOLDS_EVENTS_TARGET * >> SOFTLIMIT_EVENTS_TARGET. >> If all events correspond with a newly charged memory and the last event >>

Re: [PATCH v2 3/3] mm: Fix missing mem cgroup soft limit tree updates

2021-02-19 Thread Tim Chen
On 2/19/21 1:16 AM, Michal Hocko wrote: >> >> Something like this? >> >> diff --git a/mm/memcontrol.c b/mm/memcontrol.c >> index 8bddee75f5cb..b50cae3b2a1a 100644 >> --- a/mm/memcontrol.c >> +++ b/mm/memcontrol.c >> @@ -3472,6 +3472,14 @@ unsigned long mem_cgroup_soft_limit_reclaim(pg_data_t

Re: [PATCH v2 2/3] mm: Force update of mem cgroup soft limit tree on usage excess

2021-02-19 Thread Tim Chen
On 2/19/21 1:11 AM, Michal Hocko wrote: > On Wed 17-02-21 12:41:35, Tim Chen wrote: >> Memory is accessed at a much lower frequency >> for the second cgroup. The memcg event update was not triggered for the >> second cgroup as the memcg event update didn't happened

Re: [PATCH v2 1/3] mm: Fix dropped memcg from mem cgroup soft limit tree

2021-02-18 Thread Tim Chen
On 2/18/21 11:13 AM, Michal Hocko wrote: > On Thu 18-02-21 10:30:20, Tim Chen wrote: >> >> >> On 2/18/21 12:24 AM, Michal Hocko wrote: >> >>> >>> I have already acked this patch in the previous version along with Fixes >>> tag. It seems th

Re: [PATCH v2 1/3] mm: Fix dropped memcg from mem cgroup soft limit tree

2021-02-18 Thread Tim Chen
On 2/18/21 12:24 AM, Michal Hocko wrote: > > I have already acked this patch in the previous version along with Fixes > tag. It seems that my review feedback has been completely ignored also > for other patches in this series. Michal, My apology. Our mail system screwed up and there are

[PATCH v2 3/3] mm: Fix missing mem cgroup soft limit tree updates

2021-02-17 Thread Tim Chen
a memory soft limit. Reviewed-by: Ying Huang Signed-off-by: Tim Chen --- mm/memcontrol.c | 10 +- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index d72449eeb85a..8bddee75f5cb 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -6804,6 +6804,7

[PATCH v2 2/3] mm: Force update of mem cgroup soft limit tree on usage excess

2021-02-17 Thread Tim Chen
-by: Tim Chen --- mm/memcontrol.c | 11 +-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index a51bf90732cb..d72449eeb85a 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -985,15 +985,22 @@ static bool mem_cgroup_event_ratelimit(struct

[PATCH v2 0/3] Soft limit memory management bug fixes

2021-02-17 Thread Tim Chen
of this patch. Thanks. Tim Changelog: v2 1. Do soft limit tree uncharge update in batch of the same node only for v1 cgroups that have a soft limit. Batching in nodes is only relevant for cgroup v1 that has per node soft limit tree. Tim Chen (3): mm: Fix dropped memcg from mem cgroup soft limit tree

[PATCH v2 1/3] mm: Fix dropped memcg from mem cgroup soft limit tree

2021-02-17 Thread Tim Chen
cgroup exceeded its soft limit. Fix the logic and put the mem cgroup back on the tree when page reclaim failed for the mem cgroup. Reviewed-by: Ying Huang Signed-off-by: Tim Chen --- mm/memcontrol.c | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/mm/memcontrol.c b/mm

Re: [RFC PATCH v3 0/2] scheduler: expose the topology of clusters and add cluster scheduler

2021-02-16 Thread Tim Chen
or x86. Thanks. Tim >8-- >From 9189e489b019e110ee6e9d4183e243e48f44ff25 Mon Sep 17 00:00:00 2001 From: Tim Chen Date: Tue, 16 Feb 2021 08:24:39 -0800 Subject: [RFC PATCH] scheduler: Add cluster scheduler level for x86 To: , , , , , , , , , , , , , , , , , Cc: , , , , , , , Jonathan Cameron , B

Re: [PATCH 3/3] mm: Fix missing mem cgroup soft limit tree updates

2021-02-09 Thread Tim Chen
On 2/9/21 2:22 PM, Johannes Weiner wrote: > Hello Tim, > > On Tue, Feb 09, 2021 at 12:29:47PM -0800, Tim Chen wrote: >> @@ -6849,7 +6850,9 @@ static void uncharge_page(struct page *page, struct >> uncharge_gather *ug) >> * exclusive access to the page. >

[PATCH 3/3] mm: Fix missing mem cgroup soft limit tree updates

2021-02-09 Thread Tim Chen
, with each batch of pages all in the same mem cgroup and memory node. An update is issued for the batch of pages of a node collected till now whenever we encounter a page belonging to a different node. Reviewed-by: Ying Huang Signed-off-by: Tim Chen --- mm/memcontrol.c | 6 +- 1 file

[PATCH 2/3] mm: Force update of mem cgroup soft limit tree on usage excess

2021-02-09 Thread Tim Chen
-by: Tim Chen --- mm/memcontrol.c | 11 +-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index a51bf90732cb..d72449eeb85a 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -985,15 +985,22 @@ static bool mem_cgroup_event_ratelimit(struct

[PATCH 1/3] mm: Fix dropped memcg from mem cgroup soft limit tree

2021-02-09 Thread Tim Chen
cgroup exceeded its soft limit. Fix the logic and put the mem cgroup back on the tree when page reclaim failed for the mem cgroup. Reviewed-by: Ying Huang Signed-off-by: Tim Chen --- mm/memcontrol.c | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/mm/memcontrol.c b/mm

[PATCH 0/3] Soft limit memory management bug fixes

2021-02-09 Thread Tim Chen
During testing of tiered memory management based on memory soft limit, I found three issues with memory management using cgroup based soft limit in the mainline code. Fix the issues with the three patches in this series. Tim Chen (3): mm: Fix dropped memcg from mem cgroup soft limit tree

Re: [RFC PATCH v3 0/2] scheduler: expose the topology of clusters and add cluster scheduler

2021-01-08 Thread Tim Chen
On 1/8/21 7:12 AM, Morten Rasmussen wrote: > On Thu, Jan 07, 2021 at 03:16:47PM -0800, Tim Chen wrote: >> On 1/6/21 12:30 AM, Barry Song wrote: >>> ARM64 server chip Kunpeng 920 has 6 clusters in each NUMA node, and each >>> cluster has 4 cpus. All clusters sha

Re: [RFC PATCH v3 0/2] scheduler: expose the topology of clusters and add cluster scheduler

2021-01-07 Thread Tim Chen
L2 cache sched domain, sans the idle cpu selection on wake up code. It is similar enough in concept to Barry's patch that we should have a single patchset that accommodates both use cases. Thanks. Tim >From e0e7e42e1a033c9634723ff1dc80b426deeec1e9 Mon Sep 17 00:00:00 2001 Message-Id: In-Reply

Re: [RFC PATCH v2] sched/fair: select idle cpu from idle cpumask in sched domain

2020-09-24 Thread Tim Chen
On 9/24/20 10:13 AM, Phil Auld wrote: > On Thu, Sep 24, 2020 at 09:37:33AM -0700 Tim Chen wrote: >> >> >> On 9/22/20 12:14 AM, Vincent Guittot wrote: >> >>>> >>>>>> >>>>>> And a quick test with hackbench on my octo

Re: [RFC PATCH v2] sched/fair: select idle cpu from idle cpumask in sched domain

2020-09-24 Thread Tim Chen
On 9/22/20 12:14 AM, Vincent Guittot wrote: >> And a quick test with hackbench on my octo cores arm64 gives for 12 Vincent, Is it octo (=10) or octa (=8) cores on a single socket for your system? The L2 is per core or there are multiple L2s shared among groups of cores? Wonder if

Re: [RFC PATCH 06/16] sched: Add core wide task selection and scheduling.

2020-07-05 Thread Tim Chen
On 7/2/20 5:57 AM, Joel Fernandes wrote: > On Wed, Jul 01, 2020 at 05:54:11PM -0700, Tim Chen wrote: >> >> >> On 7/1/20 4:28 PM, Joel Fernandes wrote: >>> On Tue, Jun 30, 2020 at 09:32:27PM +, Vineeth Remanan Pillai wrote: >>>> From: Peter Zij

Re: [RFC PATCH 06/16] sched: Add core wide task selection and scheduling.

2020-07-01 Thread Tim Chen
Intel) >> Signed-off-by: Julien Desfossez >> Signed-off-by: Vineeth Remanan Pillai >> Signed-off-by: Aaron Lu >> Signed-off-by: Tim Chen > > Hi Peter, Tim, all, the below patch fixes the hotplug issue described in the > below patch's Link tag. Patch descriptio

Re: [Patch] mm: Increase pagevec size on large system

2020-06-29 Thread Tim Chen
On 6/26/20 8:47 PM, Andrew Morton wrote: > On Sat, 27 Jun 2020 04:13:04 +0100 Matthew Wilcox wrote: > >> On Fri, Jun 26, 2020 at 02:23:03PM -0700, Tim Chen wrote: >>> Enlarge the pagevec size to 31 to reduce LRU lock contention for >>> large systems. >>>

[Patch] mm: Increase pagevec size on large system

2020-06-26 Thread Tim Chen
from 88.8 Mpages/sec to 95.1 Mpages/sec. Signed-off-by: Tim Chen --- include/linux/pagevec.h | 8 1 file changed, 8 insertions(+) diff --git a/include/linux/pagevec.h b/include/linux/pagevec.h index 081d934eda64..466ebcdd190d 100644 --- a/include/linux/pagevec.h +++ b/include/linux

Re: [PATCH 3/3] mm/swap: remove redundant check for swap_slot_cache_initialized

2020-05-01 Thread Tim Chen
swap_slot_cache_initialized) > + if (!swap_slot_cache_enabled) This simplification is okay. !swap_slot_cache_initialize implies !swap_slot_cache_enabled. So only !swap_slot_cache_enabled needs to be checked. > return false; > > pages = get_nr_swap_pages(); > Acked-by: Tim Chen

Re: [PATCH 2/3] mm/swap: simplify enable_swap_slots_cache()

2020-05-01 Thread Tim Chen
be done when swap_slot_cache_initialized > is false. > > No functional change. > > Signed-off-by: Zhen Lei Acked-by: Tim Chen > --- > mm/swap_slots.c | 22 ++ > 1 file changed, 10 insertions(+), 12 deletions(-) > > diff --git a/mm/swap_slot

Re: [PATCH 1/3] mm/swap: simplify alloc_swap_slot_cache()

2020-05-01 Thread Tim Chen
nit(>free_lock); > @@ -155,15 +162,8 @@ static int alloc_swap_slot_cache(unsigned int cpu) >*/ > mb(); > cache->slots = slots; > - slots = NULL; > cache->slots_ret = slots_ret; > - slots_ret = NULL; > -out: > mutex_unlock(_slots_cache_mutex); > - if (slots) > - kvfree(slots); > - if (slots_ret) > - kvfree(slots_ret); > return 0; > } > > Acked-by: Tim Chen

Re: [Discussion v2] Usecases for the per-task latency-nice attribute

2019-10-07 Thread Tim Chen
On 10/2/19 9:11 AM, David Laight wrote: > From: Parth Shah >> Sent: 30 September 2019 11:44 > ... >> 5> Separating AVX512 tasks and latency sensitive tasks on separate cores >> ( -Tim Chen ) >>

Re: [RFC PATCH v3 00/16] Core scheduling v3

2019-09-25 Thread Tim Chen
On 9/24/19 7:40 PM, Aubrey Li wrote: > On Sat, Sep 7, 2019 at 2:30 AM Tim Chen wrote: >> +static inline s64 core_sched_imbalance_delta(int src_cpu, int dst_cpu, >> + int src_sibling, int dst_sibling, >> + struct task_gro

Re: Usecases for the per-task latency-nice attribute

2019-09-19 Thread Tim Chen
On 9/19/19 2:06 AM, David Laight wrote: > From: Tim Chen >> Sent: 18 September 2019 18:16 > ... >> Some users are running machine learning batch tasks with AVX512, and have >> observed >> that these tasks affect the tasks needing a fast response. They have to

Re: Usecases for the per-task latency-nice attribute

2019-09-19 Thread Tim Chen
On 9/19/19 1:37 AM, Parth Shah wrote: > >> >> $> Separating AVX512 tasks and latency sensitive tasks on separate cores >> - >> Another usecase we are considering is to segregate those workload that will >> pull down >> core

Re: [RFC PATCH v3 00/16] Core scheduling v3

2019-09-18 Thread Tim Chen
On 9/4/19 6:44 PM, Julien Desfossez wrote: > + > +static void coresched_idle_worker_fini(struct rq *rq) > +{ > + if (rq->core_idle_task) { > + kthread_stop(rq->core_idle_task); > + rq->core_idle_task = NULL; > + } During testing, I have found access of

Re: [RFC PATCH v3 00/16] Core scheduling v3

2019-09-18 Thread Tim Chen
On 9/10/19 7:27 AM, Julien Desfossez wrote: > On 29-Aug-2019 04:38:21 PM, Peter Zijlstra wrote: >> On Thu, Aug 29, 2019 at 10:30:51AM -0400, Phil Auld wrote: >>> I think, though, that you were basically agreeing with me that the current >>> core scheduler does not close the holes, or am I reading

Re: [RFC PATCH v3 00/16] Core scheduling v3

2019-09-18 Thread Tim Chen
On 9/17/19 6:33 PM, Aubrey Li wrote: > On Sun, Sep 15, 2019 at 10:14 PM Aaron Lu wrote: >> >> And I have pushed Tim's branch to: >> https://github.com/aaronlu/linux coresched-v3-v5.1.5-test-tim >> >> Mine: >> https://github.com/aaronlu/linux coresched-v3-v5.1.5-test-core_vruntime Aubrey,

Re: Usecases for the per-task latency-nice attribute

2019-09-18 Thread Tim Chen
On 9/18/19 5:41 AM, Parth Shah wrote: > Hello everyone, > > As per the discussion in LPC2019, new per-task property like latency-nice > can be useful in certain scenarios. The scheduler can take proper decision > by knowing latency requirement of a task from the end-user itself. > > There has

Re: [RFC PATCH v3 00/16] Core scheduling v3

2019-09-13 Thread Tim Chen
On 9/13/19 7:15 AM, Aaron Lu wrote: > On Thu, Sep 12, 2019 at 10:29:13AM -0700, Tim Chen wrote: > >> The better thing to do is to move one task from cgroupA to another core, >> that has only one cgroupA task so it can be paired up >> with that lonely cgroupA task. This w

Re: [RFC PATCH v3 00/16] Core scheduling v3

2019-09-12 Thread Tim Chen
On 9/12/19 5:35 AM, Aaron Lu wrote: > On Wed, Sep 11, 2019 at 12:47:34PM -0400, Vineeth Remanan Pillai wrote: > > core wide vruntime makes sense when there are multiple tasks of > different cgroups queued on the same core. e.g. when there are two > tasks of cgroupA and one task of cgroupB are

Re: [RFC PATCH v3 00/16] Core scheduling v3

2019-09-12 Thread Tim Chen
On 9/12/19 5:04 AM, Aaron Lu wrote: > Well, I have done following tests: > 1 Julien's test script: https://paste.debian.net/plainh/834cf45c > 2 start two tagged will-it-scale/page_fault1, see how each performs; > 3 Aubrey's mysql test: https://github.com/aubreyli/coresched_bench.git > > They all

Re: [RFC PATCH v3 00/16] Core scheduling v3

2019-09-11 Thread Tim Chen
On 9/11/19 7:02 AM, Aaron Lu wrote: > Hi Tim & Julien, > > On Fri, Sep 06, 2019 at 11:30:20AM -0700, Tim Chen wrote: >> On 8/7/19 10:10 AM, Tim Chen wrote: >> >>> 3) Load balancing between CPU cores >>> --- >>> Sa

Re: [RFC PATCH v3 00/16] Core scheduling v3

2019-09-06 Thread Tim Chen
On 9/4/19 6:44 PM, Julien Desfossez wrote: >@@ -3853,7 +3880,7 @@ pick_next_task(struct rq *rq, struct task_struct *prev, >struct rq_flags *rf) > goto done; > } > >- if (!is_idle_task(p)) >+ if

Re: [RFC PATCH v3 00/16] Core scheduling v3

2019-09-06 Thread Tim Chen
On 8/7/19 10:10 AM, Tim Chen wrote: > 3) Load balancing between CPU cores > --- > Say if one CPU core's sibling threads get forced idled > a lot as it has mostly incompatible tasks between the siblings, > moving the incompatible load to other co

Re: [RFC PATCH 1/9] sched,cgroup: Add interface for latency-nice

2019-09-04 Thread Tim Chen
On 8/30/19 10:49 AM, subhra mazumdar wrote: > Add Cgroup interface for latency-nice. Each CPU Cgroup adds a new file > "latency-nice" which is shared by all the threads in that Cgroup. Subhra, Thanks for posting the patchset. Having a latency nice hint is useful beyond idle load balancing. I

Re: [RFC PATCH v3 00/16] Core scheduling v3

2019-08-28 Thread Tim Chen
On 8/28/19 9:01 AM, Peter Zijlstra wrote: > On Wed, Aug 28, 2019 at 11:30:34AM -0400, Phil Auld wrote: >> On Tue, Aug 27, 2019 at 11:50:35PM +0200 Peter Zijlstra wrote: > >> The current core scheduler implementation, I believe, still has >> (theoretical?) >> holes involving interrupts, once/if

Re: [RFC PATCH v3 00/16] Core scheduling v3

2019-08-28 Thread Tim Chen
On 8/27/19 2:50 PM, Peter Zijlstra wrote: > On Tue, Aug 27, 2019 at 10:14:17PM +0100, Matthew Garrett wrote: >> Apple have provided a sysctl that allows applications to indicate that >> specific threads should make use of core isolation while allowing >> the rest of the system to make use of

Re: [RFC PATCH v3 00/16] Core scheduling v3

2019-08-08 Thread Tim Chen
On 8/8/19 10:27 AM, Tim Chen wrote: > On 8/7/19 11:47 PM, Aaron Lu wrote: >> On Tue, Aug 06, 2019 at 02:19:57PM -0700, Tim Chen wrote: >>> +void account_core_idletime(struct task_struct *p, u64 exec) >>> +{ >>> + const struct cpumask *smt_mask; >>&g

Re: [RFC PATCH v3 00/16] Core scheduling v3

2019-08-08 Thread Tim Chen
On 8/7/19 11:47 PM, Aaron Lu wrote: > On Tue, Aug 06, 2019 at 02:19:57PM -0700, Tim Chen wrote: >> +void account_core_idletime(struct task_struct *p, u64 exec) >> +{ >> +const struct cpumask *smt_mask; >> +struct rq *rq; >> +bool force_idle, refill; &g

Re: [RFC PATCH v3 00/16] Core scheduling v3

2019-08-08 Thread Tim Chen
On 8/8/19 5:55 AM, Aaron Lu wrote: > On Mon, Aug 05, 2019 at 08:55:28AM -0700, Tim Chen wrote: >> On 8/2/19 8:37 AM, Julien Desfossez wrote: >>> We tested both Aaron's and Tim's patches and here are our results. > > diff --git a/kernel/sched/core.c b/kernel/sched/cor

Re: [RFC PATCH v3 00/16] Core scheduling v3

2019-08-07 Thread Tim Chen
On 8/7/19 1:58 AM, Dario Faggioli wrote: > So, here comes my question: I've done a benchmarking campaign (yes, > I'll post numbers soon) using this branch: > > https://github.com/digitalocean/linux-coresched.git > vpillai/coresched-v3-v5.1.5-test >

Re: [RFC PATCH v3 00/16] Core scheduling v3

2019-08-06 Thread Tim Chen
ede10309986a6b1bcc82d317f86a5b06459d76bd Mon Sep 17 00:00:00 2001 From: Tim Chen Date: Wed, 24 Jul 2019 13:58:18 -0700 Subject: [PATCH 1/2] sched: Move sched fair prio comparison to fair.c Consolidate the task priority comparison of the fair class to fair.c. A simple code reorganization a

Re: [RFC PATCH v3 00/16] Core scheduling v3

2019-08-06 Thread Tim Chen
On 8/5/19 8:24 PM, Aaron Lu wrote: > I've been thinking if we should consider core wide tenent fairness? > > Let's say there are 3 tasks on 2 threads' rq of the same core, 2 tasks > (e.g. A1, A2) belong to tenent A and the 3rd B1 belong to another tenent > B. Assume A1 and B1 are queued on the

Re: [RFC PATCH v3 00/16] Core scheduling v3

2019-08-05 Thread Tim Chen
On 8/2/19 8:37 AM, Julien Desfossez wrote: > We tested both Aaron's and Tim's patches and here are our results. > > Test setup: > - 2 1-thread sysbench, one running the cpu benchmark, the other one the > mem benchmark > - both started at the same time > - both are pinned on the same core (2

Re: [RFC PATCH v3 00/16] Core scheduling v3

2019-07-26 Thread Tim Chen
need to use one of the sibling's cfs_rq min_vruntime as a time base. In really limited testing, it seems to have balanced fairness between two tagged cgroups. Tim ---patch 1-- From: Tim Chen Date: Wed, 24 Jul 2019 13:58:18 -0700 Subject: [PATCH 1/2] sched: move sched fair prio comparison

  1   2   3   4   5   6   7   8   9   10   >