On Thu, Dec 03, 2020 at 02:46:23PM +0800, Yunfeng Ye wrote:
> A little clean up that moving the '\n' to the prior seq_printf. and
> remove the separate seq_printf which for line breaks.
>
> No functional changes.
>
> Signed-off-by: Yunfeng Ye
Acked-by: Mel Gorman
--
Mel Gorman
SUSE Labs
I could understand if there was a follow-up patch that adjusted some
subset or there was a difference in checking for schedstat_enabled,
locking or inserting new schedstat information. This can happen in the
general case when the end result is easier to review here it seems to be
just moving code aroun
go in this direction, MPOL_F_NUMA_BALANCING
should be chcked against MPOL_INTERLEAVE and explicitly fail now so
suport can be detected at runtime.
> So, I prefer to make MPOL_F_NUMA_BALANCING to be
>
> Optimizing with NUMA balancing if possible, and we may add more
> optimization in the future.
>
Maybe, but I think it's best that the actual behaviour of the kernel is
documented instead of desired behaviour or future planning.
--
Mel Gorman
SUSE Labs
The following commit has been merged into the sched/core branch of tip:
Commit-ID: 82b738de57d571cd366d89e75b5fd60f3060852b
Gitweb:
https://git.kernel.org/tip/82b738de57d571cd366d89e75b5fd60f3060852b
Author:Mel Gorman
AuthorDate:Mon, 30 Nov 2020 14:40:20
Committer
person to talk about HWP; even though I work for
> Intel I know remarkably little of it. I don't even think I've got a
> machine that has it on.
>
> Latest version below... I'll probably send it as a patch soon and get it
> merged. We can always muck with it more later.
>
True. At least any confusion can then be driven by specific questions :)
FWIW, after reading the new version I'll ack the patch when it shows up.
Thanks!
--
Mel Gorman
SUSE Labs
re to be disabled to see if it fixes
> > it. Over time, that might give an idea of exactly what sort of workloads
> > benefit and what suffers.
>
> Okay, I'll add a sched_feat() for this feature.
>
If the series I'm preparing works out ok and your patch can be integrated,
the sched_feat() may not be necessary because your patch would further
reduce time complexity without worrying about when the information
gets reset.
--
Mel Gorman
SUSE Labs
information see: kernel/sched/cpufreq_schedutil.c
>
>
> NOTES
> -
>
> - On low-load scenarios, where DVFS is most relevant, the 'running' numbers
>will closely reflect utilization.
>
> - In saturated scenarios task movement will cause some transient dips,
>suppose we have a CPU saturated with 4 tasks, then when we migrate a task
>to an idle CPU, the old CPU will have a 'running' value of 0.75 while the
>new CPU will gain 0.25. This is inevitable and time progression will
>correct this. XXX do we still guarantee f_max due to no idle-time?
>
> - Much of the above is about avoiding DVFS dips, and independent DVFS domains
>having to re-learn / ramp-up when load shifts.
>
--
Mel Gorman
SUSE Labs
ybe --balance-bind? The intent is to hint
that it's specific to MPOL_BIND at this time.
--
Mel Gorman
SUSE Labs
uding a small test program in the manual page for a
sequence that tests whether MPOL_F_NUMA_BALANCING works. They might have
a better recommendation on how it should be handled.
--
Mel Gorman
SUSE Labs
policy.
>
Ok, I think this part is ok and while the test case is somewhat
superficial, it at least demonstrated that the NUMA balancing overhead
did not offset any potential benefit
Acked-by: Mel Gorman
--
Mel Gorman
SUSE Labs
nges again resulting
> > in bugs because the driver.poll parameter means something new.
>
> Right.
>
> > Using min_cstate was definitely a hazard because it showed up in both
> > microbenchmarks and real workloads but you were right, lets only
> > introduce a tunable when and if there is no other choice in the matter.
> >
> > So, informally the following patch is the next candidate. I'm happy to
> > resend it as a separate mail if you prefer and think the patch is ok.
>
> I actually can apply it right away, so no need to resend.
>
Thanks very much.
--
Mel Gorman
SUSE Labs
e helpers are in sched/stats.c,
>
> __update_stats_wait_*(struct rq *rq, struct task_struct *p,
> struct sched_statistics *stats)
>
> Cc: Mel Gorman
> Signed-off-by: Yafang Shao
Think it's ok, it's mostly code shuffling. I'd have been happier if
there was
ased overhead now when schedstats are *enabled* because
_schedstat_from_sched_entity() has to be called but it appears that it is
protected by a schedstat_enabled() check. So ok, schedstats when enabled
are now a bit more expensive but they were expensive in the first place
so does it matter?
I'd have been happied if there was a comparison with schedstats enabled
just in case the overhead is too high but it could also do with a second
set of eyeballs.
It's somewhat tentative but
Acked-by: Mel Gorman
--
Mel Gorman
SUSE Labs
On Tue, Dec 01, 2020 at 07:54:12PM +0800, Yafang Shao wrote:
> schedstat_enabled() has been already checked, so we can use
> __schedstat_set() directly.
>
> Signed-off-by: Yafang Shao
Acked-by: Mel Gorman
--
Mel Gorman
SUSE Labs
2975.88 ( 0.00%) 1059.73 * 8.59%*
Hmean 4 1953.97 ( 0.00%) 2081.37 * 6.52%*
Hmean 8 3645.76 ( 0.00%) 4052.95 * 11.17%*
Hmean 16 6882.21 ( 0.00%) 6995.93 * 1.65%*
Hmean 32 10752.20 ( 0.00%)10731.53 * -0.19%*
Hm
-off-by: Mel Gorman
---
kernel/sched/fair.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 0d54d69ba1a5..d9acd55d309b 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -6087,10 +6087,11 @@ static int select_idle_core
20580.64 * -1.12%*
Signed-off-by: Mel Gorman
---
Documentation/admin-guide/kernel-parameters.txt | 18 +
drivers/cpuidle/cpuidle.c | 49 -
2 files changed, 65 insertions(+), 2 deletions(-)
diff --git a/Documentation/admin-guide/kern
etal. If that is not the case, then it
would make more sense that the haltpoll adaptive logic would simply be
part of the core and not a per-governor decision.
--
Mel Gorman
SUSE Labs
nd increases exit latency that it
would be unpopular on bare metal.
I'm prototyping a v2 that simply picks different balance point to see if
that gets a better reception.
--
Mel Gorman
SUSE Labs
On Thu, Nov 26, 2020 at 08:31:51PM +, Mel Gorman wrote:
> > > and it is reasonable behaviour but it should be tunable.
> >
> > Only if there is no way to cover all of the relevant use cases in a
> > generally acceptable way without adding more module params etc.
On Thu, Nov 26, 2020 at 07:24:41PM +0100, Rafael J. Wysocki wrote:
> On Thu, Nov 26, 2020 at 6:25 PM Mel Gorman
> wrote:
> > It was noted that a few workloads that idle rapidly regressed when commit
> > 36fcb4292473 ("cpuidle: use first valid target residency as pol
0%)12608.15 * -0.51%*
Hmean 12820744.83 ( 0.00%)21147.02 * 1.94%*
Hmean 25620646.60 ( 0.00%)20608.48 * -0.18%*
Hmean 32020892.89 ( 0.00%) 20831.99 * -0.29%*
Signed-off-by: Mel Gorman
---
Documentation/admin-guide/kernel-parameters.txt
easy enough to ask for the feature to be disabled to see if it fixes
it. Over time, that might give an idea of exactly what sort of workloads
benefit and what suffers.
Note that the cost of select_idle_cpu() can also be reduced by enabling
SIS_AVG_CPU so it would be interesting to know if the idle mask is superior
or inferior to SIS_AVG_CPU for workloads that show regressions.
--
Mel Gorman
SUSE Labs
.
>
> As far as I understood Mel, rounding these ranges up/down to cover full
> MAX_ORDER blocks/pageblocks might work.
>
Yes, round down the lower end of the hole and round up the higher end to
the MAX_ORDER boundary for the purposes of having valid zone/node
linkages even if the underlying
On Wed, Nov 25, 2020 at 12:59:58PM -0500, Andrea Arcangeli wrote:
> On Wed, Nov 25, 2020 at 10:30:53AM +0000, Mel Gorman wrote:
> > On Tue, Nov 24, 2020 at 03:56:22PM -0500, Andrea Arcangeli wrote:
> > > Hello,
> > >
> > > On Tue, Nov 24, 2020 at 01:32:05P
The following commit has been merged into the sched/core branch of tip:
Commit-ID: 7d2b5dd0bcc48095651f1b85f751eef610b3e034
Gitweb:
https://git.kernel.org/tip/7d2b5dd0bcc48095651f1b85f751eef610b3e034
Author:Mel Gorman
AuthorDate:Fri, 20 Nov 2020 09:06:29
Committer
The following commit has been merged into the sched/core branch of tip:
Commit-ID: 23e6082a522e32232f7377540b4d42d8304253b8
Gitweb:
https://git.kernel.org/tip/23e6082a522e32232f7377540b4d42d8304253b8
Author:Mel Gorman
AuthorDate:Fri, 20 Nov 2020 09:06:30
Committer
The following commit has been merged into the sched/core branch of tip:
Commit-ID: abeae76a47005aa3f07c9be12d8076365622e25c
Gitweb:
https://git.kernel.org/tip/abeae76a47005aa3f07c9be12d8076365622e25c
Author:Mel Gorman
AuthorDate:Fri, 20 Nov 2020 09:06:27
Committer
The following commit has been merged into the sched/core branch of tip:
Commit-ID: 5c339005f854fa75aa46078ad640919425658b3e
Gitweb:
https://git.kernel.org/tip/5c339005f854fa75aa46078ad640919425658b3e
Author:Mel Gorman
AuthorDate:Fri, 20 Nov 2020 09:06:28
Committer
On Wed, Nov 25, 2020 at 12:04:15PM +0100, David Hildenbrand wrote:
> On 25.11.20 11:39, Mel Gorman wrote:
> > On Wed, Nov 25, 2020 at 07:45:30AM +0100, David Hildenbrand wrote:
> >>> Something must have changed more recently than v5.1 that caused the
> >>> zon
ed page. I agree that compaction should not be returning
pfns that are outside of the zone range because that is buggy in itself
but valid struct pages should have valid information. I don't think we
want to paper over that with unnecessary PageReserved checks.
--
Mel Gorman
SUSE Labs
On Tue, Nov 24, 2020 at 03:56:22PM -0500, Andrea Arcangeli wrote:
> Hello,
>
> On Tue, Nov 24, 2020 at 01:32:05PM +0000, Mel Gorman wrote:
> > I would hope that is not the case because they are not meant to overlap.
> > However, if the beginning of the pageblock was no
s-zone which would
be surprising.
Maybe this untested patch?
diff --git a/mm/compaction.c b/mm/compaction.c
index 13cb7a961b31..ef1b5dacc289 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -1330,6 +1330,10 @@ fast_isolate_freepages(struct compact_control *cc)
low_pfn = pageblock_start_pfn(cc->free_pfn - (distance >> 2));
min_pfn = pageblock_start_pfn(cc->free_pfn - (distance >> 1));
+ /* Ensure the PFN is within the zone */
+ low_pfn = max(cc->zone->zone_start_pfn, low_pfn);
+ min_pfn = max(cc->zone->zone_start_pfn, min_pfn);
+
if (WARN_ON_ONCE(min_pfn > low_pfn))
low_pfn = min_pfn;
--
Mel Gorman
SUSE Labs
it_start(struct cfs_rq *cfs_rq, struct sched_entity *se)
{
if (!schedstat_enabled())
return;
__update_stats_wait_start(cfs_rq, se);
}
where __update_stats_wait_start then lives in kernel/sched/stats.c
--
Mel Gorman
SUSE Labs
On Fri, Nov 20, 2020 at 01:58:11PM +0100, Peter Zijlstra wrote:
> On Fri, Nov 20, 2020 at 09:06:26AM +0000, Mel Gorman wrote:
>
> > Mel Gorman (4):
> > sched/numa: Rename nr_running and break out the magic number
> > sched: Avoid unnecessary calculation of load
-off-by: Mel Gorman
---
kernel/sched/fair.c | 21 +++--
1 file changed, 11 insertions(+), 10 deletions(-)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 9aded12aaa90..e17e6c5da1d5 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -1550,7 +1550,8 @@ struct
ed as long as there are idle CPUs until
the load balancer gets involved. This caused serious problems
with a real workload that unfortunately I cannot share many
details about but there is a proxy reproducer.
--
2.26.2
Mel Gorman (4):
sched/numa: Rename nr_running and break out
confusing one type of imbalance with another depending on the group_type
in the next patch.
No functional change.
Signed-off-by: Mel Gorman
---
kernel/sched/fair.c | 8 +---
1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index
This is simply a preparation patch to make the following patches easier
to read. No functional change.
Signed-off-by: Mel Gorman
---
kernel/sched/fair.c | 10 ++
1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 6d78b68847f9
enerally performance was either neutral or better in the tests conducted.
The main consideration with this patch is the point where fork stops
spreading a task so some workloads may benefit from different balance
points but it would be a risky tuning parameter.
Signed-off-by: Mel Gorman
---
kern
On Thu, Nov 19, 2020 at 12:14:11PM +0100, Peter Zijlstra wrote:
> On Thu, Nov 19, 2020 at 09:38:34AM +0000, Mel Gorman wrote:
> > On Wed, Nov 18, 2020 at 08:48:42PM +0100, Thomas Gleixner wrote:
> > > From: Thomas Gleixner
> > >
> > > Now that the scheduler ca
ds sharing the same address space.
I know it can have other examples that are rt-specific and some tricks on
percpu page alloc draining that relies on a combination of migrate_disable
and interrupt disabling to protect the structures but the above example
might be understandable to a non-RT audience.
> > because this only works on architectures which do not have cache aliasing
> > problems.
>
> Very good. And you made sure to have a comment to not enable it for
> production systems.
>
> Hopefully people will even read it ;)
>
And not start thinking it as a security hardening option.
--
Mel Gorman
SUSE Labs
Changelog since v1
o Split out patch that moves imbalance calculation
o Strongly connect fork imbalance considerations with adjust_numa_imbalance
When NUMA and CPU balancing were reconciled, there was an attempt to allow
a degree of imbalance but it caused more problems than it solved. Instead,
-off-by: Mel Gorman
---
kernel/sched/fair.c | 21 +++--
1 file changed, 11 insertions(+), 10 deletions(-)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 9aded12aaa90..e17e6c5da1d5 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -1550,7 +1550,8 @@ struct
This is simply a preparation patch to make the following patches easier
to read. No functional change.
Signed-off-by: Mel Gorman
---
kernel/sched/fair.c | 10 ++
1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 6d78b68847f9
enerally performance was either neutral or better in the tests conducted.
The main consideration with this patch is the point where fork stops
spreading a task so some workloads may benefit from different balance
points but it would be a risky tuning parameter.
Signed-off-by: Mel Gorman
---
kern
confusing one type of imbalance with another depending on the group_type
in the next patch.
No functional change.
Signed-off-by: Mel Gorman
---
kernel/sched/fair.c | 8 +---
1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index
> Yes.
>
Needs to be documented so applications know they can recover. Also needs
to be determined how numactl should behave if the flag does not exist.
Likely it will simply fail in which case the error should be clear.
> > In that case, a manual page is defintely needed to
> > explain that an error can be returned if the flag is used and the kernel
> > does not support it so the application can cover by falling back to a
> > strict binding. If it fails silently, then that also needs to be documented
> > because it'll lead to different behaviour depending on the running
> > kernel.
>
> Sure. Will describe this in the manual page.
>
Thanks.
--
Mel Gorman
SUSE Labs
hit regardless of intrest
in schedstat.
Regardless of the merit of adding schedstats for RT, the overhead of
schedstats when stats are disabled should remain the same with the
static branch check done in an inline function.
--
Mel Gorman
SUSE Labs
previously, but this change seems unrelated to
> rest of this patch and could deserve a dedicated patch
>
I can split it out as a preparation patch. It can be treated as a
trivial micro-optimisation to avoid an unnecessary calculation of the
imbalance for group_overloaded/group_fully_busy. The real motiviation is
to avoid confusing the group_overloaded/group_fully_busy imbalance with
allow_numa_imbalance and thinking they are somehow directly related to
each other.
--
Mel Gorman
SUSE Labs
is supported by the
current running kernel? It looks like they might receive -EINVAL (didn't
check for sure). In that case, a manual page is defintely needed to
explain that an error can be returned if the flag is used and the kernel
does not support it so the application can cover by falling back to
On Tue, Nov 17, 2020 at 04:53:10PM +0100, Vincent Guittot wrote:
> On Tue, 17 Nov 2020 at 16:17, Mel Gorman wrote:
> >
> > On Tue, Nov 17, 2020 at 03:31:19PM +0100, Vincent Guittot wrote:
> > > On Tue, 17 Nov 2020 at 15:18, Peter Zijlstra wrote:
> > > >
> &
On Tue, Nov 17, 2020 at 03:31:19PM +0100, Vincent Guittot wrote:
> On Tue, 17 Nov 2020 at 15:18, Peter Zijlstra wrote:
> >
> > On Tue, Nov 17, 2020 at 01:42:22PM +0000, Mel Gorman wrote:
> > > - if (local_sgs.idle_cpus)
> > > +
On Tue, Nov 17, 2020 at 03:24:56PM +0100, Vincent Guittot wrote:
> On Tue, 17 Nov 2020 at 14:42, Mel Gorman wrote:
> >
> > Currently, an imbalance is only allowed when a destination node
> > is almost completely idle. This solved one basic class of problems
> > an
On Tue, Nov 17, 2020 at 03:16:03PM +0100, Peter Zijlstra wrote:
> On Tue, Nov 17, 2020 at 01:42:21PM +0000, Mel Gorman wrote:
> > This patch revisits the possibility that NUMA nodes can be imbalanced
> > until 25% of the CPUs are occupied. The reasoning behind 25% is somewhat
This is simply a prepartion patch to make the following patches easier
to read. No functional change.
Signed-off-by: Mel Gorman
---
kernel/sched/fair.c | 10 ++
1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 6d78b68847f9
ted.
Note that the main consideration with this patch is the point where fork
stops spreading a task. Some workloads may benefit from different balance
points but it would be a risky tuning parameter.
Signed-off-by: Mel Gorman
---
kernel/sched/fair.c | 17 ++---
1 file changed, 10 insert
-off-by: Mel Gorman
---
kernel/sched/fair.c | 22 --
1 file changed, 12 insertions(+), 10 deletions(-)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 5fbed29e4001..33709dfac24d 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -1550,7 +1550,8 @@ struct
When NUMA and CPU balancing were reconciled, there was an attempt to allow
a degree of imbalance but it caused more problems than it solved. Instead,
imbalance was only allowed with an almost idle NUMA domain. A lot of the
problems have since been addressed so it's time for a revisit. There is
"sched/core: Optimize ttwu() spinning on p->on_cpu")
> Reported-by: Mel Gorman
> Signed-off-by: Peter Zijlstra (Intel)
Thanks, testing completed successfully! With the suggested alternative
comment above sched_remote_wakeup;
Reviewed-by: Mel Gorman
--
Mel Gorman
SUSE Labs
> Reported-by: Tejun Heo
> Signed-off-by: Peter Zijlstra (Intel)
s/Fixes: Fixes:/Fixes:/
Ok, very minor hazard that the same logic gets duplicated that someone
might try "fix" but git blame should help. Otherwise, it makes sense as
I've received more than one "bug"
k in the "anti-guarantees" section
of memory-barriers.txt :(
sched_psi_wake_requeue can probably stay with the other three fields
given they are under the rq lock but sched_remote_wakeup needs to move
out.
--
Mel Gorman
SUSE Labs
o I'm now both confused and wondering why smp_wmb
made a difference at all.
--
Mel Gorman
SUSE Labs
865 3653
2.33 292.74 672.24 1/861 3709
0.85 239.46 630.17 1/859 3711
0.31 195.87 590.73 1/857 3713
0.11 160.22 553.76 1/853 3715
Without the patch, the load average stabilised at 244 on an otherwise
idle machine.
Fixes: c6e7bd7afaeb ("sched/core: Optimize ttwu() spinning on p->on_cpu&q
On Mon, Nov 16, 2020 at 03:29:46PM +, Mel Gorman wrote:
> I did, it was the on_cpu ordering for the blocking case that had me
> looking at the smp_store_release and smp_cond_load_acquire in arm64 in
> the first place thinking that something in there must be breaking the
> on_cpu o
sched_migrated:1;
> + unsigned:0;
> unsignedsched_remote_wakeup:1;
> + unsigned :0;
> #ifdef CONFIG_PSI
> unsignedsched_psi_wake_requeue:1;
> #endif
I'll test this after the smp_wmb() test completes. While a clobbering may
be the issue, I also think the gap between the rq->nr_uninterruptible++
and smp_store_release(prev->on_cpu, 0) is relevant and a better candidate.
--
Mel Gorman
SUSE Labs
On Mon, Nov 16, 2020 at 01:58:03PM +0100, Peter Zijlstra wrote:
> On Mon, Nov 16, 2020 at 01:53:55PM +0100, Peter Zijlstra wrote:
> > On Mon, Nov 16, 2020 at 11:49:38AM +0000, Mel Gorman wrote:
> > > On Mon, Nov 16, 2020 at 09:10:54AM +, Mel Gorman wrote:
> > > &g
On Mon, Nov 16, 2020 at 01:11:03PM +, Will Deacon wrote:
> On Mon, Nov 16, 2020 at 09:10:54AM +0000, Mel Gorman wrote:
> > I got cc'd internal bug report filed against a 5.8 and 5.9 kernel
> > that loadavg was "exploding" on arch64 on a machines acting as a build
On Mon, Nov 16, 2020 at 01:46:57PM +0100, Peter Zijlstra wrote:
> On Mon, Nov 16, 2020 at 09:10:54AM +0000, Mel Gorman wrote:
> > Similarly, it's not clear why the arm64 implementation
> > does not call smp_acquire__after_ctrl_dep in the smp_load_acquire
> > impl
On Mon, Nov 16, 2020 at 11:49:38AM +, Mel Gorman wrote:
> On Mon, Nov 16, 2020 at 09:10:54AM +0000, Mel Gorman wrote:
> > I'll be looking again today to see can I find a mistake in the ordering for
> > how sched_contributes_to_load is handled but again, the lack of knowledge
&
On Mon, Nov 16, 2020 at 09:10:54AM +, Mel Gorman wrote:
> I'll be looking again today to see can I find a mistake in the ordering for
> how sched_contributes_to_load is handled but again, the lack of knowledge
> on the arm64 memory model means I'm a bit stuck and a second set of eye
n for it.
Protections on the filesystem level are for the inode, I can't imagine what
a filesystem would do with a protection change on the page table level
but maybe I'm not particularly imaginative today.
--
Mel Gorman
SUSE Labs
stake in the ordering for
how sched_contributes_to_load is handled but again, the lack of knowledge
on the arm64 memory model means I'm a bit stuck and a second set of eyes
would be nice :(
--
Mel Gorman
SUSE Labs
rm this
> permission comparison at mmap() time. However, there is no corresponding
> ->mprotect() hook.
>
> Solution
>
>
> Add a vm_ops->mprotect() hook so that mprotect() operations which are
> inconsistent with any page's stashed intent can be rejected by t
9.5th: 94
> 99.9th: 94
> min=0, max=94
>
> Fixes: 0b0695f2b34a ("sched/fair: Rework load_balance()")
> Reported-by: Chris Mason
> Suggested-by: Rik van Riel
> Signed-off-by: Vincent Guittot
Acked-by: Mel Gorman
--
Mel Gorman
SUSE Labs
dst=5,val=20
>
> Link: https://lore.kernel.org/lkml/jhjtux5edo2.mog...@arm.com
> Signed-off-by: Valentin Schneider
Acked-by: Mel Gorman
--
Mel Gorman
SUSE Labs
ely. Assuming schedutil gets the default, it may be necessary
to either have a tunable or a kconfig that affects the initial PELT
signal as to whether it should start low and ramp up, pick a midpoint or
start high and scale down.
--
Mel Gorman
SUSE Labs
On Mon, Nov 09, 2020 at 10:24:11AM -0500, Phil Auld wrote:
> Hi,
>
> On Fri, Nov 06, 2020 at 04:00:10PM + Mel Gorman wrote:
> > On Fri, Nov 06, 2020 at 02:33:56PM +0100, Vincent Guittot wrote:
> > > On Fri, 6 Nov 2020 at 13:03, Mel Gorman
> > > wrote:
>
m guessing "[PATCH v2] sched/fair: check for idle core"
>
> Yes, Sorry I sent my answer before adding the link
>
Grand, that's added to the mix on top to see how both patches measure up
versus a revert. No guarantee I'll have full results by Monday. As usual,
the test grid is loaded up to the eyeballs.
--
Mel Gorman
SUSE Labs
On Fri, Nov 06, 2020 at 02:33:56PM +0100, Vincent Guittot wrote:
> On Fri, 6 Nov 2020 at 13:03, Mel Gorman wrote:
> >
> > On Wed, Nov 04, 2020 at 09:42:05AM +0000, Mel Gorman wrote:
> > > While it's possible that some other factor masked the impact of the patch,
>
On Wed, Nov 04, 2020 at 09:42:05AM +, Mel Gorman wrote:
> While it's possible that some other factor masked the impact of the patch,
> the fact it's neutral for two workloads in 5.10-rc2 is suspicious as it
> indicates that if the patch was implemented against 5.10-rc2, it would
-specific permission
* checks need to be made before the mprotect is finalised.
* No modifications should be done to the VMA, returns 0
* if the mprotect is permitted.
*/
int (*mprotect)(struct vm_area_struct *vma,
unsigned long start, unsigned long end,
unsigned long newflags);
If a future driver *does* need to poke deeper into the VM for mprotect
then at least they'll have to explain why that's a good idea.
--
Mel Gorman
SUSE Labs
able performance as it'll have variable
performance with limited control of the "important" applications that
should use DRAM over PMEM. That's a long road but the step is not
incompatible with the long-term goal.
--
Mel Gorman
SUSE Labs
so visible from page fault microbenchmarks that scale the number of
threads. It's a vaguely similar class of problem but the patches are
taking very different approaches.
It'd been in my mind to consider reconciling that chunk with the
adjust_numa_imbalance but had not gotten around to seeing how
ly not have been merged. I've queued the tests on the remaining
machines to see if something more conclusive falls out.
--
Mel Gorman
SUSE Labs
sts, masking from
cpufreq changes, phase of the moon and just general plain old bad luck.
> I'll try to see if we can run some direct 5.8 - 5.9 tests like these.
>
That would be nice. While I often see false positive bisections for
performance bugs, the number of identical reports and different machines
made this more suspicious.
--
Mel Gorman
SUSE Labs
binding to one local node.
If it shows up in regressions, it'll be interesting to get a detailed
description of the workload. Pay particular attention to if THP is
disabled as I learned relatively recently that NUMA balancing with THP
disabled has higher overhead (which is hardly surprising).
Lacking data or a specific objection
Acked-by: Mel Gorman
--
Mel Gorman
SUSE Labs
he patch is a loss. I also had run reaim but not specifically for one
sub-test like this does and the generation of machines used was much
older than Gold 5318H. Grid or no grid, complete coverage is a challenge
--
Mel Gorman
SUSE Labs
ches so the sched domain groups it looks at are
smaller than happens on other machines.
Given the number of machines and workloads affected, can we revert and
retry? I have not tested against current mainline as scheduler details
have changed again but I can do so if desired.
--
Mel Gorman
SUSE Labs
on the
same machine were fine so overall;
Acked-by: Mel Gorman
Thanks.
--
Mel Gorman
SUSE Labs
On Fri, Oct 23, 2020 at 11:21:50AM +0200, Julia Lawall wrote:
>
>
> On Fri, 23 Oct 2020, Mel Gorman wrote:
>
> > On Thu, Oct 22, 2020 at 03:15:50PM +0200, Julia Lawall wrote:
> > > In the case of a thread wakeup, wake_affine determines whether a core
> >
ent Guittot
>
In principal, I think the patch is ok after the recent discussion. I'm
holding off an ack until a battery of tests on loads with different
levels of utilisation and wakeup patterns makes its way through a test
grid. It's based on Linus's tree mid-merge window that includes what is
in the scheduler pipeline
--
Mel Gorman
SUSE Labs
ms had been nominated at that time.
We have one, possibly two if Phil agrees. That's better than zero or
unfairly placing the full responsibility on the Intel guys that have been
testing it out.
--
Mel Gorman
SUSE Labs
n, but it's a useful safety net and a reasonable
way to deprecate a feature. It's also useful for bug creation -- User X
running whatever found that schedutil is worse than the old governor and
had to temporarily switch back. Repeat until complaining stops and then
tear out the old stuff.
When/if there is a patch setting schedutil as the default, cc suitable
distro people (Giovanni and myself for openSUSE). Other distros assuming
they're watching can nominate their own victim.
--
Mel Gorman
SUSE Labs
On Thu, Oct 22, 2020 at 05:25:14PM +0200, Peter Zijlstra wrote:
> On Thu, Oct 22, 2020 at 03:52:50PM +0100, Mel Gorman wrote:
>
> > There are some questions
> > currently on whether schedutil is good enough when HWP is not available.
>
> Srinivas and Rafael will know be
0170803085115.r2jfz2lofy5sp...@techsingularity.net/)
It's schedutil's turn :P
--
Mel Gorman
SUSE Labs
On Wed, Oct 21, 2020 at 05:19:53PM +0200, Vincent Guittot wrote:
> On Wed, 21 Oct 2020 at 17:08, Mel Gorman wrote:
> >
> > On Wed, Oct 21, 2020 at 03:24:48PM +0200, Julia Lawall wrote:
> > > > I worry it's overkill because prev is always used if it is idle even
> &
e hit and miss rates are both higher but
the miss rate is acceptable.
As the hunk means that recent_used_cpu could have been set based
on a previous wakeup, it makes it unreliable for making cross-node
decisions. p->recent_used_cpu's primary purpose at the moment is to
avoid select_idle_sibling searches.
--
Mel Gorman
SUSE Labs
g at prev_cpu and this_cpu is a
crude approximation and the path is heavily limited in terms of how
clever it can be.
--
Mel Gorman
SUSE Labs
301 - 400 of 10256 matches
Mail list logo