Re: [RFC PATCH v3 00/16] Core scheduling v3

2019-06-11 Thread Li, Aubrey
On 2019/6/6 23:26, Julien Desfossez wrote: > As mentioned above, we have come up with a fix for the long starvation > of untagged interactive threads competing for the same core with tagged > threads at the same priority. The idea is to detect the stall and boost > the stalling threads priority so

Re: [PATCH v18 1/3] proc: add /proc//arch_status

2019-05-27 Thread Li, Aubrey
On 2019/5/24 11:18, Andrew Morton wrote: > On Thu, 25 Apr 2019 22:32:17 +0800 Aubrey Li > wrote: > >> The architecture specific information of the running processes >> could be useful to the userland. Add /proc//arch_status >> interface support to examine process architecture specific >>

Re: [RFC PATCH v2 00/17] Core scheduling v2

2019-05-17 Thread Li, Aubrey
On 2019/5/18 8:58, Li, Aubrey wrote: > On 2019/4/30 12:42, Ingo Molnar wrote: >> >>>> What's interesting is how in the over-saturated case (the last three >>>> rows: 128, 256 and 512 total threads) coresched-SMT leaves 20-30% CPU >>>> performan

Re: [RFC PATCH v2 00/17] Core scheduling v2

2019-05-17 Thread Li, Aubrey
On 2019/4/30 12:42, Ingo Molnar wrote: > >>> What's interesting is how in the over-saturated case (the last three >>> rows: 128, 256 and 512 total threads) coresched-SMT leaves 20-30% CPU >>> performance on the floor according to the load figures. >> Sorry for a delay, I got a chance to obtain

Re: [RFC PATCH v2 00/17] Core scheduling v2

2019-04-29 Thread Li, Aubrey
On 2019/4/29 14:14, Ingo Molnar wrote: > > * Li, Aubrey wrote: > >>> I suspect it's pretty low, below 1% for all rows? >> >> Hope my this mail box works for this... >> >> .---

Re: [RFC PATCH v2 00/17] Core scheduling v2

2019-04-28 Thread Li, Aubrey
On 2019/4/28 20:17, Ingo Molnar wrote: > > * Aubrey Li wrote: > >> On Sun, Apr 28, 2019 at 5:33 PM Ingo Molnar wrote: >>> So because I'm a big fan of presenting data in a readable fashion, here >>> are your results, tabulated: >> >> I thought I tried my best to make it readable, but this one

Re: [PATCH v17 1/3] proc: add /proc//arch_status

2019-04-25 Thread Li, Aubrey
On 2019/4/25 18:11, Enrico Weigelt, metux IT consult wrote: > On 25.04.19 03:50, Li, Aubrey wrote: > >>>> +>>> +config PROC_PID_ARCH_STATUS>>> + bool "Enable > /proc//arch_status file">>>> Why is this switchable? x86 sel

Re: [PATCH v17 1/3] proc: add /proc//arch_status

2019-04-25 Thread Li, Aubrey
On 2019/4/25 16:20, Thomas Gleixner wrote: > On Thu, 25 Apr 2019, Li, Aubrey wrote: >> On 2019/4/25 15:20, Thomas Gleixner wrote: >>> Let the arch select CONFIG_PROC_PID_ARCH_STATUS >> >> Sorry, I didn't get the point here, above you mentioned not mixing arch and &g

Re: [PATCH v17 1/3] proc: add /proc//arch_status

2019-04-25 Thread Li, Aubrey
On 2019/4/25 15:20, Thomas Gleixner wrote: > On Thu, 25 Apr 2019, Li, Aubrey wrote: > >> On 2019/4/25 5:18, Thomas Gleixner wrote: >>> On Mon, 22 Apr 2019, Aubrey Li wrote: >>>> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig >>>> index 5ad92419

Re: [PATCH v17 1/3] proc: add /proc//arch_status

2019-04-24 Thread Li, Aubrey
On 2019/4/25 5:18, Thomas Gleixner wrote: > On Mon, 22 Apr 2019, Aubrey Li wrote: >> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig >> index 5ad92419be19..d5a9c5ddd453 100644 >> --- a/arch/x86/Kconfig >> +++ b/arch/x86/Kconfig >> @@ -208,6 +208,7 @@ config X86 >> select

Re: [PATCH v15 1/3] /proc/pid/status: Add support for architecture specific output

2019-04-19 Thread Li, Aubrey
On 2019/4/18 21:00, Thomas Gleixner wrote: > On Wed, 17 Apr 2019, Andy Lutomirski wrote: > >> On Tue, Apr 16, 2019 at 4:01 PM Andrew Morton >> wrote: >>> >>> On Tue, 16 Apr 2019 14:32:48 +0800 Aubrey Li >>> wrote: >>> The architecture specific information of the running processes could

Re: [PATCH v15 1/3] /proc/pid/status: Add support for architecture specific output

2019-04-17 Thread Li, Aubrey
On 2019/4/17 7:01, Andrew Morton wrote: > On Tue, 16 Apr 2019 14:32:48 +0800 Aubrey Li > wrote: > >> The architecture specific information of the running processes could >> be useful to the userland. Add support to examine process architecture >> specific information externally. > > The

Re: [PATCH v14 1/3] /proc/pid/status: Add support for architecture specific output

2019-04-11 Thread Li, Aubrey
On 2019/4/11 9:02, Li, Aubrey wrote: > On 2019/4/10 22:54, Andy Lutomirski wrote: >> On Tue, Apr 9, 2019 at 8:40 PM Li, Aubrey wrote: >>> >>> On 2019/4/10 10:36, Li, Aubrey wrote: >>>> On 2019/4/10 10:25, Andy Lutomirski wrote: >>>>> O

Re: [PATCH v14 1/3] /proc/pid/status: Add support for architecture specific output

2019-04-10 Thread Li, Aubrey
On 2019/4/10 22:54, Andy Lutomirski wrote: > On Tue, Apr 9, 2019 at 8:40 PM Li, Aubrey wrote: >> >> On 2019/4/10 10:36, Li, Aubrey wrote: >>> On 2019/4/10 10:25, Andy Lutomirski wrote: >>>> On Tue, Apr 9, 2019 at 7:20 PM Li, Aubrey >>>> wrote

Re: [PATCH v14 1/3] /proc/pid/status: Add support for architecture specific output

2019-04-09 Thread Li, Aubrey
On 2019/4/10 10:36, Li, Aubrey wrote: > On 2019/4/10 10:25, Andy Lutomirski wrote: >> On Tue, Apr 9, 2019 at 7:20 PM Li, Aubrey wrote: >>> >>> On 2019/4/10 9:58, Andy Lutomirski wrote: >>>> On Tue, Apr 9, 2019 at 6:55 PM Aubrey Li wrote: >>&g

Re: [PATCH v14 1/3] /proc/pid/status: Add support for architecture specific output

2019-04-09 Thread Li, Aubrey
On 2019/4/10 10:25, Andy Lutomirski wrote: > On Tue, Apr 9, 2019 at 7:20 PM Li, Aubrey wrote: >> >> On 2019/4/10 9:58, Andy Lutomirski wrote: >>> On Tue, Apr 9, 2019 at 6:55 PM Aubrey Li wrote: >>>> >>>> The architecture specific informati

Re: [PATCH v14 1/3] /proc/pid/status: Add support for architecture specific output

2019-04-09 Thread Li, Aubrey
On 2019/4/10 9:58, Andy Lutomirski wrote: > On Tue, Apr 9, 2019 at 6:55 PM Aubrey Li wrote: >> >> The architecture specific information of the running processes could >> be useful to the userland. Add support to examine process architecture >> specific information externally. >> >> Signed-off-by:

Re: [PATCH v13 1/3] /proc/pid/status: Add support for architecture specific output

2019-04-07 Thread Li, Aubrey
On 2019/4/8 9:52, Andy Lutomirski wrote: > On Sun, Apr 7, 2019 at 5:38 PM Li, Aubrey wrote: >> >> On 2019/4/8 1:34, Andy Lutomirski wrote: >>> On Fri, Apr 5, 2019 at 12:32 PM Thomas Gleixner wrote: >>>> >>>> On Sun, 24 Feb 2019, Aubrey Li wrote:

Re: [PATCH v13 1/3] /proc/pid/status: Add support for architecture specific output

2019-04-07 Thread Li, Aubrey
On 2019/4/7 23:46, Alexey Dobriyan wrote: > On Sun, Apr 07, 2019 at 09:02:38PM +0800, Li, Aubrey wrote: >> On 2019/4/7 5:41, Alexey Dobriyan wrote: >>> On Fri, Apr 05, 2019 at 09:32:35PM +0200, Thomas Gleixner wrote: >>>>> +/* Add support for architecture sp

Re: [PATCH v13 1/3] /proc/pid/status: Add support for architecture specific output

2019-04-07 Thread Li, Aubrey
On 2019/4/8 1:34, Andy Lutomirski wrote: > On Fri, Apr 5, 2019 at 12:32 PM Thomas Gleixner wrote: >> >> On Sun, 24 Feb 2019, Aubrey Li wrote: >> >>> The architecture specific information of the running processes could >>> be useful to the userland. Add support to examine process architecture >>>

Re: [PATCH v13 1/3] /proc/pid/status: Add support for architecture specific output

2019-04-07 Thread Li, Aubrey
On 2019/4/7 5:41, Alexey Dobriyan wrote: > On Fri, Apr 05, 2019 at 09:32:35PM +0200, Thomas Gleixner wrote: >>> +/* Add support for architecture specific output in /proc/pid/status */ >>> +extern void arch_proc_pid_status(struct seq_file *m, struct task_struct >>> *task); > ^^ > >

Re: [PATCH v13 2/3] x86,/proc/pid/status: Add AVX-512 usage elapsed time

2019-04-06 Thread Li, Aubrey
On 2019/4/6 4:27, Jann Horn wrote: > On Fri, Apr 5, 2019 at 10:02 PM Aubrey Li wrote: >> AVX-512 components use could cause core turbo frequency drop. So >> it's useful to expose AVX-512 usage elapsed time as a heuristic hint >> for the user space job scheduler to cluster the AVX-512 using tasks

Re: [RFC][PATCH 00/16] sched: Core scheduling

2019-03-14 Thread Li, Aubrey
The original patch seems missing the following change for 32bit. Thanks, -Aubrey diff --git a/kernel/sched/cpuacct.c b/kernel/sched/cpuacct.c index 9fbb10383434..78de28ebc45d 100644 --- a/kernel/sched/cpuacct.c +++ b/kernel/sched/cpuacct.c @@ -111,7 +111,7 @@ static u64

Re: [PATCH v12 3/3] Documentation/filesystems/proc.txt: add AVX512_elapsed_ms

2019-02-23 Thread Li, Aubrey
On 2019/2/24 2:16, Thomas Gleixner wrote: > On Thu, 21 Feb 2019, Aubrey Li wrote: >> @@ -45,6 +45,7 @@ Table of Contents >>3.9 /proc//map_files - Information about memory mapped files >>3.10 /proc//timerslack_ns - Task timerslack value >>3.11 /proc//patch_state - Livepatch

Re: [PATCH v11 2/3] x86,/proc/pid/status: Add AVX-512 usage elapsed time

2019-02-16 Thread Li, Aubrey
On 2019/2/16 20:55, Thomas Gleixner wrote: > On Fri, 15 Feb 2019, Li, Aubrey wrote: >> On 2019/2/14 19:29, Thomas Gleixner wrote: >> Under this scenario, the elapsed time becomes longer than normal indeed, see >> below: >> >> $ while [ 1 ]; do cat /proc/6985

Re: [PATCH v11 2/3] x86,/proc/pid/status: Add AVX-512 usage elapsed time

2019-02-14 Thread Li, Aubrey
On 2019/2/14 19:29, Thomas Gleixner wrote: > On Wed, 13 Feb 2019, Aubrey Li wrote: > >> AVX-512 components use could cause core turbo frequency drop. So >> it's useful to expose AVX-512 usage elapsed time as a heuristic hint >> for the user space job scheduler to cluster the AVX-512 using tasks

Re: [PATCH v10 2/3] x86,/proc/pid/status: Add AVX-512 usage elapsed time

2019-02-12 Thread Li, Aubrey
On 2019/2/12 21:24, Thomas Gleixner wrote: > On Tue, 12 Feb 2019, Li, Aubrey wrote: > >> On 2019/2/12 21:03, Thomas Gleixner wrote: >> I didn't include the first patch, because I saw it's already in tip >> tree. Did you use tip tree? > > Yes, that's my bad, forgot

Re: [PATCH v10 2/3] x86,/proc/pid/status: Add AVX-512 usage elapsed time

2019-02-12 Thread Li, Aubrey
On 2019/2/12 21:03, Thomas Gleixner wrote: > On Tue, 12 Feb 2019, Aubrey Li wrote: > > arch/x86/kernel/fpu/xstate.c:1252:6: warning: no previous prototype for > ‘avx512_status’ [-Wmissing-prototypes] > void avx512_status(struct seq_file *m, struct task_struct *task) > ^ Sorry

Re: [PATCH v9 2/3] x86,/proc/pid/status: Add AVX-512 usage elapsed time

2019-02-12 Thread Li, Aubrey
On 2019/2/12 19:55, Thomas Gleixner wrote: > On Tue, 12 Feb 2019, Li, Aubrey wrote: >> On 2019/2/12 19:19, Thomas Gleixner wrote: >>> On Tue, 12 Feb 2019, Li, Aubrey wrote: >>>> On 2019/2/12 16:22, Thomas Gleixner wrote: >>>>> On Tue, 12 Feb 2019,

Re: [PATCH v9 2/3] x86,/proc/pid/status: Add AVX-512 usage elapsed time

2019-02-12 Thread Li, Aubrey
On 2019/2/12 19:27, Thomas Gleixner wrote: > On Tue, 12 Feb 2019, Li, Aubrey wrote: >> $ find . -name *.h | xargs grep arch_irq_stat >> ./arch/arm64/include/asm/hardirq.h:#define arch_irq_stat_cpu smp_irq_stat_cpu >> ./arch/arm/include/asm/hardirq.h:#define arch_irq_stat_c

Re: [PATCH v9 2/3] x86,/proc/pid/status: Add AVX-512 usage elapsed time

2019-02-12 Thread Li, Aubrey
On 2019/2/12 19:19, Thomas Gleixner wrote: > On Tue, 12 Feb 2019, Li, Aubrey wrote: >> On 2019/2/12 16:22, Thomas Gleixner wrote: >>> On Tue, 12 Feb 2019, Aubrey Li wrote: >>>> diff --git a/arch/x86/include/asm/processor.h >>>> b/arch/x86/inclu

Re: [PATCH v9 2/3] x86,/proc/pid/status: Add AVX-512 usage elapsed time

2019-02-12 Thread Li, Aubrey
On 2019/2/12 17:14, Li, Aubrey wrote: > On 2019/2/12 16:22, Thomas Gleixner wrote: >> On Tue, 12 Feb 2019, Aubrey Li wrote: >>> diff --git a/arch/x86/include/asm/processor.h >>> b/arch/x86/include/asm/processor.h >>> index d53c54b842da..60ee932070fe 1

Re: [PATCH v9 2/3] x86,/proc/pid/status: Add AVX-512 usage elapsed time

2019-02-12 Thread Li, Aubrey
On 2019/2/12 16:22, Thomas Gleixner wrote: > On Tue, 12 Feb 2019, Aubrey Li wrote: >> diff --git a/arch/x86/include/asm/processor.h >> b/arch/x86/include/asm/processor.h >> index d53c54b842da..60ee932070fe 100644 >> --- a/arch/x86/include/asm/processor.h >> +++ b/arch/x86/include/asm/processor.h

Re: [PATCH v8 1/3] x86/fpu: track AVX-512 usage of tasks

2019-01-31 Thread Li, Aubrey
Hi Thomas, Just a soft reminder in case you didn't get a chance to look at this version. To address your concern about jiffies_64 on 32bit kernel, I use jiffies here instead. And to address jiffies wrap around issue, I use the trick from kernel micro time_before/after, that is, as long as the

Re: [PATCH v6 1/3] x86/fpu: track AVX-512 usage of tasks

2018-12-18 Thread Li, Aubrey
On 2018/12/19 5:38, Andi Kleen wrote: >> I misunderstood, you mean 32bit kernel, not 32bit machine. Theoretically >> 32bit >> kernel can use AVX512, but not sure if anyone use it like this. >> get_jiffies_64() >> includes jiffies_lock ops so not good in context switch. So I want to use raw >>

Re: [PATCH v6 1/3] x86/fpu: track AVX-512 usage of tasks

2018-12-18 Thread Li, Aubrey
On 2018/12/19 1:14, Dave Hansen wrote: > On 12/18/18 7:32 AM, Thomas Gleixner wrote: >> What exactly prevents a 32bit kernel from having the AVX512 feature bit >> set? And if it cannot be set on 32bit, then why are you compiling that code >> in at all? > > There are three different AVX-512 states

Re: [PATCH v6 1/3] x86/fpu: track AVX-512 usage of tasks

2018-12-18 Thread Li, Aubrey
On 2018/12/18 23:32, Thomas Gleixner wrote: > On Tue, 18 Dec 2018, Li, Aubrey wrote: > >> On 2018/12/18 22:14, Thomas Gleixner wrote: >>> On Tue, 18 Dec 2018, Aubrey Li wrote: >>>> diff --git a/arch/x86/include/asm/fpu/internal.h >>>> b/arch/x86/inclu

Re: [PATCH v6 1/3] x86/fpu: track AVX-512 usage of tasks

2018-12-18 Thread Li, Aubrey
On 2018/12/18 22:14, Thomas Gleixner wrote: > On Tue, 18 Dec 2018, Aubrey Li wrote: >> diff --git a/arch/x86/include/asm/fpu/internal.h >> b/arch/x86/include/asm/fpu/internal.h >> index a38bf5a1e37a..8778ac172255 100644 >> --- a/arch/x86/include/asm/fpu/internal.h >> +++

Re: [RESEND PATCH v5 1/3] x86/fpu: track AVX-512 usage of tasks

2018-12-18 Thread Li, Aubrey
On 2018/12/18 16:33, Thomas Gleixner wrote: > Aubrey, > > On Tue, 18 Dec 2018, Aubrey Li wrote: > > RESEND > > Please don't do that. This is not a resend because you changed something, > so it's new version. Usually I ignore resends when I have the original > submission already lined up for

Re: [PATCH v4 1/2] x86/fpu: track AVX-512 usage of tasks

2018-12-11 Thread Li, Aubrey
On 2018/12/12 8:14, Arjan van de Ven wrote: > On 12/11/2018 3:46 PM, Li, Aubrey wrote: >> On 2018/12/12 1:18, Dave Hansen wrote: >>> On 12/10/18 4:24 PM, Aubrey Li wrote: >>>> The tracking turns on the usage flag at the next context switch of >>>> th

Re: [PATCH v4 1/2] x86/fpu: track AVX-512 usage of tasks

2018-12-11 Thread Li, Aubrey
On 2018/12/12 1:20, Dave Hansen wrote: > to update AVX512 state >> + */ >> +static inline void update_avx512_state(struct fpu *fpu) >> +{ >> +/* >> + * AVX512 state is tracked here because its use is known to slow >> + * the max clock speed of the core. >> + * >> + * However,

Re: [PATCH v4 1/2] x86/fpu: track AVX-512 usage of tasks

2018-12-11 Thread Li, Aubrey
On 2018/12/12 1:18, Dave Hansen wrote: > On 12/10/18 4:24 PM, Aubrey Li wrote: >> The tracking turns on the usage flag at the next context switch of >> the task, but requires 3 consecutive context switches with no usage >> to clear it. This decay is required because well-written AVX-512 >>

Re: [PATCH v3 1/2] x86/fpu: track AVX-512 usage of tasks

2018-11-25 Thread Li, Aubrey
On 2018/11/18 22:03, Samuel Neves wrote: > On 11/17/18 12:36 AM, Li, Aubrey wrote: >> On 2018/11/17 7:10, Dave Hansen wrote: >>> Just to be clear: there are 3 AVX-512 XSAVE states: >>> >>> XFEATURE_OPMASK, >>> XFEATURE_ZMM_Hi256, >&

Re: [PATCH v3 1/2] x86/fpu: track AVX-512 usage of tasks

2018-11-25 Thread Li, Aubrey
On 2018/11/18 22:03, Samuel Neves wrote: > On 11/17/18 12:36 AM, Li, Aubrey wrote: >> On 2018/11/17 7:10, Dave Hansen wrote: >>> Just to be clear: there are 3 AVX-512 XSAVE states: >>> >>> XFEATURE_OPMASK, >>> XFEATURE_ZMM_Hi256, >&

Re: [PATCH v3 2/2] proc: add /proc//arch_state

2018-11-21 Thread Li, Aubrey
On 2018/11/21 17:53, Peter Zijlstra wrote: > On Wed, Nov 21, 2018 at 09:19:36AM +0100, Peter Zijlstra wrote: >> On Wed, Nov 21, 2018 at 09:39:00AM +0800, Li, Aubrey wrote: >>>> Also; you were going to shop around with the other architectures to see >>>> what the

Re: [PATCH v3 2/2] proc: add /proc//arch_state

2018-11-21 Thread Li, Aubrey
On 2018/11/21 17:53, Peter Zijlstra wrote: > On Wed, Nov 21, 2018 at 09:19:36AM +0100, Peter Zijlstra wrote: >> On Wed, Nov 21, 2018 at 09:39:00AM +0800, Li, Aubrey wrote: >>>> Also; you were going to shop around with the other architectures to see >>>> what the

Re: [PATCH v3 2/2] proc: add /proc//arch_state

2018-11-20 Thread Li, Aubrey
On 2018/11/20 1:39, Peter Zijlstra wrote: > On Thu, Nov 15, 2018 at 07:00:07AM +0800, Aubrey Li wrote: >> Add a /proc//arch_state interface to expose per-task cpu specific >> state values. >> >> Exposing AVX-512 Hi16_ZMM registers usage is for the user space job >> scheduler to cluster AVX-512

Re: [PATCH v3 2/2] proc: add /proc//arch_state

2018-11-20 Thread Li, Aubrey
On 2018/11/20 1:39, Peter Zijlstra wrote: > On Thu, Nov 15, 2018 at 07:00:07AM +0800, Aubrey Li wrote: >> Add a /proc//arch_state interface to expose per-task cpu specific >> state values. >> >> Exposing AVX-512 Hi16_ZMM registers usage is for the user space job >> scheduler to cluster AVX-512

Re: [PATCH v3 1/2] x86/fpu: track AVX-512 usage of tasks

2018-11-20 Thread Li, Aubrey
On 2018/11/18 22:03, Samuel Neves wrote: > On 11/17/18 12:36 AM, Li, Aubrey wrote: >> On 2018/11/17 7:10, Dave Hansen wrote: >>> Just to be clear: there are 3 AVX-512 XSAVE states: >>> >>> XFEATURE_OPMASK, >>> XFEATURE_ZMM_Hi256, >&

Re: [PATCH v3 1/2] x86/fpu: track AVX-512 usage of tasks

2018-11-20 Thread Li, Aubrey
On 2018/11/18 22:03, Samuel Neves wrote: > On 11/17/18 12:36 AM, Li, Aubrey wrote: >> On 2018/11/17 7:10, Dave Hansen wrote: >>> Just to be clear: there are 3 AVX-512 XSAVE states: >>> >>> XFEATURE_OPMASK, >>> XFEATURE_ZMM_Hi256, >&

Re: [PATCH v3 1/2] x86/fpu: track AVX-512 usage of tasks

2018-11-16 Thread Li, Aubrey
On 2018/11/17 7:10, Dave Hansen wrote: > On 11/15/18 4:21 PM, Li, Aubrey wrote: >> "Core cycles where the core was running with power delivery for license >> level 2 (introduced in Skylake Server microarchitecture). This includes >> high current AVX 512-bit instruc

Re: [PATCH v3 1/2] x86/fpu: track AVX-512 usage of tasks

2018-11-16 Thread Li, Aubrey
On 2018/11/17 7:10, Dave Hansen wrote: > On 11/15/18 4:21 PM, Li, Aubrey wrote: >> "Core cycles where the core was running with power delivery for license >> level 2 (introduced in Skylake Server microarchitecture). This includes >> high current AVX 512-bit instruc

Re: [PATCH v3 2/2] proc: add /proc//arch_state

2018-11-15 Thread Li, Aubrey
On 2018/11/15 23:18, Dave Hansen wrote: > On 11/14/18 3:00 PM, Aubrey Li wrote: >> +void arch_thread_state(struct seq_file *m, struct task_struct *task) >> +{ >> +/* >> + * Report AVX-512 Hi16_ZMM registers usage >> + */ >> +if (task->thread.fpu.hi16zmm_usage) >> +

Re: [PATCH v3 2/2] proc: add /proc//arch_state

2018-11-15 Thread Li, Aubrey
On 2018/11/15 23:18, Dave Hansen wrote: > On 11/14/18 3:00 PM, Aubrey Li wrote: >> +void arch_thread_state(struct seq_file *m, struct task_struct *task) >> +{ >> +/* >> + * Report AVX-512 Hi16_ZMM registers usage >> + */ >> +if (task->thread.fpu.hi16zmm_usage) >> +

Re: [PATCH v3 1/2] x86/fpu: track AVX-512 usage of tasks

2018-11-15 Thread Li, Aubrey
On 2018/11/15 23:40, Dave Hansen wrote: > On 11/14/18 3:00 PM, Aubrey Li wrote: >> AVX-512 component has 3 states, only Hi16_ZMM state causes notable >> frequency drop. Add per task Hi16_ZMM state tracking to context switch. > > Just curious, but is there any public documentation of this? It

Re: [PATCH v3 1/2] x86/fpu: track AVX-512 usage of tasks

2018-11-15 Thread Li, Aubrey
On 2018/11/15 23:40, Dave Hansen wrote: > On 11/14/18 3:00 PM, Aubrey Li wrote: >> AVX-512 component has 3 states, only Hi16_ZMM state causes notable >> frequency drop. Add per task Hi16_ZMM state tracking to context switch. > > Just curious, but is there any public documentation of this? It

Re: [RFC PATCH v2 1/2] x86/fpu: detect AVX task

2018-11-13 Thread Li, Aubrey
On 2018/11/13 18:25, David Laight wrote: > From: Li, Aubrey >> Sent: 12 November 2018 01:41 > ... >> VZEROUPPER instruction resets the init state. If context switch happens >> to occur exactly after VZEROUPPER instruction, XINUSE bitmap is empty(all >> zeros), which i

Re: [RFC PATCH v2 1/2] x86/fpu: detect AVX task

2018-11-13 Thread Li, Aubrey
On 2018/11/13 18:25, David Laight wrote: > From: Li, Aubrey >> Sent: 12 November 2018 01:41 > ... >> VZEROUPPER instruction resets the init state. If context switch happens >> to occur exactly after VZEROUPPER instruction, XINUSE bitmap is empty(all >> zeros), which i

Re: [RFC PATCH v2 1/2] x86/fpu: detect AVX task

2018-11-12 Thread Li, Aubrey
On 2018/11/12 23:46, Dave Hansen wrote: > On 11/11/18 9:38 PM, Li, Aubrey wrote: > >>> Do we want this, or do we want something more time-based? >>> >> This counter is introduced here to solve the race of context switch and >> VZEROUPPER. 3 context switches

Re: [RFC PATCH v2 1/2] x86/fpu: detect AVX task

2018-11-12 Thread Li, Aubrey
On 2018/11/12 23:46, Dave Hansen wrote: > On 11/11/18 9:38 PM, Li, Aubrey wrote: > >>> Do we want this, or do we want something more time-based? >>> >> This counter is introduced here to solve the race of context switch and >> VZEROUPPER. 3 context switches

Re: [RFC PATCH v1 2/2] proc: add /proc//thread_state

2018-11-11 Thread Li, Aubrey
On 2018/11/12 13:31, Ingo Molnar wrote: > > * Peter Zijlstra wrote: > >> On Thu, Nov 08, 2018 at 07:32:46AM +0100, Ingo Molnar wrote: >>> >>> * Aubrey Li wrote: >>> Expose the per-task cpu specific thread state value, it's helpful for userland to classify and schedule the tasks by

Re: [RFC PATCH v1 2/2] proc: add /proc//thread_state

2018-11-11 Thread Li, Aubrey
On 2018/11/12 13:31, Ingo Molnar wrote: > > * Peter Zijlstra wrote: > >> On Thu, Nov 08, 2018 at 07:32:46AM +0100, Ingo Molnar wrote: >>> >>> * Aubrey Li wrote: >>> Expose the per-task cpu specific thread state value, it's helpful for userland to classify and schedule the tasks by

Re: [RFC PATCH v2 1/2] x86/fpu: detect AVX task

2018-11-11 Thread Li, Aubrey
Hi Dave, Thanks for your comments! On 2018/11/12 10:32, Dave Hansen wrote: > On 11/7/18 9:16 AM, Aubrey Li wrote: >> XSAVES and its variants use init optimization to reduce the amount of >> data that they save to memory during context switch. Init optimization >> uses the state component bitmap

Re: [RFC PATCH v2 1/2] x86/fpu: detect AVX task

2018-11-11 Thread Li, Aubrey
Hi Dave, Thanks for your comments! On 2018/11/12 10:32, Dave Hansen wrote: > On 11/7/18 9:16 AM, Aubrey Li wrote: >> XSAVES and its variants use init optimization to reduce the amount of >> data that they save to memory during context switch. Init optimization >> uses the state component bitmap

Re: [RFC PATCH v2 1/2] x86/fpu: detect AVX task

2018-11-11 Thread Li, Aubrey
On 2018/11/9 19:21, Thomas Gleixner wrote: > Aubrey, > > On Thu, 8 Nov 2018, Aubrey Li wrote: > >> Subject: x86/fpu: detect AVX task > > What is an AVX task? I know what you mean, but for the casual reader this > is not very informative. So something like: > > x86/fpu: Track AVX usage

Re: [RFC PATCH v2 1/2] x86/fpu: detect AVX task

2018-11-11 Thread Li, Aubrey
On 2018/11/9 19:21, Thomas Gleixner wrote: > Aubrey, > > On Thu, 8 Nov 2018, Aubrey Li wrote: > >> Subject: x86/fpu: detect AVX task > > What is an AVX task? I know what you mean, but for the casual reader this > is not very informative. So something like: > > x86/fpu: Track AVX usage

Re: [RFC PATCH v1 2/2] proc: add /proc//thread_state

2018-11-08 Thread Li, Aubrey
On 2018/11/8 18:17, Peter Zijlstra wrote: > On Thu, Nov 08, 2018 at 07:32:46AM +0100, Ingo Molnar wrote: >> >> * Aubrey Li wrote: >> >>> Expose the per-task cpu specific thread state value, it's helpful >>> for userland to classify and schedule the tasks by different policies >> >> That's pretty

Re: [RFC PATCH v1 2/2] proc: add /proc//thread_state

2018-11-08 Thread Li, Aubrey
On 2018/11/8 18:17, Peter Zijlstra wrote: > On Thu, Nov 08, 2018 at 07:32:46AM +0100, Ingo Molnar wrote: >> >> * Aubrey Li wrote: >> >>> Expose the per-task cpu specific thread state value, it's helpful >>> for userland to classify and schedule the tasks by different policies >> >> That's pretty

Re: [RFC PATCH v1 1/2] x86/fpu: detect AVX task

2018-11-07 Thread Li, Aubrey
On 2018/11/8 1:41, Tim Chen wrote: > On 11/06/2018 10:23 AM, Aubrey Li wrote: > >> +static inline void update_avx_state(struct avx_state *avx) >> +{ >> +/* >> + * Check if XGETBV with ECX = 1 supported. XGETBV with ECX = 1 >> + * returns the logical-AND of XCR0 and XINUSE. XINUSE is a

Re: [RFC PATCH v1 1/2] x86/fpu: detect AVX task

2018-11-07 Thread Li, Aubrey
On 2018/11/8 1:41, Tim Chen wrote: > On 11/06/2018 10:23 AM, Aubrey Li wrote: > >> +static inline void update_avx_state(struct avx_state *avx) >> +{ >> +/* >> + * Check if XGETBV with ECX = 1 supported. XGETBV with ECX = 1 >> + * returns the logical-AND of XCR0 and XINUSE. XINUSE is a

Re: [RFC/RFT][PATCH 4/7] cpuidle: menu: Split idle duration prediction from state selection

2018-03-06 Thread Li, Aubrey
On 2018/3/6 16:45, Peter Zijlstra wrote: > On Tue, Mar 06, 2018 at 10:15:10AM +0800, Li, Aubrey wrote: >> On 2018/3/5 21:53, Peter Zijlstra wrote: >>> On Mon, Mar 05, 2018 at 02:05:10PM +0100, Rafael J. Wysocki wrote: >>>> On Mon, Mar 5, 2018 at 1:50 PM, Peter

Re: [RFC/RFT][PATCH 4/7] cpuidle: menu: Split idle duration prediction from state selection

2018-03-06 Thread Li, Aubrey
On 2018/3/6 16:45, Peter Zijlstra wrote: > On Tue, Mar 06, 2018 at 10:15:10AM +0800, Li, Aubrey wrote: >> On 2018/3/5 21:53, Peter Zijlstra wrote: >>> On Mon, Mar 05, 2018 at 02:05:10PM +0100, Rafael J. Wysocki wrote: >>>> On Mon, Mar 5, 2018 at 1:50 PM, Peter Zijls

Re: [RFC/RFT][PATCH 4/7] cpuidle: menu: Split idle duration prediction from state selection

2018-03-05 Thread Li, Aubrey
On 2018/3/5 21:53, Peter Zijlstra wrote: > On Mon, Mar 05, 2018 at 02:05:10PM +0100, Rafael J. Wysocki wrote: >> On Mon, Mar 5, 2018 at 1:50 PM, Peter Zijlstra wrote: >>> On Mon, Mar 05, 2018 at 12:47:23PM +0100, Rafael J. Wysocki wrote: > IOW, the target residency of

Re: [RFC/RFT][PATCH 4/7] cpuidle: menu: Split idle duration prediction from state selection

2018-03-05 Thread Li, Aubrey
On 2018/3/5 21:53, Peter Zijlstra wrote: > On Mon, Mar 05, 2018 at 02:05:10PM +0100, Rafael J. Wysocki wrote: >> On Mon, Mar 5, 2018 at 1:50 PM, Peter Zijlstra wrote: >>> On Mon, Mar 05, 2018 at 12:47:23PM +0100, Rafael J. Wysocki wrote: > IOW, the target residency of the selected state

Re: [RFC PATCH v2 0/8] Introduct cpu idle prediction functionality

2017-11-29 Thread Li, Aubrey
Hi, Thanks Rafael's comments against V2. I'd like to ping here to see which direction this proposal should go and what fundamental change this proposal should make. I'm also open to any suggestions if my proposal is not on the right way. Thanks, -Aubrey On 2017/9/30 15:20, Aubrey Li wrote: >

Re: [RFC PATCH v2 0/8] Introduct cpu idle prediction functionality

2017-11-29 Thread Li, Aubrey
Hi, Thanks Rafael's comments against V2. I'd like to ping here to see which direction this proposal should go and what fundamental change this proposal should make. I'm also open to any suggestions if my proposal is not on the right way. Thanks, -Aubrey On 2017/9/30 15:20, Aubrey Li wrote: >

Re: [RFC PATCH v2 0/8] Introduct cpu idle prediction functionality

2017-10-17 Thread Li, Aubrey
On 2017/10/17 8:07, Rafael J. Wysocki wrote: > On Monday, October 16, 2017 9:44:41 AM CEST Li, Aubrey wrote: >> >> Or you concern why the threshold can't simply be tick interval? > > That I guess. > >> For the latter, if the threshold is close/equal to the

Re: [RFC PATCH v2 0/8] Introduct cpu idle prediction functionality

2017-10-17 Thread Li, Aubrey
On 2017/10/17 8:07, Rafael J. Wysocki wrote: > On Monday, October 16, 2017 9:44:41 AM CEST Li, Aubrey wrote: >> >> Or you concern why the threshold can't simply be tick interval? > > That I guess. > >> For the latter, if the threshold is close/equal to the

Re: [RFC PATCH v2 2/8] cpuidle: record the overhead of idle entry

2017-10-17 Thread Li, Aubrey
On 2017/10/17 8:05, Rafael J. Wysocki wrote: > On Monday, October 16, 2017 5:11:57 AM CEST Li, Aubrey wrote: >> On 2017/10/14 8:35, Rafael J. Wysocki wrote: >>> On Saturday, September 30, 2017 9:20:28 AM CEST Aubrey Li wrote: >>>> Record the overhead of idle entry in

Re: [RFC PATCH v2 2/8] cpuidle: record the overhead of idle entry

2017-10-17 Thread Li, Aubrey
On 2017/10/17 8:05, Rafael J. Wysocki wrote: > On Monday, October 16, 2017 5:11:57 AM CEST Li, Aubrey wrote: >> On 2017/10/14 8:35, Rafael J. Wysocki wrote: >>> On Saturday, September 30, 2017 9:20:28 AM CEST Aubrey Li wrote: >>>> Record the overhead of idle entry in

Re: [RFC PATCH v2 6/8] cpuidle: make fast idle threshold tunable

2017-10-17 Thread Li, Aubrey
On 2017/10/17 8:01, Rafael J. Wysocki wrote: > On Monday, October 16, 2017 8:00:45 AM CEST Li, Aubrey wrote: >> On 2017/10/14 8:59, Rafael J. Wysocki wrote: >>> On Saturday, September 30, 2017 9:20:32 AM CEST Aubrey Li wrote: >>>> Add a knob to make fast idle threshol

Re: [RFC PATCH v2 6/8] cpuidle: make fast idle threshold tunable

2017-10-17 Thread Li, Aubrey
On 2017/10/17 8:01, Rafael J. Wysocki wrote: > On Monday, October 16, 2017 8:00:45 AM CEST Li, Aubrey wrote: >> On 2017/10/14 8:59, Rafael J. Wysocki wrote: >>> On Saturday, September 30, 2017 9:20:32 AM CEST Aubrey Li wrote: >>>> Add a knob to make fast idle threshol

Re: [RFC PATCH v2 5/8] timers: keep sleep length updated as needed

2017-10-17 Thread Li, Aubrey
On 2017/10/17 7:58, Rafael J. Wysocki wrote: > On Monday, October 16, 2017 8:46:41 AM CEST Li, Aubrey wrote: >> On 2017/10/14 8:56, Rafael J. Wysocki wrote: >>> On Saturday, September 30, 2017 9:20:31 AM CEST Aubrey Li wrote: >>>> sleep length indicates how long

Re: [RFC PATCH v2 5/8] timers: keep sleep length updated as needed

2017-10-17 Thread Li, Aubrey
On 2017/10/17 7:58, Rafael J. Wysocki wrote: > On Monday, October 16, 2017 8:46:41 AM CEST Li, Aubrey wrote: >> On 2017/10/14 8:56, Rafael J. Wysocki wrote: >>> On Saturday, September 30, 2017 9:20:31 AM CEST Aubrey Li wrote: >>>> sleep length indicates how long

Re: [RFC PATCH v2 3/8] cpuidle: add a new predict interface

2017-10-16 Thread Li, Aubrey
On 2017/10/14 9:27, Rafael J. Wysocki wrote: >> diff --git a/kernel/sched/idle.c b/kernel/sched/idle.c >> index 0951dac..8704f3c 100644 >> --- a/kernel/sched/idle.c >> +++ b/kernel/sched/idle.c >> @@ -225,6 +225,7 @@ static void do_idle(void) >> */ >> __current_set_polling(); >>

Re: [RFC PATCH v2 3/8] cpuidle: add a new predict interface

2017-10-16 Thread Li, Aubrey
On 2017/10/14 9:27, Rafael J. Wysocki wrote: >> diff --git a/kernel/sched/idle.c b/kernel/sched/idle.c >> index 0951dac..8704f3c 100644 >> --- a/kernel/sched/idle.c >> +++ b/kernel/sched/idle.c >> @@ -225,6 +225,7 @@ static void do_idle(void) >> */ >> __current_set_polling(); >>

Re: [RFC PATCH v2 3/8] cpuidle: add a new predict interface

2017-10-16 Thread Li, Aubrey
On 2017/10/14 8:45, Rafael J. Wysocki wrote: > On Saturday, September 30, 2017 9:20:29 AM CEST Aubrey Li wrote: >> For the governor has predict functionality, add a new predict >> interface in cpuidle framework to call and use it. > > Care to describe how it is intended to work? > > Also this

Re: [RFC PATCH v2 3/8] cpuidle: add a new predict interface

2017-10-16 Thread Li, Aubrey
On 2017/10/14 8:45, Rafael J. Wysocki wrote: > On Saturday, September 30, 2017 9:20:29 AM CEST Aubrey Li wrote: >> For the governor has predict functionality, add a new predict >> interface in cpuidle framework to call and use it. > > Care to describe how it is intended to work? > > Also this

Re: [RFC PATCH v2 0/8] Introduct cpu idle prediction functionality

2017-10-16 Thread Li, Aubrey
On 2017/10/14 9:14, Rafael J. Wysocki wrote: > On Saturday, September 30, 2017 9:20:26 AM CEST Aubrey Li wrote: >> We found under some latency intensive workloads, short idle periods occurs >> very common, then idle entry and exit path starts to dominate, so it's >> important to optimize them. To

Re: [RFC PATCH v2 0/8] Introduct cpu idle prediction functionality

2017-10-16 Thread Li, Aubrey
On 2017/10/14 9:14, Rafael J. Wysocki wrote: > On Saturday, September 30, 2017 9:20:26 AM CEST Aubrey Li wrote: >> We found under some latency intensive workloads, short idle periods occurs >> very common, then idle entry and exit path starts to dominate, so it's >> important to optimize them. To

Re: [RFC PATCH v2 5/8] timers: keep sleep length updated as needed

2017-10-16 Thread Li, Aubrey
On 2017/10/14 8:56, Rafael J. Wysocki wrote: > On Saturday, September 30, 2017 9:20:31 AM CEST Aubrey Li wrote: >> sleep length indicates how long we'll be idle. Currently, it's updated >> only when tick nohz enters. These patch series make a new requirement >> with tick, so we should keep sleep

Re: [RFC PATCH v2 5/8] timers: keep sleep length updated as needed

2017-10-16 Thread Li, Aubrey
On 2017/10/14 8:56, Rafael J. Wysocki wrote: > On Saturday, September 30, 2017 9:20:31 AM CEST Aubrey Li wrote: >> sleep length indicates how long we'll be idle. Currently, it's updated >> only when tick nohz enters. These patch series make a new requirement >> with tick, so we should keep sleep

Re: [RFC PATCH v2 4/8] tick/nohz: keep tick on for a fast idle

2017-10-16 Thread Li, Aubrey
On 2017/10/16 14:25, Mike Galbraith wrote: > On Mon, 2017-10-16 at 13:34 +0800, Li, Aubrey wrote: >> On 2017/10/16 12:45, Mike Galbraith wrote: >>> On Mon, 2017-10-16 at 11:26 +0800, Li, Aubrey wrote: >>>> >>>> I'll try to move quiet_vmstat() into the nor

Re: [RFC PATCH v2 4/8] tick/nohz: keep tick on for a fast idle

2017-10-16 Thread Li, Aubrey
On 2017/10/16 14:25, Mike Galbraith wrote: > On Mon, 2017-10-16 at 13:34 +0800, Li, Aubrey wrote: >> On 2017/10/16 12:45, Mike Galbraith wrote: >>> On Mon, 2017-10-16 at 11:26 +0800, Li, Aubrey wrote: >>>> >>>> I'll try to move quiet_vmstat() into the nor

Re: [RFC PATCH v2 7/8] cpuidle: introduce irq timing to make idle prediction

2017-10-16 Thread Li, Aubrey
On 2017/10/14 9:01, Rafael J. Wysocki wrote: > On Saturday, September 30, 2017 9:20:33 AM CEST Aubrey Li wrote: >> Introduce irq timings output as a factor to predict the duration >> of the coming idle >> >> @@ -342,13 +343,27 @@ void cpuidle_entry_end(void) >> void cpuidle_predict(void) >> { >>

Re: [RFC PATCH v2 7/8] cpuidle: introduce irq timing to make idle prediction

2017-10-16 Thread Li, Aubrey
On 2017/10/14 9:01, Rafael J. Wysocki wrote: > On Saturday, September 30, 2017 9:20:33 AM CEST Aubrey Li wrote: >> Introduce irq timings output as a factor to predict the duration >> of the coming idle >> >> @@ -342,13 +343,27 @@ void cpuidle_entry_end(void) >> void cpuidle_predict(void) >> { >>

Re: [RFC PATCH v2 6/8] cpuidle: make fast idle threshold tunable

2017-10-16 Thread Li, Aubrey
On 2017/10/14 8:59, Rafael J. Wysocki wrote: > On Saturday, September 30, 2017 9:20:32 AM CEST Aubrey Li wrote: >> Add a knob to make fast idle threshold tunable >> >> Signed-off-by: Aubrey Li <aubrey...@linux.intel.com> > > I first of all am not sur

Re: [RFC PATCH v2 6/8] cpuidle: make fast idle threshold tunable

2017-10-16 Thread Li, Aubrey
On 2017/10/14 8:59, Rafael J. Wysocki wrote: > On Saturday, September 30, 2017 9:20:32 AM CEST Aubrey Li wrote: >> Add a knob to make fast idle threshold tunable >> >> Signed-off-by: Aubrey Li > > I first of all am not sure about the need to add a tunable for this at all > in the first place.

Re: [RFC PATCH v2 4/8] tick/nohz: keep tick on for a fast idle

2017-10-15 Thread Li, Aubrey
On 2017/10/16 12:45, Mike Galbraith wrote: > On Mon, 2017-10-16 at 11:26 +0800, Li, Aubrey wrote: >> >> I'll try to move quiet_vmstat() into the normal idle branch if this patch >> series >> are reasonable. Is fast_idle a good indication for it? > > see x86_

Re: [RFC PATCH v2 4/8] tick/nohz: keep tick on for a fast idle

2017-10-15 Thread Li, Aubrey
On 2017/10/16 12:45, Mike Galbraith wrote: > On Mon, 2017-10-16 at 11:26 +0800, Li, Aubrey wrote: >> >> I'll try to move quiet_vmstat() into the normal idle branch if this patch >> series >> are reasonable. Is fast_idle a good indication for it? > > see x86_

<    1   2   3   4   5   6   >