Re: [PATCH v4 1/2] x86/fpu: track AVX-512 usage of tasks

2018-12-12 Thread Andi Kleen
> Isn't a thread likely to clear the AVX registers at the end of a function > that uses them. > In particular this removes the massive overhead on certain cpus of > switching between two AVX modes. > So it is actually unlikely that XSAVE will need to save them at all? Only if context switches only

RE: [PATCH v4 1/2] x86/fpu: track AVX-512 usage of tasks

2018-12-12 Thread David Laight
From: Aubrey Li > Sent: 11 December 2018 00:25 > > User space tools which do automated task placement need information > about AVX-512 usage of tasks, because AVX-512 usage could cause core > turbo frequency drop and impact the running task on the sibling CPU. > > The XSAVE hardware structure has

Re: [PATCH v4 1/2] x86/fpu: track AVX-512 usage of tasks

2018-12-11 Thread Dave Hansen
On 12/11/18 4:59 PM, Li, Aubrey wrote: >> maybe instead of a 1/0 bit, it's useful to store the timestamp of the last >> time we found the task to use avx? (need to find a good time unit) >> >> > Are you suggesting kernel does not do any translation, just provide a fact > to the user space tool and

Re: [PATCH v4 1/2] x86/fpu: track AVX-512 usage of tasks

2018-12-11 Thread Li, Aubrey
On 2018/12/12 8:14, Arjan van de Ven wrote: > On 12/11/2018 3:46 PM, Li, Aubrey wrote: >> On 2018/12/12 1:18, Dave Hansen wrote: >>> On 12/10/18 4:24 PM, Aubrey Li wrote: The tracking turns on the usage flag at the next context switch of the task, but requires 3 consecutive context switch

Re: [PATCH v4 1/2] x86/fpu: track AVX-512 usage of tasks

2018-12-11 Thread Dave Hansen
On 12/11/18 4:34 PM, Li, Aubrey wrote: >> Is there a reason we shouldn't do: >> >> if (!cpu_feature_enabled(X86_FEATURE_AVX512F)) >> update_avx512_state(fpu); >> >> ? >> > Why _!_ ? Sorry, got it backwards. I think I was considering having you do a if (!cpu_feature_enab

Re: [PATCH v4 1/2] x86/fpu: track AVX-512 usage of tasks

2018-12-11 Thread Li, Aubrey
On 2018/12/12 1:20, Dave Hansen wrote: > to update AVX512 state >> + */ >> +static inline void update_avx512_state(struct fpu *fpu) >> +{ >> +/* >> + * AVX512 state is tracked here because its use is known to slow >> + * the max clock speed of the core. >> + * >> + * However, A

Re: [PATCH v4 1/2] x86/fpu: track AVX-512 usage of tasks

2018-12-11 Thread Arjan van de Ven
On 12/11/2018 3:46 PM, Li, Aubrey wrote: On 2018/12/12 1:18, Dave Hansen wrote: On 12/10/18 4:24 PM, Aubrey Li wrote: The tracking turns on the usage flag at the next context switch of the task, but requires 3 consecutive context switches with no usage to clear it. This decay is required becaus

Re: [PATCH v4 1/2] x86/fpu: track AVX-512 usage of tasks

2018-12-11 Thread Li, Aubrey
On 2018/12/12 1:18, Dave Hansen wrote: > On 12/10/18 4:24 PM, Aubrey Li wrote: >> The tracking turns on the usage flag at the next context switch of >> the task, but requires 3 consecutive context switches with no usage >> to clear it. This decay is required because well-written AVX-512 >> applicat

Re: [PATCH v4 1/2] x86/fpu: track AVX-512 usage of tasks

2018-12-11 Thread Andi Kleen
On Tue, Dec 11, 2018 at 09:18:25AM -0800, Dave Hansen wrote: > On 12/10/18 4:24 PM, Aubrey Li wrote: > > The tracking turns on the usage flag at the next context switch of > > the task, but requires 3 consecutive context switches with no usage > > to clear it. This decay is required because well-wr

Re: [PATCH v4 1/2] x86/fpu: track AVX-512 usage of tasks

2018-12-11 Thread Tim Chen
On 12/11/18 9:18 AM, Dave Hansen wrote: > On 12/10/18 4:24 PM, Aubrey Li wrote: >> The tracking turns on the usage flag at the next context switch of >> the task, but requires 3 consecutive context switches with no usage >> to clear it. This decay is required because well-written AVX-512 >> appl

Re: [PATCH v4 1/2] x86/fpu: track AVX-512 usage of tasks

2018-12-11 Thread Dave Hansen
to update AVX512 state > + */ > +static inline void update_avx512_state(struct fpu *fpu) > +{ > + /* > + * AVX512 state is tracked here because its use is known to slow > + * the max clock speed of the core. > + * > + * However, AVX512-using tasks are expected to clear this

Re: [PATCH v4 1/2] x86/fpu: track AVX-512 usage of tasks

2018-12-11 Thread Dave Hansen
On 12/10/18 4:24 PM, Aubrey Li wrote: > The tracking turns on the usage flag at the next context switch of > the task, but requires 3 consecutive context switches with no usage > to clear it. This decay is required because well-written AVX-512 > applications are expected to clear this state when no

[PATCH v4 1/2] x86/fpu: track AVX-512 usage of tasks

2018-12-10 Thread Aubrey Li
User space tools which do automated task placement need information about AVX-512 usage of tasks, because AVX-512 usage could cause core turbo frequency drop and impact the running task on the sibling CPU. The XSAVE hardware structure has bits that indicate when valid state is present in registers