Re: [PATCH v4 14/22] x86/fpu/xstate: Expand the xstate buffer on the first use of dynamic user state

2021-03-30 Thread Len Brown
On Tue, Mar 30, 2021 at 4:28 AM Thomas Gleixner wrote: > > Len, > > On Mon, Mar 29 2021 at 18:16, Len Brown wrote: > > On Mon, Mar 29, 2021 at 2:49 PM Thomas Gleixner wrote: > > Let me know if this problem description is fair: > > > > Many-core Xeon servers will support AMX, and when I run an

Re: [PATCH v4 14/22] x86/fpu/xstate: Expand the xstate buffer on the first use of dynamic user state

2021-03-30 Thread Thomas Gleixner
Len, On Mon, Mar 29 2021 at 18:16, Len Brown wrote: > On Mon, Mar 29, 2021 at 2:49 PM Thomas Gleixner wrote: > Let me know if this problem description is fair: > > Many-core Xeon servers will support AMX, and when I run an AMX application > on one, when I take an interrupt with AMX INIT=0, Linux

Re: [PATCH v4 14/22] x86/fpu/xstate: Expand the xstate buffer on the first use of dynamic user state

2021-03-29 Thread Len Brown
On Mon, Mar 29, 2021 at 2:49 PM Thomas Gleixner wrote: > According to documentation it is irrelevant whether AMX usage is > disabled via XCR0, CR4.OSXSAVE or XFD[18]. In any case the effect of > AMX INIT=0 will prevent C6. > > As I explained in great length there are enough ways to get into a >

Re: [PATCH v4 14/22] x86/fpu/xstate: Expand the xstate buffer on the first use of dynamic user state

2021-03-29 Thread Len Brown
On Mon, Mar 29, 2021 at 1:43 PM Andy Lutomirski wrote: > > *switching* XCR0 on context switch is slow, but perfectly legal. > > How slow is it? And how slow is switching XFD? XFD is definitely > serializing? XCR0 writes in a VM guest cause a VMEXIT.. XCR writes in a VM guest do not. I will

Re: [PATCH v4 14/22] x86/fpu/xstate: Expand the xstate buffer on the first use of dynamic user state

2021-03-29 Thread Thomas Gleixner
On Mon, Mar 29 2021 at 11:43, Len Brown wrote: > On Mon, Mar 29, 2021 at 9:33 AM Thomas Gleixner wrote: > But yes, if a bare metal OS doesn't support any threading libraries > that query XCR0 with xgetbv, and they don't care about the performance > impact of switching XCR0, they could choose to

Re: [PATCH v4 14/22] x86/fpu/xstate: Expand the xstate buffer on the first use of dynamic user state

2021-03-29 Thread Bae, Chang Seok
On Mar 26, 2021, at 09:34, Jann Horn wrote: > On Sun, Feb 21, 2021 at 7:56 PM Chang S. Bae wrote: >> >> + if (handle_xfirstuse_event(>thread.fpu)) >> + return; > > What happens if handle_xfirstuse_event() fails because vmalloc() > failed in alloc_xstate_buffer()? I think

Re: [PATCH v4 14/22] x86/fpu/xstate: Expand the xstate buffer on the first use of dynamic user state

2021-03-29 Thread Andy Lutomirski
> On Mar 29, 2021, at 9:06 AM, Len Brown wrote: > > On Mon, Mar 29, 2021 at 11:43 AM Len Brown wrote: >> >> On Mon, Mar 29, 2021 at 9:33 AM Thomas Gleixner wrote: >> I found the author of this passage, and he agreed to revise it to say this was targeted primarily at VMMs. >>>

Re: [PATCH v4 14/22] x86/fpu/xstate: Expand the xstate buffer on the first use of dynamic user state

2021-03-29 Thread Len Brown
On Mon, Mar 29, 2021 at 11:43 AM Len Brown wrote: > > On Mon, Mar 29, 2021 at 9:33 AM Thomas Gleixner wrote: > > > > I found the author of this passage, and he agreed to revise it to say this > > > was targeted primarily at VMMs. > > > > Why would this only a problem for VMMs? > > VMMs may have

Re: [PATCH v4 14/22] x86/fpu/xstate: Expand the xstate buffer on the first use of dynamic user state

2021-03-29 Thread Len Brown
On Mon, Mar 29, 2021 at 9:33 AM Thomas Gleixner wrote: > > I found the author of this passage, and he agreed to revise it to say this > > was targeted primarily at VMMs. > > Why would this only a problem for VMMs? VMMs may have to emulate different hardware for different guest OS's, and they

Re: [PATCH v4 14/22] x86/fpu/xstate: Expand the xstate buffer on the first use of dynamic user state

2021-03-29 Thread Thomas Gleixner
On Mon, Mar 29 2021 at 09:14, Len Brown wrote: > On Sat, Mar 20, 2021 at 6:14 PM Thomas Gleixner wrote: >> >> On Sun, Feb 21 2021 at 10:56, Chang S. Bae wrote: >> > + >> > +/* Update MSR IA32_XFD with xfirstuse_not_detected() if needed. */ >> > +static inline void xdisable_switch(struct fpu

Re: [PATCH v4 14/22] x86/fpu/xstate: Expand the xstate buffer on the first use of dynamic user state

2021-03-29 Thread Len Brown
On Sat, Mar 20, 2021 at 6:14 PM Thomas Gleixner wrote: > > On Sun, Feb 21 2021 at 10:56, Chang S. Bae wrote: > > + > > +/* Update MSR IA32_XFD with xfirstuse_not_detected() if needed. */ > > +static inline void xdisable_switch(struct fpu *prev, struct fpu *next) > > +{ > > + if

Re: [PATCH v4 14/22] x86/fpu/xstate: Expand the xstate buffer on the first use of dynamic user state

2021-03-26 Thread Jann Horn
On Sun, Feb 21, 2021 at 7:56 PM Chang S. Bae wrote: > Intel's Extended Feature Disable (XFD) feature is an extension of the XSAVE > architecture. XFD allows the kernel to enable a feature state in XCR0 and > to receive a #NM trap when a task uses instructions accessing that state. > In this way,

Re: [PATCH v4 14/22] x86/fpu/xstate: Expand the xstate buffer on the first use of dynamic user state

2021-03-25 Thread Liu, Jing2
For AMX, we must still reserve the space, but we are not going to write zeros for clean state. We so this in software by checking XINUSE=0, and clearing the xstate_bf for the XSAVE. As a result, for XINUSE=0, we can skip writing the zeros, even though we can't compress the space. So my

Re: [PATCH v4 14/22] x86/fpu/xstate: Expand the xstate buffer on the first use of dynamic user state

2021-03-25 Thread Bae, Chang Seok
On Mar 24, 2021, at 22:12, Liu, Jing2 wrote: > On 3/25/2021 5:09 AM, Len Brown wrote: >> >> For AMX, we must still reserve the space, but we are not going to write zeros >> for clean state. We so this in software by checking XINUSE=0, and clearing >> the xstate_bf for the XSAVE. As a result,

Re: [PATCH v4 14/22] x86/fpu/xstate: Expand the xstate buffer on the first use of dynamic user state

2021-03-24 Thread Liu, Jing2
On 3/25/2021 5:09 AM, Len Brown wrote: On Tue, Mar 23, 2021 at 11:15 PM Liu, Jing2 wrote: IMO, the problem with AVX512 state is that we guaranteed it will be zero for XINUSE=0. That means we have to write 0's on saves. why "we have to write 0's on saves" when XINUSE=0. Since due to SDM,

Re: [PATCH v4 14/22] x86/fpu/xstate: Expand the xstate buffer on the first use of dynamic user state

2021-03-24 Thread Andy Lutomirski
> On Mar 24, 2021, at 2:58 PM, Dave Hansen wrote: > > On 3/24/21 2:42 PM, Andy Lutomirski wrote: > 3. user space always uses fully uncompacted XSAVE buffers. > There is no reason we have to do this for new states. Arguably we shouldn’t for AMX to avoid yet another altstack

Re: [PATCH v4 14/22] x86/fpu/xstate: Expand the xstate buffer on the first use of dynamic user state

2021-03-24 Thread Dave Hansen
On 3/24/21 2:42 PM, Andy Lutomirski wrote: 3. user space always uses fully uncompacted XSAVE buffers. >>> There is no reason we have to do this for new states. Arguably we >>> shouldn’t for AMX to avoid yet another altstack explosion. >> The thing that's worried me is that the list of

Re: [PATCH v4 14/22] x86/fpu/xstate: Expand the xstate buffer on the first use of dynamic user state

2021-03-24 Thread Andy Lutomirski
> On Mar 24, 2021, at 2:30 PM, Dave Hansen wrote: > > On 3/24/21 2:26 PM, Andy Lutomirski wrote: >>> 3. user space always uses fully uncompacted XSAVE buffers. >>> >> There is no reason we have to do this for new states. Arguably we >> shouldn’t for AMX to avoid yet another altstack

Re: [PATCH v4 14/22] x86/fpu/xstate: Expand the xstate buffer on the first use of dynamic user state

2021-03-24 Thread Dave Hansen
On 3/24/21 2:26 PM, Andy Lutomirski wrote: >> 3. user space always uses fully uncompacted XSAVE buffers. >> > There is no reason we have to do this for new states. Arguably we > shouldn’t for AMX to avoid yet another altstack explosion. The thing that's worried me is that the list of OS-enabled

Re: [PATCH v4 14/22] x86/fpu/xstate: Expand the xstate buffer on the first use of dynamic user state

2021-03-24 Thread Andy Lutomirski
> On Mar 24, 2021, at 2:09 PM, Len Brown wrote: > > On Tue, Mar 23, 2021 at 11:15 PM Liu, Jing2 > wrote: > >>> IMO, the problem with AVX512 state >>> is that we guaranteed it will be zero for XINUSE=0. >>> That means we have to write 0's on saves. > >> why "we have to write 0's on

Re: [PATCH v4 14/22] x86/fpu/xstate: Expand the xstate buffer on the first use of dynamic user state

2021-03-24 Thread Len Brown
On Tue, Mar 23, 2021 at 11:15 PM Liu, Jing2 wrote: > > IMO, the problem with AVX512 state > > is that we guaranteed it will be zero for XINUSE=0. > > That means we have to write 0's on saves. > why "we have to write 0's on saves" when XINUSE=0. > > Since due to SDM, if XINUSE=0, the XSAVES will

Re: [PATCH v4 14/22] x86/fpu/xstate: Expand the xstate buffer on the first use of dynamic user state

2021-03-24 Thread Dave Hansen
On 3/23/21 2:52 PM, Bae, Chang Seok wrote: >> "System software may disable use of Intel AMX by clearing XCR0[18:17], by >> clearing CR4.OSXSAVE, or by setting IA32_XFD[18]. It is recommended that >> system software initialize AMX state (e.g., by executing TILERELEASE) >> before doing so.

Re: [PATCH v4 14/22] x86/fpu/xstate: Expand the xstate buffer on the first use of dynamic user state

2021-03-23 Thread Liu, Jing2
On 3/24/2021 5:01 AM, Len Brown wrote: I have an obnoxious question: do we really want to use the XFD mechanism? Obnoxious questions are often the most valuable! :-) [...] cheers, Len Brown, Intel Open Source Technology Center ps. I agree that un-necessary XINUSE=1 is possible.

Re: [PATCH v4 14/22] x86/fpu/xstate: Expand the xstate buffer on the first use of dynamic user state

2021-03-23 Thread Bae, Chang Seok
On Mar 20, 2021, at 15:13, Thomas Gleixner wrote: > On Sun, Feb 21 2021 at 10:56, Chang S. Bae wrote: >> + >> +/* Update MSR IA32_XFD with xfirstuse_not_detected() if needed. */ >> +static inline void xdisable_switch(struct fpu *prev, struct fpu *next) >> +{ >> +if

Re: [PATCH v4 14/22] x86/fpu/xstate: Expand the xstate buffer on the first use of dynamic user state

2021-03-23 Thread Len Brown
> I have an obnoxious question: do we really want to use the XFD mechanism? Obnoxious questions are often the most valuable! :-) > Right now, glibc, and hence most user space code, blindly uses > whatever random CPU features are present for no particularly good > reason, which means that all

Re: [PATCH v4 14/22] x86/fpu/xstate: Expand the xstate buffer on the first use of dynamic user state

2021-03-20 Thread Andy Lutomirski
On Sat, Mar 20, 2021 at 3:13 PM Thomas Gleixner wrote: > > On Sun, Feb 21 2021 at 10:56, Chang S. Bae wrote: > > + > > +/* Update MSR IA32_XFD with xfirstuse_not_detected() if needed. */ > > +static inline void xdisable_switch(struct fpu *prev, struct fpu *next) > > +{ > > + if

Re: [PATCH v4 14/22] x86/fpu/xstate: Expand the xstate buffer on the first use of dynamic user state

2021-03-20 Thread Thomas Gleixner
On Sun, Feb 21 2021 at 10:56, Chang S. Bae wrote: > + > +/* Update MSR IA32_XFD with xfirstuse_not_detected() if needed. */ > +static inline void xdisable_switch(struct fpu *prev, struct fpu *next) > +{ > + if (!static_cpu_has(X86_FEATURE_XFD) || !xfirstuse_enabled()) > + return; >

[PATCH v4 14/22] x86/fpu/xstate: Expand the xstate buffer on the first use of dynamic user state

2021-02-21 Thread Chang S. Bae
Intel's Extended Feature Disable (XFD) feature is an extension of the XSAVE architecture. XFD allows the kernel to enable a feature state in XCR0 and to receive a #NM trap when a task uses instructions accessing that state. In this way, Linux can defer allocating the large XSAVE buffer until tasks