On Tue, Mar 30, 2021 at 4:28 AM Thomas Gleixner wrote:
>
> Len,
>
> On Mon, Mar 29 2021 at 18:16, Len Brown wrote:
> > On Mon, Mar 29, 2021 at 2:49 PM Thomas Gleixner wrote:
> > Let me know if this problem description is fair:
> >
> > Many-core Xeon servers will support AMX, and when I run an AMX
Len,
On Mon, Mar 29 2021 at 18:16, Len Brown wrote:
> On Mon, Mar 29, 2021 at 2:49 PM Thomas Gleixner wrote:
> Let me know if this problem description is fair:
>
> Many-core Xeon servers will support AMX, and when I run an AMX application
> on one, when I take an interrupt with AMX INIT=0, Linux
On Mon, Mar 29, 2021 at 2:49 PM Thomas Gleixner wrote:
> According to documentation it is irrelevant whether AMX usage is
> disabled via XCR0, CR4.OSXSAVE or XFD[18]. In any case the effect of
> AMX INIT=0 will prevent C6.
>
> As I explained in great length there are enough ways to get into a
> s
On Mon, Mar 29, 2021 at 1:43 PM Andy Lutomirski wrote:
> > *switching* XCR0 on context switch is slow, but perfectly legal.
>
> How slow is it? And how slow is switching XFD? XFD is definitely
> serializing?
XCR0 writes in a VM guest cause a VMEXIT..
XCR writes in a VM guest do not.
I will f
On Mon, Mar 29 2021 at 11:43, Len Brown wrote:
> On Mon, Mar 29, 2021 at 9:33 AM Thomas Gleixner wrote:
> But yes, if a bare metal OS doesn't support any threading libraries
> that query XCR0 with xgetbv, and they don't care about the performance
> impact of switching XCR0, they could choose to sw
On Mar 26, 2021, at 09:34, Jann Horn wrote:
> On Sun, Feb 21, 2021 at 7:56 PM Chang S. Bae wrote:
>>
>> + if (handle_xfirstuse_event(¤t->thread.fpu))
>> + return;
>
> What happens if handle_xfirstuse_event() fails because vmalloc()
> failed in alloc_xstate_buffer()? I think
> On Mar 29, 2021, at 9:06 AM, Len Brown wrote:
>
> On Mon, Mar 29, 2021 at 11:43 AM Len Brown wrote:
>>
>> On Mon, Mar 29, 2021 at 9:33 AM Thomas Gleixner wrote:
>>
I found the author of this passage, and he agreed to revise it to say this
was targeted primarily at VMMs.
>>>
>>
On Mon, Mar 29, 2021 at 11:43 AM Len Brown wrote:
>
> On Mon, Mar 29, 2021 at 9:33 AM Thomas Gleixner wrote:
>
> > > I found the author of this passage, and he agreed to revise it to say this
> > > was targeted primarily at VMMs.
> >
> > Why would this only a problem for VMMs?
>
> VMMs may have t
On Mon, Mar 29, 2021 at 9:33 AM Thomas Gleixner wrote:
> > I found the author of this passage, and he agreed to revise it to say this
> > was targeted primarily at VMMs.
>
> Why would this only a problem for VMMs?
VMMs may have to emulate different hardware for different guest OS's,
and they wou
On Mon, Mar 29 2021 at 09:14, Len Brown wrote:
> On Sat, Mar 20, 2021 at 6:14 PM Thomas Gleixner wrote:
>>
>> On Sun, Feb 21 2021 at 10:56, Chang S. Bae wrote:
>> > +
>> > +/* Update MSR IA32_XFD with xfirstuse_not_detected() if needed. */
>> > +static inline void xdisable_switch(struct fpu *prev,
On Sat, Mar 20, 2021 at 6:14 PM Thomas Gleixner wrote:
>
> On Sun, Feb 21 2021 at 10:56, Chang S. Bae wrote:
> > +
> > +/* Update MSR IA32_XFD with xfirstuse_not_detected() if needed. */
> > +static inline void xdisable_switch(struct fpu *prev, struct fpu *next)
> > +{
> > + if (!static_cpu_ha
On Sun, Feb 21, 2021 at 7:56 PM Chang S. Bae wrote:
> Intel's Extended Feature Disable (XFD) feature is an extension of the XSAVE
> architecture. XFD allows the kernel to enable a feature state in XCR0 and
> to receive a #NM trap when a task uses instructions accessing that state.
> In this way, L
For AMX, we must still reserve the space, but we are not going to write zeros
for clean state. We so this in software by checking XINUSE=0, and clearing
the xstate_bf for the XSAVE. As a result, for XINUSE=0, we can skip
writing the zeros, even though we can't compress the space.
So my unde
On Mar 24, 2021, at 22:12, Liu, Jing2 wrote:
> On 3/25/2021 5:09 AM, Len Brown wrote:
>>
>> For AMX, we must still reserve the space, but we are not going to write zeros
>> for clean state. We so this in software by checking XINUSE=0, and clearing
>> the xstate_bf for the XSAVE. As a result, fo
On 3/25/2021 5:09 AM, Len Brown wrote:
On Tue, Mar 23, 2021 at 11:15 PM Liu, Jing2 wrote:
IMO, the problem with AVX512 state
is that we guaranteed it will be zero for XINUSE=0.
That means we have to write 0's on saves.
why "we have to write 0's on saves" when XINUSE=0.
Since due to SDM, i
> On Mar 24, 2021, at 2:58 PM, Dave Hansen wrote:
>
> On 3/24/21 2:42 PM, Andy Lutomirski wrote:
> 3. user space always uses fully uncompacted XSAVE buffers.
>
There is no reason we have to do this for new states. Arguably we
shouldn’t for AMX to avoid yet another altstack e
On 3/24/21 2:42 PM, Andy Lutomirski wrote:
3. user space always uses fully uncompacted XSAVE buffers.
>>> There is no reason we have to do this for new states. Arguably we
>>> shouldn’t for AMX to avoid yet another altstack explosion.
>> The thing that's worried me is that the list of OS-
> On Mar 24, 2021, at 2:30 PM, Dave Hansen wrote:
>
> On 3/24/21 2:26 PM, Andy Lutomirski wrote:
>>> 3. user space always uses fully uncompacted XSAVE buffers.
>>>
>> There is no reason we have to do this for new states. Arguably we
>> shouldn’t for AMX to avoid yet another altstack explosion
On 3/24/21 2:26 PM, Andy Lutomirski wrote:
>> 3. user space always uses fully uncompacted XSAVE buffers.
>>
> There is no reason we have to do this for new states. Arguably we
> shouldn’t for AMX to avoid yet another altstack explosion.
The thing that's worried me is that the list of OS-enabled s
> On Mar 24, 2021, at 2:09 PM, Len Brown wrote:
>
> On Tue, Mar 23, 2021 at 11:15 PM Liu, Jing2
> wrote:
>
>>> IMO, the problem with AVX512 state
>>> is that we guaranteed it will be zero for XINUSE=0.
>>> That means we have to write 0's on saves.
>
>> why "we have to write 0's on saves"
On Tue, Mar 23, 2021 at 11:15 PM Liu, Jing2 wrote:
> > IMO, the problem with AVX512 state
> > is that we guaranteed it will be zero for XINUSE=0.
> > That means we have to write 0's on saves.
> why "we have to write 0's on saves" when XINUSE=0.
>
> Since due to SDM, if XINUSE=0, the XSAVES will
On 3/23/21 2:52 PM, Bae, Chang Seok wrote:
>> "System software may disable use of Intel AMX by clearing XCR0[18:17], by
>> clearing CR4.OSXSAVE, or by setting IA32_XFD[18]. It is recommended that
>> system software initialize AMX state (e.g., by executing TILERELEASE)
>> before doing so. Thi
On 3/24/2021 5:01 AM, Len Brown wrote:
I have an obnoxious question: do we really want to use the XFD mechanism?
Obnoxious questions are often the most valuable! :-)
[...]
cheers,
Len Brown, Intel Open Source Technology Center
ps. I agree that un-necessary XINUSE=1 is possible.
Notwithstand
On Mar 20, 2021, at 15:13, Thomas Gleixner wrote:
> On Sun, Feb 21 2021 at 10:56, Chang S. Bae wrote:
>> +
>> +/* Update MSR IA32_XFD with xfirstuse_not_detected() if needed. */
>> +static inline void xdisable_switch(struct fpu *prev, struct fpu *next)
>> +{
>> +if (!static_cpu_has(X86_FEATURE
> I have an obnoxious question: do we really want to use the XFD mechanism?
Obnoxious questions are often the most valuable! :-)
> Right now, glibc, and hence most user space code, blindly uses
> whatever random CPU features are present for no particularly good
> reason, which means that all thes
On Sat, Mar 20, 2021 at 3:13 PM Thomas Gleixner wrote:
>
> On Sun, Feb 21 2021 at 10:56, Chang S. Bae wrote:
> > +
> > +/* Update MSR IA32_XFD with xfirstuse_not_detected() if needed. */
> > +static inline void xdisable_switch(struct fpu *prev, struct fpu *next)
> > +{
> > + if (!static_cpu_ha
On Sun, Feb 21 2021 at 10:56, Chang S. Bae wrote:
> +
> +/* Update MSR IA32_XFD with xfirstuse_not_detected() if needed. */
> +static inline void xdisable_switch(struct fpu *prev, struct fpu *next)
> +{
> + if (!static_cpu_has(X86_FEATURE_XFD) || !xfirstuse_enabled())
> + return;
>
Intel's Extended Feature Disable (XFD) feature is an extension of the XSAVE
architecture. XFD allows the kernel to enable a feature state in XCR0 and
to receive a #NM trap when a task uses instructions accessing that state.
In this way, Linux can defer allocating the large XSAVE buffer until tasks
28 matches
Mail list logo