Re: [EXTERNAL] Re: ia-32/ia-64 fxsave64 instruction behavior when saving mmx

2020-07-31 Thread Robert Henry
On 7/31/20 1:34 PM, Eduardo Habkost wrote:
> On Mon, Jun 01, 2020 at 08:19:51AM +0200, Philippe Mathieu-Daudé wrote:
>> Hi Robert.
>>
>> Top-posting is difficult to read on technical lists,
>> it's better to reply inline.
>>
>> Cc'ing the X86 FPU maintainers:
>>
>> ./scripts/get_maintainer.pl -f target/i386/fpu_helper.c
>> Paolo Bonzini  (maintainer:X86 TCG CPUs)
>> Richard Henderson  (maintainer:X86 TCG CPUs)
>> Eduardo Habkost  (maintainer:X86 TCG CPUs)
>>
>> On 6/1/20 1:22 AM, Robert Henry wrote:
>>> Here's additional information.
>>>
>>> All of the remill tests of the legacy MMX instructions fail. These
>>> instructions work on 64-bit registers aliased with the lower 64-bits of
>>> the x87 fp80 registers. Ã, The tests fail because remill expects the
>>> fxsave64 instruction to deliver 16 bits of 1's (infinity or nan prefix)
>>> in the fp80 exponent, eg bits 79:64. Ã, Metal does this, but QEMU does not.
>> Metal is what matters, QEMU should emulate it when possible.
>>
>>> Reading of Intel Software development manual, table 3.44
>>> (https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.felixcloutier.com%2Fx86%2FFXSAVE.html%23tbl-3-44&data=02%7C01%7Crobhenry%40microsoft.com%7C56b85c84b7234f16f07c08d8359125bd%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637318245390249551&sdata=8CFUNe%2F%2BbLyukyV6BmmTBFyoV%2B3pTFh%2Bg5QAs3ccLbc%3D&reserved=0)
>>>  says these 16
>>> bits are reserved, but another version of the manual
>>> (https://nam06.safelinks.protection.outlook.com/?url=http%3A%2F%2Fmath-atlas.sourceforge.net%2Fdevel%2Farch%2Fia32_arch.pdf&data=02%7C01%7Crobhenry%40microsoft.com%7C56b85c84b7234f16f07c08d8359125bd%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637318245390249551&sdata=2SiKu1cx4SVhwzzSbj7hgz%2B8ICCZHDnu0npUs9yMJLA%3D&reserved=0)
>>>  section
>>> 9.6.2 "Transitions between x87 fpu and mmx code" says a write to an MMX
>>> register sets those 16 bits to all 1s.
>> You are [1] here answering [2] you asked below.
>>
>>> In digging through the code for the implementation of the SSE/mmx
>>> instruction pavgb I see a nice clean implementation in the SSE_HELPER_B
>>> macro which takes a MMXREG which is an MMREG_UNION which does not
>>> provide, to the extent that I can figure this out, a handle to bits
>>> 79:64 of the aliased-with x87 register.
>>>
>>> I find it hard to believe that an apparent bug like this has been here
>>> "forever". Am I missing something?
>> Likely the developer who implemented this code didn't have all the
>> information you found, nor the test-suite, and eventually not even the
>> hardware to compare.
>>
>> Since you have a good understanding of Intel FPU and hardware to
>> compare, do you mind sending a patch to have QEMU emulate the correct
>> hardware behavior?
>>
>> If possible add a test case to tests/tcg/i386/test-i386.c (see
>> test_fxsave there).
> Was this issue addressed, or does it remain unfixed?  I remember
> seeing x86 FPU patches merged recently, but I don't know if they
> were related to this.
>
I haven't done anything to address this issue.
>>> Robert Henry
>>> 
>>> *From:* Robert Henry
>>> *Sent:* Friday, May 29, 2020 10:38 AM
>>> *To:* qemu-devel@nongnu.org 
>>> *Subject:* ia-32/ia-64 fxsave64 instruction behavior when saving mmx
>>> Ã,Â
>>> Background: The ia-32/ia-64 fxsave64 instruction saves fp80 or legacy
>>> SSE mmx registers. The mmx registers are saved as if they were fp80
>>> values. The lower 64 bits of the constructed fp80 value is the mmx
>>> register.Ã,  The upper 16 bits of the constructed fp80 value are reserved;
>>> see the last row of table 3-44
>>> ofÃ, 
>>> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.felixcloutier.com%2Fx86%2Ffxsave%23tbl-3-44&data=02%7C01%7Crobhenry%40microsoft.com%7C56b85c84b7234f16f07c08d8359125bd%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637318245390249551&sdata=0CBE%2Btnm2b%2FJu9FjNHHjuh5vrYJ2MTfkitMApxXRSZQ%3D&reserved=0
>>>
>>> The Intel core i9-9980XE Skylake metal I have puts 0x into these
>>> reserved 16 bits when saving MMX.
>>>
>>> QEMU appears to put 0's there.
>>>
>>> Does anybody have insight as to what "reserved" really means, or must
>>> be, in this case?
>> You self-answered to this [2] in [1] earlier.
>>
>>> I take the verb "reserved" to mean something other
>>> than "undefined".
>>>
>>> I came across this issue when running the remill instruction test
>>> engine.Ã,  See my
>>> issueÃ, 
>>> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Flifting-bits%2Fremill%2Fissues%2F423%25C3%2583&data=02%7C01%7Crobhenry%40microsoft.com%7C56b85c84b7234f16f07c08d8359125bd%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637318245390249551&sdata=BjW7gZplqoUlKdpOT7dRnCvlzTrC4Vgpy%2BFf8bNpT0k%3D&reserved=0,Â
>>>  For better or
>>> worse, remill assumes that those bits are 0x, not 0x
>>>
>> Regards,
>>
>> Phil.
>>



Re: ia-32/ia-64 fxsave64 instruction behavior when saving mmx

2020-07-31 Thread Eduardo Habkost
On Mon, Jun 01, 2020 at 08:19:51AM +0200, Philippe Mathieu-Daudé wrote:
> Hi Robert.
> 
> Top-posting is difficult to read on technical lists,
> it's better to reply inline.
> 
> Cc'ing the X86 FPU maintainers:
> 
> ./scripts/get_maintainer.pl -f target/i386/fpu_helper.c
> Paolo Bonzini  (maintainer:X86 TCG CPUs)
> Richard Henderson  (maintainer:X86 TCG CPUs)
> Eduardo Habkost  (maintainer:X86 TCG CPUs)
> 
> On 6/1/20 1:22 AM, Robert Henry wrote:
> > Here's additional information.
> > 
> > All of the remill tests of the legacy MMX instructions fail. These
> > instructions work on 64-bit registers aliased with the lower 64-bits of
> > the x87 fp80 registers.  The tests fail because remill expects the
> > fxsave64 instruction to deliver 16 bits of 1's (infinity or nan prefix)
> > in the fp80 exponent, eg bits 79:64.  Metal does this, but QEMU does not.
> 
> Metal is what matters, QEMU should emulate it when possible.
> 
> > 
> > Reading of Intel Software development manual, table 3.44
> > (https://www.felixcloutier.com/x86/FXSAVE.html#tbl-3-44) says these 16
> > bits are reserved, but another version of the manual
> > (http://math-atlas.sourceforge.net/devel/arch/ia32_arch.pdf) section
> > 9.6.2 "Transitions between x87 fpu and mmx code" says a write to an MMX
> > register sets those 16 bits to all 1s.
> 
> You are [1] here answering [2] you asked below.
> 
> > 
> > In digging through the code for the implementation of the SSE/mmx
> > instruction pavgb I see a nice clean implementation in the SSE_HELPER_B
> > macro which takes a MMXREG which is an MMREG_UNION which does not
> > provide, to the extent that I can figure this out, a handle to bits
> > 79:64 of the aliased-with x87 register.
> > 
> > I find it hard to believe that an apparent bug like this has been here
> > "forever". Am I missing something?
> 
> Likely the developer who implemented this code didn't have all the
> information you found, nor the test-suite, and eventually not even the
> hardware to compare.
> 
> Since you have a good understanding of Intel FPU and hardware to
> compare, do you mind sending a patch to have QEMU emulate the correct
> hardware behavior?
> 
> If possible add a test case to tests/tcg/i386/test-i386.c (see
> test_fxsave there).

Was this issue addressed, or does it remain unfixed?  I remember
seeing x86 FPU patches merged recently, but I don't know if they
were related to this.

> 
> > 
> > Robert Henry
> > 
> > *From:* Robert Henry
> > *Sent:* Friday, May 29, 2020 10:38 AM
> > *To:* qemu-devel@nongnu.org 
> > *Subject:* ia-32/ia-64 fxsave64 instruction behavior when saving mmx
> >  
> > Background: The ia-32/ia-64 fxsave64 instruction saves fp80 or legacy
> > SSE mmx registers. The mmx registers are saved as if they were fp80
> > values. The lower 64 bits of the constructed fp80 value is the mmx
> > register.  The upper 16 bits of the constructed fp80 value are reserved;
> > see the last row of table 3-44
> > of https://www.felixcloutier.com/x86/fxsave#tbl-3-44
> > 
> > The Intel core i9-9980XE Skylake metal I have puts 0x into these
> > reserved 16 bits when saving MMX.
> > 
> > QEMU appears to put 0's there.
> > 
> > Does anybody have insight as to what "reserved" really means, or must
> > be, in this case?
> 
> You self-answered to this [2] in [1] earlier.
> 
> > I take the verb "reserved" to mean something other
> > than "undefined".
> > 
> > I came across this issue when running the remill instruction test
> > engine.  See my
> > issue https://github.com/lifting-bits/remill/issues/423 For better or
> > worse, remill assumes that those bits are 0x, not 0x
> > 
> 
> Regards,
> 
> Phil.
> 

-- 
Eduardo




Re: ia-32/ia-64 fxsave64 instruction behavior when saving mmx

2020-05-31 Thread Philippe Mathieu-Daudé
Hi Robert.

Top-posting is difficult to read on technical lists,
it's better to reply inline.

Cc'ing the X86 FPU maintainers:

./scripts/get_maintainer.pl -f target/i386/fpu_helper.c
Paolo Bonzini  (maintainer:X86 TCG CPUs)
Richard Henderson  (maintainer:X86 TCG CPUs)
Eduardo Habkost  (maintainer:X86 TCG CPUs)

On 6/1/20 1:22 AM, Robert Henry wrote:
> Here's additional information.
> 
> All of the remill tests of the legacy MMX instructions fail. These
> instructions work on 64-bit registers aliased with the lower 64-bits of
> the x87 fp80 registers.  The tests fail because remill expects the
> fxsave64 instruction to deliver 16 bits of 1's (infinity or nan prefix)
> in the fp80 exponent, eg bits 79:64.  Metal does this, but QEMU does not.

Metal is what matters, QEMU should emulate it when possible.

> 
> Reading of Intel Software development manual, table 3.44
> (https://www.felixcloutier.com/x86/FXSAVE.html#tbl-3-44) says these 16
> bits are reserved, but another version of the manual
> (http://math-atlas.sourceforge.net/devel/arch/ia32_arch.pdf) section
> 9.6.2 "Transitions between x87 fpu and mmx code" says a write to an MMX
> register sets those 16 bits to all 1s.

You are [1] here answering [2] you asked below.

> 
> In digging through the code for the implementation of the SSE/mmx
> instruction pavgb I see a nice clean implementation in the SSE_HELPER_B
> macro which takes a MMXREG which is an MMREG_UNION which does not
> provide, to the extent that I can figure this out, a handle to bits
> 79:64 of the aliased-with x87 register.
> 
> I find it hard to believe that an apparent bug like this has been here
> "forever". Am I missing something?

Likely the developer who implemented this code didn't have all the
information you found, nor the test-suite, and eventually not even the
hardware to compare.

Since you have a good understanding of Intel FPU and hardware to
compare, do you mind sending a patch to have QEMU emulate the correct
hardware behavior?

If possible add a test case to tests/tcg/i386/test-i386.c (see
test_fxsave there).

> 
> Robert Henry
> ----------------
> *From:* Robert Henry
> *Sent:* Friday, May 29, 2020 10:38 AM
> *To:* qemu-devel@nongnu.org 
> *Subject:* ia-32/ia-64 fxsave64 instruction behavior when saving mmx
>  
> Background: The ia-32/ia-64 fxsave64 instruction saves fp80 or legacy
> SSE mmx registers. The mmx registers are saved as if they were fp80
> values. The lower 64 bits of the constructed fp80 value is the mmx
> register.  The upper 16 bits of the constructed fp80 value are reserved;
> see the last row of table 3-44
> of https://www.felixcloutier.com/x86/fxsave#tbl-3-44
> 
> The Intel core i9-9980XE Skylake metal I have puts 0x into these
> reserved 16 bits when saving MMX.
> 
> QEMU appears to put 0's there.
> 
> Does anybody have insight as to what "reserved" really means, or must
> be, in this case?

You self-answered to this [2] in [1] earlier.

> I take the verb "reserved" to mean something other
> than "undefined".
> 
> I came across this issue when running the remill instruction test
> engine.  See my
> issue https://github.com/lifting-bits/remill/issues/423 For better or
> worse, remill assumes that those bits are 0x, not 0x
> 

Regards,

Phil.




Re: ia-32/ia-64 fxsave64 instruction behavior when saving mmx

2020-05-31 Thread Robert Henry
Here's additional information.

All of the remill tests of the legacy MMX instructions fail. These instructions 
work on 64-bit registers aliased with the lower 64-bits of the x87 fp80 
registers.  The tests fail because remill expects the fxsave64 instruction to 
deliver 16 bits of 1's (infinity or nan prefix) in the fp80 exponent, eg bits 
79:64.  Metal does this, but QEMU does not.

Reading of Intel Software development manual, table 3.44 
(https://www.felixcloutier.com/x86/FXSAVE.html#tbl-3-44) says these 16 bits are 
reserved, but another version of the manual 
(http://math-atlas.sourceforge.net/devel/arch/ia32_arch.pdf) section 9.6.2 
"Transitions between x87 fpu and mmx code" says a write to an MMX register sets 
those 16 bits to all 1s.

In digging through the code for the implementation of the SSE/mmx instruction 
pavgb I see a nice clean implementation in the SSE_HELPER_B macro which takes a 
MMXREG which is an MMREG_UNION which does not provide, to the extent that I can 
figure this out, a handle to bits 79:64 of the aliased-with x87 register.

I find it hard to believe that an apparent bug like this has been here 
"forever". Am I missing something?

Robert Henry

From: Robert Henry
Sent: Friday, May 29, 2020 10:38 AM
To: qemu-devel@nongnu.org 
Subject: ia-32/ia-64 fxsave64 instruction behavior when saving mmx

Background: The ia-32/ia-64 fxsave64 instruction saves fp80 or legacy SSE mmx 
registers. The mmx registers are saved as if they were fp80 values. The lower 
64 bits of the constructed fp80 value is the mmx register.  The upper 16 bits 
of the constructed fp80 value are reserved; see the last row of table 3-44 of 
https://www.felixcloutier.com/x86/fxsave#tbl-3-44

The Intel core i9-9980XE Skylake metal I have puts 0x into these reserved 
16 bits when saving MMX.

QEMU appears to put 0's there.

Does anybody have insight as to what "reserved" really means, or must be, in 
this case?  I take the verb "reserved" to mean something other than "undefined".

I came across this issue when running the remill instruction test engine.  See 
my issue https://github.com/lifting-bits/remill/issues/423 For better or worse, 
remill assumes that those bits are 0x, not 0x



ia-32/ia-64 fxsave64 instruction behavior when saving mmx

2020-05-29 Thread Robert Henry
Background: The ia-32/ia-64 fxsave64 instruction saves fp80 or legacy SSE mmx 
registers. The mmx registers are saved as if they were fp80 values. The lower 
64 bits of the constructed fp80 value is the mmx register.  The upper 16 bits 
of the constructed fp80 value are reserved; see the last row of table 3-44 of 
https://www.felixcloutier.com/x86/fxsave#tbl-3-44

The Intel core i9-9980XE Skylake metal I have puts 0x into these reserved 
16 bits when saving MMX.

QEMU appears to put 0's there.

Does anybody have insight as to what "reserved" really means, or must be, in 
this case?  I take the verb "reserved" to mean something other than "undefined".

I came across this issue when running the remill instruction test engine.  See 
my issue https://github.com/lifting-bits/remill/issues/423 For better or worse, 
remill assumes that those bits are 0x, not 0x