Re: [EXTERNAL] Re: ia-32/ia-64 fxsave64 instruction behavior when saving mmx
On 7/31/20 1:34 PM, Eduardo Habkost wrote: > On Mon, Jun 01, 2020 at 08:19:51AM +0200, Philippe Mathieu-Daudé wrote: >> Hi Robert. >> >> Top-posting is difficult to read on technical lists, >> it's better to reply inline. >> >> Cc'ing the X86 FPU maintainers: >> >> ./scripts/get_maintainer.pl -f target/i386/fpu_helper.c >> Paolo Bonzini (maintainer:X86 TCG CPUs) >> Richard Henderson (maintainer:X86 TCG CPUs) >> Eduardo Habkost (maintainer:X86 TCG CPUs) >> >> On 6/1/20 1:22 AM, Robert Henry wrote: >>> Here's additional information. >>> >>> All of the remill tests of the legacy MMX instructions fail. These >>> instructions work on 64-bit registers aliased with the lower 64-bits of >>> the x87 fp80 registers. Ã, The tests fail because remill expects the >>> fxsave64 instruction to deliver 16 bits of 1's (infinity or nan prefix) >>> in the fp80 exponent, eg bits 79:64. Ã, Metal does this, but QEMU does not. >> Metal is what matters, QEMU should emulate it when possible. >> >>> Reading of Intel Software development manual, table 3.44 >>> (https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.felixcloutier.com%2Fx86%2FFXSAVE.html%23tbl-3-44&data=02%7C01%7Crobhenry%40microsoft.com%7C56b85c84b7234f16f07c08d8359125bd%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637318245390249551&sdata=8CFUNe%2F%2BbLyukyV6BmmTBFyoV%2B3pTFh%2Bg5QAs3ccLbc%3D&reserved=0) >>> says these 16 >>> bits are reserved, but another version of the manual >>> (https://nam06.safelinks.protection.outlook.com/?url=http%3A%2F%2Fmath-atlas.sourceforge.net%2Fdevel%2Farch%2Fia32_arch.pdf&data=02%7C01%7Crobhenry%40microsoft.com%7C56b85c84b7234f16f07c08d8359125bd%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637318245390249551&sdata=2SiKu1cx4SVhwzzSbj7hgz%2B8ICCZHDnu0npUs9yMJLA%3D&reserved=0) >>> section >>> 9.6.2 "Transitions between x87 fpu and mmx code" says a write to an MMX >>> register sets those 16 bits to all 1s. >> You are [1] here answering [2] you asked below. >> >>> In digging through the code for the implementation of the SSE/mmx >>> instruction pavgb I see a nice clean implementation in the SSE_HELPER_B >>> macro which takes a MMXREG which is an MMREG_UNION which does not >>> provide, to the extent that I can figure this out, a handle to bits >>> 79:64 of the aliased-with x87 register. >>> >>> I find it hard to believe that an apparent bug like this has been here >>> "forever". Am I missing something? >> Likely the developer who implemented this code didn't have all the >> information you found, nor the test-suite, and eventually not even the >> hardware to compare. >> >> Since you have a good understanding of Intel FPU and hardware to >> compare, do you mind sending a patch to have QEMU emulate the correct >> hardware behavior? >> >> If possible add a test case to tests/tcg/i386/test-i386.c (see >> test_fxsave there). > Was this issue addressed, or does it remain unfixed? I remember > seeing x86 FPU patches merged recently, but I don't know if they > were related to this. > I haven't done anything to address this issue. >>> Robert Henry >>> >>> *From:* Robert Henry >>> *Sent:* Friday, May 29, 2020 10:38 AM >>> *To:* qemu-devel@nongnu.org >>> *Subject:* ia-32/ia-64 fxsave64 instruction behavior when saving mmx >>> Ã, >>> Background: The ia-32/ia-64 fxsave64 instruction saves fp80 or legacy >>> SSE mmx registers. The mmx registers are saved as if they were fp80 >>> values. The lower 64 bits of the constructed fp80 value is the mmx >>> register.Ã, The upper 16 bits of the constructed fp80 value are reserved; >>> see the last row of table 3-44 >>> ofÃ, >>> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.felixcloutier.com%2Fx86%2Ffxsave%23tbl-3-44&data=02%7C01%7Crobhenry%40microsoft.com%7C56b85c84b7234f16f07c08d8359125bd%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637318245390249551&sdata=0CBE%2Btnm2b%2FJu9FjNHHjuh5vrYJ2MTfkitMApxXRSZQ%3D&reserved=0 >>> >>> The Intel core i9-9980XE Skylake metal I have puts 0x into these >>> reserved 16 bits when saving MMX. >>> >>> QEMU appears to put 0's there. >>> >>> Does anybody have insight as to what "reserved" really means, or must >>> be, in this case? >> You self-answered to this [2] in [1] earlier. >> >>> I take the verb "reserved" to mean something other >>> than "undefined". >>> >>> I came across this issue when running the remill instruction test >>> engine.Ã, See my >>> issueÃ, >>> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Flifting-bits%2Fremill%2Fissues%2F423%25C3%2583&data=02%7C01%7Crobhenry%40microsoft.com%7C56b85c84b7234f16f07c08d8359125bd%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637318245390249551&sdata=BjW7gZplqoUlKdpOT7dRnCvlzTrC4Vgpy%2BFf8bNpT0k%3D&reserved=0, >>> For better or >>> worse, remill assumes that those bits are 0x, not 0x >>> >> Regards, >> >> Phil. >>
Re: ia-32/ia-64 fxsave64 instruction behavior when saving mmx
On Mon, Jun 01, 2020 at 08:19:51AM +0200, Philippe Mathieu-Daudé wrote: > Hi Robert. > > Top-posting is difficult to read on technical lists, > it's better to reply inline. > > Cc'ing the X86 FPU maintainers: > > ./scripts/get_maintainer.pl -f target/i386/fpu_helper.c > Paolo Bonzini (maintainer:X86 TCG CPUs) > Richard Henderson (maintainer:X86 TCG CPUs) > Eduardo Habkost (maintainer:X86 TCG CPUs) > > On 6/1/20 1:22 AM, Robert Henry wrote: > > Here's additional information. > > > > All of the remill tests of the legacy MMX instructions fail. These > > instructions work on 64-bit registers aliased with the lower 64-bits of > > the x87 fp80 registers. àThe tests fail because remill expects the > > fxsave64 instruction to deliver 16 bits of 1's (infinity or nan prefix) > > in the fp80 exponent, eg bits 79:64. àMetal does this, but QEMU does not. > > Metal is what matters, QEMU should emulate it when possible. > > > > > Reading of Intel Software development manual, table 3.44 > > (https://www.felixcloutier.com/x86/FXSAVE.html#tbl-3-44) says these 16 > > bits are reserved, but another version of the manual > > (http://math-atlas.sourceforge.net/devel/arch/ia32_arch.pdf) section > > 9.6.2 "Transitions between x87 fpu and mmx code" says a write to an MMX > > register sets those 16 bits to all 1s. > > You are [1] here answering [2] you asked below. > > > > > In digging through the code for the implementation of the SSE/mmx > > instruction pavgb I see a nice clean implementation in the SSE_HELPER_B > > macro which takes a MMXREG which is an MMREG_UNION which does not > > provide, to the extent that I can figure this out, a handle to bits > > 79:64 of the aliased-with x87 register. > > > > I find it hard to believe that an apparent bug like this has been here > > "forever". Am I missing something? > > Likely the developer who implemented this code didn't have all the > information you found, nor the test-suite, and eventually not even the > hardware to compare. > > Since you have a good understanding of Intel FPU and hardware to > compare, do you mind sending a patch to have QEMU emulate the correct > hardware behavior? > > If possible add a test case to tests/tcg/i386/test-i386.c (see > test_fxsave there). Was this issue addressed, or does it remain unfixed? I remember seeing x86 FPU patches merged recently, but I don't know if they were related to this. > > > > > Robert Henry > > > > *From:* Robert Henry > > *Sent:* Friday, May 29, 2020 10:38 AM > > *To:* qemu-devel@nongnu.org > > *Subject:* ia-32/ia-64 fxsave64 instruction behavior when saving mmx > > à> > Background: The ia-32/ia-64 fxsave64 instruction saves fp80 or legacy > > SSE mmx registers. The mmx registers are saved as if they were fp80 > > values. The lower 64 bits of the constructed fp80 value is the mmx > > register.àThe upper 16 bits of the constructed fp80 value are reserved; > > see the last row of table 3-44 > > ofàhttps://www.felixcloutier.com/x86/fxsave#tbl-3-44 > > > > The Intel core i9-9980XE Skylake metal I have puts 0x into these > > reserved 16 bits when saving MMX. > > > > QEMU appears to put 0's there. > > > > Does anybody have insight as to what "reserved" really means, or must > > be, in this case? > > You self-answered to this [2] in [1] earlier. > > > I take the verb "reserved" to mean something other > > than "undefined". > > > > I came across this issue when running the remill instruction test > > engine.àSee my > > issueàhttps://github.com/lifting-bits/remill/issues/423àFor better or > > worse, remill assumes that those bits are 0x, not 0x > > > > Regards, > > Phil. > -- Eduardo
Re: ia-32/ia-64 fxsave64 instruction behavior when saving mmx
Hi Robert. Top-posting is difficult to read on technical lists, it's better to reply inline. Cc'ing the X86 FPU maintainers: ./scripts/get_maintainer.pl -f target/i386/fpu_helper.c Paolo Bonzini (maintainer:X86 TCG CPUs) Richard Henderson (maintainer:X86 TCG CPUs) Eduardo Habkost (maintainer:X86 TCG CPUs) On 6/1/20 1:22 AM, Robert Henry wrote: > Here's additional information. > > All of the remill tests of the legacy MMX instructions fail. These > instructions work on 64-bit registers aliased with the lower 64-bits of > the x87 fp80 registers.  The tests fail because remill expects the > fxsave64 instruction to deliver 16 bits of 1's (infinity or nan prefix) > in the fp80 exponent, eg bits 79:64.  Metal does this, but QEMU does not. Metal is what matters, QEMU should emulate it when possible. > > Reading of Intel Software development manual, table 3.44 > (https://www.felixcloutier.com/x86/FXSAVE.html#tbl-3-44) says these 16 > bits are reserved, but another version of the manual > (http://math-atlas.sourceforge.net/devel/arch/ia32_arch.pdf) section > 9.6.2 "Transitions between x87 fpu and mmx code" says a write to an MMX > register sets those 16 bits to all 1s. You are [1] here answering [2] you asked below. > > In digging through the code for the implementation of the SSE/mmx > instruction pavgb I see a nice clean implementation in the SSE_HELPER_B > macro which takes a MMXREG which is an MMREG_UNION which does not > provide, to the extent that I can figure this out, a handle to bits > 79:64 of the aliased-with x87 register. > > I find it hard to believe that an apparent bug like this has been here > "forever". Am I missing something? Likely the developer who implemented this code didn't have all the information you found, nor the test-suite, and eventually not even the hardware to compare. Since you have a good understanding of Intel FPU and hardware to compare, do you mind sending a patch to have QEMU emulate the correct hardware behavior? If possible add a test case to tests/tcg/i386/test-i386.c (see test_fxsave there). > > Robert Henry > ---------------- > *From:* Robert Henry > *Sent:* Friday, May 29, 2020 10:38 AM > *To:* qemu-devel@nongnu.org > *Subject:* ia-32/ia-64 fxsave64 instruction behavior when saving mmx >  > Background: The ia-32/ia-64 fxsave64 instruction saves fp80 or legacy > SSE mmx registers. The mmx registers are saved as if they were fp80 > values. The lower 64 bits of the constructed fp80 value is the mmx > register. The upper 16 bits of the constructed fp80 value are reserved; > see the last row of table 3-44 > of https://www.felixcloutier.com/x86/fxsave#tbl-3-44 > > The Intel core i9-9980XE Skylake metal I have puts 0x into these > reserved 16 bits when saving MMX. > > QEMU appears to put 0's there. > > Does anybody have insight as to what "reserved" really means, or must > be, in this case? You self-answered to this [2] in [1] earlier. > I take the verb "reserved" to mean something other > than "undefined". > > I came across this issue when running the remill instruction test > engine. See my > issue https://github.com/lifting-bits/remill/issues/423 For better or > worse, remill assumes that those bits are 0x, not 0x > Regards, Phil.
Re: ia-32/ia-64 fxsave64 instruction behavior when saving mmx
Here's additional information. All of the remill tests of the legacy MMX instructions fail. These instructions work on 64-bit registers aliased with the lower 64-bits of the x87 fp80 registers. The tests fail because remill expects the fxsave64 instruction to deliver 16 bits of 1's (infinity or nan prefix) in the fp80 exponent, eg bits 79:64. Metal does this, but QEMU does not. Reading of Intel Software development manual, table 3.44 (https://www.felixcloutier.com/x86/FXSAVE.html#tbl-3-44) says these 16 bits are reserved, but another version of the manual (http://math-atlas.sourceforge.net/devel/arch/ia32_arch.pdf) section 9.6.2 "Transitions between x87 fpu and mmx code" says a write to an MMX register sets those 16 bits to all 1s. In digging through the code for the implementation of the SSE/mmx instruction pavgb I see a nice clean implementation in the SSE_HELPER_B macro which takes a MMXREG which is an MMREG_UNION which does not provide, to the extent that I can figure this out, a handle to bits 79:64 of the aliased-with x87 register. I find it hard to believe that an apparent bug like this has been here "forever". Am I missing something? Robert Henry From: Robert Henry Sent: Friday, May 29, 2020 10:38 AM To: qemu-devel@nongnu.org Subject: ia-32/ia-64 fxsave64 instruction behavior when saving mmx Background: The ia-32/ia-64 fxsave64 instruction saves fp80 or legacy SSE mmx registers. The mmx registers are saved as if they were fp80 values. The lower 64 bits of the constructed fp80 value is the mmx register. The upper 16 bits of the constructed fp80 value are reserved; see the last row of table 3-44 of https://www.felixcloutier.com/x86/fxsave#tbl-3-44 The Intel core i9-9980XE Skylake metal I have puts 0x into these reserved 16 bits when saving MMX. QEMU appears to put 0's there. Does anybody have insight as to what "reserved" really means, or must be, in this case? I take the verb "reserved" to mean something other than "undefined". I came across this issue when running the remill instruction test engine. See my issue https://github.com/lifting-bits/remill/issues/423 For better or worse, remill assumes that those bits are 0x, not 0x
ia-32/ia-64 fxsave64 instruction behavior when saving mmx
Background: The ia-32/ia-64 fxsave64 instruction saves fp80 or legacy SSE mmx registers. The mmx registers are saved as if they were fp80 values. The lower 64 bits of the constructed fp80 value is the mmx register. The upper 16 bits of the constructed fp80 value are reserved; see the last row of table 3-44 of https://www.felixcloutier.com/x86/fxsave#tbl-3-44 The Intel core i9-9980XE Skylake metal I have puts 0x into these reserved 16 bits when saving MMX. QEMU appears to put 0's there. Does anybody have insight as to what "reserved" really means, or must be, in this case? I take the verb "reserved" to mean something other than "undefined". I came across this issue when running the remill instruction test engine. See my issue https://github.com/lifting-bits/remill/issues/423 For better or worse, remill assumes that those bits are 0x, not 0x