https://sourceware.org/bugzilla/show_bug.cgi?id=18700
--- Comment #1 from Michael Rolle <m at rolle dot name> --- First off, I might be misinterpreting the Intel doc (319433-022 OCTOBER 2014), in which case gas might be doing the right thing after all. I wrote some EVEX instructions, both scalar and vector, and both with and without the {sae} option. Here's the disassembly of the result: 0: 62 f1 f5 18 c2 ca 00 vcmpeqpd {sae},%zmm2,%zmm1,%k1 7: 62 f1 f5 48 c2 ca 00 vcmpeqpd %zmm2,%zmm1,%k1 e: 62 f1 f7 18 c2 ca 00 vcmpeqsd {sae},%xmm2,%xmm1,%k1 15: 62 f1 f7 08 c2 ca 00 vcmpeqsd %xmm2,%xmm1,%k1 1c: 62 f1 f5 18 c2 ca 00 vcmpeqpd {sae},%zmm2,%zmm1,%k1 23: 62 f1 f5 38 c2 ca 00 vcmpeqpd {sae},%zmm2,%zmm1,%k1 2a: 62 f1 f5 58 c2 ca 00 vcmpeqpd {sae},%zmm2,%zmm1,%k1 31: 62 f1 f5 78 c2 ca 00 vcmpeqpd {sae},%zmm2,%zmm1,%k1 38: 62 f1 f7 18 c2 ca 00 vcmpeqsd {sae},%xmm2,%xmm1,%k1 3f: 62 f1 f7 38 c2 ca 00 vcmpeqsd {sae},%xmm2,%xmm1,%k1 46: 62 f1 f7 58 c2 ca 00 vcmpeqsd {sae},%xmm2,%xmm1,%k1 4d: 62 f1 f7 78 c2 ca 00 vcmpeqsd {sae},%xmm2,%xmm1,%k1 The first four were assembled from the instructions shown in the disassembly. The last eight were assembled with .byte instructions, just to see how the disassembler would treat them. objdump basically ignores L'L as you can see. This is appropriate for vcmpsd, but perhaps not for vcmppd. The error, I believe, is with the L'L bits in the 4th EVEX byte. For the first instruction at address 0, the byte is 0 00 1 1 000 in binary. L'L = 00b. My reading of the doc is that L'L = 10b is required, and the instruction will #UD otherwise. The intel doc says these things in particular about this. (1) Page 7. 4.6.3 SAE Support in EVEX The EVEX encoding system allows arithmetic floating-point instructions without rounding semantic to be encoded with the SAE attribute. This capability applies to scalar and all vector lengths, by setting EVEX.b. Table 4-7. EVEX Embedded Broadcast/Rounding/SAE and Vector Length on Vector Instructions Position P2[4] P2[6:5] P2[6:5] Broadcast/Rounding/SAE Context EVEX.b EVEX.L’L FP Instructions w/o rounding semantic, SAE control 00b: 128-bit can cause #XF 01b: 256-bit 10b: 512-bit 11b: Reserved (#UD) (2) Page 15. 4.10.2 Exceptions Type E2 of EVEX-Encoded Instructions (This includes vcmppd) Invalid Opcode, #UD (in certain modes), ... If EVEX.L’L != 10b (VL=512). (3) Page 60. The details for the VCMPPD instruction shows {sae} ONLY in the 512-bit version, not in the 128- or 256-bit EVEX-encoded versions. The reasons I am a bit doubtful of the doc is that there are some inconsistencies, indicating that Intel may have rushed the document out to press. First of all, it's strange that only the 512-bit vector instructions allow {sae}, even though it says that all vector lengths are supported, and that the L'L encodes the vector length. Another thing is that in the Exceptions Type E3, which includes vcmpsd, it says that there is a #UD if EVEX.b = 1. This is clearly wrong. It lists the encoding as LIG, meaning L'L is ignored; that's fine, and consistent with the lack of Exceptions Type E3 conditions for L'L. ---------------------------- I think the only real way to resolve this is to get hold of one of the new Intel processors that supports AVX-512, and try running these instructions to see if you get a #UD. And if vcmppd runs with L'L = 00b or 01b, see if it actually compares the entire zmm registers or only the xmm/ymm registers, resp. If indeed, you get a #UD with L'L = 00b, as produced by gas, then this is a critical bug. -- You are receiving this mail because: You are on the CC list for the bug. _______________________________________________ bug-binutils mailing list bug-binutils@gnu.org https://lists.gnu.org/mailman/listinfo/bug-binutils