https://issues.dlang.org/show_bug.cgi?id=23641
Iain Buclaw <ibuc...@gdcproject.org> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |ibuc...@gdcproject.org --- Comment #2 from Iain Buclaw <ibuc...@gdcproject.org> --- (In reply to ponce from comment #0) > LDC, GDC and DMD implement int4 differently when it comes to multiplication. > With DMD, you need to explicitly pass -mcpu=avx when compiling. It uses a strict gate at compile-time to determine whether or not the expression would map to a single opcode in the dmd backend for the given type mode. GDC and LDC ignores this gate - even if the information is there and can be queried against GCC or LLVM respectively - and just permissively allows the operation, which does mean that when passing down to the backend, it may split up the vector op into narrower modes when the target being compiled for doesn't have an available opcode. This behaviour is justified because strictly, we don't know whether the optimizer might rewrite the expression in such a way that there *is* an a supported opcode. For example: `a / b` has no vector op, but `a >> b` does. https://d.godbolt.org/z/vrn77GG9f (FYI, in gdc-13, `-Wvector-operation-performance` will be turned on by default so you'll at least get a non-blocking warning about expressions that have been expanded at narrower modes). --