https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110438
Bug ID: 110438 Summary: generating all-ones zmm needs dep-breaking pxor before ternlog Product: gcc Version: 13.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: amonakov at gcc dot gnu.org Target Milestone: --- Target: x86_64-*-* VPTERNLOG is never a dependency-breaking instruction on existing x86 implementations, so generating a vector of all-ones via bare ternlog can stall waiting on destination register. GCC should emit a dependency-breaking PXOR, otherwise it will be a false-dependency-on-popcnt-lzcnt debacle all over again. #include <immintrin.h> __m512i g(void) { return (__m512i){ 0 } - 1; } g: # waits until previous computation # of zmm0 has completed vpternlogd zmm0, zmm0, zmm0, 0xFF ret