https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82735
Bug ID: 82735 Summary: _mm256_zeroupper does not invalidate previously computed registers Product: gcc Version: 7.2.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: marcin.slusarz at intel dot com Target Milestone: --- $ cat main.c #include <stdio.h> #include <string.h> void test(char *dest); int main() { char buf[32]; memset(buf, 0x2, 32); test(buf); for (int i = 0; i < 32; ++i) printf("%d ", buf[i]); printf("\n"); } $ cat zeroupper.c #include <immintrin.h> void test(char *dest) { __m256i ymm1 = _mm256_set1_epi8((char)0x1); _mm256_storeu_si256((__m256i *)dest + 32, ymm1); _mm256_zeroupper(); __m256i ymm2 = _mm256_set1_epi8((char)0x1); _mm256_storeu_si256((__m256i *)dest, ymm2); } $ gcc -c -std=c99 -O2 main.c $ gcc -c -std=c99 -O2 zeroupper.c -mavx $ gcc zeroupper.o main.o -o zeroupper $ ./zeroupper 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Expected output is: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I can reproduce it on gcc 4.9, 5.4.1, 6.3.0 and 7.2.1. gcc 4.6.4 is not affected. clang 3.5, 3.8, 3.9 and 4.0 produce the expected output.