On 07/07/2018 02:15 AM, Jakub Jelinek wrote: > Hi! > > On Fri, Jul 06, 2018 at 12:47:07PM +0200, Jakub Jelinek wrote: >> On Thu, Jul 05, 2018 at 11:57:26PM +0300, Grazvydas Ignotas wrote: >>> I think it would be more efficient if you took care of it. I won't >>> have time for at least a few days anyway. > > Here is the complete patch, I found two further issues where > the __mmask mismatch was in between the return type and what was used > in the rest of the intrinsic, so not caught by my earlier greps. > > I've added (except for the avx512bitalg which seems to have no runtime > test coverage whatsoever) tests that cover the real bugs and further > fixed the avx512*-vpcmp{,u}b-2.c test because (rel) << i triggered UB > if i could go up to 63. > > I don't have AVX512* hw, so I've just bootstrapped/regtested the patch > normally on i686-linux and x86_64-linux AVX2 hw and tried the affected > tests without the config/i386/ changes and with them under SDE. > The patch should fix these FAILs: > > FAIL: gcc.target/i386/avx512bw-vpcmpb-2.c execution test > FAIL: gcc.target/i386/avx512bw-vpcmpub-2.c execution test > FAIL: gcc.target/i386/avx512f-vinsertf32x4-3.c execution test > FAIL: gcc.target/i386/avx512f-vinserti32x4-3.c execution test > FAIL: gcc.target/i386/avx512vl-vpcmpb-2.c execution test > FAIL: gcc.target/i386/avx512vl-vpcmpgeb-2.c execution test > FAIL: gcc.target/i386/avx512vl-vpcmpgeub-2.c execution test > FAIL: gcc.target/i386/avx512vl-vpcmpgeuw-2.c execution test > FAIL: gcc.target/i386/avx512vl-vpcmpgew-2.c execution test > FAIL: gcc.target/i386/avx512vl-vpcmpleb-2.c execution test > FAIL: gcc.target/i386/avx512vl-vpcmpleub-2.c execution test > FAIL: gcc.target/i386/avx512vl-vpcmpleuw-2.c execution test > FAIL: gcc.target/i386/avx512vl-vpcmplew-2.c execution test > FAIL: gcc.target/i386/avx512vl-vpcmpltb-2.c execution test > FAIL: gcc.target/i386/avx512vl-vpcmpltub-2.c execution test > FAIL: gcc.target/i386/avx512vl-vpcmpltuw-2.c execution test > FAIL: gcc.target/i386/avx512vl-vpcmpltw-2.c execution test > FAIL: gcc.target/i386/avx512vl-vpcmpneqb-2.c execution test > FAIL: gcc.target/i386/avx512vl-vpcmpnequb-2.c execution test > FAIL: gcc.target/i386/avx512vl-vpcmpnequw-2.c execution test > FAIL: gcc.target/i386/avx512vl-vpcmpneqw-2.c execution test > FAIL: gcc.target/i386/avx512vl-vpcmpub-2.c execution test > > Ok for trunk? > > I guess we want to backport it soon, but would appreciate somebody testing > it on real AVX512-{BW,VL} hw before doing the backports. > > Another thing to consider is whether we shouldn't add those grep/sed checks > I've been doing (at least the easy ones that don't cross-check the > i386-builtins.def against the uses in the intrin headers) to config/i386/t-* > some way. > > 2018-07-07 Jakub Jelinek <ja...@redhat.com> > > * config/i386/avx512bitalgintrin.h (_mm512_mask_bitshuffle_epi64_mask): > Use __mmask64 type instead of __mmask8 for __M argument. > * config/i386/avx512fintrin.h (_mm512_mask_xor_epi64, > _mm512_maskz_xor_epi64): Use __mmask8 type instead of __mmask16 for > __U argument. > (_mm512_mask_cmpneq_epi64_mask): Use __mmask8 type instead of > __mmask16 for __M argument. > (_mm512_maskz_insertf32x4, _mm512_maskz_inserti32x4, > _mm512_mask_insertf32x4, _mm512_mask_inserti32x4): Cast last argument > to __mmask16 instead of __mmask8. > * config/i386/avx512vlintrin.h (_mm_mask_add_ps, _mm_maskz_add_ps, > _mm256_mask_add_ps, _mm256_maskz_add_ps, _mm_mask_sub_ps, > _mm_maskz_sub_ps, _mm256_mask_sub_ps, _mm256_maskz_sub_ps, > _mm256_maskz_cvtepi32_ps, _mm_maskz_cvtepi32_ps): Use __mmask8 type > instead of __mmask16 for __U argument. > * config/i386/avx512vlbwintrin.h (_mm_mask_cmp_epi8_mask): Use > __mmask16 instead of __mmask8 for __U argument. > (_mm256_mask_cmp_epi8_mask): Use __mmask32 instead of __mmask16 for > __U argument. > (_mm256_cmp_epi8_mask): Use __mmask32 return type instead of > __mmask16. > (_mm_mask_cmp_epu8_mask): Use __mmask16 instead of __mmask8 for __U > argument. > (_mm256_mask_cmp_epu8_mask): Use __mmask32 instead of __mmask16 for > __U argument. > (_mm256_cmp_epu8_mask): Use __mmask32 return type instead of > __mmask16. > (_mm_mask_cmp_epi16_mask): Cast last argument to __mmask8 instead > of __mmask16. > (_mm256_mask_cvtepi8_epi16): Use __mmask16 instead of __mmask32 for > __U argument. > (_mm_mask_cvtepi8_epi16): Use __mmask8 instead of __mmask32 for > __U argument. > (_mm256_mask_cvtepu8_epi16): Use __mmask16 instead of __mmask32 for > __U argument. > (_mm_mask_cvtepu8_epi16): Use __mmask8 instead of __mmask32 for > __U argument. > (_mm256_mask_cmpneq_epu8_mask, _mm256_mask_cmplt_epu8_mask, > _mm256_mask_cmpge_epu8_mask, _mm256_mask_cmple_epu8_mask): Change > return type as well as __M argument type and all casts from __mmask8 > to __mmask32. > (_mm256_mask_cmpneq_epu16_mask, _mm256_mask_cmplt_epu16_mask, > _mm256_mask_cmpge_epu16_mask, _mm256_mask_cmple_epu16_mask): Change > return type as well as __M argument type and all casts from __mmask8 > to __mmask16. > (_mm256_mask_cmpneq_epi8_mask, _mm256_mask_cmplt_epi8_mask, > _mm256_mask_cmpge_epi8_mask, _mm256_mask_cmple_epi8_mask): Change > return type as well as __M argument type and all casts from __mmask8 > to __mmask32. > (_mm256_mask_cmpneq_epi16_mask, _mm256_mask_cmplt_epi16_mask, > _mm256_mask_cmpge_epi16_mask, _mm256_mask_cmple_epi16_mask): Change > return type as well as __M argument type and all casts from __mmask8 > to __mmask16. > * config/i386/avx512vbmi2vlintrin.h (_mm_mask_shrdi_epi32, > _mm_mask_shldi_epi32): Cast last argument to __mmask8 instead of > __mmask16. > > * gcc.target/i386/avx512bw-vpcmpb-2.c (CMP): Use SIZE macro instead > of hardcoding size. Cast (rel) to MASK_TYPE. > * gcc.target/i386/avx512bw-vpcmpub-2.c (CMP): Likewise. > * gcc.target/i386/avx512f-vinserti32x4-3.c: New test. > * gcc.target/i386/avx512f-vinsertf32x4-3.c: New test. > * gcc.target/i386/avx512vl-vpcmpnequb-2.c: New test. > * gcc.target/i386/avx512vl-vpcmpgeub-2.c: New test. > * gcc.target/i386/avx512vl-vpcmpleb-2.c: New test. > * gcc.target/i386/avx512vl-vpcmpgeb-2.c: New test. > * gcc.target/i386/avx512vl-vpcmpltb-2.c: New test. > * gcc.target/i386/avx512vl-vpcmpltub-2.c: New test. > * gcc.target/i386/avx512vl-vpcmpleub-2.c: New test. > * gcc.target/i386/avx512vl-vpcmpneqb-2.c: New test. > * gcc.target/i386/avx512vl-vpcmpnequw-2.c: New test. > * gcc.target/i386/avx512vl-vpcmpgeuw-2.c: New test. > * gcc.target/i386/avx512vl-vpcmplew-2.c: New test. > * gcc.target/i386/avx512vl-vpcmpgew-2.c: New test. > * gcc.target/i386/avx512vl-vpcmpltw-2.c: New test. > * gcc.target/i386/avx512vl-vpcmpltuw-2.c: New test. > * gcc.target/i386/avx512vl-vpcmpleuw-2.c: New test. > * gcc.target/i386/avx512vl-vpcmpneqw-2.c: New test. > > 2018-07-07 Grazvydas Ignotas <nota...@gmail.com> > > * config/i386/avx512bwintrin.h: (_mm512_mask_cmp_epi8_mask, > _mm512_mask_cmp_epu8_mask): Use __mmask64 type instead of __mmask32 > for __U argument. > > * gcc.target/i386/avx512bw-vpcmpb-2.c (SIZE): Define to > (AVX512F_LEN / 8) instead of (AVX512F_LEN / 16). > * gcc.target/i386/avx512bw-vpcmpub-2.c (SIZE): Likewise. OK.
FWIW, we have plenty of avx512 machines available in beaker. You can do queries based on the cpuflags. Select "Key/Value" for the table. "CPUFLAGS" for the Keyvalue "contains" for Operation and "avx512" for Value. Jeff Do a search on Key/Value for CPUFLAGS contains avx512. jeff