PING^1 [PATCH] rs6000: Fix vector_set_var_p9 by considering BE [PR108807]
Hi, I'd like to gentle ping this: https://gcc.gnu.org/pipermail/gcc-patches/2023-February/612213.html It's to fix one regression, I think it's stage 4 content. BR, Kewen on 2023/2/17 17:55, Kewen.Lin via Gcc-patches wrote: > Hi, > > As PR108807 exposes, the current handling in function > rs6000_expand_vector_set_var_p9 doesn't take care of big > endianness. Currently the function is to rotate the > target vector by moving element to-be-set to element 0, > set element 0 with the given val, then rotate back. To > get the permutation control vector for the rotation, it > makes use of lvsr and lvsl, but the element ordering is > different for BE and LE (like element 0 is the most > significant one on BE while the least significant one on > LE), this patch is to add consideration for BE and make > sure permutation control vectors for rotations are expected. > > As tested, it helped to fix the below failures: > > FAIL: gcc.target/powerpc/pr79251-run.p9.c execution test > FAIL: gcc.target/powerpc/pr89765-mc.c execution test > FAIL: gcc.target/powerpc/vsx-builtin-10d.c execution test > FAIL: gcc.target/powerpc/vsx-builtin-11d.c execution test > FAIL: gcc.target/powerpc/vsx-builtin-14d.c execution test > FAIL: gcc.target/powerpc/vsx-builtin-16d.c execution test > FAIL: gcc.target/powerpc/vsx-builtin-18d.c execution test > FAIL: gcc.target/powerpc/vsx-builtin-9d.c execution test > > Bootstrapped and regtested on powerpc64-linux-gnu P{8,9} > and powerpc64le-linux-gnu P10. > > Is it ok for trunk? > > BR, > Kewen > - > PR target/108807 > > gcc/ChangeLog: > > * config/rs6000/rs6000.cc (rs6000_expand_vector_set_var_p9): Fix gen > function for permutation control vector by considering big endianness. > --- > gcc/config/rs6000/rs6000.cc | 48 + > 1 file changed, 28 insertions(+), 20 deletions(-) > > diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc > index 16ca3a31757..774eb2963d9 100644 > --- a/gcc/config/rs6000/rs6000.cc > +++ b/gcc/config/rs6000/rs6000.cc > @@ -7235,22 +7235,26 @@ rs6000_expand_vector_set_var_p9 (rtx target, rtx val, > rtx idx) > >machine_mode shift_mode; >rtx (*gen_ashl)(rtx, rtx, rtx); > - rtx (*gen_lvsl)(rtx, rtx); > - rtx (*gen_lvsr)(rtx, rtx); > + rtx (*gen_pcvr1)(rtx, rtx); > + rtx (*gen_pcvr2)(rtx, rtx); > >if (TARGET_POWERPC64) > { >shift_mode = DImode; >gen_ashl = gen_ashldi3; > - gen_lvsl = gen_altivec_lvsl_reg_di; > - gen_lvsr = gen_altivec_lvsr_reg_di; > + gen_pcvr1 = BYTES_BIG_ENDIAN ? gen_altivec_lvsl_reg_di > +: gen_altivec_lvsr_reg_di; > + gen_pcvr2 = BYTES_BIG_ENDIAN ? gen_altivec_lvsr_reg_di > +: gen_altivec_lvsl_reg_di; > } >else > { >shift_mode = SImode; >gen_ashl = gen_ashlsi3; > - gen_lvsl = gen_altivec_lvsl_reg_si; > - gen_lvsr = gen_altivec_lvsr_reg_si; > + gen_pcvr1 = BYTES_BIG_ENDIAN ? gen_altivec_lvsl_reg_si > +: gen_altivec_lvsr_reg_si; > + gen_pcvr2 = BYTES_BIG_ENDIAN ? gen_altivec_lvsr_reg_si > +: gen_altivec_lvsl_reg_si; > } >/* Generate the IDX for permute shift, width is the vector element size. > idx = idx * width. */ > @@ -7259,25 +7263,29 @@ rs6000_expand_vector_set_var_p9 (rtx target, rtx val, > rtx idx) > >emit_insn (gen_ashl (tmp, idx, GEN_INT (shift))); > > - /* lvsrv1,0,idx. */ > - rtx pcvr = gen_reg_rtx (V16QImode); > - emit_insn (gen_lvsr (pcvr, tmp)); > - > - /* lvslv2,0,idx. */ > - rtx pcvl = gen_reg_rtx (V16QImode); > - emit_insn (gen_lvsl (pcvl, tmp)); > + /* Generate one permutation control vector used for rotating the element > + at to-insert position to element zero in target vector. lvsl is > + used for big endianness while lvsr is used for little endianness: > + lvs[lr]v1,0,idx. */ > + rtx pcvr1 = gen_reg_rtx (V16QImode); > + emit_insn (gen_pcvr1 (pcvr1, tmp)); > >rtx sub_target = simplify_gen_subreg (V16QImode, target, mode, 0); > + rtx perm1 = gen_altivec_vperm_v8hiv16qi (sub_target, sub_target, > sub_target, > +pcvr1); > + emit_insn (perm1); > > - rtx permr > -= gen_altivec_vperm_v8hiv16qi (sub_target, sub_target, sub_target, pcvr); > - emit_insn (permr); > - > + /* Insert val into element 0 of target vector. */ >rs6000_expand_vector_set (target, val, const0_rtx); > > - rtx perml > -= gen_altivec_vperm_v8hiv16qi (sub_target, sub_target, sub_target, pcvl); > - emit_insn (perml); > + /* Rotate back with a reversed permutation control vector generated from: > + lvs[rl] v2,0,idx. */ > + rtx pcvr2 = gen_reg_rtx (V16QImode); > + emit_insn (gen_pcvr2 (pcvr2, tmp)); > + > + rtx perm2 = gen_altivec_vperm_v8hiv16qi (sub_target, sub_target, > sub_target, > +
[PATCH] libgcc: Use initarray section type for .init_stack
Hi, One of my workmates found there is a warning like: libgcc/config/rs6000/morestack.S:402: Warning: ignoring incorrect section type for .init_array.0 when compiling libgcc/config/rs6000/morestack.S. Since commit r13-6545 touched that file recently, which was suspected to be responsible for this warning, I did some investigation and found this is a warning staying for a long time. For section .init_stack*, it's preferred to use section type SHT_INIT_ARRAY. So this patch is use "@init_array" to replace "@progbits". Although the warning is trivial, Segher suggested me to post this to fix it, in order to avoid any possible misunderstanding/confusion on the warning. As Alan confirmed, this doesn't require a premise check on if the existing binutils supports "@init_array" or not, "because if you want split-stack to work, you must link with gold, any version of binutils that has gold has an assembler that understands @init_array". (Thanks Alan!) Bootstrapped and regtested on x86_64-redhat-linux and powerpc64{,le}-linux-gnu. Is it ok for trunk when next stage 1 comes? BR, Kewen - libgcc/ChangeLog: * config/i386/morestack.S: Use @init_array rather than @progbits for section type of section .init_array. * config/rs6000/morestack.S: Likewise. * config/s390/morestack.S: Likewise. --- libgcc/config/i386/morestack.S | 2 +- libgcc/config/rs6000/morestack.S | 2 +- libgcc/config/s390/morestack.S | 2 +- 3 files changed, 3 insertions(+), 3 deletions(-) diff --git a/libgcc/config/i386/morestack.S b/libgcc/config/i386/morestack.S index 7ae99b50cf5..c822b71e2dd 100644 --- a/libgcc/config/i386/morestack.S +++ b/libgcc/config/i386/morestack.S @@ -850,7 +850,7 @@ __morestack_make_guard: # This is ELF specific. #if HAVE_INITFINI_ARRAY_SUPPORT - .section.init_array.0,"aw",@progbits + .section.init_array.0,"aw",@init_array #else .section.ctors.65535,"aw",@progbits #endif diff --git a/libgcc/config/rs6000/morestack.S b/libgcc/config/rs6000/morestack.S index f2fea6abb10..dd1e27cd454 100644 --- a/libgcc/config/rs6000/morestack.S +++ b/libgcc/config/rs6000/morestack.S @@ -399,7 +399,7 @@ ENTRY0(__morestack_make_guard) # Make __stack_split_initialize a high priority constructor. #if HAVE_INITFINI_ARRAY_SUPPORT - .section .init_array.0,"aw",@progbits + .section .init_array.0,"aw",@init_array #else .section .ctors.65535,"aw",@progbits #endif diff --git a/libgcc/config/s390/morestack.S b/libgcc/config/s390/morestack.S index 09a49bb8851..f52e7a6510c 100644 --- a/libgcc/config/s390/morestack.S +++ b/libgcc/config/s390/morestack.S @@ -597,7 +597,7 @@ __morestack_make_guard: # Make __stack_split_initialize a high priority constructor. #if HAVE_INITFINI_ARRAY_SUPPORT - .section .init_array.0,"aw",@progbits + .section .init_array.0,"aw",@init_array #else .section .ctors.65535,"aw",@progbits #endif -- 2.31.1
[PATCH] rs6000: Make _mm_slli_si128 and _mm_bslli_si128 consistent [PR109167]
Hi, As PR109167 shows, it's unexpected to have two different implementation ways for _mm_slli_si128 and _mm_bslli_si128, as gcc/config/i386/emmintrin.h they should be the same. So this patch is to fix it accordingly. Bootstrapped and regtested on powerpc64-linux-gnu P8 and powerpc64le-linux-gnu P9 and P10. I'm going to push this soon if no objections. BR, Kewen - PR target/109167 gcc/ChangeLog: * config/rs6000/emmintrin.h (_mm_bslli_si128): Move the implementation from ... (_mm_slli_si128): ... here. Change to call _mm_bslli_si128 directly. gcc/testsuite/ChangeLog: * gcc.target/powerpc/pr109167.c: New test. --- gcc/config/rs6000/emmintrin.h | 26 gcc/testsuite/gcc.target/powerpc/pr109167.c | 47 + 2 files changed, 56 insertions(+), 17 deletions(-) create mode 100644 gcc/testsuite/gcc.target/powerpc/pr109167.c diff --git a/gcc/config/rs6000/emmintrin.h b/gcc/config/rs6000/emmintrin.h index bfff7ff6fea..44d01a83d8d 100644 --- a/gcc/config/rs6000/emmintrin.h +++ b/gcc/config/rs6000/emmintrin.h @@ -1601,8 +1601,14 @@ _mm_bslli_si128 (__m128i __A, const int __N) __v16qu __result; const __v16qu __zeros = { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }; - if (__N >= 0 && __N < 16) + if (__N == 0) +return __A; + else if (__N > 0 && __N < 16) +#ifdef __LITTLE_ENDIAN__ __result = vec_sld ((__v16qu) __A, __zeros, __N); +#else +__result = vec_sld (__zeros, (__v16qu) __A, (16 - __N)); +#endif else __result = __zeros; @@ -1647,23 +1653,9 @@ _mm_srli_si128 (__m128i __A, const int __N) } extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) -_mm_slli_si128 (__m128i __A, const int _imm5) +_mm_slli_si128 (__m128i __A, const int __N) { - __v16qu __result; - const __v16qu __zeros = { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }; - - if (_imm5 == 0) -return __A; - else if (_imm5 > 0 && _imm5 < 16) -#ifdef __LITTLE_ENDIAN__ -__result = vec_sld ((__v16qu) __A, __zeros, _imm5); -#else -__result = vec_sld (__zeros, (__v16qu) __A, (16 - _imm5)); -#endif - else -__result = __zeros; - - return (__m128i) __result; + return _mm_bslli_si128 (__A, __N); } extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) diff --git a/gcc/testsuite/gcc.target/powerpc/pr109167.c b/gcc/testsuite/gcc.target/powerpc/pr109167.c new file mode 100644 index 000..d490c995b14 --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/pr109167.c @@ -0,0 +1,47 @@ +/* { dg-do run } */ +/* { dg-require-effective-target vsx_hw } */ +/* { dg-options "-O2 -mvsx" } */ + +/* Verify there is no warning message. */ + +#define NO_WARN_X86_INTRINSICS 1 + +#include + +#define N 5 + +__attribute__ ((noipa)) __m128i +test1 (__m128i v) +{ + return _mm_bslli_si128 (v, N); +} + +__attribute__ ((noipa)) __m128i +test2 (__m128i v) +{ + return _mm_slli_si128 (v, N); +} + +typedef union +{ + __m128i x; + unsigned char a[16]; +} union128i_ub; + +int main() +{ + union128i_ub v; + v.x += _mm_set_epi8 (1, 2, 3, 4, 10, 20, 30, 90, 80, 40, 100, 15, 98, 25, 98, 7); + + union128i_ub r1, r2; + r1.x = test1 (v.x); + r2.x = test2 (v.x); + + for (int i = 0; i < 16; i++) +if (r1.a[i] != r2.a[i]) + __builtin_abort(); + + return 0; +} + -- 2.31.1
[PATCH] rs6000: Ensure vec_sld shift count in allowable range [PR109082]
Hi, As PR109082 shows, some uses of vec_sld in emmintrin.h don't strictly guarantee the given shift count is in the range 0-15 (inclusive). This patch is to make the argument range constraint honored for those uses. Bootstrapped and regtested on powerpc64-linux-gnu P8 and powerpc64le-linux-gnu P9 and P10. I'm going to push this soon if no objections. BR, Kewen - PR target/109082 gcc/ChangeLog: * config/rs6000/emmintrin.h (_mm_bslli_si128): Check __N is not less than zero when calling vec_sld. (_mm_bsrli_si128): Return __A if __N is zero, check __N is bigger than zero when calling vec_sld. (_mm_slli_si128): Return __A if _imm5 is zero, check _imm5 is bigger than zero when calling vec_sld. gcc/testsuite/ChangeLog: * gcc.target/powerpc/pr109082.c: New test. --- gcc/config/rs6000/emmintrin.h | 10 +++--- gcc/testsuite/gcc.target/powerpc/pr109082.c | 14 ++ 2 files changed, 21 insertions(+), 3 deletions(-) create mode 100644 gcc/testsuite/gcc.target/powerpc/pr109082.c diff --git a/gcc/config/rs6000/emmintrin.h b/gcc/config/rs6000/emmintrin.h index f6a6dbf399a..bfff7ff6fea 100644 --- a/gcc/config/rs6000/emmintrin.h +++ b/gcc/config/rs6000/emmintrin.h @@ -1601,7 +1601,7 @@ _mm_bslli_si128 (__m128i __A, const int __N) __v16qu __result; const __v16qu __zeros = { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }; - if (__N < 16) + if (__N >= 0 && __N < 16) __result = vec_sld ((__v16qu) __A, __zeros, __N); else __result = __zeros; @@ -1615,7 +1615,9 @@ _mm_bsrli_si128 (__m128i __A, const int __N) __v16qu __result; const __v16qu __zeros = { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }; - if (__N < 16) + if (__N == 0) +return __A; + else if (__N > 0 && __N < 16) #ifdef __LITTLE_ENDIAN__ if (__builtin_constant_p(__N)) /* Would like to use Vector Shift Left Double by Octet @@ -1650,7 +1652,9 @@ _mm_slli_si128 (__m128i __A, const int _imm5) __v16qu __result; const __v16qu __zeros = { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }; - if (_imm5 < 16) + if (_imm5 == 0) +return __A; + else if (_imm5 > 0 && _imm5 < 16) #ifdef __LITTLE_ENDIAN__ __result = vec_sld ((__v16qu) __A, __zeros, _imm5); #else diff --git a/gcc/testsuite/gcc.target/powerpc/pr109082.c b/gcc/testsuite/gcc.target/powerpc/pr109082.c new file mode 100644 index 000..98da22c386b --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/pr109082.c @@ -0,0 +1,14 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target powerpc_vsx_ok } */ +/* { dg-options "-O2 -mvsx" } */ + +/* Verify there is no warning message. */ + +#define NO_WARN_X86_INTRINSICS 1 +#include + +__m128i +foo (__m128i A) +{ + return _mm_bsrli_si128 (A, 0); +} -- 2.31.1
[PATCH v3] rs6000: Fix vector parity support [PR108699]
Hi, The failures on the original failed case builtin-bitops-1.c and the associated test case pr108699.c here show that the current support of parity vector mode is wrong on Power. The hardware insns vprtyb[wdq] which operate on the least significant bit of each byte per element, they doesn't match what RTL opcode parity needs, but the current implementation expands it with them wrongly. This patch is to fix the handling with one more insn vpopcntb. Comparing to v2 [1]: - Use rs6000_vprtyb2 rather than parityb2, and adjust several places with it accordingly. Bootstrapped and regtested on powerpc64-linux-gnu P{8,9} and powerpc64le-linux-gnu P10. Is it ok for trunk? [1] https://gcc.gnu.org/pipermail/gcc-patches/2023-February/612212.html BR, Kewen - PR target/108699 gcc/ChangeLog: * config/rs6000/altivec.md (*p9v_parity2): Rename to ... (rs6000_vprtyb2): ... this. * config/rs6000/rs6000-builtins.def (VPRTYBD): Replace parityv2di2 with rs6000_vprtybv2di2. (VPRTYBW): Replace parityv4si2 with rs6000_vprtybv4si2. (VPRTYBQ): Replace parityv1ti2 with rs6000_vprtybv1ti2. * config/rs6000/vector.md (parity2 with VEC_IP): Expand with popcountv16qi2 and the corresponding rs6000_vprtyb2. gcc/testsuite/ChangeLog: * gcc.target/powerpc/p9-vparity.c: Add scan-assembler-not for vpopcntb to distinguish parity byte from parity. * gcc.target/powerpc/pr108699.c: New test. --- gcc/config/rs6000/altivec.md | 8 ++-- gcc/config/rs6000/rs6000-builtins.def | 6 +-- gcc/config/rs6000/vector.md | 11 - gcc/testsuite/gcc.target/powerpc/p9-vparity.c | 1 + gcc/testsuite/gcc.target/powerpc/pr108699.c | 42 +++ 5 files changed, 61 insertions(+), 7 deletions(-) create mode 100644 gcc/testsuite/gcc.target/powerpc/pr108699.c diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md index 30606b8ab21..49b0c964f4d 100644 --- a/gcc/config/rs6000/altivec.md +++ b/gcc/config/rs6000/altivec.md @@ -4195,9 +4195,11 @@ (define_insn "*p8v_popcount2" [(set_attr "type" "vecsimple")]) ;; Vector parity -(define_insn "*p9v_parity2" - [(set (match_operand:VParity 0 "register_operand" "=v") -(parity:VParity (match_operand:VParity 1 "register_operand" "v")))] +(define_insn "rs6000_vprtyb2" + [(set (match_operand:VEC_IP 0 "register_operand" "=v") +(unspec:VEC_IP + [(match_operand:VEC_IP 1 "register_operand" "v")] + UNSPEC_PARITY))] "TARGET_P9_VECTOR" "vprtyb %0,%1" [(set_attr "type" "vecsimple")]) diff --git a/gcc/config/rs6000/rs6000-builtins.def b/gcc/config/rs6000/rs6000-builtins.def index e0d9f5adc97..03fb194b151 100644 --- a/gcc/config/rs6000/rs6000-builtins.def +++ b/gcc/config/rs6000/rs6000-builtins.def @@ -2666,13 +2666,13 @@ VMSUMUDM altivec_vmsumudm {} const vsll __builtin_altivec_vprtybd (vsll); -VPRTYBD parityv2di2 {} +VPRTYBD rs6000_vprtybv2di2 {} const vsq __builtin_altivec_vprtybq (vsq); -VPRTYBQ parityv1ti2 {} +VPRTYBQ rs6000_vprtybv1ti2 {} const vsi __builtin_altivec_vprtybw (vsi); -VPRTYBW parityv4si2 {} +VPRTYBW rs6000_vprtybv4si2 {} const vsll __builtin_altivec_vrldmi (vsll, vsll, vsll); VRLDMI altivec_vrldmi {} diff --git a/gcc/config/rs6000/vector.md b/gcc/config/rs6000/vector.md index 12fd5f976ed..1ae04c8e0a8 100644 --- a/gcc/config/rs6000/vector.md +++ b/gcc/config/rs6000/vector.md @@ -1226,7 +1226,16 @@ (define_expand "popcount2" (define_expand "parity2" [(set (match_operand:VEC_IP 0 "register_operand") (parity:VEC_IP (match_operand:VEC_IP 1 "register_operand")))] - "TARGET_P9_VECTOR") + "TARGET_P9_VECTOR" +{ + rtx op1 = gen_lowpart (V16QImode, operands[1]); + rtx res = gen_reg_rtx (V16QImode); + emit_insn (gen_popcountv16qi2 (res, op1)); + emit_insn (gen_rs6000_vprtyb2 (operands[0], + gen_lowpart (mode, res))); + + DONE; +}) ;; Same size conversions diff --git a/gcc/testsuite/gcc.target/powerpc/p9-vparity.c b/gcc/testsuite/gcc.target/powerpc/p9-vparity.c index f4aba1567cd..8f6f1239f7a 100644 --- a/gcc/testsuite/gcc.target/powerpc/p9-vparity.c +++ b/gcc/testsuite/gcc.target/powerpc/p9-vparity.c @@ -105,3 +105,4 @@ parity_ti_4u (__uint128_t a) /* { dg-final { scan-assembler "vprtybd" } } */ /* { dg-final { scan-assembler "vprtybq" } } */ /* { dg-final { scan-assembler "vprtybw" } } */ +/* { dg-final { scan-assembler-not "vpopcntb" } } */ diff --git a/gcc/testsuite/gcc.target/powerpc/pr108699.c b/gcc/testsuite/gcc.target/powerpc/pr108699.c new file mode 100644 index 000..f02bac130cc --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/pr108699.c @@ -0,0 +1,42 @@ +/* { dg-run } */ +/* { dg-options "-O2 -ftree-vectorize -fno-vect-cost-model" } */ + +#define N 16 + +unsigned long long vals[N]; +unsigned int res[N]; +unsigned int expects[N] = {0, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0
[RFC/PATCH] sched: Consider debug insn in no_real_insns_p [PR108273]
Hi, As PR108273 shows, when there is one block which only has NOTE_P and LABEL_P insns at non-debug mode while has some extra DEBUG_INSN_P insns at debug mode, after scheduling it, the DFA states would be different between debug mode and non-debug mode. Since at non-debug mode, the block meets no_real_insns_p, it gets skipped; while at debug mode, it gets scheduled, even it only has NOTE_P, LABEL_P and DEBUG_INSN_P, the call of function advance_one_cycle will change the DFA state. PR108519 also shows this issue issue can be exposed by some scheduler changes. This patch is to take debug insn into account in function no_real_insns_p, which make us not try to schedule for the block having only NOTE_P, LABEL_P and DEBUG_INSN_P insns, resulting in consistent DFA states between non-debug and debug mode. Changing no_real_insns_p caused ICE when doing free_block_dependencies, the root cause is that we create dependencies for debug insns, those dependencies are expected to be resolved during scheduling insns which gets skipped after the change in no_real_insns_p. By checking the code, it looks it's reasonable to skip to compute block dependencies for no_real_insns_p blocks. It can be bootstrapped and regtested but it hit one ICE when built SPEC2017 bmks at option -O2 -g. The root cause is that initially there are no no_real_insns_p blocks in a region, but in the later scheduling one block has one insn scheduled speculatively then becomes no_real_insns_p, so we compute dependencies and rgn_n_insns for this special block before scheduling, later it gets skipped so not scheduled, the following counts would mismatch: /* Sanity check: verify that all region insns were scheduled. */ gcc_assert (sched_rgn_n_insns == rgn_n_insns); , and we miss to release the allocated dependencies. To avoid the unexpected mis-matchings, this patch adds one bitmap to track this kind of special block which isn't no_real_insns_p but becomes no_real_insns_p later, then we can adjust the count and free deps for it. This patch can be bootstrapped and regress-tested on x86_64-redhat-linux, aarch64-linux-gnu and powerpc64{,le}-linux-gnu. I also verified this patch can pass SPEC2017 both intrate and fprate bmks building at -g -O2/-O3. This is for next stage 1, but since I know little on the scheduler, I'd like to post it early for more comments. Is it on the right track? Any thoughts? BR, Kewen - PR rtl-optimization/108273 gcc/ChangeLog: * haifa-sched.cc (no_real_insns_p): Consider DEBUG_INSN_P insn. * sched-rgn.cc (no_real_insns): New static bitmap variable. (compute_block_dependences): Skip for no_real_insns_p. (free_deps_for_bb_no_real_insns_p): New function. (free_block_dependencies): Call free_deps_for_bb_no_real_insns_p for no_real_insns_p bb. (schedule_region): Fix up sched_rgn_n_insns for some block for which rgn_n_insns is computed before, and move sched_rgn_local_finish after free_block_dependencies loop. (sched_rgn_local_init): Allocate and compute no_real_insns. (sched_rgn_local_free): Free no_real_insns. --- gcc/haifa-sched.cc | 8 - gcc/sched-rgn.cc | 84 +++--- 2 files changed, 87 insertions(+), 5 deletions(-) diff --git a/gcc/haifa-sched.cc b/gcc/haifa-sched.cc index 48b53776fa9..378f3b34cc0 100644 --- a/gcc/haifa-sched.cc +++ b/gcc/haifa-sched.cc @@ -5040,7 +5040,13 @@ no_real_insns_p (const rtx_insn *head, const rtx_insn *tail) { while (head != NEXT_INSN (tail)) { - if (!NOTE_P (head) && !LABEL_P (head)) + /* Take debug insn into account here, otherwise we can have different +DFA states after scheduling a block which only has NOTE_P, LABEL_P +and DEBUG_P (debug mode) insns between non-debug and debug modes, +it could cause -fcompare-debug failure. */ + if (!NOTE_P (head) + && !LABEL_P (head) + && !DEBUG_INSN_P (head)) return 0; head = NEXT_INSN (head); } diff --git a/gcc/sched-rgn.cc b/gcc/sched-rgn.cc index f2751f62450..211b62e2b4a 100644 --- a/gcc/sched-rgn.cc +++ b/gcc/sched-rgn.cc @@ -213,6 +213,11 @@ static int rgn_nr_edges; /* Array of size rgn_nr_edges. */ static edge *rgn_edges; +/* For basic block i, the corresponding set bit i in bitmap indicates this basic + block meets predicate no_real_insns_p before scheduling any basic blocks in + the region. */ +static bitmap no_real_insns; + /* Mapping from each edge in the graph to its number in the rgn. */ #define EDGE_TO_BIT(edge) ((int)(size_t)(edge)->aux) #define SET_EDGE_TO_BIT(edge,nr) ((edge)->aux = (void *)(size_t)(nr)) @@ -2730,6 +2735,15 @@ compute_block_dependences (int bb) gcc_assert (EBB_FIRST_BB (bb) == EBB_LAST_BB (bb)); get_ebb_head_tail (EBB_FIRST_BB (bb), EBB_LAST_BB (bb), &head, &tail); + /* Don't compute block dependencies if there are no real insns. */ + if (no_real_insns_p
[PATCH] fix for __sanitizer_struct_mallinfo with mallinfo2
Fix sanititzers with mallinfo2 e.g fedora already uses mallinfo2 with long v[10]; -- Reini Urban From 074b5b5d073137762a3bbef3cece5646cea537b5 Mon Sep 17 00:00:00 2001 From: Reini Urban Date: Sat, 12 Mar 2022 09:52:36 +0100 Subject: [PATCH 1/2] __sanitizer_struct_mallinfo vs mallinfo2 size assertion libsanitizer/Changelog: * configure.ac: add mallinfo2 probe * sanitizer_common/sanitizer_platform_limits_posix.h (struct __sanitizer_struct_mallinfo): use mallinfo2 probe Signed-off-by: Reini Urban --- libsanitizer/configure.ac| 3 ++- .../sanitizer_common/sanitizer_platform_limits_posix.h | 5 + 2 files changed, 7 insertions(+), 1 deletion(-) diff --git libsanitizer/configure.ac libsanitizer/configure.ac index 04cd8910ed6..227c2644ecd 100644 --- libsanitizer/configure.ac +++ libsanitizer/configure.ac @@ -103,7 +103,8 @@ AM_CONDITIONAL(LSAN_SUPPORTED, [test "x$LSAN_SUPPORTED" = "xyes"]) AM_CONDITIONAL(HWASAN_SUPPORTED, [test "x$HWASAN_SUPPORTED" = "xyes"]) # Check for functions needed. -AC_CHECK_FUNCS(clock_getres clock_gettime clock_settime lstat readlink) +AC_CHECK_FUNCS(clock_getres clock_gettime clock_settime lstat readlink \ + mallinfo mallinfo2) # Common libraries that we need to link against for all sanitizer libs. link_sanitizer_common='-lpthread -lm' diff --git libsanitizer/sanitizer_common/sanitizer_platform_limits_posix.h libsanitizer/sanitizer_common/sanitizer_platform_limits_posix.h index 44dd3d9e22d..918ee95ef82 100644 --- libsanitizer/sanitizer_common/sanitizer_platform_limits_posix.h +++ libsanitizer/sanitizer_common/sanitizer_platform_limits_posix.h @@ -213,7 +213,12 @@ struct __sanitizer_struct_mallinfo { #if SANITIZER_LINUX && !SANITIZER_ANDROID struct __sanitizer_struct_mallinfo { + // e.g ubuntu uses mallinfo, fedora mallinfo2 +#ifdef HAVE_MALLINFO2 + long v[10]; +#else int v[10]; +#endif }; extern unsigned struct_ustat_sz; -- 2.34.1 From 6e1ab452bcf2bae0be20faf65966c8ee2f755a2b Mon Sep 17 00:00:00 2001 From: Reini Urban Date: Sun, 20 Feb 2022 18:27:15 +0100 Subject: [PATCH 2/2] gcc: fixup report_heap_memory_use() without mallinfo2 decl gcc/ChangeLog: * gcc/ggc-common.cc (report_heap_memory_use): fix without mallinfo2 decl Signed-off-by: Reini Urban --- gcc/ggc-common.cc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git gcc/ggc-common.cc gcc/ggc-common.cc index db317f49993..9cd446995f8 100644 --- gcc/ggc-common.cc +++ gcc/ggc-common.cc @@ -1276,7 +1276,7 @@ void report_heap_memory_use () { #if defined(HAVE_MALLINFO) || defined(HAVE_MALLINFO2) -#ifdef HAVE_MALLINFO2 +#if defined HAVE_MALLINFO2 && HAVE_DECL_MALLINFO2 #define MALLINFO_FN mallinfo2 #else #define MALLINFO_FN mallinfo -- 2.34.1
[PATCH] RISC-V: Fix RVV ICE && runtine fail
From: Ju-Zhe Zhong gcc/ChangeLog: * config/riscv/riscv-vsetvl.cc (eliminate_insn): Fix bugs. (insert_vsetvl): Ditto. (pass_vsetvl::emit_local_forward_vsetvls): Ditto. * config/riscv/riscv-vsetvl.h (enum vsetvl_type): Ditto. * config/riscv/vector.md: Ditto. gcc/testsuite/ChangeLog: * g++.target/riscv/rvv/base/bug-16.C: New test. * g++.target/riscv/rvv/base/bug-17.C: New test. --- gcc/config/riscv/riscv-vsetvl.cc | 31 +- gcc/config/riscv/riscv-vsetvl.h | 1 + gcc/config/riscv/vector.md| 4 +- .../g++.target/riscv/rvv/base/bug-16.C| 443 ++ .../g++.target/riscv/rvv/base/bug-17.C| 406 5 files changed, 876 insertions(+), 9 deletions(-) create mode 100644 gcc/testsuite/g++.target/riscv/rvv/base/bug-16.C create mode 100644 gcc/testsuite/g++.target/riscv/rvv/base/bug-17.C diff --git a/gcc/config/riscv/riscv-vsetvl.cc b/gcc/config/riscv/riscv-vsetvl.cc index f4c1773da0d..b5f5301ea43 100644 --- a/gcc/config/riscv/riscv-vsetvl.cc +++ b/gcc/config/riscv/riscv-vsetvl.cc @@ -686,7 +686,7 @@ eliminate_insn (rtx_insn *rinsn) delete_insn (rinsn); } -static void +static vsetvl_type insert_vsetvl (enum emit_type emit_type, rtx_insn *rinsn, const vector_insn_info &info, const vector_insn_info &prev_info) { @@ -697,14 +697,14 @@ insert_vsetvl (enum emit_type emit_type, rtx_insn *rinsn, { emit_vsetvl_insn (VSETVL_VTYPE_CHANGE_ONLY, emit_type, info, NULL_RTX, rinsn); - return; + return VSETVL_VTYPE_CHANGE_ONLY; } if (info.has_avl_imm ()) { emit_vsetvl_insn (VSETVL_DISCARD_RESULT, emit_type, info, NULL_RTX, rinsn); - return; + return VSETVL_DISCARD_RESULT; } if (info.has_avl_no_reg ()) @@ -716,14 +716,14 @@ insert_vsetvl (enum emit_type emit_type, rtx_insn *rinsn, { emit_vsetvl_insn (VSETVL_VTYPE_CHANGE_ONLY, emit_type, info, NULL_RTX, rinsn); - return; + return VSETVL_VTYPE_CHANGE_ONLY; } /* Otherwise use an AVL of 0 to avoid depending on previous vl. */ vl_vtype_info new_info = info; new_info.set_avl_info (avl_info (const0_rtx, nullptr)); emit_vsetvl_insn (VSETVL_DISCARD_RESULT, emit_type, new_info, NULL_RTX, rinsn); - return; + return VSETVL_DISCARD_RESULT; } /* Use X0 as the DestReg unless AVLReg is X0. We also need to change the @@ -735,7 +735,7 @@ insert_vsetvl (enum emit_type emit_type, rtx_insn *rinsn, rtx vl_op = info.get_avl_reg_rtx (); gcc_assert (!vlmax_avl_p (vl_op)); emit_vsetvl_insn (VSETVL_NORMAL, emit_type, info, vl_op, rinsn); - return; + return VSETVL_NORMAL; } emit_vsetvl_insn (VSETVL_DISCARD_RESULT, emit_type, info, NULL_RTX, rinsn); @@ -745,6 +745,7 @@ insert_vsetvl (enum emit_type emit_type, rtx_insn *rinsn, fprintf (dump_file, "Update VL/VTYPE info, previous info="); prev_info.dump (dump_file); } + return VSETVL_DISCARD_RESULT; } /* If X contains any LABEL_REF's, add REG_LABEL_OPERAND notes for them @@ -2760,6 +2761,7 @@ pass_vsetvl::emit_local_forward_vsetvls (const bb_info *bb) for (insn_info *insn : bb->real_nondebug_insns ()) { const vector_insn_info prev_info = curr_info; + enum vsetvl_type type = NUM_VSETVL_TYPE; transfer_before (curr_info, insn); if (has_vtype_op (insn->rtl ())) @@ -2771,10 +2773,25 @@ pass_vsetvl::emit_local_forward_vsetvls (const bb_info *bb) = m_vector_manager->vector_insn_infos[insn->uid ()]; if (!require.compatible_p ( static_cast (prev_info))) - insert_vsetvl (EMIT_BEFORE, insn->rtl (), require, prev_info); + type = insert_vsetvl (EMIT_BEFORE, insn->rtl (), require, + prev_info); } } + /* Fix the issue of following sequence: +vsetivli zero, 5 + +vsetvli zero, zero +vmv.x.s (demand AVL = 8). + +incorrect: vsetvli zero, zero ===> Since the curr_info is AVL = 8. +correct: vsetivli zero, 8 +vadd (demand AVL = 8). */ + if (type == VSETVL_VTYPE_CHANGE_ONLY) + { + /* Update the curr_info to be real correct AVL. */ + curr_info.set_avl_info (prev_info.get_avl_info ()); + } transfer_after (curr_info, insn); } diff --git a/gcc/config/riscv/riscv-vsetvl.h b/gcc/config/riscv/riscv-vsetvl.h index 887ff1bdec8..237381f7026 100644 --- a/gcc/config/riscv/riscv-vsetvl.h +++ b/gcc/config/riscv/riscv-vsetvl.h @@ -31,6 +31,7 @@ enum vsetvl_type VSETVL_NORMAL, VSETVL_VTYPE_CHANGE_ONLY, VSETVL_DISCARD_RESULT, + NUM_VSETVL_TYPE }; enum emit_type diff --git a/gcc/config/riscv/vector.md b/gc
Re: [PATCH] rs6000: Don't ICE when compiling the __builtin_vec_xst_trunc built-in [PR109178]
Hi Peter, on 2023/3/18 10:30, Peter Bergner wrote: > On 3/17/23 7:17 PM, Peter Bergner wrote: >> On 3/17/23 5:35 PM, Peter Bergner wrote: >>> When we expand the __builtin_vec_xst_trunc built-in, we use the wrong mode >>> for the MEM operand which causes an unrecognizable insn ICE. The solution >>> is to use the correct TMODE mode. >>> >>> Is this ok for trunk and gcc12 assuming my bootstraps and regtests show >>> no regressions? >> >> The trunk bootstrap and regtests were clean. I'm still waiting on the >> backport testing to finish. > ...and the gcc12 backported bootstrap and regtest were clean too. > Nice, OK for trunk and gcc12 branch, thanks! BR, Kewen
[PATCH] libstdc++: use new built-in trait __remove_pointer
libstdc++-v3/ChangeLog: * include/std/type_traits (is_reference): Use __remove_pointer built-in trait. --- diff --git a/libstdc++-v3/include/std/type_traits b/libstdc++-v3/include/std/type_traits index 2bd607a8b8f..cba98091aad 100644 --- a/libstdc++-v3/include/std/type_traits +++ b/libstdc++-v3/include/std/type_traits @@ -2025,17 +2025,23 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION template struct __remove_pointer_helper -{ typedef _Tp type; }; +{ using type = _Tp; }; template struct __remove_pointer_helper<_Tp, _Up*> -{ typedef _Up type; }; +{ using type = _Up; }; /// remove_pointer +#if __has_builtin(__remove_pointer) + template +struct remove_pointer +{ using type = __remove_pointer(_Tp); }; +#else template struct remove_pointer : public __remove_pointer_helper<_Tp, __remove_cv_t<_Tp>> { }; +#endif template struct __add_pointer_helper
Ping [PATCHv3, gfortran] Escalate failure when Hollerith constant to real conversion fails [PR103628]
Hi, Gently ping this: https://gcc.gnu.org/pipermail/gcc-patches/2023-March/613497.html Thanks Gui Haochen 在 2023/3/7 16:55, HAO CHEN GUI 写道: > Hi, > The patch escalates the failure when Hollerith constant to real conversion > fails in native_interpret_expr. It finally reports an "Cannot simplify > expression" error in do_simplify method. > > The patch of pr95450 added a verification for decoding/encoding checking > in native_interpret_expr. native_interpret_expr may fail on real type > conversion and returns a NULL tree then. But upper layer calls don't handle > the failure so that an ICE is reported when the verification fails. > > IBM long double is an example. It doesn't have a unique memory presentation > for some real values. So it may not pass the verification. The new test > case shows the problem. > > errorcount is used to check if an error is already reported or not when > getting a bad expr. Buffered errors need to be excluded as they don't > increase error count either. > > The patch passed regression test on Power and x86 linux platforms. > > Gui Haochen > Thanks > > ChangeLog > 2023-03-07 Haochen Gui > > gcc/ > PR target/103628 > * fortran/target-memory.cc (gfc_interpret_float): Return FAIL when > native_interpret_expr gets a NULL tree. > * fortran/arith.cc (gfc_hollerith2real): Return NULL when > gfc_interpret_float fails. > * fortran/error.cc (gfc_buffered_p): Define. > * fortran/gfortran.h (gfc_buffered_p): Declare. > * fortran/intrinsic.cc: Add diagnostic.h to include list. > (do_simplify): Save errorcount and check it at finish. Report a > "Cannot simplify expression" error on a bad result if error count > doesn't change and no other errors buffered. > > gcc/testsuite/ > PR target/103628 > * gfortran.dg/pr103628.f90: New. > > Co-Authored-By: Tobias Burnus > > patch.diff > diff --git a/gcc/fortran/arith.cc b/gcc/fortran/arith.cc > index c0d12cfad9d..d3d38c7eb6a 100644 > --- a/gcc/fortran/arith.cc > +++ b/gcc/fortran/arith.cc > @@ -2752,10 +2752,12 @@ gfc_hollerith2real (gfc_expr *src, int kind) >result = gfc_get_constant_expr (BT_REAL, kind, &src->where); > >hollerith2representation (result, src); > - gfc_interpret_float (kind, (unsigned char *) result->representation.string, > -result->representation.length, result->value.real); > - > - return result; > + if (gfc_interpret_float (kind, > +(unsigned char *) result->representation.string, > +result->representation.length, result->value.real)) > +return result; > + else > +return NULL; > } > > /* Convert character to real. The constant will be padded or truncated. */ > diff --git a/gcc/fortran/error.cc b/gcc/fortran/error.cc > index 214fb78ba7b..872d42e731e 100644 > --- a/gcc/fortran/error.cc > +++ b/gcc/fortran/error.cc > @@ -49,6 +49,13 @@ static gfc_error_buffer error_buffer; > static output_buffer *pp_error_buffer, *pp_warning_buffer; > static int warningcount_buffered, werrorcount_buffered; > > +/* Return buffered_p. */ > +bool > +gfc_buffered_p (void) > +{ > + return buffered_p; > +} > + > /* Return true if there output_buffer is empty. */ > > static bool > diff --git a/gcc/fortran/gfortran.h b/gcc/fortran/gfortran.h > index 219ef8c7612..edfe11796a6 100644 > --- a/gcc/fortran/gfortran.h > +++ b/gcc/fortran/gfortran.h > @@ -3328,6 +3328,7 @@ void gfc_internal_error (const char *, ...) > ATTRIBUTE_NORETURN ATTRIBUTE_GCC_GFC > void gfc_clear_error (void); > bool gfc_error_check (void); > bool gfc_error_flag_test (void); > +bool gfc_buffered_p (void); > > notification gfc_notification_std (int); > bool gfc_notify_std (int, const char *, ...) ATTRIBUTE_GCC_GFC(2,3); > diff --git a/gcc/fortran/intrinsic.cc b/gcc/fortran/intrinsic.cc > index e89131f5a71..9d049001a51 100644 > --- a/gcc/fortran/intrinsic.cc > +++ b/gcc/fortran/intrinsic.cc > @@ -25,6 +25,7 @@ along with GCC; see the file COPYING3. If not see > #include "options.h" > #include "gfortran.h" > #include "intrinsic.h" > +#include "diagnostic.h" /* For errorcount. */ > > /* Namespace to hold the resolved symbols for intrinsic subroutines. */ > static gfc_namespace *gfc_intrinsic_namespace; > @@ -4620,6 +4621,7 @@ do_simplify (gfc_intrinsic_sym *specific, gfc_expr *e) > { >gfc_expr *result, *a1, *a2, *a3, *a4, *a5, *a6; >gfc_actual_arglist *arg; > + int old_errorcount = errorcount; > >/* Max and min require special handling due to the variable number > of args. */ > @@ -4708,7 +4710,12 @@ do_simplify (gfc_intrinsic_sym *specific, gfc_expr *e) > > finish: >if (result == &gfc_bad_expr) > -return false; > +{ > + if (errorcount == old_errorcount > + && (gfc_buffered_p () && !gfc_error_flag_test ())) > + gfc_error ("Cannot simplify expression at %L", &e->where); > + return false; > +} > >if (result ==
Re: Re: [PATCH] RISC-V: Fix bugs of ternary integer and floating-point ternary intrinsics.
The last patch. Kito is still keep testing with pressure tests. juzhe.zh...@rivai.ai From: Jeff Law Date: 2023-03-20 01:03 To: juzhe.zhong; gcc-patches CC: kito.cheng Subject: Re: [PATCH] RISC-V: Fix bugs of ternary integer and floating-point ternary intrinsics. On 3/15/23 00:37, juzhe.zh...@rivai.ai wrote: > From: Ju-Zhe Zhong > > Fix bugs of ternary intrinsic pattern: > > interger: > vnmsac.vv vd, vs1, vs2, vm# vd[i] = -(vs1[i] * vs2[i]) + vd[i] (minus > op3 (mult op1 op2)) > vnmsac.vx vd, rs1, vs2, vm# vd[i] = -(x[rs1] * vs2[i]) + vd[i] (minus > op3 (mult op1 op2)) > > floating-point: > # FP multiply-accumulate, overwrites addend > vfmacc.vv vd, vs1, vs2, vm# vd[i] = +(vs1[i] * vs2[i]) + vd[i] (plus > (mult (op1 op2)) op3) > vfmacc.vf vd, rs1, vs2, vm# vd[i] = +(f[rs1] * vs2[i]) + vd[i] (plus > (mult (op1 op2)) op3) > > > # FP negate-(multiply-accumulate), overwrites subtrahend > vfnmacc.vv vd, vs1, vs2, vm # vd[i] = -(vs1[i] * vs2[i]) - vd[i] (minus > (neg (mult (op1 op2))) op3)) > vfnmacc.vf vd, rs1, vs2, vm # vd[i] = -(f[rs1] * vs2[i]) - vd[i] (minus > (neg (mult (op1 op2)) op3)) > # FP multiply-subtract-accumulator, overwrites subtrahend > vfmsac.vv vd, vs1, vs2, vm# vd[i] = +(vs1[i] * vs2[i]) - vd[i] (minus > (mult (op1 op2)) op3) > vfmsac.vf vd, rs1, vs2, vm# vd[i] = +(f[rs1] * vs2[i]) - vd[i] (minus > (mult (op1 op2)) op3) > > # FP negate-(multiply-subtract-accumulator), overwrites minuend > vfnmsac.vv vd, vs1, vs2, vm # vd[i] = -(vs1[i] * vs2[i]) + vd[i] (plus > (neg:(mult (op1 op2))) op3) > vfnmsac.vf vd, rs1, vs2, vm # vd[i] = -(f[rs1] * vs2[i]) + vd[i] (plus > (neg:(mult (op1 op2))) op3) > > gcc/ChangeLog: > > * config/riscv/riscv-vector-builtins-bases.cc: Fix ternary bug. > * config/riscv/vector-iterators.md (nmsac): Ditto. > (nmsub): Ditto. > (msac): Ditto. > (msub): Ditto. > (nmadd): Ditto. > (nmacc): Ditto. > * config/riscv/vector.md (@pred_mul_): Ditto. > (@pred_mul_plus): Ditto. > (*pred_madd): Ditto. > (*pred_macc): Ditto. > (*pred_mul_plus): Ditto. > (@pred_mul_plus_scalar): Ditto. > (*pred_madd_scalar): Ditto. > (*pred_macc_scalar): Ditto. > (*pred_mul_plus_scalar): Ditto. > (*pred_madd_extended_scalar): Ditto. > (*pred_macc_extended_scalar): Ditto. > (*pred_mul_plus_extended_scalar): Ditto. > (@pred_minus_mul): Ditto. > (*pred_): Ditto. > (*pred_nmsub): Ditto. > (*pred_): Ditto. > (*pred_nmsac): Ditto. > (*pred_mul_): Ditto. > (*pred_minus_mul): Ditto. > (@pred_mul__scalar): Ditto. > (@pred_minus_mul_scalar): Ditto. > (*pred__scalar): Ditto. > (*pred_nmsub_scalar): Ditto. > (*pred__scalar): Ditto. > (*pred_nmsac_scalar): Ditto. > (*pred_mul__scalar): Ditto. > (*pred_minus_mul_scalar): Ditto. > (*pred__extended_scalar): Ditto. > (*pred_nmsub_extended_scalar): Ditto. > (*pred__extended_scalar): Ditto. > (*pred_nmsac_extended_scalar): Ditto. > (*pred_mul__extended_scalar): Ditto. > (*pred_minus_mul_extended_scalar): Ditto. > (*pred_): Ditto. > (*pred_): Ditto. > (*pred__scalar): Ditto. > (*pred__scalar): Ditto. > (@pred_neg_mul_): Ditto. > (@pred_mul_neg_): Ditto. > (*pred_): Ditto. > (*pred_): Ditto. > (*pred_): Ditto. > (*pred_): Ditto. > (*pred_neg_mul_): Ditto. > (*pred_mul_neg_): Ditto. > (@pred_neg_mul__scalar): Ditto. > (@pred_mul_neg__scalar): Ditto. > (*pred__scalar): Ditto. > (*pred__scalar): Ditto. > (*pred__scalar): Ditto. > (*pred__scalar): Ditto. > (*pred_neg_mul__scalar): Ditto. > (*pred_mul_neg__scalar): Ditto. > (@pred_widen_neg_mul_): Ditto. > (@pred_widen_mul_neg_): Ditto. > (@pred_widen_neg_mul__scalar): Ditto. > (@pred_widen_mul_neg__scalar): Ditto. It looks like you've got two patches that are almost 100% identical except for a few bits in vector.md. Which is the correct version? One is dated 3/14/23 00:30 the other 3/15/23: 04:07. jeff
Re: Re: [PATCH] RISC-V: Fine tune gather load RA constraint
It's ok to defer them GCC-14. I will keep testing and fix bugs during these 2 months. I won't support any more feature or optimizations until GCC-14 is open. juzhe.zh...@rivai.ai From: Jeff Law Date: 2023-03-20 00:55 To: juzhe.zh...@rivai.ai; gcc-patches CC: kito.cheng Subject: Re: [PATCH] RISC-V: Fine tune gather load RA constraint On 3/15/23 00:52, juzhe.zh...@rivai.ai wrote: > Hi, Jeff. I really hope the current "refine tune RA constraint" patches > can be merged into GCC-13. > These patches are just making RA constraint to be consistent with RVV > ISA after I double checked RVV ISA. > These RA constraints changing is very safe.They may be very safe, but we're > *way* past the point where we should be making this kind of change. When I agreed to not object to including the RVV builtins in gcc-13, I never imagined we'd still be making changes to that code in March. My bad for not getting clarification on how much work remained to be done. Jeff
[PATCH] Fortran: simplification of NEAREST for large argument [PR109186]
Dear all, I intend to commit the attached obvious patch within 24h unless there are comments. The issue is an off-by-one error in setting up the maximum exponent of the real kind that is passed to mpfr, so that model numbers between huge(x)/2 and huge(x), when given as an argument to NEAREST(arg,-1.0), are rounded down to huge(x)/2 during compile-time simplification. As no such issue is observed at run-time, the testcase compares the compile-time and run-time results for corner cases. Regtested on x86_64-pc-linux-gnu. As this is sort of a wrong-code issue, I intend to backport to all open branches. (The issue was apparently introduced in r0-84566-gb6f63e898498e6 without noticing, so it is technically a regression.) Thanks, Harald From 9391bd0eeef8e069d9e49f9aa277160b43aaf4f3 Mon Sep 17 00:00:00 2001 From: Harald Anlauf Date: Sun, 19 Mar 2023 21:29:46 +0100 Subject: [PATCH] Fortran: simplification of NEAREST for large argument [PR109186] gcc/fortran/ChangeLog: PR fortran/109186 * simplify.cc (gfc_simplify_nearest): Fix off-by-one error in setting up real kind-specific maximum exponent for mpfr. gcc/testsuite/ChangeLog: PR fortran/109186 * gfortran.dg/nearest_6.f90: New test. --- gcc/fortran/simplify.cc | 2 +- gcc/testsuite/gfortran.dg/nearest_6.f90 | 26 + 2 files changed, 27 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/gfortran.dg/nearest_6.f90 diff --git a/gcc/fortran/simplify.cc b/gcc/fortran/simplify.cc index 20ea38e0007..ecf0e3558df 100644 --- a/gcc/fortran/simplify.cc +++ b/gcc/fortran/simplify.cc @@ -6114,7 +6114,7 @@ gfc_simplify_nearest (gfc_expr *x, gfc_expr *s) kind = gfc_validate_kind (BT_REAL, x->ts.kind, 0); mpfr_set_emin ((mpfr_exp_t) gfc_real_kinds[kind].min_exponent - mpfr_get_prec(result->value.real) + 1); - mpfr_set_emax ((mpfr_exp_t) gfc_real_kinds[kind].max_exponent - 1); + mpfr_set_emax ((mpfr_exp_t) gfc_real_kinds[kind].max_exponent); mpfr_check_range (result->value.real, 0, MPFR_RNDU); if (mpfr_sgn (s->value.real) > 0) diff --git a/gcc/testsuite/gfortran.dg/nearest_6.f90 b/gcc/testsuite/gfortran.dg/nearest_6.f90 new file mode 100644 index 000..00d1ebe618c --- /dev/null +++ b/gcc/testsuite/gfortran.dg/nearest_6.f90 @@ -0,0 +1,26 @@ +! { dg-do run } +! PR fortran/109186 - Verify that NEAREST produces same results at +! compile-time and run-time for corner cases +! Reported by John Harper + +program p + implicit none + integer, parameter :: sp = selected_real_kind (6) + integer, parameter :: dp = selected_real_kind (13) + real(sp), parameter :: x1 = huge (1._sp), t1 = tiny (1._sp) + real(dp), parameter :: x2 = huge (1._dp), t2 = tiny (1._dp) + real(sp), volatile :: y1, z1 + real(dp), volatile :: y2, z2 + y1 = x1 + z1 = nearest (y1, -1._sp) + if (nearest (x1, -1._sp) /= z1) stop 1 + y2 = x2 + z2 = nearest (y2, -1._dp) + if (nearest (x2, -1._dp) /= z2) stop 2 + y1 = t1 + z1 = nearest (y1, 1._sp) + if (nearest (t1, 1._sp) /= z1) stop 3 + y2 = t2 + z2 = nearest (y2, 1._dp) + if (nearest (t2, 1._dp) /= z2) stop 4 +end -- 2.35.3
[PATCH] c++: implement __remove_pointer built-in trait
This patch implements built-in trait for std::remove_pointer. gcc/cp/ChangeLog: * cp-trait.def: Define __remove_pointer. * semantics.cc (finish_trait_type): Handle CPTK_REMOVE_POINTER. gcc/testsuite/ChangeLog: * g++.dg/ext/has-builtin-1.C: Test existence of __remove_pointer. * g++.dg/ext/remove_pointer.C: New test. --- diff --git a/gcc/cp/cp-trait.def b/gcc/cp/cp-trait.def index bac593c0094..985b43e0d97 100644 --- a/gcc/cp/cp-trait.def +++ b/gcc/cp/cp-trait.def @@ -90,6 +90,7 @@ DEFTRAIT_EXPR (IS_DEDUCIBLE, "__is_deducible ", 2) DEFTRAIT_TYPE (REMOVE_CV, "__remove_cv", 1) DEFTRAIT_TYPE (REMOVE_REFERENCE, "__remove_reference", 1) DEFTRAIT_TYPE (REMOVE_CVREF, "__remove_cvref", 1) +DEFTRAIT_TYPE (REMOVE_POINTER, "__remove_pointer", 1) DEFTRAIT_TYPE (UNDERLYING_TYPE, "__underlying_type", 1) /* These traits yield a type pack, not a type, and are represented by diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc index 87c2e8a7111..92db1f670ac 100644 --- a/gcc/cp/semantics.cc +++ b/gcc/cp/semantics.cc @@ -12273,6 +12273,10 @@ finish_trait_type (cp_trait_kind kind, tree type1, tree type2) if (TYPE_REF_P (type1)) type1 = TREE_TYPE (type1); return cv_unqualified (type1); +case CPTK_REMOVE_POINTER: + if (TYPE_PTR_P (type1)) +type1 = TREE_TYPE (type1); + return type1; #define DEFTRAIT_EXPR(CODE, NAME, ARITY) \ case CPTK_##CODE: diff --git a/gcc/testsuite/g++.dg/ext/has-builtin-1.C b/gcc/testsuite/g++.dg/ext/has-builtin-1.C index f343e153e56..e21e0a95509 100644 --- a/gcc/testsuite/g++.dg/ext/has-builtin-1.C +++ b/gcc/testsuite/g++.dg/ext/has-builtin-1.C @@ -146,3 +146,6 @@ #if !__has_builtin (__remove_cvref) # error "__has_builtin (__remove_cvref) failed" #endif +#if !__has_builtin (__remove_pointer) +# error "__has_builtin (__remove_pointer) failed" +#endif diff --git a/gcc/testsuite/g++.dg/ext/remove_pointer.C b/gcc/testsuite/g++.dg/ext/remove_pointer.C new file mode 100644 index 000..7b13db93950 --- /dev/null +++ b/gcc/testsuite/g++.dg/ext/remove_pointer.C @@ -0,0 +1,51 @@ +// { dg-do compile { target c++11 } } + +#define SA(X) static_assert((X),#X) + +SA(__is_same(__remove_pointer(int), int)); +SA(__is_same(__remove_pointer(int*), int)); +SA(__is_same(__remove_pointer(int**), int*)); + +SA(__is_same(__remove_pointer(const int*), const int)); +SA(__is_same(__remove_pointer(const int**), const int*)); +SA(__is_same(__remove_pointer(int* const), int)); +SA(__is_same(__remove_pointer(int** const), int*)); +SA(__is_same(__remove_pointer(int* const* const), int* const)); + +SA(__is_same(__remove_pointer(volatile int*), volatile int)); +SA(__is_same(__remove_pointer(volatile int**), volatile int*)); +SA(__is_same(__remove_pointer(int* volatile), int)); +SA(__is_same(__remove_pointer(int** volatile), int*)); +SA(__is_same(__remove_pointer(int* volatile* volatile), int* volatile)); + +SA(__is_same(__remove_pointer(const volatile int*), const volatile int)); +SA(__is_same(__remove_pointer(const volatile int**), const volatile int*)); +SA(__is_same(__remove_pointer(const int* volatile), const int)); +SA(__is_same(__remove_pointer(volatile int* const), volatile int)); +SA(__is_same(__remove_pointer(int* const volatile), int)); +SA(__is_same(__remove_pointer(const int** volatile), const int*)); +SA(__is_same(__remove_pointer(volatile int** const), volatile int*)); +SA(__is_same(__remove_pointer(int** const volatile), int*)); +SA(__is_same(__remove_pointer(int* const* const volatile), int* const)); +SA(__is_same(__remove_pointer(int* volatile* const volatile), int* volatile)); +SA(__is_same(__remove_pointer(int* const volatile* const volatile), int* const volatile)); + +SA(__is_same(__remove_pointer(int&), int&)); +SA(__is_same(__remove_pointer(const int&), const int&)); +SA(__is_same(__remove_pointer(volatile int&), volatile int&)); +SA(__is_same(__remove_pointer(const volatile int&), const volatile int&)); + +SA(__is_same(__remove_pointer(int&&), int&&)); +SA(__is_same(__remove_pointer(const int&&), const int&&)); +SA(__is_same(__remove_pointer(volatile int&&), volatile int&&)); +SA(__is_same(__remove_pointer(const volatile int&&), const volatile int&&)); + +SA(__is_same(__remove_pointer(int[3]), int[3])); +SA(__is_same(__remove_pointer(const int[3]), const int[3])); +SA(__is_same(__remove_pointer(volatile int[3]), volatile int[3])); +SA(__is_same(__remove_pointer(const volatile int[3]), const volatile int[3])); + +SA(__is_same(__remove_pointer(int(int)), int(int))); +SA(__is_same(__remove_pointer(int(*const)(int)), int(int))); +SA(__is_same(__remove_pointer(int(*volatile)(int)), int(int))); +SA(__is_same(__remove_pointer(int(*const volatile)(int)), int(int)));
Re: [PATCH] c++: implement __is_reference built-in trait
* cp-trait.def (names_builtin_p): Define __is_reference. This changelog should be the following. * cp-trait.def: Define __is_reference. I am sorry for the confusion. On Sat, Mar 18, 2023 at 9:07 PM Ken Matsui wrote: > > This patch implements built-in trait for std::is_reference. > > gcc/cp/ChangeLog: > > * cp-trait.def (names_builtin_p): Define __is_reference. > * constraint.cc (diagnose_trait_expr): Handle CPTK_IS_REFERENCE. > * semantics.cc (trait_expr_value): Likewise. > (finish_trait_expr): Likewise. > > gcc/testsuite/ChangeLog: > > * g++.dg/ext/has-builtin-1.C: Test existence of __is_reference. > * g++.dg/ext/is_reference.C: New test. > > --- > diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc > index 273d15ab097..23e5bc24dbb 100644 > --- a/gcc/cp/constraint.cc > +++ b/gcc/cp/constraint.cc > @@ -3701,6 +3701,9 @@ diagnose_trait_expr (tree expr, tree args) > case CPTK_HAS_VIRTUAL_DESTRUCTOR: >inform (loc, " %qT does not have a virtual destructor", t1); >break; > +case CPTK_IS_REFERENCE: > + inform (loc, " %qT is not a reference", t1); > + break; > case CPTK_IS_ABSTRACT: >inform (loc, " %qT is not an abstract class", t1); >break; > diff --git a/gcc/cp/cp-trait.def b/gcc/cp/cp-trait.def > index bac593c0094..63a64152ce6 100644 > --- a/gcc/cp/cp-trait.def > +++ b/gcc/cp/cp-trait.def > @@ -67,6 +67,7 @@ DEFTRAIT_EXPR (IS_CONVERTIBLE, "__is_convertible", 2) > DEFTRAIT_EXPR (IS_EMPTY, "__is_empty", 1) > DEFTRAIT_EXPR (IS_ENUM, "__is_enum", 1) > DEFTRAIT_EXPR (IS_FINAL, "__is_final", 1) > +DEFTRAIT_EXPR (IS_REFERENCE, "__is_reference", 1) > DEFTRAIT_EXPR (IS_LAYOUT_COMPATIBLE, "__is_layout_compatible", 2) > DEFTRAIT_EXPR (IS_LITERAL_TYPE, "__is_literal_type", 1) > DEFTRAIT_EXPR (IS_NOTHROW_ASSIGNABLE, "__is_nothrow_assignable", 2) > diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc > index 87c2e8a7111..dce98af4f72 100644 > --- a/gcc/cp/semantics.cc > +++ b/gcc/cp/semantics.cc > @@ -11995,6 +11995,9 @@ trait_expr_value (cp_trait_kind kind, tree > type1, tree type2) > case CPTK_IS_FINAL: >return CLASS_TYPE_P (type1) && CLASSTYPE_FINAL (type1); > > +case CPTK_IS_REFERENCE: > + return type_code1 == REFERENCE_TYPE; > + > case CPTK_IS_LAYOUT_COMPATIBLE: >return layout_compatible_type_p (type1, type2); > > @@ -12139,6 +12142,7 @@ finish_trait_expr (location_t loc, > cp_trait_kind kind, tree type1, tree type2) > case CPTK_HAS_TRIVIAL_COPY: > case CPTK_HAS_TRIVIAL_DESTRUCTOR: > case CPTK_HAS_UNIQUE_OBJ_REPRESENTATIONS: > +case CPTK_IS_REFERENCE: >if (!check_trait_type (type1)) > return error_mark_node; >break; > diff --git a/gcc/testsuite/g++.dg/ext/has-builtin-1.C > b/gcc/testsuite/g++.dg/ext/has-builtin-1.C > index f343e153e56..b697673790c 100644 > --- a/gcc/testsuite/g++.dg/ext/has-builtin-1.C > +++ b/gcc/testsuite/g++.dg/ext/has-builtin-1.C > @@ -146,3 +146,6 @@ > #if !__has_builtin (__remove_cvref) > # error "__has_builtin (__remove_cvref) failed" > #endif > +#if !__has_builtin (__is_reference) > +# error "__has_builtin (__is_reference) failed" > +#endif > diff --git a/gcc/testsuite/g++.dg/ext/is_reference.C > b/gcc/testsuite/g++.dg/ext/is_reference.C > new file mode 100644 > index 000..b4f048538e5 > --- /dev/null > +++ b/gcc/testsuite/g++.dg/ext/is_reference.C > @@ -0,0 +1,26 @@ > +// { dg-do compile { target c++11 } } > + > +#define SA(X) static_assert((X),#X) > + > +SA(!__is_reference(void)); > +SA(!__is_reference(int*)); > + > +SA(__is_reference(int&)); > +SA(__is_reference(const int&)); > +SA(__is_reference(volatile int&)); > +SA(__is_reference(const volatile int&)); > + > +SA(__is_reference(int&&)); > +SA(__is_reference(const int&&)); > +SA(__is_reference(volatile int&&)); > +SA(__is_reference(const volatile int&&)); > + > +SA(!__is_reference(int[3])); > +SA(!__is_reference(const int[3])); > +SA(!__is_reference(volatile int[3])); > +SA(!__is_reference(const volatile int[3])); > + > +SA(!__is_reference(int(int))); > +SA(!__is_reference(int(*const)(int))); > +SA(!__is_reference(int(*volatile)(int))); > +SA(!__is_reference(int(*const volatile)(int)));
Re: [PATCH] Fortran: procedures with BIND(C) attribute require explicit interface [PR85877]
Hi Thomas, Am 19.03.23 um 08:34 schrieb Thomas Koenig via Gcc-patches: Hi Harald, Am 18.03.23 um 19:52 schrieb Thomas Koenig via Gcc-patches: Hi Harald, the Fortran standard requires an explicit procedure interface in certain situations, such as when they have a BIND(C) attribute (F2018:15.4.2.2). The attached patch adds a check for this. Regtested on x86_64-pc-linux-gnu. OK for mainline? While this fixes the ICE, it misses function f() bind(c) f = 42. end subroutine p bind(c) f ! { dg-error "must be explicit" } x = f() end what do you mean by "it misses"? Sorry, that was caused by confusion on my part (and it is better to test an assumption of what the compiler actually does :-) Patch is OK, also for backport. Maybe you can also include the test above, just to make sure. I've added your suggestion to the testcase. Pushed as: https://gcc.gnu.org/g:5426ab34643d9e6502f3ee572891a03471fa33ed Best regards Thomas Thanks, Harald
Re: [patch, fortran, doc] Explicitly mention undefined overflow
Hi Paul, Yes, that's fine for trunk. I wonder if it is worth being explicit that linear congruential pseudo-random number generators can and do fail at -O3? I don't think we should put this into the docs, because that can change at any time. Maybe into porting_to.html, though (where I have only mentioned this as a general issue with linear congruential generators, without mentioning specific options. Current text can be seen at https://gcc.gnu.org/gcc-13/porting_to.html ). Hm Best regards Thomas
Re: [PATCH] Testsuite: Disable micromips for MSA tests
On 3/13/23 23:46, Xin Liu wrote: Thanks for your feedback. You're right that MicroMIPS doesn't support MSA, so disabling micromips for MSA tests is a reasonable change. I'll make sure to include a ChangeLog entry with a clear description of future patches. Thanks for the suggestions, and I'll strive to improve my work based on your feedback. THanks. I've pushed your patch to the trunk. jeff
[PATCH testsuite] rs6000: suboptimal code for returning bool value on target ppc.
Hello All: This patch add new test to check unnecessary zero extension removal. Regtested on powerpc64-linux-gnu. Thanks & Regards Ajit rs6000: suboptimal code for returning bool value on target ppc. Tests to check unnecessary redundant zero extension removal. 2023-03-19 Ajit Kumar Agarwal gcc/ChangeLog: * testsuite/g++.target/powerpc/zext-elim.C: New test. --- gcc/testsuite/g++.target/powerpc/zext-elim.C | 30 1 file changed, 30 insertions(+) create mode 100644 gcc/testsuite/g++.target/powerpc/zext-elim.C diff --git a/gcc/testsuite/g++.target/powerpc/zext-elim.C b/gcc/testsuite/g++.target/powerpc/zext-elim.C new file mode 100644 index 000..56eabbe0c19 --- /dev/null +++ b/gcc/testsuite/g++.target/powerpc/zext-elim.C @@ -0,0 +1,30 @@ +/* { dg-do compile { target { powerpc*-*-* } } } */ +/* { dg-require-effective-target lp64 } */ +/* { dg-require-effective-target powerpc_p9vector_ok } */ +/* { dg-options "-mcpu=power9 -O2 -free" } */ + +#include + +bool foo (int a, int b) +{ + if (a > 2) +return false; + + if (b < 10) +return true; + + return true; +} + +int bar (int a, int b) +{ + if (a > 2) +return 0; + + if (b < 10) +return 1; + + return 0; +} + +/* { dg-final { scan-assembler-not "rldicl" } } */ -- 2.31.1
Re: [PATCH] RISC-V: Fix bugs of ternary integer and floating-point ternary intrinsics.
On 3/15/23 00:37, juzhe.zh...@rivai.ai wrote: From: Ju-Zhe Zhong Fix bugs of ternary intrinsic pattern: interger: vnmsac.vv vd, vs1, vs2, vm# vd[i] = -(vs1[i] * vs2[i]) + vd[i] (minus op3 (mult op1 op2)) vnmsac.vx vd, rs1, vs2, vm# vd[i] = -(x[rs1] * vs2[i]) + vd[i] (minus op3 (mult op1 op2)) floating-point: # FP multiply-accumulate, overwrites addend vfmacc.vv vd, vs1, vs2, vm# vd[i] = +(vs1[i] * vs2[i]) + vd[i] (plus (mult (op1 op2)) op3) vfmacc.vf vd, rs1, vs2, vm# vd[i] = +(f[rs1] * vs2[i]) + vd[i] (plus (mult (op1 op2)) op3) # FP negate-(multiply-accumulate), overwrites subtrahend vfnmacc.vv vd, vs1, vs2, vm # vd[i] = -(vs1[i] * vs2[i]) - vd[i] (minus (neg (mult (op1 op2))) op3)) vfnmacc.vf vd, rs1, vs2, vm # vd[i] = -(f[rs1] * vs2[i]) - vd[i] (minus (neg (mult (op1 op2)) op3)) # FP multiply-subtract-accumulator, overwrites subtrahend vfmsac.vv vd, vs1, vs2, vm# vd[i] = +(vs1[i] * vs2[i]) - vd[i] (minus (mult (op1 op2)) op3) vfmsac.vf vd, rs1, vs2, vm# vd[i] = +(f[rs1] * vs2[i]) - vd[i] (minus (mult (op1 op2)) op3) # FP negate-(multiply-subtract-accumulator), overwrites minuend vfnmsac.vv vd, vs1, vs2, vm # vd[i] = -(vs1[i] * vs2[i]) + vd[i] (plus (neg:(mult (op1 op2))) op3) vfnmsac.vf vd, rs1, vs2, vm # vd[i] = -(f[rs1] * vs2[i]) + vd[i] (plus (neg:(mult (op1 op2))) op3) gcc/ChangeLog: * config/riscv/riscv-vector-builtins-bases.cc: Fix ternary bug. * config/riscv/vector-iterators.md (nmsac): Ditto. (nmsub): Ditto. (msac): Ditto. (msub): Ditto. (nmadd): Ditto. (nmacc): Ditto. * config/riscv/vector.md (@pred_mul_): Ditto. (@pred_mul_plus): Ditto. (*pred_madd): Ditto. (*pred_macc): Ditto. (*pred_mul_plus): Ditto. (@pred_mul_plus_scalar): Ditto. (*pred_madd_scalar): Ditto. (*pred_macc_scalar): Ditto. (*pred_mul_plus_scalar): Ditto. (*pred_madd_extended_scalar): Ditto. (*pred_macc_extended_scalar): Ditto. (*pred_mul_plus_extended_scalar): Ditto. (@pred_minus_mul): Ditto. (*pred_): Ditto. (*pred_nmsub): Ditto. (*pred_): Ditto. (*pred_nmsac): Ditto. (*pred_mul_): Ditto. (*pred_minus_mul): Ditto. (@pred_mul__scalar): Ditto. (@pred_minus_mul_scalar): Ditto. (*pred__scalar): Ditto. (*pred_nmsub_scalar): Ditto. (*pred__scalar): Ditto. (*pred_nmsac_scalar): Ditto. (*pred_mul__scalar): Ditto. (*pred_minus_mul_scalar): Ditto. (*pred__extended_scalar): Ditto. (*pred_nmsub_extended_scalar): Ditto. (*pred__extended_scalar): Ditto. (*pred_nmsac_extended_scalar): Ditto. (*pred_mul__extended_scalar): Ditto. (*pred_minus_mul_extended_scalar): Ditto. (*pred_): Ditto. (*pred_): Ditto. (*pred__scalar): Ditto. (*pred__scalar): Ditto. (@pred_neg_mul_): Ditto. (@pred_mul_neg_): Ditto. (*pred_): Ditto. (*pred_): Ditto. (*pred_): Ditto. (*pred_): Ditto. (*pred_neg_mul_): Ditto. (*pred_mul_neg_): Ditto. (@pred_neg_mul__scalar): Ditto. (@pred_mul_neg__scalar): Ditto. (*pred__scalar): Ditto. (*pred__scalar): Ditto. (*pred__scalar): Ditto. (*pred__scalar): Ditto. (*pred_neg_mul__scalar): Ditto. (*pred_mul_neg__scalar): Ditto. (@pred_widen_neg_mul_): Ditto. (@pred_widen_mul_neg_): Ditto. (@pred_widen_neg_mul__scalar): Ditto. (@pred_widen_mul_neg__scalar): Ditto. It looks like you've got two patches that are almost 100% identical except for a few bits in vector.md. Which is the correct version? One is dated 3/14/23 00:30 the other 3/15/23: 04:07. jeff
Re: [PATCH] vect: Verify that GET_MODE_NUNITS is greater than one.
On 3/14/23 15:52, Michael Collison wrote: While working on autovectorizing for the RISCV port I encountered an issue where can_duplicate_and_interleave_p assumes that GET_MODE_NUNITS is a evenly divisible by two. The RISC-V target has vector modes (e.g. VNx1DImode), where GET_MODE_NUNITS is equal to one. Tested on RISCV and x86_64-linux-gnu. Okay? 2023-03-09 Michael Collison * tree-vect-slp.cc (can_duplicate_and_interleave_p): Check that GET_MODE_NUNITS is greater than one. As far as I know this doesn't fix a regression so I would defer to gc-14. As release managers, Richi, Jakub or Joseph can gate it in as an exception. jeff
Re: [PATCH] RISC-V: Fine tune gather load RA constraint
On 3/15/23 00:52, juzhe.zh...@rivai.ai wrote: Hi, Jeff. I really hope the current "refine tune RA constraint" patches can be merged into GCC-13. These patches are just making RA constraint to be consistent with RVV ISA after I double checked RVV ISA. These RA constraints changing is very safe.They may be very safe, but we're *way* past the point where we should be making this kind of change. When I agreed to not object to including the RVV builtins in gcc-13, I never imagined we'd still be making changes to that code in March. My bad for not getting clarification on how much work remained to be done. Jeff
Re: [PATCH v2] PR target/89828 Inernal compiler error on -fno-omit-frame-pointer
On 3/15/23 01:51, Yoshinori Sato wrote: What about this? It no longer occurs for me. gcc/config/rx/ * rx.cc (add_pop_cfi_notes): Release the frame pointer if it is used. (rx_expand_prologue): Redesigned stack pointer and frame pointer update process. So I think the ChangeLog entry needs a bit of work. I don't see how the ChangeLog entry for add_pop_cfi_notes relates to the changes at all. This might be better: * config/rx/rx.cc (add_pop_cfi_notes): Attach CFA_RESTORE notes first, then the CFA_ADJUST_CFA note. If restoring the frame pointer, use (fp + offset) for the CFA_ADJUST_CFA note. @@ -1815,37 +1819,17 @@ rx_expand_prologue (void) } } - /* If needed, set up the frame pointer. */ - if (frame_pointer_needed) -gen_safe_add (frame_pointer_rtx, stack_pointer_rtx, - GEN_INT (- (HOST_WIDE_INT) frame_size), true); - - /* Allocate space for the outgoing args. - If the stack frame has not already been set up then handle this as well. */ - if (stack_size) + if (stack_size || frame_size) { - if (frame_size) - { - if (frame_pointer_needed) - gen_safe_add (stack_pointer_rtx, frame_pointer_rtx, - GEN_INT (- (HOST_WIDE_INT) stack_size), true); - else - gen_safe_add (stack_pointer_rtx, stack_pointer_rtx, - GEN_INT (- (HOST_WIDE_INT) (frame_size + stack_size)), - true); - } - else - gen_safe_add (stack_pointer_rtx, stack_pointer_rtx, - GEN_INT (- (HOST_WIDE_INT) stack_size), true); + gen_safe_add (stack_pointer_rtx, stack_pointer_rtx, + GEN_INT (- (HOST_WIDE_INT) (stack_size + frame_size)), + true); } - else if (frame_size) + if (frame_pointer_needed) { - if (! frame_pointer_needed) - gen_safe_add (stack_pointer_rtx, stack_pointer_rtx, - GEN_INT (- (HOST_WIDE_INT) frame_size), true); - else - gen_safe_add (stack_pointer_rtx, frame_pointer_rtx, NULL_RTX, - false /* False because the epilogue will use the FP not the SP. */); + gen_safe_add (frame_pointer_rtx, stack_pointer_rtx, + GEN_INT ((HOST_WIDE_INT) stack_size), + true); It looks like we're emitting; (set (sp) (plus (sp) (stack_size + frame_size) Then we emit (set (fp) (plus (sp) (stack_size)) Unless I missing something important, that seems wrong and results in stack_size being added to FP twice. jeff
Re: [patch, fortran, doc] Explicitly mention undefined overflow
On Mär 19 2023, Thomas Koenig via Gcc-patches wrote: > diff --git a/gcc/fortran/gfortran.texi b/gcc/fortran/gfortran.texi > index c483e13686d..93c66b18938 100644 > --- a/gcc/fortran/gfortran.texi > +++ b/gcc/fortran/gfortran.texi > @@ -820,6 +820,7 @@ might in some way or another become visible to the > programmer. > * File operations on symbolic links:: > * File format of unformatted sequential files:: > * Asynchronous I/O:: > +* Behavior on integer overflow::o s/o$// -- Andreas Schwab, sch...@linux-m68k.org GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510 2552 DF73 E780 A9DA AEC1 "And now for something completely different."
Re: [patch, fortran, doc] Explicitly mention undefined overflow
Hi Thomas, Yes, that's fine for trunk. I wonder if it is worth being explicit that linear congruential pseudo-random number generators can and do fail at -O3? Thanks for the doc patches! Paul On Sun, 19 Mar 2023 at 08:32, Thomas Koenig via Fortran wrote: > Here's also an update on the docs to explicitly mention behavior > on overflow. > > Maybe this will reach another 0.05% of users... > > OK for trunk? > > Best regards > > Thomas > > gcc/fortran/ChangeLog: > > * gfortran.texi: Mention behavior on overflow. > -- "If you can't explain it simply, you don't understand it well enough" - Albert Einstein
New Swedish PO file for 'gcc' (version 13.1-b20230212)
Hello, gentle maintainer. This is a message from the Translation Project robot. A revised PO file for textual domain 'gcc' has been submitted by the Swedish team of translators. The file is available at: https://translationproject.org/latest/gcc/sv.po (This file, 'gcc-13.1-b20230212.sv.po', has just now been sent to you in a separate email.) All other PO files for your package are available in: https://translationproject.org/latest/gcc/ Please consider including all of these in your next release, whether official or a pretest. Whenever you have a new distribution with a new version number ready, containing a newer POT file, please send the URL of that distribution tarball to the address below. The tarball may be just a pretest or a snapshot, it does not even have to compile. It is just used by the translators when they need some extra translation context. The following HTML page has been updated: https://translationproject.org/domain/gcc.html If any question arises, please contact the translation coordinator. Thank you for all your work, The Translation Project robot, in the name of your translation coordinator.
[Patch, fortran] PR87127 - External function not recognised from within an associate block
Hi All, I committed this to 8-branch on 2019-04-24 but not to 9-branch. I have no record of why I did this. The patch now requires an additional line, && sym->ns->proc_name->attr.proc != PROC_MODULE to prevent the error message in pr88376.f90 from changing to the less helpful Error: Specification function ‘n’ at (1) must be PURE I propose to commit to mainline and backport to 12-branch unless there are objections in the next 24 hours. Cheers Paul Fortran: Recognise external function from within an associate block that has not been declared as external [PR87127] 2023-03-19 Paul Thomas gcc/fortran PR fortran/87127 * resolve.cc (check_host_association): If an external function is typed but not declared explicitly to be external, change the old symbol from a variable to an external function. gcc/testsuite/ PR fortran/87127 * gfortran.dg/external_procedures_4.f90: New test. diff --git a/gcc/fortran/resolve.cc b/gcc/fortran/resolve.cc index ba603b4c407..a947f908ece 100644 --- a/gcc/fortran/resolve.cc +++ b/gcc/fortran/resolve.cc @@ -6079,11 +6079,14 @@ resolve_procedure: /* Checks to see that the correct symbol has been host associated. - The only situation where this arises is that in which a twice - contained function is parsed after the host association is made. - Therefore, on detecting this, change the symbol in the expression - and convert the array reference into an actual arglist if the old - symbol is a variable. */ + The only situations where this arises are: + (i) That in which a twice contained function is parsed after + the host association is made. On detecting this, change + the symbol in the expression and convert the array reference + into an actual arglist if the old symbol is a variable; or + (ii) That in which an external function is typed but not declared + explcitly to be external. Here, the old symbol is changed + from a variable to an external function. */ static bool check_host_association (gfc_expr *e) { @@ -6185,6 +6188,27 @@ check_host_association (gfc_expr *e) gfc_resolve_expr (e); sym->refs++; } + /* This case corresponds to a call, from a block or a contained + procedure, to an external function, which has not been declared + as being external in the main program but has been typed. */ + else if (sym && old_sym != sym + && !e->ref + && sym->ts.type == BT_UNKNOWN + && old_sym->ts.type != BT_UNKNOWN + && sym->attr.flavor == FL_PROCEDURE + && old_sym->attr.flavor == FL_VARIABLE + && sym->ns->parent == old_sym->ns + && sym->ns->proc_name + && sym->ns->proc_name->attr.proc != PROC_MODULE + && (sym->ns->proc_name->attr.flavor == FL_LABEL + || sym->ns->proc_name->attr.flavor == FL_PROCEDURE)) + { + old_sym->attr.flavor = FL_PROCEDURE; + old_sym->attr.external = 1; + old_sym->attr.function = 1; + old_sym->result = old_sym; + gfc_resolve_expr (e); + } } /* This might have changed! */ return e->expr_type == EXPR_FUNCTION; ! { dg-do run } ! ! Test the fix for PR87127 in which the references to exfunc cause ! the error "exfunc at (1) is not a function". ! ! Contributed by Gerhard Steinmetz ! function exfunc(i) implicit none integer :: exfunc,i exfunc = 2*i end function ! contents of test.f90 program test implicit none integer :: exfunc,i integer,parameter :: array(2)=[6,7] associate(i=>array(2))! Original bug if (exfunc(i) .ne. 2*i) stop 1 end associate i = 99 call foo contains subroutine foo() ! Comment #3 if (exfunc(i) .ne. 2*i) stop 2 end subroutine foo end program
[PATCH v2] rs6000: suboptimal code for returning bool value on target ppc
Hello All: This patch eliminates unncessary zero extension with ree pass. Bootstrapped and regtested on powerpc64-linux-gnu. Thanks & Regards Ajit rs6000: suboptimal code for returning bool value on target ppc. Eliminate unnecessary redundantzero extension. 2023-03-19 Ajit Kumar Agarwal gcc/ChangeLog: * ree.cc: Add support of AND opcode to eliminate unnecessary zero extension. --- gcc/ree.cc | 22 +- 1 file changed, 13 insertions(+), 9 deletions(-) diff --git a/gcc/ree.cc b/gcc/ree.cc index 413aec7c8eb..d09f55149b1 100644 --- a/gcc/ree.cc +++ b/gcc/ree.cc @@ -319,7 +319,7 @@ combine_set_extension (ext_cand *cand, rtx_insn *curr_insn, rtx *orig_set) { rtx orig_src = SET_SRC (*orig_set); machine_mode orig_mode = GET_MODE (SET_DEST (*orig_set)); - rtx new_set; + rtx new_set = NULL_RTX; rtx cand_pat = single_set (cand->insn); /* If the extension's source/destination registers are not the same @@ -370,7 +370,7 @@ combine_set_extension (ext_cand *cand, rtx_insn *curr_insn, rtx *orig_set) { /* Only IF_THEN_ELSE of phi-type copies are combined. Otherwise, in general, IF_THEN_ELSE should not be combined. */ - return false; + return true; } else { @@ -713,12 +713,15 @@ merge_def_and_ext (ext_cand *cand, rtx_insn *def_insn, ext_state *state) if (sub_rtx == NULL) return false; - if (GET_MODE (SET_DEST (*sub_rtx)) == ext_src_mode + bool copy_needed += (REGNO (SET_DEST (cand->expr)) != REGNO (XEXP (SET_SRC (cand->expr), 0))); + + if (!copy_needed || (GET_MODE (SET_DEST (*sub_rtx)) == ext_src_mode || ((state->modified[INSN_UID (def_insn)].kind - == (cand->code == ZERO_EXTEND + == (cand->code == ZERO_EXTEND || cand->code == AND ? EXT_MODIFIED_ZEXT : EXT_MODIFIED_SEXT)) && state->modified[INSN_UID (def_insn)].mode -== ext_src_mode)) +== ext_src_mode))) { if (GET_MODE_UNIT_SIZE (GET_MODE (SET_DEST (*sub_rtx))) >= GET_MODE_UNIT_SIZE (cand->mode)) @@ -744,7 +747,8 @@ merge_def_and_ext (ext_cand *cand, rtx_insn *def_insn, ext_state *state) static inline rtx get_extended_src_reg (rtx src) { - while (GET_CODE (src) == SIGN_EXTEND || GET_CODE (src) == ZERO_EXTEND) + while (GET_CODE (src) == SIGN_EXTEND || GET_CODE (src) == ZERO_EXTEND + || GET_CODE (src) == AND) src = XEXP (src, 0); gcc_assert (REG_P (src)); return src; @@ -993,7 +997,7 @@ combine_reaching_defs (ext_cand *cand, const_rtx set_pat, ext_state *state) machine_mode mode; if (state->modified[INSN_UID (cand->insn)].kind - != (cand->code == ZERO_EXTEND + != (cand->code == ZERO_EXTEND || cand->code == AND ? EXT_MODIFIED_ZEXT : EXT_MODIFIED_SEXT) || state->modified[INSN_UID (cand->insn)].mode != cand->mode || (set == NULL_RTX)) @@ -1052,7 +1056,7 @@ combine_reaching_defs (ext_cand *cand, const_rtx set_pat, ext_state *state) { ext_modified *modified = &state->modified[INSN_UID (def_insn)]; if (modified->kind == EXT_MODIFIED_NONE) - modified->kind = (cand->code == ZERO_EXTEND ? EXT_MODIFIED_ZEXT + modified->kind = (cand->code == ZERO_EXTEND || cand->code == AND ? EXT_MODIFIED_ZEXT : EXT_MODIFIED_SEXT); if (copy_needed) @@ -1106,7 +1110,7 @@ add_removable_extension (const_rtx expr, rtx_insn *insn, mode = GET_MODE (dest); if (REG_P (dest) - && (code == SIGN_EXTEND || code == ZERO_EXTEND) + && (code == SIGN_EXTEND || code == ZERO_EXTEND || code == AND) && REG_P (XEXP (src, 0))) { rtx reg = XEXP (src, 0); -- 2.31.1
Re: [patch, wwwdocs] Mention finalization
Hi Thomas, Thanks for that! I think that your one-liner says it all :-) There are three PRs left open that PR37336 depends on: PR65347: Is partially fixed. The F2003/8 feature of finalization of a structure constructor within an array constructor doesn't work. I wonder if a compile option -finalize-constructors might not be better than -std=f2003/8? PR84472: I need to investigate if it is fixed or not. It behaves like one of the other brands, which complains about a double free. The other brand does not have this problem. At one stage, I nulled pointer components before finalization of a function result but removed it because it is not required by the standard. It might well be a good idea, just on the grounds that smart-pointers and resource managers seem to be the main real-life use of finalization and pointer components loom large with them. PR91316: An impure final call is allowed within a pure procedure at the moment. Malcolm Cohen convinced me that this should be disallowed. If the finalization patch has survived a few weeks on mainline without causing problems, I am inclined to backport to 12-branch. Would that be acceptable to one and all? Cheers Paul On Sun, 19 Mar 2023 at 08:15, Thomas Koenig via Fortran wrote: > Hi, > > the sentence below seems a bit short for such a huge undertaking, > but I could not think of anything else to day. > > Tested with "tidy -e". > > OK for wwwdocs? > > Best regards > > Thomas > > > diff --git a/htdocs/gcc-13/changes.html b/htdocs/gcc-13/changes.html > index c8d757b6..a4b71ffa 100644 > --- a/htdocs/gcc-13/changes.html > +++ b/htdocs/gcc-13/changes.html > @@ -373,7 +373,12 @@ a work-in-progress. > > > > - > +Fortran > + > + > +Finalization is now fully supported. > + > + > > > -- "If you can't explain it simply, you don't understand it well enough" - Albert Einstein
[patch, fortran, doc] Explicitly mention undefined overflow
Here's also an update on the docs to explicitly mention behavior on overflow. Maybe this will reach another 0.05% of users... OK for trunk? Best regards Thomas gcc/fortran/ChangeLog: * gfortran.texi: Mention behavior on overflow. diff --git a/gcc/fortran/gfortran.texi b/gcc/fortran/gfortran.texi index c483e13686d..93c66b18938 100644 --- a/gcc/fortran/gfortran.texi +++ b/gcc/fortran/gfortran.texi @@ -820,6 +820,7 @@ might in some way or another become visible to the programmer. * File operations on symbolic links:: * File format of unformatted sequential files:: * Asynchronous I/O:: +* Behavior on integer overflow::o @end menu @@ -1160,6 +1161,23 @@ sytems, such as Linux, it is necessary to specify @option{-pthread}, @c Maybe this chapter should be merged with the 'Standards' section, @c whenever that is written :-) +@node Behavior on integer overflow +@section Behavior on integer overflow +@cindex integer overflow +@cindex overflow handling + +Integer overflow is prohibited by the Fortran standard. The behavior +of gfortran on integer overflow is undefined by default. Traditional +code, like linear congruential pseudo-random number generators in old +programs that rely on specific, non-standard behavior may generate +unexpected results. The @option{-fsanitize=undefined} option can be +used to detect such code at runtime. + +It is recommended to use the intrinsic subroutine @code{RANDOM_NUMBER} +for random number generators or, if the old behavior is desired, to +use the @option{-fwrapv} option. Note that this option can impact +performance. + @node Extensions @chapter Extensions @cindex extensions
[patch, wwwdocs] Mention finalization
Hi, the sentence below seems a bit short for such a huge undertaking, but I could not think of anything else to day. Tested with "tidy -e". OK for wwwdocs? Best regards Thomas diff --git a/htdocs/gcc-13/changes.html b/htdocs/gcc-13/changes.html index c8d757b6..a4b71ffa 100644 --- a/htdocs/gcc-13/changes.html +++ b/htdocs/gcc-13/changes.html @@ -373,7 +373,12 @@ a work-in-progress. - +Fortran + + +Finalization is now fully supported. + +
Re: [PATCH] Fortran: procedures with BIND(C) attribute require explicit interface [PR85877]
Hi Harald, Am 18.03.23 um 19:52 schrieb Thomas Koenig via Gcc-patches: Hi Harald, the Fortran standard requires an explicit procedure interface in certain situations, such as when they have a BIND(C) attribute (F2018:15.4.2.2). The attached patch adds a check for this. Regtested on x86_64-pc-linux-gnu. OK for mainline? While this fixes the ICE, it misses function f() bind(c) f = 42. end subroutine p bind(c) f ! { dg-error "must be explicit" } x = f() end what do you mean by "it misses"? Sorry, that was caused by confusion on my part (and it is better to test an assumption of what the compiler actually does :-) Patch is OK, also for backport. Maybe you can also include the test above, just to make sure. Best regards Thomas