[committed] wwwdocs: gcc-4.1/changes.html: Rework/reduce Classpath links
Adjust one of two links to classpath.org and avoid the other, by removing the respective paragraph which is really not relevant any longer. --- htdocs/gcc-4.1/changes.html | 9 ++--- 1 file changed, 2 insertions(+), 7 deletions(-) diff --git a/htdocs/gcc-4.1/changes.html b/htdocs/gcc-4.1/changes.html index 07c76dfe..5c2708aa 100644 --- a/htdocs/gcc-4.1/changes.html +++ b/htdocs/gcc-4.1/changes.html @@ -324,11 +324,6 @@ rewrites. All image drawing operations should now work correctly (flipping requires gtk+ = 2.6) - Future Graphics2D, image and text work is - documented at: http://developer.classpath.org/mediation/ClasspathGraphicsImagesText;> - http://developer.classpath.org/mediation/ClasspathGraphicsImagesText - - When gtk+ 2.6 or higher is installed the default log handler will produce stack traces whenever a WARNING, CRITICAL or ERROR message is produced. @@ -543,8 +538,8 @@ likely contain bugs). Documentation fixes all over the place. See - http://developer.classpath.org/doc/;> - http://developer.classpath.org/doc/ + https://developer.classpath.org/doc/;> + https://developer.classpath.org/doc/ -- 2.33.0
Re: [PATCH] Adjust testcase for O2 vectorization.
On Fri, Oct 15, 2021 at 3:11 PM Kewen.Lin via Gcc-patches wrote: > > on 2021/10/14 下午6:56, Kewen.Lin via Gcc-patches wrote: > > Hi Hongtao, > > > > on 2021/10/14 下午3:11, liuhongt wrote: > >> Hi Kewen: > >> Cound you help to verify if this patch fix those regressions > >> for rs6000 port. > >> > > > > The ppc64le run just finished, there are still some regresssions: > > > > NA->XPASS: c-c++-common/Wstringop-overflow-2.c -Wc++-compat (test for > > warnings, line 194) > > NA->XPASS: c-c++-common/Wstringop-overflow-2.c -Wc++-compat (test for > > warnings, line 212) > > NA->XPASS: c-c++-common/Wstringop-overflow-2.c -Wc++-compat (test for > > warnings, line 296) > > NA->XPASS: c-c++-common/Wstringop-overflow-2.c -Wc++-compat (test for > > warnings, line 314) > > NA->FAIL: gcc.dg/Wstringop-overflow-21-novec.c (test for excess errors) > > NA->FAIL: gcc.dg/Wstringop-overflow-21-novec.c (test for warnings, line 18) > > NA->FAIL: gcc.dg/Wstringop-overflow-21-novec.c (test for warnings, line 29) > > NA->FAIL: gcc.dg/Wstringop-overflow-21-novec.c (test for warnings, line 45) > > NA->FAIL: gcc.dg/Wstringop-overflow-21-novec.c (test for warnings, line 55) > > NA->FAIL: gcc.dg/Wstringop-overflow-76-novec.c note (test for warnings, > > line 104) > > NA->FAIL: gcc.dg/Wstringop-overflow-76-novec.c note (test for warnings, > > line 137) > > NA->FAIL: gcc.dg/Wstringop-overflow-76-novec.c note (test for warnings, > > line 19) > > NA->FAIL: gcc.dg/Wstringop-overflow-76-novec.c note (test for warnings, > > line 39) > > NA->FAIL: gcc.dg/Wstringop-overflow-76-novec.c note (test for warnings, > > line 56) > > NA->FAIL: gcc.dg/Wstringop-overflow-76-novec.c note (test for warnings, > > line 70) > > NA->FAIL: gcc.dg/Wstringop-overflow-76-novec.c (test for excess errors) > > NA->FAIL: gcc.dg/Wstringop-overflow-76-novec.c (test for warnings, line > > 116) > > NA->FAIL: gcc.dg/Wstringop-overflow-76-novec.c (test for warnings, line > > 131) > > NA->FAIL: gcc.dg/Wstringop-overflow-76-novec.c (test for warnings, line > > 146) > > NA->FAIL: gcc.dg/Wstringop-overflow-76-novec.c (test for warnings, line 33) > > NA->FAIL: gcc.dg/Wstringop-overflow-76-novec.c (test for warnings, line 50) > > NA->FAIL: gcc.dg/Wstringop-overflow-76-novec.c (test for warnings, line 64) > > NA->FAIL: gcc.dg/Wstringop-overflow-76-novec.c (test for warnings, line 78) > > NA->FAIL: gcc.dg/Wstringop-overflow-76-novec.c (test for warnings, line 97) > > PASS->FAIL: c-c++-common/Wstringop-overflow-2.c -std=gnu++14 (test for > > excess errors) > > NA->FAIL: c-c++-common/Wstringop-overflow-2.c -std=gnu++14 (test for > > warnings, line 229) > > NA->FAIL: c-c++-common/Wstringop-overflow-2.c -std=gnu++14 (test for > > warnings, line 230) > > NA->FAIL: c-c++-common/Wstringop-overflow-2.c -std=gnu++14 (test for > > warnings, line 331) > > NA->FAIL: c-c++-common/Wstringop-overflow-2.c -std=gnu++14 (test for > > warnings, line 332) > > // omitting -std=gnu++17, -std=gnu++2a, -std=gnu++98 > > > > I'll have a look and get back to you tomorrow. > > > > The failure c-c++-common/Wstringop-overflow-2.c is due to that the > current proc check_vect_slp_vnqihi_store_usage is made as "cache" > but it can vary for different input patterns. For rs6000 the test > for v2qi fails, the cached test result makes v4qi check fail > unexpectedly (should pass). I adjusted caching for the following users > check_effective_target_vect_slp_v*_store, also refactored a bit. > One trivial change is to add one new argument macro then we can just > compile the corresponding foo* function instead of all, hope it helps > to make the debugging outputs compact. > > For the failure Wstringop-overflow-76-novec.c, there is one typo > comparing to the original Wstringop-overflow-76.c. Guess it failed > on x86 too? It would be surprising if it passes on x86. > As to the failure Wstringop-overflow-21-novec.c, I confirmed it's > just noise, patching typos caused this failure. Thanks for the explanation for those failures and the typo, i'll adjust the patch. > > One new round ppc64le testing just finished with below diff and all > previous regressions are fixed without any new regressions. > > > diff --git a/gcc/testsuite/gcc.dg/Wstringop-overflow-76-novec.c > b/gcc/testsuite/gcc.dg/Wstringop-overflow-76-novec.c > index d000b587a65..1132348c5f4 100644 > --- a/gcc/testsuite/gcc.dg/Wstringop-overflow-76-novec.c > +++ b/gcc/testsuite/gcc.dg/Wstringop-overflow-76-novec.c > @@ -82,7 +82,7 @@ void max_d8_p (char *q, int i) > struct A3_5 > { >char a3[3]; // { dg-message "at offset 3 into destination object 'a3' of > size 3" "pr??" { xfail *-*-* } } > - char a5[5]; > + char a5[5]; // { dg-message "at offset 5 into destination object 'a5' of > size 5" "note" } > }; > > void max_A3_A5 (int i, struct A3_5 *pa3_5) > diff --git a/gcc/testsuite/lib/target-supports.exp > b/gcc/testsuite/lib/target-supports.exp > index 530c5769614..8736b908ec7 100644 > ---
Re: [PATCH] Adjust testcase for O2 vectorization.
On Fri, Oct 15, 2021 at 11:37 PM Martin Sebor wrote: > > On 10/14/21 1:11 AM, liuhongt wrote: > > Hi Kewen: > >Cound you help to verify if this patch fix those regressions > > for rs6000 port. > > > > As discussed in [1], this patch add xfail/target selector to those > > testcases, also make a copy of them so that they can be tested w/o > > vectorization. > > Just to make sure I understand what's happening with the tests: > the new -N-novec.c tests consist of just the casses xfailed due > to vectorizartion in the corresponding -N.c tests? Or are there Wstringop-overflow-2-novec.c is the same as Wstringop-overflow-2.c before O2 vectorization adjustment. Do you want me to reduce them to only contain cases for new xfail/target? > some other differences (e.g., new cases in them, etc.)? I'd > hope to eventually remove the -novec.c tests once all warnings > behave as expected with vectorization as without it (maybe > keeping just one case both ways as a sanity check). > > For the target-supports selectors, I confess I don't know enough > about vectorization to find their names quite intuitive enough > to know when to use each. For instance, for vect_slp_v4qi_store: It's 4-byte char stores with address being 4-bytes aligned. .i.e. > > +# Return the true if target support vectorization of v4qi store. > +proc check_effective_target_vect_slp_v4qi_store { } { > +set pattern {add new stmt: MEM } > +return [expr { [check_vect_slp_vnqihi_store_usage $pattern ] != 0 }] > +} > > When should this selector be used? In cases involving 4-byte > char stores? Only naturally aligned 4-bytes stores (i.e., on > a 4 byte boundary, as the check_vect_slp_vnqihi_store_usage > suggests?) Or 4-byte stores of any types (e.g., four chars > as well as two 16-bit shorts), etc.? > > Hopefully once all the warnings handle vectorization we won't > need to use them, but until then it would be good to document > this in more detail in the .exp file. > > Finally, thank you for adding comments to the xfailed tests > referencing the corresponding bugs! Can you please mention > the PR in the comment in each of the new xfails? Like so: > > index 7d29b5f48c7..cb687c69324 100644 > --- a/gcc/testsuite/c-c++-common/Wstringop-overflow-2.c > +++ b/gcc/testsuite/c-c++-common/Wstringop-overflow-2.c > @@ -189,8 +189,9 @@ void ga1__ (void) > > struct A1 a = { 1 }; > a.a[0] = 0; > + // O2 vectorization regress Wstringop-overflow case (1), refer to > pr102462. > a.a[1] = 1;// { dg-warning > "\\\[-Wstringop-overflow" } > - a.a[2] = 2;// { dg-warning > "\\\[-Wstringop-overflow" "" { xfail { i?86-*-* x86_64-*-* } } } > + a.a[2] = 2;// { dg-warning > "\\\[-Wstringop-overflow" "pr102462" { xfail { vect_slp_v2qi_store } } } > > PR in dg-warning comment. > > This should make it easier to deal with the XFAILs once > the warnings have improved to handle vectorization. Will do. > > Martin -- BR, Hongtao
Re: [RFC] Don't move cold code out of loop by checking bb count
On 2021/10/15 16:11, Richard Biener wrote: > On Sat, Oct 9, 2021 at 5:45 AM Xionghu Luo wrote: >> >> Hi, >> >> On 2021/9/28 20:09, Richard Biener wrote: >>> On Fri, Sep 24, 2021 at 8:29 AM Xionghu Luo wrote: Update the patch to v3, not sure whether you prefer the paste style and continue to link the previous thread as Segher dislikes this... [PATCH v3] Don't move cold code out of loop by checking bb count Changes: 1. Handle max_loop in determine_max_movement instead of outermost_invariant_loop. 2. Remove unnecessary changes. 3. Add for_all_locs_in_loop (loop, ref, ref_in_loop_hot_body) in can_sm_ref_p. 4. "gsi_next ();" in move_computations_worker is kept since it caused infinite loop when implementing v1 and the iteration is missed to be updated actually. v1: https://gcc.gnu.org/pipermail/gcc-patches/2021-August/576488.html v2: https://gcc.gnu.org/pipermail/gcc-patches/2021-September/579086.html There was a patch trying to avoid move cold block out of loop: https://gcc.gnu.org/pipermail/gcc/2014-November/215551.html Richard suggested to "never hoist anything from a bb with lower execution frequency to a bb with higher one in LIM invariantness_dom_walker before_dom_children". In gimple LIM analysis, add find_coldest_out_loop to move invariants to expected target loop, if profile count of the loop bb is colder than target loop preheader, it won't be hoisted out of loop. Likely for store motion, if all locations of the REF in loop is cold, don't do store motion of it. SPEC2017 performance evaluation shows 1% performance improvement for intrate GEOMEAN and no obvious regression for others. Especially, 500.perlbench_r +7.52% (Perf shows function S_regtry of perlbench is largely improved.), and 548.exchange2_r+1.98%, 526.blender_r +1.00% on P8LE. gcc/ChangeLog: * loop-invariant.c (find_invariants_bb): Check profile count before motion. (find_invariants_body): Add argument. * tree-ssa-loop-im.c (find_coldest_out_loop): New function. (determine_max_movement): Use find_coldest_out_loop. (move_computations_worker): Adjust and fix iteration udpate. (execute_sm_exit): Check pointer validness. (class ref_in_loop_hot_body): New functor. (ref_in_loop_hot_body::operator): New. (can_sm_ref_p): Use for_all_locs_in_loop. gcc/testsuite/ChangeLog: * gcc.dg/tree-ssa/recip-3.c: Adjust. * gcc.dg/tree-ssa/ssa-lim-18.c: New test. * gcc.dg/tree-ssa/ssa-lim-19.c: New test. * gcc.dg/tree-ssa/ssa-lim-20.c: New test. --- gcc/loop-invariant.c | 10 ++-- gcc/tree-ssa-loop-im.c | 61 -- gcc/testsuite/gcc.dg/tree-ssa/recip-3.c| 2 +- gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-18.c | 20 +++ gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-19.c | 27 ++ gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-20.c | 25 + gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-21.c | 28 ++ 7 files changed, 165 insertions(+), 8 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-18.c create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-19.c create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-20.c create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-21.c diff --git a/gcc/loop-invariant.c b/gcc/loop-invariant.c index fca0c2b24be..5c3be7bf0eb 100644 --- a/gcc/loop-invariant.c +++ b/gcc/loop-invariant.c @@ -1183,9 +1183,14 @@ find_invariants_insn (rtx_insn *insn, bool always_reached, bool always_executed) call. */ static void -find_invariants_bb (basic_block bb, bool always_reached, bool always_executed) +find_invariants_bb (class loop *loop, basic_block bb, bool always_reached, + bool always_executed) { rtx_insn *insn; + basic_block preheader = loop_preheader_edge (loop)->src; + + if (preheader->count > bb->count) +return; FOR_BB_INSNS (bb, insn) { @@ -1214,8 +1219,7 @@ find_invariants_body (class loop *loop, basic_block *body, unsigned i; for (i = 0; i < loop->num_nodes; i++) -find_invariants_bb (body[i], - bitmap_bit_p (always_reached, i), +find_invariants_bb (loop, body[i], bitmap_bit_p (always_reached, i), bitmap_bit_p (always_executed, i)); } diff --git a/gcc/tree-ssa-loop-im.c b/gcc/tree-ssa-loop-im.c index 4b187c2cdaf..655fab03442 100644 --- a/gcc/tree-ssa-loop-im.c +++ b/gcc/tree-ssa-loop-im.c
[PATCH] tree-object-size: Avoid unnecessary processing of __builtin_object_size
This is a minor cleanup to bail out early if the result of __builtin_object_size is not assigned to anything and avoid initializing the object size arrays. gcc/ChangeLog: * tree-object-size (object_sizes_execute): Consolidate LHS null check and do it early. Signed-off-by: Siddhesh Poyarekar --- gcc/tree-object-size.c | 12 +--- 1 file changed, 5 insertions(+), 7 deletions(-) diff --git a/gcc/tree-object-size.c b/gcc/tree-object-size.c index 6a4dc724f34..46a976dfe10 100644 --- a/gcc/tree-object-size.c +++ b/gcc/tree-object-size.c @@ -1298,6 +1298,10 @@ object_sizes_execute (function *fun, bool insert_min_max_p) if (!gimple_call_builtin_p (call, BUILT_IN_OBJECT_SIZE)) continue; + tree lhs = gimple_call_lhs (call); + if (!lhs) + continue; + init_object_sizes (); /* If insert_min_max_p, only attempt to fold @@ -1312,11 +1316,9 @@ object_sizes_execute (function *fun, bool insert_min_max_p) { unsigned HOST_WIDE_INT object_size_type = tree_to_uhwi (ost); tree ptr = gimple_call_arg (call, 0); - tree lhs = gimple_call_lhs (call); if ((object_size_type == 1 || object_size_type == 3) && (TREE_CODE (ptr) == ADDR_EXPR - || TREE_CODE (ptr) == SSA_NAME) - && lhs) + || TREE_CODE (ptr) == SSA_NAME)) { tree type = TREE_TYPE (lhs); unsigned HOST_WIDE_INT bytes; @@ -1339,10 +1341,6 @@ object_sizes_execute (function *fun, bool insert_min_max_p) continue; } - tree lhs = gimple_call_lhs (call); - if (!lhs) - continue; - result = gimple_fold_stmt_to_constant (call, do_valueize); if (!result) { -- 2.31.1
Re: [RFC PATCH 0/8] RISC-V: Bit-manipulation extension.
Hi Vineet: I am not familiar with buildroot, so I am not sure which GCC version will work, but I think the patch set should be able to apply both gcc 11.1 and trunk without conflict. Here is a gcc 11.1 + this patch set on my github, hope this could help :) https://github.com/kito-cheng/riscv-gcc/tree/riscv-gcc-11.1.0-zbabcs On Thu, Oct 14, 2021 at 4:22 AM Vineet Gupta wrote: > > Hi Kito, > > On 9/23/21 12:57 AM, Kito Cheng wrote: > > Bit manipulation extension[1] is finishing the public review and waiting for > > the rest of the ratification process, I believe that will become a ratified > > extension soon, so I think it's time to submit to upstream for review now :) > > > > As the title included RFC, it's not a rush to merge to trunk yet, I would > > like to merge that until it is officially ratified. > > > > This patch set is the implementation of bit-manipulation extension, which > > includes zba, zbb, zbc and zbs extension, but only included in > > instruction/md > > pattern only, no intrinsic function implementation. > > > > Most work is done by Jim Willson and many other contributors > > on https://github.com/riscv-collab/riscv-gcc. > > > > > > [1] https://github.com/riscv/riscv-bitmanip/releases/tag/1.0.0 > > I wanted to give these a try. Is it reasonable to apply these to a gcc > 11.1 baseline and give a spin in buildroot or do these absolutely have > to be bleeding edge gcc. > > Thx, > -Vineet
Re: [PATCH] AVX512FP16: Add *_set1_pch intrinsics.
On Fri, Oct 15, 2021 at 4:38 PM dianhong.xu--- via Gcc-patches wrote: > > From: dianhong xu > > Add *_set1_pch (_Float16 _Complex A) intrinsics. > > gcc/ChangeLog: > > * config/i386/avx512fp16intrin.h: > (_mm512_set1_pch): New intrinsic. > * config/i386/avx512fp16vlintrin.h: > (_mm256_set1_pch): New intrinsic. > (_mm_set1_pch): Ditto. > > gcc/testsuite/ChangeLog: > > * gcc.target/i386/avx512fp16-set1-pch-1a.c: New test. > * gcc.target/i386/avx512fp16-set1-pch-1b.c: New test. > * gcc.target/i386/avx512fp16vl-set1-pch-1a.c: New test. > * gcc.target/i386/avx512fp16vl-set1-pch-1b.c: New test. LGTM. > --- > gcc/config/i386/avx512fp16intrin.h| 13 + > gcc/config/i386/avx512fp16vlintrin.h | 26 + > .../gcc.target/i386/avx512fp16-set1-pch-1a.c | 13 + > .../gcc.target/i386/avx512fp16-set1-pch-1b.c | 42 ++ > .../i386/avx512fp16vl-set1-pch-1a.c | 20 +++ > .../i386/avx512fp16vl-set1-pch-1b.c | 57 +++ > 6 files changed, 171 insertions(+) > create mode 100644 gcc/testsuite/gcc.target/i386/avx512fp16-set1-pch-1a.c > create mode 100644 gcc/testsuite/gcc.target/i386/avx512fp16-set1-pch-1b.c > create mode 100644 gcc/testsuite/gcc.target/i386/avx512fp16vl-set1-pch-1a.c > create mode 100644 gcc/testsuite/gcc.target/i386/avx512fp16vl-set1-pch-1b.c > > diff --git a/gcc/config/i386/avx512fp16intrin.h > b/gcc/config/i386/avx512fp16intrin.h > index 079ce321c01..17025d68b8e 100644 > --- a/gcc/config/i386/avx512fp16intrin.h > +++ b/gcc/config/i386/avx512fp16intrin.h > @@ -7237,6 +7237,19 @@ _mm512_permutexvar_ph (__m512i __A, __m512h __B) > (__mmask32)-1); > } > > +extern __inline __m512h > +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) > +_mm512_set1_pch (_Float16 _Complex __A) > +{ > + union > + { > +_Float16 _Complex a; > +float b; > + } u = { .a = __A}; > + > + return (__m512h) _mm512_set1_ps (u.b); > +} > + > #ifdef __DISABLE_AVX512FP16__ > #undef __DISABLE_AVX512FP16__ > #pragma GCC pop_options > diff --git a/gcc/config/i386/avx512fp16vlintrin.h > b/gcc/config/i386/avx512fp16vlintrin.h > index f83a429ba43..1de4513d7f1 100644 > --- a/gcc/config/i386/avx512fp16vlintrin.h > +++ b/gcc/config/i386/avx512fp16vlintrin.h > @@ -3315,6 +3315,32 @@ _mm_permutexvar_ph (__m128i __A, __m128h __B) > (__mmask8)-1); > } > > +extern __inline __m256h > +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) > +_mm256_set1_pch (_Float16 _Complex __A) > +{ > + union > + { > +_Float16 _Complex a; > +float b; > + } u = { .a = __A }; > + > + return (__m256h) _mm256_set1_ps (u.b); > +} > + > +extern __inline __m128h > +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) > +_mm_set1_pch (_Float16 _Complex __A) > +{ > + union > + { > +_Float16 _Complex a; > +float b; > + } u = { .a = __A }; > + > + return (__m128h) _mm_set1_ps (u.b); > +} > + > #ifdef __DISABLE_AVX512FP16VL__ > #undef __DISABLE_AVX512FP16VL__ > #pragma GCC pop_options > diff --git a/gcc/testsuite/gcc.target/i386/avx512fp16-set1-pch-1a.c > b/gcc/testsuite/gcc.target/i386/avx512fp16-set1-pch-1a.c > new file mode 100644 > index 000..0055193f243 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/i386/avx512fp16-set1-pch-1a.c > @@ -0,0 +1,13 @@ > +/* { dg-do compile} */ > +/* { dg-options "-O2 -mavx512fp16" } */ > + > +#include > + > +__m512h > +__attribute__ ((noinline, noclone)) > +test_mm512_set1_pch (_Float16 _Complex A) > +{ > + return _mm512_set1_pch(A); > +} > + > +/* { dg-final { scan-assembler "vbroadcastss\[ \\t\]+\[^\n\r\]*%zmm\[01\]" } > } */ > diff --git a/gcc/testsuite/gcc.target/i386/avx512fp16-set1-pch-1b.c > b/gcc/testsuite/gcc.target/i386/avx512fp16-set1-pch-1b.c > new file mode 100644 > index 000..450d7e37237 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/i386/avx512fp16-set1-pch-1b.c > @@ -0,0 +1,42 @@ > +/* { dg-do run { target avx512fp16 } } */ > +/* { dg-options "-O2 -mavx512fp16" } */ > + > +#include > +#include > +#include > + > +static void do_test (void); > + > +#define DO_TEST do_test > +#define AVX512FP16 > + > +#include > +#include "avx512-check.h" > + > +static void > +do_test (void) > +{ > + _Float16 _Complex fc = 1.0 + 1.0*I; > + union > + { > +_Float16 _Complex a; > +float b; > + } u = { .a = fc }; > + float ff= u.b; > + > + typedef union > + { > +float fp[16]; > +__m512h m512h; > + } u1; > + > + __m512h test512 = _mm512_set1_pch(fc); > + > + u1 test; > + test.m512h = test512; > + for (int i = 0; i<16; i++) > + { > +if (test.fp[i] != ff) abort(); > + } > + > +} > diff --git a/gcc/testsuite/gcc.target/i386/avx512fp16vl-set1-pch-1a.c > b/gcc/testsuite/gcc.target/i386/avx512fp16vl-set1-pch-1a.c > new file mode 100644 > index
Re: [PATCH] Convert strlen pass from evrp to ranger.
On 10/8/2021 9:12 AM, Aldy Hernandez via Gcc-patches wrote: The following patch converts the strlen pass from evrp to ranger, leaving DOM as the last remaining user. So is there any reason why we can't convert DOM as well? DOM's use of EVRP is pretty limited. You've mentioned FP bits before, but my recollection is those are not part of the EVRP analysis DOM uses. Hell, give me a little guidance and I'll do the work... No additional cleanups have been done. For example, the strlen pass still has uses of VR_ANTI_RANGE, and the sprintf still passes around pairs of integers instead of using a proper range. Fixing this could further improve these passes. As a further enhancement, if the relevant maintainers deem useful, the domwalk could be removed from strlen. That is, unless the pass needs it for something else. The dom walk was strictly for the benefit of EVRP when it was added. So I think it can get zapped once the pass is converted. Jeff
Re: [PATCH] Convert strlen pass from evrp to ranger.
On 10/15/2021 4:39 AM, Aldy Hernandez wrote: On 10/15/21 2:47 AM, Andrew MacLeod wrote: On 10/14/21 6:07 PM, Martin Sebor via Gcc-patches wrote: On 10/9/21 12:47 PM, Aldy Hernandez via Gcc-patches wrote: We seem to be passing a lot of context around in the strlen code. I certainly don't want to contribute to more. Most of the handle_* functions are passing the gsi as well as either ptr_qry or rvals. That looks a bit messy. May I suggest putting all of that in the strlen pass object (well, the dom walker object, but we can rename it to be less dom centric)? Something like the attached (untested) patch could be the basis for further cleanups. Jakub, would this line of work interest you? You didn't ask me but since no one spoke up against it let me add some encouragement: this is exactly what I was envisioning and in line with other such modernization we have been doing elsewhere. Could you please submit it for review? Martin I'm willing to bet he didn't submit it for review because he doesn't have time this release to polish and track it... (I think the threader has been quite consuming). Rather, it was offered as a starting point for someone else who might be interested in continuing to pursue this work... *everyone* is interested in cleanup work others do :-) Exactly. There's a lot of work that could be done in this area, and I'm trying to avoid the situation with the threaders where what started as refactoring ended up with me basically owning them ;-). I wouldn't go that far ;-) I'm still here, just focused on other stuff. That being said, I there are enough cleanups that are useful on their own. I've removed all the passing around of GSIs, as well as ptr_qry, with the exception of anything dealing with the sprintf pass, since it has a slightly different interface. You know, it's funny. The 0001 patch looks a lot like what I ended up doing here and there i when I start cleaning things up. Pull state into a class, make functions which need the state member functions, repeat until it works. This is patch 0001, which I'm formally submitting for inclusion. No functional changes with this patch. OK for trunk? I'll ACK this now :-) Also, I am PINGing patch 0002, which is the strlen pass conversion to the ranger. As mentioned, this is just a change from an evrp client to a ranger client. The APIs are exactly the same, and besides, the evrp analyzer is deprecated and slated for removal. OK for trunk? I'll defer on this a bit. I've got to step away and may not be back online tonight. I worry more about the unintended testsuite fallout here more than anything. Which argues it should go into the tester to see if there is any such fallout :-) jeff
Re: [PATCH] d-demangle: properly skip anonymous symbols
On 10/5/2021 11:53 AM, Luís Ferreira wrote: On Tue, 2021-10-05 at 18:13 +0100, Luís Ferreira wrote: This patch fixes a bug on the D demangler by parsing and skip anonymous symbols correctly, according the ABI specification. Furthermore, it also includes tests to cover anonymous symbols. The spec specifies [1] that a symbol name can be anonymous and multiple anonymous symbols are allowed. [1]: https://dlang.org/spec/abi.html#SymbolName ChangeLog: libiberty/ * d-demangle.c (dlang_parse_qualified): Handle anonymous symbols correctly. * testsuite/d-demangle-expected: New tests to cover anonymous symbols. Thanks. I fixed a whitespace nit and installed this patch. Jeff
Re: [PING^3] Generalize 'gcc/input.h:struct location_hash' (was: [Committed] [PATCH 2/4] (v4) On-demand locations within string-literals)
On 9/30/2021 12:47 AM, Thomas Schwinge wrote: Hi! On 2021-09-17T13:16:14+0200, I wrote: On 2021-09-10T09:48:56+0200, I wrote: Ping. My patches again attached, for easy reference. Ping once again. Jeff had ACKed "Don't record string concatenation data for 'RESERVED_LOCATION_P'" (thanks!), but "Generalize 'gcc/input.h:struct location_hash'" is still awaiting review: On 2021-09-03T18:33:37+0200, I wrote: On 2021-09-02T21:09:54+0200, I wrote: On 2021-09-02T15:59:14+0200, I wrote: On 2016-08-05T14:16:58-0400, David Malcolm wrote: Committed to trunk as r239175; I'm attaching the final version of the patch for reference. David, you've added here 'gcc/input.h:struct location_hash' (see quoted below), which will be useful elsewhere, so: --- a/gcc/input.h +++ b/gcc/input.h +struct location_hash : int_hash { }; + +class GTY(()) string_concat_db +{ +[...] + hash_map *m_table; +}; OK to push the attached "Generalize 'gcc/input.h:struct location_hash'"? Attached again, for easy reference. Grüße Thomas - Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955 0002-Generalize-gcc-input.h-struct-location_hash.patch From 349a3172f64db93ee98ea39b36489b702b6596ab Mon Sep 17 00:00:00 2001 From: Thomas Schwinge Date: Tue, 31 Aug 2021 23:30:25 +0200 Subject: [PATCH 2/2] Generalize 'gcc/input.h:struct location_hash' This is currently only used here ('gcc/input.h:class string_concat_db'), but is actually generally useful, so advertize it as such. Per the rationale given, we may use 'BUILTINS_LOCATION' as spare value for 'Deleted', in addition to the existing use of 'UNKNOWN_LOCATION' as spare value for 'Empty'. gcc/ * input.h (location_hash): Use 'BUILTINS_LOCATION' as spare value for 'Deleted'. Turn into a '#define'. OK jeff
Re: [PATCH] Try placing RTL folded constants in constant pool
On 10/3/2021 8:26 AM, Roger Sayle wrote: My recent attempts to come up with a testcase for my patch to evaluate ss_plus in simplify-rtx.c, identified a missed optimization opportunity (that's potentially a long-time regression): The RTL optimizers no longer place constants in the constant pool. The motivating x86_64 example is the simple program: typedef char v8qi __attribute__ ((vector_size (8))); v8qi foo() { v8qi tx = { 1, 0, 0, 0, 0, 0, 0, 0 }; v8qi ty = { 2, 0, 0, 0, 0, 0, 0, 0 }; v8qi t = __builtin_ia32_paddsb(tx, ty); return t; } which (with my previous patch) currently results in: foo:movq.LC0(%rip), %xmm0 movq.LC1(%rip), %xmm1 paddsb %xmm1, %xmm0 ret even though the RTL contains the result in a REG_EQUAL note: (insn 7 6 12 2 (set (reg:V8QI 83) (ss_plus:V8QI (reg:V8QI 84) (reg:V8QI 85))) "ssaddqi3.c":7:12 1419 {*mmx_ssaddv8qi3} (expr_list:REG_DEAD (reg:V8QI 85) (expr_list:REG_DEAD (reg:V8QI 84) (expr_list:REG_EQUAL (const_vector:V8QI [ (const_int 3 [0x3]) (const_int 0 [0]) repeated x7 ]) (nil) Together with the patch below, GCC will now generate the much more sensible: foo:movq.LC2(%rip), %xmm0 ret My first approach was to look in cse.c (where the REG_EQUAL note gets added) and notice that the constant pool handling functionality has been unreachable for a while. A quick search for constant_pool_entries_cost shows that it's initialized to zero, but never set to a non-zero value, meaning that force_const_mem is never called. This functionality used to work way back in 2003, but has been lost over time: https://gcc.gnu.org/pipermail/gcc-patches/2003-October/116435.html The changes to cse.c below restore this functionality (placing suitable constants in the constant pool) with two significant refinements; (i) it only attempts to do this if the function already uses a constant pool (thanks to the availability of crtl->uses_constant_pool since 2003). (ii) it allows different constants (i.e. modes) to have different costs, so that floating point "doubles" and 64-bit, 128-bit, 256-bit and 512-bit vectors don't all have the share the same cost. Back in 2003, the assumption was that everything in a constant pool had the same cost, hence the global variable constant_pool_entries_cost. Although this is a useful CSE fix, it turns out that it doesn't cure my motivating problem above. CSE only considers a single instruction, so determines that it's cheaper to perform the ss_plus (COSTS_N_INSNS(1)) than read the result from the constant pool (COSTS_N_INSNS(2)). It's only when the other reads from the constant pool are also eliminated, that this transformation is a win. Hence a better place to perform this transformation is in combine, where after failing to "recog" the load of a suitable constant, it can retry after calling force_const_mem. This achieves the desired transformation and allows the backend insn_cost call-back to control whether or not using the constant pool is preferrable. Alas, it's rare to change code generation without affecting something in GCC's testsuite. On x86_64-pc-linux-gnu there were two families of new failures (and I'd predict similar benign fallout on other platforms). One failure was gcc.target/i386/387-12.c (aka PR target/26915), where the test is missing an explicit -m32 flag. On i686, it's very reasonable to materialize -1.0 using "fld1; fchs", but on x86_64-pc-linux-gnu we currently generate the awkward: testm1: fld1 fchs fstpl -8(%rsp) movsd -8(%rsp), %xmm0 ret which combine now very reasonably simplifies to just: testm1: movsd .LC3(%rip), %xmm0 ret The other class of x86_64-pc-linux-gnu failure was from materialization of vector constants using vpbroadcast (e.g. gcc.target/i386/pr90773-17.c) where the decision is finely balanced; the load of an integer register with an immediate constant, followed by a vpbroadcast is deemed to be COSTS_N_INSNS(2), whereas a load from the constant pool is also reported as COSTS_N_INSNS(2). My solution is to tweak the i386.c's rtx_costs so that all other things being equal, an instruction (sequence) that accesses memory is fractionally more expensive than one that doesn't. Hopefully, this all makes sense. If someone could benchmark this for me that would me much appreciated. This patch has been tested on x86_64-pc-linux-gnu with "make bootstrap" and "make -k check" with no new failures. Ok for mainline? 2021-10-03 Roger Sayle gcc/ChangeLog * combine.c (recog_for_combine): For an unrecognized move/set of a constant, try force_const_mem to place it in the constant pool. * cse.c (constant_pool_entries_cost, constant_pool_entries_regcost): Delete global variables (that are no longer assigned a cost value).
Re: [PATCH] libiberty: d-demangle: use appendc for single chars append
On 9/29/2021 9:32 AM, Luís Ferreira wrote: This may be optimized by some modern smart compilers inliner, but since strlen can be an external source, this can produce unoptimized code. strlen has very well defined semantics by ISO and even if it's defined externally compilers know those semantics and can optimize appropriately. In fact, if you build a testcase, compile it with a modern compiler, you should see the call to strlen optimized away & the call to memcpy turned into a simple store. So I just don't see the value in adding more code here when we can just let the optimizer do its job and get the same result. I won't object if Iain wants to go forward with this patch, but I'm not going forward with it independently. jeff
Re: [PATCH 0/8] __builtin_dynamic_object_size and more
On 10/7/2021 10:50 PM, Siddhesh Poyarekar wrote: On 10/8/21 03:44, Siddhesh Poyarekar wrote: (from about 4% to 70% in bash), but that could well be due to the _chk I should also clarify that this is for memcpy. For all fortifiable functions, the coverage percentage went from 30.81% to 84.5% for bash. Below is the full table. Please note that this is only based on symbols emitted in the end as I didn't want to rebuild the _FORTIFIED_SOURCE=2 binaries, so it does not take into account the fact that _chk could get folded to regular calls if we know at compile time that it's safe to do so. No more posting patches at 4am; it only leads to more clarification follow-ups :/ FWIW, that 30% number is roughly in-line with the data we saw from a Red Hat partner a year or so ago. Bringing that up to 80%+ would be a notable win, even if folks have to explicitly opt-in, as I expect some projects would without hesitation. I'd really like it if Jakub could take the lead on this. While I'm a big proponent of the workn Jakub knows the relevant code far better than I and it'll affect the Red Hat team far more than I'll affect me these days :-) Jeff
Re: [PATCH] bfin: Popcount-related improvements to machine description.
On 10/17/2021 7:08 AM, Roger Sayle wrote: Blackfin processors support a ONES instruction that implements a 32-bit popcount returning a 16-bit result. This instruction was previously described by GCC's bfin backed using a UNSPEC, but with this patch uses a POPCOUNT:SI rtx to capture the semantics, allowing it to evaluated at compile-time. I've decided to keep the instruction name the same (avoiding any changes to the __builtin_bfin_ones machinery), but have provided popcountsi2 and popcounthi2 expanders so that the middle-end can use this instruction to implement __builtin_popcount (and __builtin_parity). The new testcase ones.c short foo () { int t = 5; short r = __builtin_bfin_ones(t); return r; } previously generated: _foo: nop; nop; R0 = 5 (X); R0.L = ONES R0; rts; with this patch, now generates: _foo: nop; nop; nop; R0 = 2 (X); rts; The new testcase popcount.c int foo(int x) { return __builtin_popcount(x); } previously generated: _foo: [--SP] = RETS; SP += -12; call ___popcountsi2; SP += 12; RETS = [SP++]; rts; now generates: _foo: nop; nop; R0.L = ONES R0; R0 = R0.L (Z); rts; And the new testcase parity.c int foo(int x) { return __builtin_parity(x); } previously generated: _foo: [--SP] = RETS; SP += -12; call ___paritysi2; SP += 12; RETS = [SP++]; rts; now generates: _foo: nop; R1 = 1 (X); R0.L = ONES R0; R0 = R1 & R0; rts; This patch has been tested on a cross-compiler to bfin-elf hosted on x86_64-pc-linux-gnu, but without a toolchain, and shows no regressions in the compile-only parts of the testsuite. Ok for mainline? 2021-10-17 Roger Sayle gcc/ChangeLog * config/bfin/bfin.md (define_constants): Remove UNSPEC_ONES. (define_insn "ones"): Replace UNSPEC_ONES with a truncate of a popcount, allowing compile-time evaluation/simplification. (popcountsi2, popcounthi2): New expanders using a "ones" insn. gcc/testsuite/ChangeLog * gcc.target/bfin/ones.c: New test case. * gcc.target/bfin/parity.c: New test case. * gcc.target/bfin/ones.c: New test case. OK jeff
Re: [PATCH] Constant fold SS_NEG and SS_ABS in simplify-rtx.c
On 10/17/2021 3:12 AM, Roger Sayle wrote: This simple patch performs compile-time constant folding of signed saturating negation and signed saturating absolute value in the RTL optimizers. Normally in two's complement arithmetic the lowest representable signed value overflows on negation, with these saturating operators they "saturate" to the maximum representable signed value, so SS_NEG:QI -128 is 127, and SS_ABS:HI -32768 is 32767. On bfin-elf, the following two short functions: short foo() { short t = -32768; short r = __builtin_bfin_negate_fr1x16(t); return r; } int bar() { int t = -2147483648; int r = __builtin_bfin_abs_fr1x32(t); return r; } currently compile to: _foo: nop; nop; R0 = -32768 (X); R0 = -R0 (V); rts; _bar: nop; R0 = -1 (X); R0 <<= 31; R0 = abs R0; rts; but with this middle-end patch now compile to: _foo: nop; nop; nop; R0 = 32767 (X); rts; _bar: nop; nop; R0 = -1 (X); R0.H = 32767; rts; This patch has been tested on x86_64-pc-linux-gnu with "make bootstrap" and "make -k check" with no new failures. Ok for mainline? 2021-10-17 Roger Sayle gcc/ChangeLog * simplify-rtx.c (simplify_const_unary_operation) [SS_NEG, SS_ABS]: Evalute SS_NEG and SS_ABS of a constant argument. gcc/testsuite/ChangeLog * gcc.target/bfin/ssabs.c: New test case. * gcc.target/bfin/ssneg.c: New test case. OK. Jeff
Re: [Patch] Fortran: Fix CLASS conversion check [PR102745]
Hi Tobias, This is OK for mainline and as far back in the branches as you feel inclined to go. Thanks for the patch. Paul On Fri, 15 Oct 2021 at 22:19, Tobias Burnus wrote: > This patch fixes two issues: > > First, to print 'CLASS(t2)' instead of: > Error: Type mismatch in argument ‘x’ at (1); passed > CLASS(__class_MAIN___T2_a) to TYPE(t) > > Additionally, > >class(t2) = class(t) ! 't2' extends 't' >class(t2) = class(any) > > was wrongly accepted. > > OK? > > Tobias > - > Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, > 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: > Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; > Registergericht München, HRB 106955 > -- "If you can't explain it simply, you don't understand it well enough" - Albert Einstein
Re: [Patch] [v3] Fortran: Fix Bind(C) Array-Descriptor Conversion (Move to Front-End Code)
Hi Tobias, I can only echo Harald's comment that this is an impressive bit of work. I spent some time messing with fc-descriptor-7.f90/gc-descriptor-7-c.cc because it kept failing on me. This came about because I missed one of the chunks not applying in the C component of the test; namely: for (int j = 0; j < 5; ++j) for (int i = 0; i < 10; ++i) { subscripts[0] = j; subscripts[1] = i; if (*(int *) CFI_address (a, subscripts) != (i+1) + 100*(j+1)) abort (); } This set me to wondering whether or not the user should be aware that the result of the transpose intrinsic being passed in this way should not generate a warning that the CFI API must be used in this case and not to depend on the data being transposed? Apart from this I have no other comments, still less corrections :-) Many thanks for the patch - OK for mainline. Paul On Wed, 13 Oct 2021 at 21:11, Harald Anlauf wrote: > Hi Tobias, > > Am 13.10.21 um 18:01 schrieb Tobias Burnus: > > Dear all, > > > > a minor update [→ v3]. > > this has become an impressive work. > > > I searched for XFAIL in Sandra's c-interop/ and found > > two remaining true** xfails, now fixed: > > > > - gfortran.dg/c-interop/typecodes-scalar-basic.f90 > >The conversion of scalars of type(c_ptr) was mishandled; > >fixed now; the fix did run into issues converting a string_cst, > >which has also been fixed. > > > > - gfortran.dg/c-interop/fc-descriptor-7.f90 > >this one uses TRANSPOSE which did not work [now mostly* does] > >→ PR fortran/101309 now also fixed. > > > > I forgot what the exact issue for the latter was. However, when > > looking at the testcase and extending it, I did run into the > > following issue - and at the end the testcase does now pass. > > The issue I had was that when a contiguous check was requested > > (i.e. only copy in when needed) it failed to work, when > > parmse->expr was (a pointer to) a descriptor. I fixed that and > > now most* things work. > > > > OK for mainline? Comments? Suggestions? More PRs which fixes > > this patch? Regressions? Test results? > > Doesn't break my own codes so far. > > If nobody else responds within the next days, assume an OK > from my side. > > This will also provide Gerhard with a new playground. ;-) > > Thanks for the patch! > > Harald > > > Tobias > > > > PS: I intent to commit this patch to the OG11 (devel/omp/gcc-11) > > branch, in case someone wants to test it there. > > > > PPS: Nice to have an extensive testcase suite - kudos to Sandra > > in this case. I am sure Gerald will find more issues and once > > it is in, I think I/we have to check some PRs + José's patches > > whether for additional testcases + follow-up fixes. > > > > (*) I write most as passing a (potentially) noncontiguous > > assumed-rank array to a CONTIGUOUS assumed-rank array causes > > an ICE as the scalarizer does not handle dynamic ranks alias > > expr->rank == -1 / ss->dimen = -1. > > I decided that that's a separate issue and filled: > > https://gcc.gnu.org/PR102729 > > BTW, my impression is that fixing that PR might fix will solve > > the trans*.c part of https://gcc.gnu.org/PR102641 - but I have > > not investigated. > > > > (**) There are still some 'xfail' in comments (outside dg-*) > > whose tests now pass. And those where for two bugs in the same > > statement, only one is reported - and the other only after fixing > > the first one, which is fine. > > > > On 09.10.21 23:48, Tobias Burnus wrote: > >> Hi all, > >> > >> attached is the updated version. Changes: > >> * Handle noncontiguous arrays – with BIND(C), (g)Fortran needs to make > it > >> contiguous in the caller but also handle noncontiguous in the callee. > >> * Fixes/handle 'character(len=*)' with BIND(C); those always use an > >> array descriptor - also with explicit-size and assumed-size arrays > >> * Fixed a bunch of bugs, found when writing extensive testcases. > >> * Fixed type(*) handling - those now pass properly type and elem_len > >> on when calling a new function (bind(C) or not). > >> > >> Besides adding the type itself (which is rather straight forward), > >> this patch only had minor modifications – and then the two big > >> conversion functions. > >> > >> While it looks intimidating, it should be comparably simple to > >> review as everything is on one place and hopefully sufficiently > >> well documented. > >> > >> OK – for mainline? Other comments? More PRs which are fixed? > >> Issues not yet fixed (which are inside the scope of this patch)? > >> > >> (If this patch is too long, I also have a nine-day old pending patch > >> at https://gcc.gnu.org/pipermail/gcc-patches/2021-October/580624.html ) > >> > >> Tobias > >> > >> PS: The following still applies. > >> > >> On 06.09.21 12:52, Tobias Burnus wrote: > >>> gfortran's internal array descriptor (xgfc descriptor) and > >>> the descriptor used with BIND(C) (CFI descriptor, ISO_Fortran_binding.h > >>> of TS29113 / Fortran 2018) are
[PATCH v4] Fix ICE when mixing VLAs and statement expressions [PR91038]
Here is the 4th version of the patch. I tried to implement Jason's suggestion and this also fixes the problem. But I am not sure I understand the condition on the TREE_SIDE_EFFECTS ... And there is now another problem: c_finish_omp_for in c-family/c-omp.c does not seem to understand the expressions anymore and I get a test failure in testsuite/c-c++-common/gomp/for-5.c where I now get an "invalid increment expression" instead of the expected error. (bootstrapping and all other tests work fine) Martin Fix ICE when mixing VLAs and statement expressions [PR91038] When returning VM-types from statement expressions, this can lead to an ICE when declarations from the statement expression are referred to later. Most of these issues can be addressed by gimplifying the base expression earlier in gimplify_compound_lval. Another issue is fixed by adding SAVE_EXPRs in pointer_int_sum in the FE to force a correct order of evaluation. This fixes PR91038 and some of the test cases from PR29970 (structs with VLA members need further work). 2021-08-01 Martin Uecker 2021-08-01 Martin Uecker gcc/ PR c/91038 PR c/29970 * gimplify.c (gimplify_var_or_parm_decl): Update comment. (gimplify_compound_lval): Gimplify base expression first. (gimplify_target_expr): Add comment. * c-family/c- common.c (pointer_int_sum): Wrap pointer operand in SAVE_EXPR and also it to the integer argument. gcc/testsuite/ PR c/91038 PR c/29970 * gcc.dg/vla-stexp-3.c: New test. * gcc.dg/vla-stexp-4.c: New test. * gcc.dg/vla-stexp-5.c: New test. * gcc.dg/vla-stexp-6.c: New test. * gcc.dg/vla-stexp-7.c: New test. * gcc.dg/vla-stexp- 8.c: New test. * gcc.dg/vla-stexp-9.c: New test. diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c index 9d19e352725..522085664f5 100644 --- a/gcc/c-family/c-common.c +++ b/gcc/c-family/c-common.c @@ -3348,6 +3348,16 @@ pointer_int_sum (location_t loc, enum tree_code resultcode, intop = convert (c_common_type_for_size (TYPE_PRECISION (sizetype), TYPE_UNSIGNED (sizetype)), intop); + /* Wrap the pointer expression in a SAVE_EXPR to make sure it + * is evaluated first because the size expression may depend on it + * for VM types. + */ + if (TREE_SIDE_EFFECTS (size_exp)) +{ +ptrop = build1_loc (loc, SAVE_EXPR, TREE_TYPE (ptrop), ptrop); +intop = build2 (COMPOUND_EXPR, TREE_TYPE (intop), ptrop, intop); +} + /* Replace the integer argument with a suitable product by the object size. Do this multiplication as signed, then convert to the appropriate type for the pointer operation and disregard an overflow that occurred only diff --git a/gcc/gimplify.c b/gcc/gimplify.c index d8e4b139349..be5b00b6716 100644 --- a/gcc/gimplify.c +++ b/gcc/gimplify.c @@ -2958,7 +2958,10 @@ gimplify_var_or_parm_decl (tree *expr_p) declaration, for which we've already issued an error. It would be really nice if the front end wouldn't leak these at all. Currently the only known culprit is C++ destructors, as seen - in g++.old-deja/g++.jason/binding.C. */ + in g++.old-deja/g++.jason/binding.C. + Another possible culpit are size expressions for variably modified + types which are lost in the FE or not gimplified correctly. + */ if (VAR_P (decl) && !DECL_SEEN_IN_BIND_EXPR_P (decl) && !TREE_STATIC (decl) && !DECL_EXTERNAL (decl) @@ -3103,16 +3106,22 @@ gimplify_compound_lval (tree *expr_p, gimple_seq *pre_p, gimple_seq *post_p, expression until we deal with any variable bounds, sizes, or positions in order to deal with PLACEHOLDER_EXPRs. - So we do this in three steps. First we deal with the annotations - for any variables in the components, then we gimplify the base, - then we gimplify any indices, from left to right. */ + The base expression may contain a statement expression that + has declarations used in size expressions, so has to be + gimplified before gimplifying the size expressions. + + So we do this in three steps. First we deal with variable + bounds, sizes, and positions, then we gimplify the base, + then we deal with the annotations for any variables in the + components and any indices, from left to right. */ + for (i = expr_stack.length () - 1; i >= 0; i--) { tree t = expr_stack[i]; if (TREE_CODE (t) == ARRAY_REF || TREE_CODE (t) == ARRAY_RANGE_REF) { - /* Gimplify the low bound and element type size and put them into + /* Deal with the low bound and element type size and put them into the ARRAY_REF. If these values are set, they have already been gimplified. */ if (TREE_OPERAND (t, 2) == NULL_TREE) @@ -3121,18 +3130,8 @@ gimplify_compound_lval (tree
Re: [PATCH v2 0/4] libffi: Sync with upstream
On Sat, Oct 16, 2021 at 1:07 PM David Edelsohn wrote: > > On Sat, Oct 16, 2021 at 3:59 PM H.J. Lu wrote: > > > > On Sat, Oct 16, 2021 at 12:53 PM David Edelsohn wrote: > > > > > > On Sat, Oct 16, 2021 at 1:13 PM H.J. Lu wrote: > > > > > > > > On Sat, Oct 16, 2021 at 10:04 AM David Edelsohn > > > > wrote: > > > > > > > > > > On Sat, Oct 16, 2021 at 7:48 AM H.J. Lu wrote: > > > > > > > > > > > > On Fri, Oct 15, 2021 at 5:22 PM David Edelsohn > > > > > > wrote: > > > > > > > > > > > > > > On Fri, Oct 15, 2021 at 8:06 PM H.J. Lu > > > > > > > wrote: > > > > > > > > > > > > > > > > On Wed, Oct 13, 2021 at 6:42 AM H.J. Lu > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > On Wed, Oct 13, 2021 at 6:03 AM Richard Biener > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > On Wed, Oct 13, 2021 at 2:56 PM H.J. Lu > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > On Wed, Oct 13, 2021 at 5:45 AM Richard Biener > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > On Thu, Sep 2, 2021 at 5:50 PM H.J. Lu > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > Change in the v2 patch: > > > > > > > > > > > > > > > > > > > > > > > > > > 1. Disable static trampolines by default. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > GCC maintained a copy of libffi snapshot from 2009 > > > > > > > > > > > > > and cherry-picked fixes > > > > > > > > > > > > > from upstream over the last 10+ years. In the > > > > > > > > > > > > > meantime, libffi upstream > > > > > > > > > > > > > has been changed significantly with new features, bug > > > > > > > > > > > > > fixes and new target > > > > > > > > > > > > > support. Here is a set of patches to sync with > > > > > > > > > > > > > libffi 3.4.2 release and > > > > > > > > > > > > > make it easier to sync with libffi upstream: > > > > > > > > > > > > > > > > > > > > > > > > > > 1. Document how to sync with upstream. > > > > > > > > > > > > > 2. Add scripts to help sync with upstream. > > > > > > > > > > > > > 3. Sync with libffi 3.4.2. This patch is quite big. > > > > > > > > > > > > > It is availale at > > > > > > > > > > > > > > > > > > > > > > > > > > https://gitlab.com/x86-gcc/gcc/-/commit/15e80c879c571f79a0e57702848a9df5fba5be2f > > > > > > > > > > > > > 4. Integrate libffi build and testsuite with GCC. > > > > > > > > > > > > > > > > > > > > > > > > How did you test this? It looks like libgo is the only > > > > > > > > > > > > consumer of > > > > > > > > > > > > libffi these days. > > > > > > > > > > > > In particular go/libgo seems to be supported on almost > > > > > > > > > > > > all targets besides > > > > > > > > > > > > darwin/windows - did you test cross and canadian > > > > > > > > > > > > configurations? > > > > > > > > > > > > > > > > > > > > > > I only tested it on Linux/i686 and Linux/x86-64. My > > > > > > > > > > > understanding is that > > > > > > > > > > > the upstream libffi works on Darwin and Windows. > > > > > > > > > > > > > > > > > > > > > > > I applaud the attempt to sync to upsteam but I fear you > > > > > > > > > > > > won't get any "review" > > > > > > > > > > > > of this massive diff. > > > > > > > > > > > > > > > > > > > > > > I believe that it should just work. Our libffi is very > > > > > > > > > > > much out of date. > > > > > > > > > > > > > > > > > > > > Yes, you can hope. And yes, our libffi is out of date. > > > > > > > > > > > > > > > > > > > > Can you please do the extra step to test one weird > > > > > > > > > > architecture, namely > > > > > > > > > > powerpc64-aix which is available on the compile-farm? > > > > > > > > > > > > > > > > > > I will give it a try and report back. > > > > > > > > > > > > > > > > > > > If that goes well I think it's good to "hope" at this point > > > > > > > > > > (and plenty of > > > > > > > > > > time to fix fallout until the GCC 12 release). > > > > > > > > > > > > > > > > > > > > Thus OK after the extra testing dance and waiting until > > > > > > > > > > early next > > > > > > > > > > week so others can throw in a veto. > > > > > > > > > > > > > > > > I tried to bootstrap GCC master branch on gcc119.fsffrance.org: > > > > > > > > > > > > > > > > * MT/MODEL: 8284-22A > > > > > > > > * > > > > > > > > * Partition: gcc119 > > > > > > > > * > > > > > > > > *System: power8-aix.osuosl.org > > > > > > > > * > > > > > > > > * O/S: AIX V7.2 7200-04-03-2038 > > > > > > > > > > > > > > > > I configured GCC with > > > > > > > > > > > > > > > > --with-as=/usr/bin/as --with-ld=/usr/bin/ld > > > > > > > > --enable-version-specific-runtime-libs --disable-nls > > > > > > > > --enable-decimal-float=dpd --disable-libstdcxx-pch > > > > > > > > --disable-werror > > > > > > > > --enable-__cxa_atexit
[PATCH] bfin: Popcount-related improvements to machine description.
Blackfin processors support a ONES instruction that implements a 32-bit popcount returning a 16-bit result. This instruction was previously described by GCC's bfin backed using a UNSPEC, but with this patch uses a POPCOUNT:SI rtx to capture the semantics, allowing it to evaluated at compile-time. I've decided to keep the instruction name the same (avoiding any changes to the __builtin_bfin_ones machinery), but have provided popcountsi2 and popcounthi2 expanders so that the middle-end can use this instruction to implement __builtin_popcount (and __builtin_parity). The new testcase ones.c short foo () { int t = 5; short r = __builtin_bfin_ones(t); return r; } previously generated: _foo: nop; nop; R0 = 5 (X); R0.L = ONES R0; rts; with this patch, now generates: _foo: nop; nop; nop; R0 = 2 (X); rts; The new testcase popcount.c int foo(int x) { return __builtin_popcount(x); } previously generated: _foo: [--SP] = RETS; SP += -12; call ___popcountsi2; SP += 12; RETS = [SP++]; rts; now generates: _foo: nop; nop; R0.L = ONES R0; R0 = R0.L (Z); rts; And the new testcase parity.c int foo(int x) { return __builtin_parity(x); } previously generated: _foo: [--SP] = RETS; SP += -12; call ___paritysi2; SP += 12; RETS = [SP++]; rts; now generates: _foo: nop; R1 = 1 (X); R0.L = ONES R0; R0 = R1 & R0; rts; This patch has been tested on a cross-compiler to bfin-elf hosted on x86_64-pc-linux-gnu, but without a toolchain, and shows no regressions in the compile-only parts of the testsuite. Ok for mainline? 2021-10-17 Roger Sayle gcc/ChangeLog * config/bfin/bfin.md (define_constants): Remove UNSPEC_ONES. (define_insn "ones"): Replace UNSPEC_ONES with a truncate of a popcount, allowing compile-time evaluation/simplification. (popcountsi2, popcounthi2): New expanders using a "ones" insn. gcc/testsuite/ChangeLog * gcc.target/bfin/ones.c: New test case. * gcc.target/bfin/parity.c: New test case. * gcc.target/bfin/ones.c: New test case. Thanks in advance, Roger -- diff --git a/gcc/config/bfin/bfin.md b/gcc/config/bfin/bfin.md index 1ec0bbb..8b311f3 100644 --- a/gcc/config/bfin/bfin.md +++ b/gcc/config/bfin/bfin.md @@ -138,8 +138,7 @@ ;; Distinguish a 32-bit version of an insn from a 16-bit version. (UNSPEC_32BIT 11) (UNSPEC_NOP 12) - (UNSPEC_ONES 13) - (UNSPEC_ATOMIC 14)]) + (UNSPEC_ATOMIC 13)]) (define_constants [(UNSPEC_VOLATILE_CSYNC 1) @@ -1398,12 +1397,32 @@ (define_insn "ones" [(set (match_operand:HI 0 "register_operand" "=d") - (unspec:HI [(match_operand:SI 1 "register_operand" "d")] - UNSPEC_ONES))] + (truncate:HI +(popcount:SI (match_operand:SI 1 "register_operand" "d"] "" "%h0 = ONES %1;" [(set_attr "type" "alu0")]) +(define_expand "popcountsi2" + [(set (match_dup 2) + (truncate:HI (popcount:SI (match_operand:SI 1 "register_operand" "" + (set (match_operand:SI 0 "register_operand") + (zero_extend:SI (match_dup 2)))] + "" +{ + operands[2] = gen_reg_rtx (HImode); +}) + +(define_expand "popcounthi2" + [(set (match_dup 2) + (zero_extend:SI (match_operand:HI 1 "register_operand" ""))) + (set (match_operand:HI 0 "register_operand") + (truncate:HI (popcount:SI (match_dup 2] + "" +{ + operands[2] = gen_reg_rtx (SImode); +}) + (define_insn "smaxsi3" [(set (match_operand:SI 0 "register_operand" "=d") (smax:SI (match_operand:SI 1 "register_operand" "d") /* { dg-do compile } */ /* { dg-options "-O2" } */ short foo () { int t = 5; short r = __builtin_bfin_ones(t); return r; } /* { dg-final { scan-assembler-not "ONES" } } */ /* { dg-do compile } */ /* { dg-options "-O2" } */ int foo(int x) { return __builtin_parity(x); } /* { dg-final { scan-assembler "ONES" } } */ /* { dg-do compile } */ /* { dg-options "-O2" } */ int foo(int x) { return __builtin_popcount(x); } /* { dg-final { scan-assembler "ONES" } } */
[PATCH] Constant fold SS_NEG and SS_ABS in simplify-rtx.c
This simple patch performs compile-time constant folding of signed saturating negation and signed saturating absolute value in the RTL optimizers. Normally in two's complement arithmetic the lowest representable signed value overflows on negation, with these saturating operators they "saturate" to the maximum representable signed value, so SS_NEG:QI -128 is 127, and SS_ABS:HI -32768 is 32767. On bfin-elf, the following two short functions: short foo() { short t = -32768; short r = __builtin_bfin_negate_fr1x16(t); return r; } int bar() { int t = -2147483648; int r = __builtin_bfin_abs_fr1x32(t); return r; } currently compile to: _foo: nop; nop; R0 = -32768 (X); R0 = -R0 (V); rts; _bar: nop; R0 = -1 (X); R0 <<= 31; R0 = abs R0; rts; but with this middle-end patch now compile to: _foo: nop; nop; nop; R0 = 32767 (X); rts; _bar: nop; nop; R0 = -1 (X); R0.H = 32767; rts; This patch has been tested on x86_64-pc-linux-gnu with "make bootstrap" and "make -k check" with no new failures. Ok for mainline? 2021-10-17 Roger Sayle gcc/ChangeLog * simplify-rtx.c (simplify_const_unary_operation) [SS_NEG, SS_ABS]: Evalute SS_NEG and SS_ABS of a constant argument. gcc/testsuite/ChangeLog * gcc.target/bfin/ssabs.c: New test case. * gcc.target/bfin/ssneg.c: New test case. Thanks in advance, Roger -- diff --git a/gcc/simplify-rtx.c b/gcc/simplify-rtx.c index e4fae0b..2bb18fb 100644 --- a/gcc/simplify-rtx.c +++ b/gcc/simplify-rtx.c @@ -2026,6 +2026,20 @@ simplify_const_unary_operation (enum rtx_code code, machine_mode mode, result = wide_int::from (op0, width, SIGNED); break; + case SS_NEG: + if (wi::only_sign_bit_p (op0)) + result = wi::max_value (GET_MODE_PRECISION (imode), SIGNED); + else + result = wi::neg (op0); + break; + + case SS_ABS: + if (wi::only_sign_bit_p (op0)) + result = wi::max_value (GET_MODE_PRECISION (imode), SIGNED); + else + result = wi::abs (op0); + break; + case SQRT: default: return 0; /* { dg-do compile } */ /* { dg-options "-O2" } */ int foo() { int t = -2147483648; int r = __builtin_bfin_abs_fr1x32(t); return r; } /* { dg-final { scan-assembler "32767" } } */ /* { dg-do compile } */ /* { dg-options "-O2" } */ short foo() { short t = -32768; short r = __builtin_bfin_negate_fr1x16(t); return r; } /* { dg-final { scan-assembler "32767" } } */
[committed] wwwdocs: nongnu.org wants to be known as www.nongnu.org
--- htdocs/git.html | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/htdocs/git.html b/htdocs/git.html index ac1f2eb9..881f1d38 100644 --- a/htdocs/git.html +++ b/htdocs/git.html @@ -334,7 +334,7 @@ in Git. modula-2 This branch is for the -http://nongnu.org/gm2/homepage.html;>GNU Modula-2 +http://www.nongnu.org/gm2/homepage.html;>GNU Modula-2 front end to GCC prior to its integration with the mainline. The branch will be regularly rebased against the mainline. It is maintained by -- 2.33.0
[committed] wwwdocs: Remove link to DWARD standard
We've got a number of links to the DWARF standard on our page, which requires some link maintenance. Remove this one for GCC 7 which is unlikely to be used (much). --- htdocs/gcc-7/changes.html | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/htdocs/gcc-7/changes.html b/htdocs/gcc-7/changes.html index a040a80a..5bcb59b6 100644 --- a/htdocs/gcc-7/changes.html +++ b/htdocs/gcc-7/changes.html @@ -155,8 +155,7 @@ main (int argc, char **argv) UndefinedBehavior Sanitizer now diagnoses arithmetic overflows even on arithmetic operations with generic vectors. - Version 5 of the http://www.dwarfstd.org/Download.php;>DWARF debugging + Version 5 of the DWARF debugging information standard is supported through the -gdwarf-5 option. The DWARF version 4 debugging information remains the default until consumers of debugging information are adjusted. -- 2.33.0