[committed] wwwdocs: gcc-4.1/changes.html: Rework/reduce Classpath links

2021-10-17 Thread Gerald Pfeifer
Adjust one of two links to classpath.org and avoid the other, by
removing the respective paragraph which is really not relevant any
longer.
---
 htdocs/gcc-4.1/changes.html | 9 ++---
 1 file changed, 2 insertions(+), 7 deletions(-)

diff --git a/htdocs/gcc-4.1/changes.html b/htdocs/gcc-4.1/changes.html
index 07c76dfe..5c2708aa 100644
--- a/htdocs/gcc-4.1/changes.html
+++ b/htdocs/gcc-4.1/changes.html
@@ -324,11 +324,6 @@
rewrites. All image drawing operations should now work
correctly (flipping requires gtk+ = 2.6)
 
-   Future Graphics2D, image and text work is
-   documented at: http://developer.classpath.org/mediation/ClasspathGraphicsImagesText;>
-   http://developer.classpath.org/mediation/ClasspathGraphicsImagesText
-   
-
When gtk+ 2.6 or higher is installed the default log
handler will produce stack traces whenever a WARNING,
CRITICAL or ERROR message is produced.
@@ -543,8 +538,8 @@
likely contain bugs).
 
Documentation fixes all over the place.  See
-   http://developer.classpath.org/doc/;>
-   http://developer.classpath.org/doc/
+   https://developer.classpath.org/doc/;>
+   https://developer.classpath.org/doc/
  

   
-- 
2.33.0


Re: [PATCH] Adjust testcase for O2 vectorization.

2021-10-17 Thread Hongtao Liu via Gcc-patches
On Fri, Oct 15, 2021 at 3:11 PM Kewen.Lin via Gcc-patches
 wrote:
>
> on 2021/10/14 下午6:56, Kewen.Lin via Gcc-patches wrote:
> > Hi Hongtao,
> >
> > on 2021/10/14 下午3:11, liuhongt wrote:
> >> Hi Kewen:
> >>   Cound you help to verify if this patch fix those regressions
> >> for rs6000 port.
> >>
> >
> > The ppc64le run just finished, there are still some regresssions:
> >
> > NA->XPASS: c-c++-common/Wstringop-overflow-2.c  -Wc++-compat   (test for 
> > warnings, line 194)
> > NA->XPASS: c-c++-common/Wstringop-overflow-2.c  -Wc++-compat   (test for 
> > warnings, line 212)
> > NA->XPASS: c-c++-common/Wstringop-overflow-2.c  -Wc++-compat   (test for 
> > warnings, line 296)
> > NA->XPASS: c-c++-common/Wstringop-overflow-2.c  -Wc++-compat   (test for 
> > warnings, line 314)
> > NA->FAIL: gcc.dg/Wstringop-overflow-21-novec.c (test for excess errors)
> > NA->FAIL: gcc.dg/Wstringop-overflow-21-novec.c  (test for warnings, line 18)
> > NA->FAIL: gcc.dg/Wstringop-overflow-21-novec.c  (test for warnings, line 29)
> > NA->FAIL: gcc.dg/Wstringop-overflow-21-novec.c  (test for warnings, line 45)
> > NA->FAIL: gcc.dg/Wstringop-overflow-21-novec.c  (test for warnings, line 55)
> > NA->FAIL: gcc.dg/Wstringop-overflow-76-novec.c note (test for warnings, 
> > line 104)
> > NA->FAIL: gcc.dg/Wstringop-overflow-76-novec.c note (test for warnings, 
> > line 137)
> > NA->FAIL: gcc.dg/Wstringop-overflow-76-novec.c note (test for warnings, 
> > line 19)
> > NA->FAIL: gcc.dg/Wstringop-overflow-76-novec.c note (test for warnings, 
> > line 39)
> > NA->FAIL: gcc.dg/Wstringop-overflow-76-novec.c note (test for warnings, 
> > line 56)
> > NA->FAIL: gcc.dg/Wstringop-overflow-76-novec.c note (test for warnings, 
> > line 70)
> > NA->FAIL: gcc.dg/Wstringop-overflow-76-novec.c (test for excess errors)
> > NA->FAIL: gcc.dg/Wstringop-overflow-76-novec.c  (test for warnings, line 
> > 116)
> > NA->FAIL: gcc.dg/Wstringop-overflow-76-novec.c  (test for warnings, line 
> > 131)
> > NA->FAIL: gcc.dg/Wstringop-overflow-76-novec.c  (test for warnings, line 
> > 146)
> > NA->FAIL: gcc.dg/Wstringop-overflow-76-novec.c  (test for warnings, line 33)
> > NA->FAIL: gcc.dg/Wstringop-overflow-76-novec.c  (test for warnings, line 50)
> > NA->FAIL: gcc.dg/Wstringop-overflow-76-novec.c  (test for warnings, line 64)
> > NA->FAIL: gcc.dg/Wstringop-overflow-76-novec.c  (test for warnings, line 78)
> > NA->FAIL: gcc.dg/Wstringop-overflow-76-novec.c  (test for warnings, line 97)
> > PASS->FAIL: c-c++-common/Wstringop-overflow-2.c  -std=gnu++14 (test for 
> > excess errors)
> > NA->FAIL: c-c++-common/Wstringop-overflow-2.c  -std=gnu++14  (test for 
> > warnings, line 229)
> > NA->FAIL: c-c++-common/Wstringop-overflow-2.c  -std=gnu++14  (test for 
> > warnings, line 230)
> > NA->FAIL: c-c++-common/Wstringop-overflow-2.c  -std=gnu++14  (test for 
> > warnings, line 331)
> > NA->FAIL: c-c++-common/Wstringop-overflow-2.c  -std=gnu++14  (test for 
> > warnings, line 332)
> > // omitting -std=gnu++17, -std=gnu++2a, -std=gnu++98
> >
> > I'll have a look and get back to you tomorrow.
> >
>
> The failure c-c++-common/Wstringop-overflow-2.c is due to that the
> current proc check_vect_slp_vnqihi_store_usage is made as "cache"
> but it can vary for different input patterns.  For rs6000 the test
> for v2qi fails, the cached test result makes v4qi check fail
> unexpectedly (should pass).  I adjusted caching for the following users
> check_effective_target_vect_slp_v*_store, also refactored a bit.
> One trivial change is to add one new argument macro then we can just
> compile the corresponding foo* function instead of all, hope it helps
> to make the debugging outputs compact.
>
> For the failure Wstringop-overflow-76-novec.c, there is one typo
> comparing to the original Wstringop-overflow-76.c.  Guess it failed
> on x86 too?  It would be surprising if it passes on x86.
> As to the failure Wstringop-overflow-21-novec.c, I confirmed it's
> just noise, patching typos caused this failure.
Thanks for the explanation for  those failures and the typo, i'll
adjust the patch.
>
> One new round ppc64le testing just finished with below diff and all
> previous regressions are fixed without any new regressions.
>
>
> diff --git a/gcc/testsuite/gcc.dg/Wstringop-overflow-76-novec.c 
> b/gcc/testsuite/gcc.dg/Wstringop-overflow-76-novec.c
> index d000b587a65..1132348c5f4 100644
> --- a/gcc/testsuite/gcc.dg/Wstringop-overflow-76-novec.c
> +++ b/gcc/testsuite/gcc.dg/Wstringop-overflow-76-novec.c
> @@ -82,7 +82,7 @@ void max_d8_p (char *q, int i)
>  struct A3_5
>  {
>char a3[3];  // { dg-message "at offset 3 into destination object 'a3' of 
> size 3" "pr??" { xfail *-*-* } }
> -  char a5[5];
> +  char a5[5];  // { dg-message "at offset 5 into destination object 'a5' of 
> size 5" "note" }
>  };
>
>  void max_A3_A5 (int i, struct A3_5 *pa3_5)
> diff --git a/gcc/testsuite/lib/target-supports.exp 
> b/gcc/testsuite/lib/target-supports.exp
> index 530c5769614..8736b908ec7 100644
> --- 

Re: [PATCH] Adjust testcase for O2 vectorization.

2021-10-17 Thread Hongtao Liu via Gcc-patches
On Fri, Oct 15, 2021 at 11:37 PM Martin Sebor  wrote:
>
> On 10/14/21 1:11 AM, liuhongt wrote:
> > Hi Kewen:
> >Cound you help to verify if this patch fix those regressions
> > for rs6000 port.
> >
> > As discussed in [1], this patch add xfail/target selector to those
> > testcases, also make a copy of them so that they can be tested w/o
> > vectorization.
>
> Just to make sure I understand what's happening with the tests:
> the new -N-novec.c tests consist of just the casses xfailed due
> to vectorizartion in the corresponding -N.c tests?  Or are there
Wstringop-overflow-2-novec.c is the same as Wstringop-overflow-2.c
before O2 vectorization adjustment.
Do you want me to reduce them to only contain cases for new xfail/target?
> some other differences (e.g., new cases in them, etc.)?  I'd
> hope to eventually remove the -novec.c tests once all warnings
> behave as expected with vectorization as without it (maybe
> keeping just one case both ways as a sanity check).
>
> For the target-supports selectors, I confess I don't know enough
> about vectorization to find their names quite intuitive enough
> to know when to use each.  For instance, for vect_slp_v4qi_store:
It's 4-byte char stores with address being 4-bytes aligned.
.i.e.

>
> +# Return the true if target support vectorization of v4qi store.
> +proc check_effective_target_vect_slp_v4qi_store { } {
> +set pattern {add new stmt: MEM }
> +return [expr { [check_vect_slp_vnqihi_store_usage $pattern ] != 0 }]
> +}
>
> When should this selector be used?  In cases involving 4-byte
> char stores?  Only naturally aligned 4-bytes stores (i.e., on
> a 4 byte boundary, as the check_vect_slp_vnqihi_store_usage
> suggests?) Or 4-byte stores of any types (e.g., four chars
> as well as two 16-bit shorts), etc.?
>
> Hopefully once all the warnings handle vectorization we won't
> need to use them, but until then it would be good to document
> this in more detail in the .exp file.
>
> Finally, thank you for adding comments to the xfailed tests
> referencing the corresponding bugs!  Can you please mention
> the PR in the comment in each of the new xfails?  Like so:
>
> index 7d29b5f48c7..cb687c69324 100644
> --- a/gcc/testsuite/c-c++-common/Wstringop-overflow-2.c
> +++ b/gcc/testsuite/c-c++-common/Wstringop-overflow-2.c
> @@ -189,8 +189,9 @@ void ga1__ (void)
>
> struct A1 a = { 1 };
> a.a[0] = 0;
> +  // O2 vectorization regress Wstringop-overflow case (1), refer to
> pr102462.
> a.a[1] = 1;// { dg-warning
> "\\\[-Wstringop-overflow" }
> -  a.a[2] = 2;// { dg-warning
> "\\\[-Wstringop-overflow" "" { xfail { i?86-*-* x86_64-*-* } } }
> +  a.a[2] = 2;// { dg-warning
> "\\\[-Wstringop-overflow" "pr102462" { xfail { vect_slp_v2qi_store } } }
> 
> PR in dg-warning comment.
>
> This should make it easier to deal with the XFAILs once
> the warnings have improved to handle vectorization.
Will do.
>
> Martin



-- 
BR,
Hongtao


Re: [RFC] Don't move cold code out of loop by checking bb count

2021-10-17 Thread Xionghu Luo via Gcc-patches



On 2021/10/15 16:11, Richard Biener wrote:
> On Sat, Oct 9, 2021 at 5:45 AM Xionghu Luo  wrote:
>>
>> Hi,
>>
>> On 2021/9/28 20:09, Richard Biener wrote:
>>> On Fri, Sep 24, 2021 at 8:29 AM Xionghu Luo  wrote:

 Update the patch to v3, not sure whether you prefer the paste style
 and continue to link the previous thread as Segher dislikes this...


 [PATCH v3] Don't move cold code out of loop by checking bb count


 Changes:
 1. Handle max_loop in determine_max_movement instead of
 outermost_invariant_loop.
 2. Remove unnecessary changes.
 3. Add for_all_locs_in_loop (loop, ref, ref_in_loop_hot_body) in 
 can_sm_ref_p.
 4. "gsi_next ();" in move_computations_worker is kept since it caused
 infinite loop when implementing v1 and the iteration is missed to be
 updated actually.

 v1: https://gcc.gnu.org/pipermail/gcc-patches/2021-August/576488.html
 v2: https://gcc.gnu.org/pipermail/gcc-patches/2021-September/579086.html

 There was a patch trying to avoid move cold block out of loop:

 https://gcc.gnu.org/pipermail/gcc/2014-November/215551.html

 Richard suggested to "never hoist anything from a bb with lower execution
 frequency to a bb with higher one in LIM invariantness_dom_walker
 before_dom_children".

 In gimple LIM analysis, add find_coldest_out_loop to move invariants to
 expected target loop, if profile count of the loop bb is colder
 than target loop preheader, it won't be hoisted out of loop.
 Likely for store motion, if all locations of the REF in loop is cold,
 don't do store motion of it.

 SPEC2017 performance evaluation shows 1% performance improvement for
 intrate GEOMEAN and no obvious regression for others.  Especially,
 500.perlbench_r +7.52% (Perf shows function S_regtry of perlbench is
 largely improved.), and 548.exchange2_r+1.98%, 526.blender_r +1.00%
 on P8LE.

 gcc/ChangeLog:

 * loop-invariant.c (find_invariants_bb): Check profile count
 before motion.
 (find_invariants_body): Add argument.
 * tree-ssa-loop-im.c (find_coldest_out_loop): New function.
 (determine_max_movement): Use find_coldest_out_loop.
 (move_computations_worker): Adjust and fix iteration udpate.
 (execute_sm_exit): Check pointer validness.
 (class ref_in_loop_hot_body): New functor.
 (ref_in_loop_hot_body::operator): New.
 (can_sm_ref_p): Use for_all_locs_in_loop.

 gcc/testsuite/ChangeLog:

 * gcc.dg/tree-ssa/recip-3.c: Adjust.
 * gcc.dg/tree-ssa/ssa-lim-18.c: New test.
 * gcc.dg/tree-ssa/ssa-lim-19.c: New test.
 * gcc.dg/tree-ssa/ssa-lim-20.c: New test.
 ---
  gcc/loop-invariant.c   | 10 ++--
  gcc/tree-ssa-loop-im.c | 61 --
  gcc/testsuite/gcc.dg/tree-ssa/recip-3.c|  2 +-
  gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-18.c | 20 +++
  gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-19.c | 27 ++
  gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-20.c | 25 +
  gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-21.c | 28 ++
  7 files changed, 165 insertions(+), 8 deletions(-)
  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-18.c
  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-19.c
  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-20.c
  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-21.c

 diff --git a/gcc/loop-invariant.c b/gcc/loop-invariant.c
 index fca0c2b24be..5c3be7bf0eb 100644
 --- a/gcc/loop-invariant.c
 +++ b/gcc/loop-invariant.c
 @@ -1183,9 +1183,14 @@ find_invariants_insn (rtx_insn *insn, bool 
 always_reached, bool always_executed)
 call.  */

  static void
 -find_invariants_bb (basic_block bb, bool always_reached, bool 
 always_executed)
 +find_invariants_bb (class loop *loop, basic_block bb, bool always_reached,
 +   bool always_executed)
  {
rtx_insn *insn;
 +  basic_block preheader = loop_preheader_edge (loop)->src;
 +
 +  if (preheader->count > bb->count)
 +return;

FOR_BB_INSNS (bb, insn)
  {
 @@ -1214,8 +1219,7 @@ find_invariants_body (class loop *loop, basic_block 
 *body,
unsigned i;

for (i = 0; i < loop->num_nodes; i++)
 -find_invariants_bb (body[i],
 -   bitmap_bit_p (always_reached, i),
 +find_invariants_bb (loop, body[i], bitmap_bit_p (always_reached, i),
 bitmap_bit_p (always_executed, i));
  }

 diff --git a/gcc/tree-ssa-loop-im.c b/gcc/tree-ssa-loop-im.c
 index 4b187c2cdaf..655fab03442 100644
 --- a/gcc/tree-ssa-loop-im.c
 +++ b/gcc/tree-ssa-loop-im.c

[PATCH] tree-object-size: Avoid unnecessary processing of __builtin_object_size

2021-10-17 Thread Siddhesh Poyarekar
This is a minor cleanup to bail out early if the result of
__builtin_object_size is not assigned to anything and avoid initializing
the object size arrays.

gcc/ChangeLog:

* tree-object-size (object_sizes_execute): Consolidate LHS null
check and do it early.

Signed-off-by: Siddhesh Poyarekar 
---
 gcc/tree-object-size.c | 12 +---
 1 file changed, 5 insertions(+), 7 deletions(-)

diff --git a/gcc/tree-object-size.c b/gcc/tree-object-size.c
index 6a4dc724f34..46a976dfe10 100644
--- a/gcc/tree-object-size.c
+++ b/gcc/tree-object-size.c
@@ -1298,6 +1298,10 @@ object_sizes_execute (function *fun, bool 
insert_min_max_p)
  if (!gimple_call_builtin_p (call, BUILT_IN_OBJECT_SIZE))
continue;
 
+ tree lhs = gimple_call_lhs (call);
+ if (!lhs)
+   continue;
+
  init_object_sizes ();
 
  /* If insert_min_max_p, only attempt to fold
@@ -1312,11 +1316,9 @@ object_sizes_execute (function *fun, bool 
insert_min_max_p)
{
  unsigned HOST_WIDE_INT object_size_type = tree_to_uhwi (ost);
  tree ptr = gimple_call_arg (call, 0);
- tree lhs = gimple_call_lhs (call);
  if ((object_size_type == 1 || object_size_type == 3)
  && (TREE_CODE (ptr) == ADDR_EXPR
- || TREE_CODE (ptr) == SSA_NAME)
- && lhs)
+ || TREE_CODE (ptr) == SSA_NAME))
{
  tree type = TREE_TYPE (lhs);
  unsigned HOST_WIDE_INT bytes;
@@ -1339,10 +1341,6 @@ object_sizes_execute (function *fun, bool 
insert_min_max_p)
  continue;
}
 
- tree lhs = gimple_call_lhs (call);
- if (!lhs)
-   continue;
-
  result = gimple_fold_stmt_to_constant (call, do_valueize);
  if (!result)
{
-- 
2.31.1



Re: [RFC PATCH 0/8] RISC-V: Bit-manipulation extension.

2021-10-17 Thread Kito Cheng via Gcc-patches
Hi Vineet:

I am not familiar with buildroot, so I am not sure which GCC version will work,
but I think the patch set should be able to apply both gcc 11.1 and
trunk without conflict.

Here is a gcc 11.1 + this patch set on my github, hope this could help :)
https://github.com/kito-cheng/riscv-gcc/tree/riscv-gcc-11.1.0-zbabcs

On Thu, Oct 14, 2021 at 4:22 AM Vineet Gupta  wrote:
>
> Hi Kito,
>
> On 9/23/21 12:57 AM, Kito Cheng wrote:
> > Bit manipulation extension[1] is finishing the public review and waiting for
> > the rest of the ratification process, I believe that will become a ratified
> > extension soon, so I think it's time to submit to upstream for review now :)
> >
> > As the title included RFC, it's not a rush to merge to trunk yet, I would
> > like to merge that until it is officially ratified.
> >
> > This patch set is the implementation of bit-manipulation extension, which
> > includes zba, zbb, zbc and zbs extension, but only included in 
> > instruction/md
> > pattern only, no intrinsic function implementation.
> >
> > Most work is done by Jim Willson and many other contributors
> > on https://github.com/riscv-collab/riscv-gcc.
> >
> >
> > [1] https://github.com/riscv/riscv-bitmanip/releases/tag/1.0.0
>
> I wanted to give these a try. Is it reasonable to apply these to a gcc
> 11.1 baseline and give a spin in buildroot or do these absolutely have
> to be bleeding edge gcc.
>
> Thx,
> -Vineet


Re: [PATCH] AVX512FP16: Add *_set1_pch intrinsics.

2021-10-17 Thread Hongtao Liu via Gcc-patches
On Fri, Oct 15, 2021 at 4:38 PM dianhong.xu--- via Gcc-patches
 wrote:
>
> From: dianhong xu 
>
> Add *_set1_pch (_Float16 _Complex A) intrinsics.
>
> gcc/ChangeLog:
>
> * config/i386/avx512fp16intrin.h:
> (_mm512_set1_pch): New intrinsic.
> * config/i386/avx512fp16vlintrin.h:
> (_mm256_set1_pch): New intrinsic.
> (_mm_set1_pch): Ditto.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/i386/avx512fp16-set1-pch-1a.c: New test.
> * gcc.target/i386/avx512fp16-set1-pch-1b.c: New test.
> * gcc.target/i386/avx512fp16vl-set1-pch-1a.c: New test.
> * gcc.target/i386/avx512fp16vl-set1-pch-1b.c: New test.
LGTM.
> ---
>  gcc/config/i386/avx512fp16intrin.h| 13 +
>  gcc/config/i386/avx512fp16vlintrin.h  | 26 +
>  .../gcc.target/i386/avx512fp16-set1-pch-1a.c  | 13 +
>  .../gcc.target/i386/avx512fp16-set1-pch-1b.c  | 42 ++
>  .../i386/avx512fp16vl-set1-pch-1a.c   | 20 +++
>  .../i386/avx512fp16vl-set1-pch-1b.c   | 57 +++
>  6 files changed, 171 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.target/i386/avx512fp16-set1-pch-1a.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/avx512fp16-set1-pch-1b.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/avx512fp16vl-set1-pch-1a.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/avx512fp16vl-set1-pch-1b.c
>
> diff --git a/gcc/config/i386/avx512fp16intrin.h 
> b/gcc/config/i386/avx512fp16intrin.h
> index 079ce321c01..17025d68b8e 100644
> --- a/gcc/config/i386/avx512fp16intrin.h
> +++ b/gcc/config/i386/avx512fp16intrin.h
> @@ -7237,6 +7237,19 @@ _mm512_permutexvar_ph (__m512i __A, __m512h __B)
>  (__mmask32)-1);
>  }
>
> +extern __inline __m512h
> +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
> +_mm512_set1_pch (_Float16 _Complex __A)
> +{
> +  union
> +  {
> +_Float16 _Complex a;
> +float b;
> +  } u = { .a = __A};
> +
> +  return (__m512h) _mm512_set1_ps (u.b);
> +}
> +
>  #ifdef __DISABLE_AVX512FP16__
>  #undef __DISABLE_AVX512FP16__
>  #pragma GCC pop_options
> diff --git a/gcc/config/i386/avx512fp16vlintrin.h 
> b/gcc/config/i386/avx512fp16vlintrin.h
> index f83a429ba43..1de4513d7f1 100644
> --- a/gcc/config/i386/avx512fp16vlintrin.h
> +++ b/gcc/config/i386/avx512fp16vlintrin.h
> @@ -3315,6 +3315,32 @@ _mm_permutexvar_ph (__m128i __A, __m128h __B)
>  (__mmask8)-1);
>  }
>
> +extern __inline __m256h
> +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
> +_mm256_set1_pch (_Float16 _Complex __A)
> +{
> +  union
> +  {
> +_Float16 _Complex a;
> +float b;
> +  } u = { .a = __A };
> +
> +  return (__m256h) _mm256_set1_ps (u.b);
> +}
> +
> +extern __inline __m128h
> +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
> +_mm_set1_pch (_Float16 _Complex __A)
> +{
> +  union
> +  {
> +_Float16 _Complex a;
> +float b;
> +  } u = { .a = __A };
> +
> +  return (__m128h) _mm_set1_ps (u.b);
> +}
> +
>  #ifdef __DISABLE_AVX512FP16VL__
>  #undef __DISABLE_AVX512FP16VL__
>  #pragma GCC pop_options
> diff --git a/gcc/testsuite/gcc.target/i386/avx512fp16-set1-pch-1a.c 
> b/gcc/testsuite/gcc.target/i386/avx512fp16-set1-pch-1a.c
> new file mode 100644
> index 000..0055193f243
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/avx512fp16-set1-pch-1a.c
> @@ -0,0 +1,13 @@
> +/* { dg-do compile} */
> +/* { dg-options "-O2 -mavx512fp16" } */
> +
> +#include 
> +
> +__m512h
> +__attribute__ ((noinline, noclone))
> +test_mm512_set1_pch (_Float16 _Complex A)
> +{
> +  return _mm512_set1_pch(A);
> +}
> +
> +/* { dg-final { scan-assembler "vbroadcastss\[ \\t\]+\[^\n\r\]*%zmm\[01\]" } 
> } */
> diff --git a/gcc/testsuite/gcc.target/i386/avx512fp16-set1-pch-1b.c 
> b/gcc/testsuite/gcc.target/i386/avx512fp16-set1-pch-1b.c
> new file mode 100644
> index 000..450d7e37237
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/avx512fp16-set1-pch-1b.c
> @@ -0,0 +1,42 @@
> +/* { dg-do run { target avx512fp16 } } */
> +/* { dg-options "-O2 -mavx512fp16" } */
> +
> +#include
> +#include 
> +#include 
> +
> +static void do_test (void);
> +
> +#define DO_TEST do_test
> +#define AVX512FP16
> +
> +#include 
> +#include "avx512-check.h"
> +
> +static void
> +do_test (void)
> +{
> + _Float16 _Complex fc = 1.0 + 1.0*I;
> +  union
> +  {
> +_Float16 _Complex a;
> +float b;
> +  } u = { .a = fc };
> +  float ff= u.b;
> +
> +  typedef union
> +  {
> +float fp[16];
> +__m512h m512h;
> +  } u1;
> +
> +  __m512h test512 = _mm512_set1_pch(fc);
> +
> +  u1 test;
> +  test.m512h = test512;
> +  for (int i = 0; i<16; i++)
> +  {
> +if (test.fp[i] != ff) abort();
> +  }
> +
> +}
> diff --git a/gcc/testsuite/gcc.target/i386/avx512fp16vl-set1-pch-1a.c 
> b/gcc/testsuite/gcc.target/i386/avx512fp16vl-set1-pch-1a.c
> new file mode 100644
> index 

Re: [PATCH] Convert strlen pass from evrp to ranger.

2021-10-17 Thread Jeff Law via Gcc-patches




On 10/8/2021 9:12 AM, Aldy Hernandez via Gcc-patches wrote:

The following patch converts the strlen pass from evrp to ranger,
leaving DOM as the last remaining user.
So is there any reason why we can't convert DOM as well?   DOM's use of 
EVRP is pretty limited.  You've mentioned FP bits before, but my 
recollection is those are not part of the EVRP analysis DOM uses. Hell, 
give me a little guidance and I'll do the work...




No additional cleanups have been done.  For example, the strlen pass
still has uses of VR_ANTI_RANGE, and the sprintf still passes around
pairs of integers instead of using a proper range.  Fixing this
could further improve these passes.

As a further enhancement, if the relevant maintainers deem useful,
the domwalk could be removed from strlen.  That is, unless the pass
needs it for something else.
The dom walk was strictly for the benefit of EVRP when it was added.  So 
I think it can get zapped once the pass is converted.


Jeff


Re: [PATCH] Convert strlen pass from evrp to ranger.

2021-10-17 Thread Jeff Law via Gcc-patches




On 10/15/2021 4:39 AM, Aldy Hernandez wrote:



On 10/15/21 2:47 AM, Andrew MacLeod wrote:

On 10/14/21 6:07 PM, Martin Sebor via Gcc-patches wrote:

On 10/9/21 12:47 PM, Aldy Hernandez via Gcc-patches wrote:

We seem to be passing a lot of context around in the strlen code.  I
certainly don't want to contribute to more.

Most of the handle_* functions are passing the gsi as well as either
ptr_qry or rvals.  That looks a bit messy.  May I suggest putting all
of that in the strlen pass object (well, the dom walker object, but we
can rename it to be less dom centric)?

Something like the attached (untested) patch could be the basis for
further cleanups.

Jakub, would this line of work interest you?


You didn't ask me but since no one spoke up against it let me add
some encouragement: this is exactly what I was envisioning and in
line with other such modernization we have been doing elsewhere.
Could you please submit it for review?

Martin


I'm willing to bet he didn't submit it for review because he doesn't 
have time this release to polish and track it...  (I think the 
threader has been quite consuming).  Rather, it was offered as a 
starting point for someone else who might be interested in continuing 
to pursue this work...  *everyone* is interested in cleanup work 
others do :-)


Exactly.  There's a lot of work that could be done in this area, and 
I'm trying to avoid the situation with the threaders where what 
started as refactoring ended up with me basically owning them ;-).

I wouldn't go that far ;-)  I'm still here, just focused on other stuff.



That being said, I there are enough cleanups that are useful on their 
own.  I've removed all the passing around of GSIs, as well as ptr_qry, 
with the exception of anything dealing with the sprintf pass, since it 
has a slightly different interface.
You know, it's funny.   The 0001 patch looks a lot like what I ended up 
doing here and there i when I start cleaning things up.  Pull state into 
a class, make functions which need the state member functions, repeat 
until it works.


This is patch 0001, which I'm formally submitting for inclusion. No 
functional changes with this patch.  OK for trunk?

I'll ACK this now :-)




Also, I am PINGing patch 0002, which is the strlen pass conversion to 
the ranger.  As mentioned, this is just a change from an evrp client 
to a ranger client.  The APIs are exactly the same, and besides, the 
evrp analyzer is deprecated and slated for removal. OK for trunk?
I'll defer on this a bit.  I've got to step away and may not be back 
online tonight.  I worry more about the unintended testsuite fallout 
here more than anything.  Which argues it should go into the tester to 
see if there is any such fallout :-)



jeff



Re: [PATCH] d-demangle: properly skip anonymous symbols

2021-10-17 Thread Jeff Law via Gcc-patches




On 10/5/2021 11:53 AM, Luís Ferreira wrote:

On Tue, 2021-10-05 at 18:13 +0100, Luís Ferreira wrote:

This patch fixes a bug on the D demangler by parsing and skip anonymous
symbols
correctly, according the ABI specification. Furthermore, it also
includes tests
to cover anonymous symbols.

The spec specifies [1] that a symbol name can be anonymous and multiple
anonymous symbols are allowed.

[1]: https://dlang.org/spec/abi.html#SymbolName

ChangeLog:
libiberty/
 * d-demangle.c (dlang_parse_qualified): Handle anonymous
symbols
   correctly.

 * testsuite/d-demangle-expected: New tests to cover anonymous
symbols.

Thanks.  I fixed a whitespace nit and installed this patch.

Jeff


Re: [PING^3] Generalize 'gcc/input.h:struct location_hash' (was: [Committed] [PATCH 2/4] (v4) On-demand locations within string-literals)

2021-10-17 Thread Jeff Law via Gcc-patches




On 9/30/2021 12:47 AM, Thomas Schwinge wrote:

Hi!

On 2021-09-17T13:16:14+0200, I wrote:

On 2021-09-10T09:48:56+0200, I wrote:

Ping.  My patches again attached, for easy reference.

Ping once again.

Jeff had ACKed "Don't record string concatenation data for
'RESERVED_LOCATION_P'" (thanks!), but "Generalize 'gcc/input.h:struct
location_hash'" is still awaiting review:


On 2021-09-03T18:33:37+0200, I wrote:

On 2021-09-02T21:09:54+0200, I wrote:

On 2021-09-02T15:59:14+0200, I wrote:

On 2016-08-05T14:16:58-0400, David Malcolm  wrote:

Committed to trunk as r239175; I'm attaching the final version of the
patch for reference.

David, you've added here 'gcc/input.h:struct location_hash' (see quoted
below), which will be useful elsewhere, so:

--- a/gcc/input.h
+++ b/gcc/input.h
+struct location_hash : int_hash  { };
+
+class GTY(()) string_concat_db
+{
+[...]
+  hash_map  *m_table;
+};

OK to push the attached
"Generalize 'gcc/input.h:struct location_hash'"?

Attached again, for easy reference.


Grüße
  Thomas


-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955

0002-Generalize-gcc-input.h-struct-location_hash.patch

 From 349a3172f64db93ee98ea39b36489b702b6596ab Mon Sep 17 00:00:00 2001
From: Thomas Schwinge 
Date: Tue, 31 Aug 2021 23:30:25 +0200
Subject: [PATCH 2/2] Generalize 'gcc/input.h:struct location_hash'

This is currently only used here ('gcc/input.h:class string_concat_db'), but is
actually generally useful, so advertize it as such.

Per the rationale given, we may use 'BUILTINS_LOCATION' as spare value for
'Deleted', in addition to the existing use of 'UNKNOWN_LOCATION' as spare value
for 'Empty'.

gcc/
* input.h (location_hash): Use 'BUILTINS_LOCATION' as spare value
for 'Deleted'.  Turn into a '#define'.

OK
jeff



Re: [PATCH] Try placing RTL folded constants in constant pool

2021-10-17 Thread Jeff Law via Gcc-patches




On 10/3/2021 8:26 AM, Roger Sayle wrote:

My recent attempts to come up with a testcase for my patch to evaluate
ss_plus in simplify-rtx.c, identified a missed optimization opportunity
(that's potentially a long-time regression): The RTL optimizers no longer
place constants in the constant pool.

The motivating x86_64 example is the simple program:

typedef char v8qi __attribute__ ((vector_size (8)));

v8qi foo()
{
   v8qi tx = { 1, 0, 0, 0, 0, 0, 0, 0 };
   v8qi ty = { 2, 0, 0, 0, 0, 0, 0, 0 };
   v8qi t = __builtin_ia32_paddsb(tx, ty);
   return t;
}

which (with my previous patch) currently results in:
foo:movq.LC0(%rip), %xmm0
 movq.LC1(%rip), %xmm1
 paddsb  %xmm1, %xmm0
 ret

even though the RTL contains the result in a REG_EQUAL note:

(insn 7 6 12 2 (set (reg:V8QI 83)
 (ss_plus:V8QI (reg:V8QI 84)
 (reg:V8QI 85))) "ssaddqi3.c":7:12 1419 {*mmx_ssaddv8qi3}
  (expr_list:REG_DEAD (reg:V8QI 85)
 (expr_list:REG_DEAD (reg:V8QI 84)
 (expr_list:REG_EQUAL (const_vector:V8QI [
 (const_int 3 [0x3])
 (const_int 0 [0]) repeated x7
 ])
 (nil)

Together with the patch below, GCC will now generate the much
more sensible:
foo:movq.LC2(%rip), %xmm0
 ret

My first approach was to look in cse.c (where the REG_EQUAL note gets
added) and notice that the constant pool handling functionality has been
unreachable for a while.  A quick search for constant_pool_entries_cost
shows that it's initialized to zero, but never set to a non-zero value,
meaning that force_const_mem is never called.  This functionality used
to work way back in 2003, but has been lost over time:
https://gcc.gnu.org/pipermail/gcc-patches/2003-October/116435.html

The changes to cse.c below restore this functionality (placing suitable
constants in the constant pool) with two significant refinements;
(i) it only attempts to do this if the function already uses a constant
pool (thanks to the availability of crtl->uses_constant_pool since 2003).
(ii) it allows different constants (i.e. modes) to have different costs,
so that floating point "doubles" and 64-bit, 128-bit, 256-bit and 512-bit
vectors don't all have the share the same cost.  Back in 2003, the
assumption was that everything in a constant pool had the same
cost, hence the global variable constant_pool_entries_cost.

Although this is a useful CSE fix, it turns out that it doesn't cure my
motivating problem above.  CSE only considers a single instruction,
so determines that it's cheaper to perform the ss_plus (COSTS_N_INSNS(1))
than read the result from the constant pool (COSTS_N_INSNS(2)).  It's
only when the other reads from the constant pool are also eliminated,
that this transformation is a win.  Hence a better place to perform
this transformation is in combine, where after failing to "recog" the
load of a suitable constant, it can retry after calling force_const_mem.
This achieves the desired transformation and allows the backend insn_cost
call-back to control whether or not using the constant pool is preferrable.

Alas, it's rare to change code generation without affecting something in
GCC's testsuite.  On x86_64-pc-linux-gnu there were two families of new
failures (and I'd predict similar benign fallout on other platforms).
One failure was gcc.target/i386/387-12.c (aka PR target/26915), where
the test is missing an explicit -m32 flag.  On i686, it's very reasonable
to materialize -1.0 using "fld1; fchs", but on x86_64-pc-linux-gnu we
currently generate the awkward:
testm1: fld1
 fchs
 fstpl   -8(%rsp)
 movsd   -8(%rsp), %xmm0
 ret

which combine now very reasonably simplifies to just:
testm1: movsd   .LC3(%rip), %xmm0
 ret

The other class of x86_64-pc-linux-gnu failure was from materialization
of vector constants using vpbroadcast (e.g. gcc.target/i386/pr90773-17.c)
where the decision is finely balanced; the load of an integer register
with an immediate constant, followed by a vpbroadcast is deemed to be
COSTS_N_INSNS(2), whereas a load from the constant pool is also reported
as COSTS_N_INSNS(2).  My solution is to tweak the i386.c's rtx_costs
so that all other things being equal, an instruction (sequence) that
accesses memory is fractionally more expensive than one that doesn't.


Hopefully, this all makes sense.  If someone could benchmark this for
me that would me much appreciated.  This patch has been tested on
x86_64-pc-linux-gnu with "make bootstrap" and "make -k check" with no
new failures.  Ok for mainline?


2021-10-03  Roger Sayle  

gcc/ChangeLog
* combine.c (recog_for_combine): For an unrecognized move/set of
a constant, try force_const_mem to place it in the constant pool.
* cse.c (constant_pool_entries_cost, constant_pool_entries_regcost):
Delete global variables (that are no longer assigned a cost value).

Re: [PATCH] libiberty: d-demangle: use appendc for single chars append

2021-10-17 Thread Jeff Law via Gcc-patches




On 9/29/2021 9:32 AM, Luís Ferreira wrote:

This may be optimized by some modern smart compilers inliner, but since
strlen can be an external source, this can produce unoptimized code.
strlen has very well defined semantics by ISO and even if it's defined 
externally compilers know those semantics and can optimize 
appropriately.  In fact, if you build a testcase, compile it with a 
modern compiler, you should see the call to strlen optimized away & the 
call to memcpy turned into a simple store.


So I just don't see the value in adding more code here when we can just 
let the optimizer do its job and get the same result.


I won't object if Iain wants to go forward with this patch, but I'm not 
going forward with it independently.


jeff


Re: [PATCH 0/8] __builtin_dynamic_object_size and more

2021-10-17 Thread Jeff Law via Gcc-patches




On 10/7/2021 10:50 PM, Siddhesh Poyarekar wrote:

On 10/8/21 03:44, Siddhesh Poyarekar wrote:

(from about 4% to 70% in bash), but that could well be due to the _chk


I should also clarify that this is for memcpy.  For all fortifiable 
functions, the coverage percentage went from 30.81% to 84.5% for bash. 
Below is the full table.  Please note that this is only based on 
symbols emitted in the end as I didn't want to rebuild the 
_FORTIFIED_SOURCE=2 binaries, so it does not take into account the 
fact that _chk could get folded to regular calls if we know at compile 
time that it's safe to do so.


No more posting patches at 4am; it only leads to more clarification 
follow-ups :/
FWIW, that 30% number is roughly in-line with the data we saw from a Red 
Hat partner a year or so ago.  Bringing that up to 80%+ would be a 
notable win, even if folks have to explicitly opt-in, as I expect some 
projects would without hesitation.


I'd really like it if Jakub could take the lead on this.  While I'm a 
big proponent of the workn Jakub knows the relevant code far better than 
I and it'll affect the Red Hat team far more than I'll affect me these 
days :-)



Jeff



Re: [PATCH] bfin: Popcount-related improvements to machine description.

2021-10-17 Thread Jeff Law via Gcc-patches




On 10/17/2021 7:08 AM, Roger Sayle wrote:

Blackfin processors support a ONES instruction that implements a
32-bit popcount returning a 16-bit result.  This instruction was
previously described by GCC's bfin backed using a UNSPEC, but with
this patch uses a POPCOUNT:SI rtx to capture the semantics, allowing
it to evaluated at compile-time.  I've decided to keep the instruction
name the same (avoiding any changes to the __builtin_bfin_ones
machinery), but have provided popcountsi2 and popcounthi2 expanders
so that the middle-end can use this instruction to implement
__builtin_popcount (and __builtin_parity).

The new testcase ones.c
short foo ()
{
   int t = 5;
   short r = __builtin_bfin_ones(t);
   return r;
}

previously generated:
_foo:   nop;
 nop;
 R0 = 5 (X);
 R0.L = ONES R0;
 rts;

with this patch, now generates:
_foo:   nop;
 nop;
 nop;
 R0 = 2 (X);
 rts;

The new testcase popcount.c
int foo(int x)
{
   return __builtin_popcount(x);
}

previously generated:
_foo:   [--SP] = RETS;
 SP += -12;
 call ___popcountsi2;
 SP += 12;
 RETS = [SP++];
 rts;

now generates:
_foo:   nop;
 nop;
 R0.L = ONES R0;
 R0 = R0.L (Z);
 rts;

And the new testcase parity.c
int foo(int x)
{
   return __builtin_parity(x);
}

previously generated:
_foo:   [--SP] = RETS;
 SP += -12;
 call ___paritysi2;
 SP += 12;
 RETS = [SP++];
 rts;

now generates:
_foo:   nop;
 R1 = 1 (X);
 R0.L = ONES R0;
 R0 = R1 & R0;
 rts;


This patch has been tested on a cross-compiler to bfin-elf hosted
on x86_64-pc-linux-gnu, but without a toolchain, and shows no
regressions in the compile-only parts of the testsuite.
Ok for mainline?


2021-10-17  Roger Sayle  

gcc/ChangeLog
* config/bfin/bfin.md (define_constants): Remove UNSPEC_ONES.
(define_insn "ones"): Replace UNSPEC_ONES with a truncate of
a popcount, allowing compile-time evaluation/simplification.
(popcountsi2, popcounthi2): New expanders using a "ones" insn.

gcc/testsuite/ChangeLog
* gcc.target/bfin/ones.c: New test case.
* gcc.target/bfin/parity.c: New test case.
* gcc.target/bfin/ones.c: New test case.

OK
jeff



Re: [PATCH] Constant fold SS_NEG and SS_ABS in simplify-rtx.c

2021-10-17 Thread Jeff Law via Gcc-patches




On 10/17/2021 3:12 AM, Roger Sayle wrote:

This simple patch performs compile-time constant folding of
signed saturating negation and signed saturating absolute value
in the RTL optimizers.  Normally in two's complement arithmetic
the lowest representable signed value overflows on negation,
with these saturating operators they "saturate" to the maximum
representable signed value, so SS_NEG:QI -128 is 127, and
SS_ABS:HI -32768 is 32767.

On bfin-elf, the following two short functions:

short foo()
{
   short t = -32768;
   short r = __builtin_bfin_negate_fr1x16(t);
   return r;
}

int bar()
{
   int t = -2147483648;
   int r = __builtin_bfin_abs_fr1x32(t);
   return r;
}

currently compile to:
_foo:   nop;
 nop;
 R0 = -32768 (X);
 R0 = -R0 (V);
 rts;

_bar:   nop;
 R0 = -1 (X);
 R0 <<= 31;
 R0 = abs R0;
 rts;

but with this middle-end patch now compile to:

_foo:   nop;
 nop;
 nop;
 R0 = 32767 (X);
 rts;

_bar:   nop;
 nop;
 R0 = -1 (X);
 R0.H = 32767;
 rts;


This patch has been tested on x86_64-pc-linux-gnu with "make bootstrap"
and "make -k check" with no new failures.  Ok for mainline?


2021-10-17  Roger Sayle  

gcc/ChangeLog
* simplify-rtx.c (simplify_const_unary_operation) [SS_NEG, SS_ABS]:
Evalute SS_NEG and SS_ABS of a constant argument.

gcc/testsuite/ChangeLog
* gcc.target/bfin/ssabs.c: New test case.
* gcc.target/bfin/ssneg.c: New test case.

OK.

Jeff




Re: [Patch] Fortran: Fix CLASS conversion check [PR102745]

2021-10-17 Thread Paul Richard Thomas via Gcc-patches
Hi Tobias,

This is OK for mainline and as far back in the branches as you feel
inclined to go.

Thanks for the patch.

Paul


On Fri, 15 Oct 2021 at 22:19, Tobias Burnus  wrote:

> This patch fixes two issues:
>
> First, to print 'CLASS(t2)' instead of:
> Error: Type mismatch in argument ‘x’ at (1); passed
> CLASS(__class_MAIN___T2_a) to TYPE(t)
>
> Additionally,
>
>class(t2) = class(t)  ! 't2' extends 't'
>class(t2) = class(any)
>
> was wrongly accepted.
>
> OK?
>
> Tobias
> -
> Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201,
> 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer:
> Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München;
> Registergericht München, HRB 106955
>


-- 
"If you can't explain it simply, you don't understand it well enough" -
Albert Einstein


Re: [Patch] [v3] Fortran: Fix Bind(C) Array-Descriptor Conversion (Move to Front-End Code)

2021-10-17 Thread Paul Richard Thomas via Gcc-patches
Hi Tobias,

I can only echo Harald's comment that this is an impressive bit of work.

I spent some time messing with fc-descriptor-7.f90/gc-descriptor-7-c.cc
because it kept failing on me. This came about because I missed one of the
chunks not applying in the C component of the test; namely:
  for (int j = 0; j < 5; ++j)
for (int i = 0; i < 10; ++i)
  {
 subscripts[0] = j; subscripts[1] = i;
 if (*(int *) CFI_address (a, subscripts) != (i+1) + 100*(j+1))
   abort ();
  }

This set me to wondering whether or not the user should be aware that the
result of the transpose intrinsic being passed in this way should not
generate a warning that the CFI API must be used in this case and not to
depend on the data being transposed?

Apart from this I have no other comments, still less corrections :-)

Many thanks for the patch - OK for mainline.

Paul


On Wed, 13 Oct 2021 at 21:11, Harald Anlauf  wrote:

> Hi Tobias,
>
> Am 13.10.21 um 18:01 schrieb Tobias Burnus:
> > Dear all,
> >
> > a minor update [→ v3].
>
> this has become an impressive work.
>
> > I searched for XFAIL in Sandra's c-interop/ and found
> > two remaining true** xfails, now fixed:
> >
> > - gfortran.dg/c-interop/typecodes-scalar-basic.f90
> >The conversion of scalars of type(c_ptr) was mishandled;
> >fixed now; the fix did run into issues converting a string_cst,
> >which has also been fixed.
> >
> > - gfortran.dg/c-interop/fc-descriptor-7.f90
> >this one uses TRANSPOSE which did not work [now mostly* does]
> >→ PR fortran/101309 now also fixed.
> >
> > I forgot what the exact issue for the latter was. However, when
> > looking at the testcase and extending it, I did run into the
> > following issue - and at the end the testcase does now pass.
> > The issue I had was that when a contiguous check was requested
> > (i.e. only copy in when needed) it failed to work, when
> > parmse->expr was (a pointer to) a descriptor. I fixed that and
> > now most* things work.
> >
> > OK for mainline? Comments? Suggestions? More PRs which fixes
> > this patch? Regressions? Test results?
>
> Doesn't break my own codes so far.
>
> If nobody else responds within the next days, assume an OK
> from my side.
>
> This will also provide Gerhard with a new playground.  ;-)
>
> Thanks for the patch!
>
> Harald
>
> > Tobias
> >
> > PS: I intent to commit this patch to the OG11 (devel/omp/gcc-11)
> > branch, in case someone wants to test it there.
> >
> > PPS: Nice to have an extensive testcase suite - kudos to Sandra
> > in this case. I am sure Gerald will find more issues and once
> > it is in, I think I/we have to check some PRs + José's patches
> > whether for additional testcases + follow-up fixes.
> >
> > (*) I write most as passing a (potentially) noncontiguous
> > assumed-rank array to a CONTIGUOUS assumed-rank array causes
> > an ICE as the scalarizer does not handle dynamic ranks alias
> > expr->rank == -1 / ss->dimen = -1.
> > I decided that that's a separate issue and filled:
> > https://gcc.gnu.org/PR102729
> > BTW, my impression is that fixing that PR might fix will solve
> > the trans*.c part of https://gcc.gnu.org/PR102641 - but I have
> > not investigated.
> >
> > (**) There are still some 'xfail' in comments (outside dg-*)
> > whose tests now pass. And those where for two bugs in the same
> > statement, only one is reported - and the other only after fixing
> > the first one, which is fine.
> >
> > On 09.10.21 23:48, Tobias Burnus wrote:
> >> Hi all,
> >>
> >> attached is the updated version. Changes:
> >> * Handle noncontiguous arrays – with BIND(C), (g)Fortran needs to make
> it
> >>   contiguous in the caller but also handle noncontiguous in the callee.
> >> * Fixes/handle 'character(len=*)' with BIND(C); those always use an
> >>   array descriptor - also with explicit-size and assumed-size arrays
> >> * Fixed a bunch of bugs, found when writing extensive testcases.
> >> * Fixed type(*) handling - those now pass properly type and elem_len
> >>   on when calling a new function (bind(C) or not).
> >>
> >> Besides adding the type itself (which is rather straight forward),
> >> this patch only had minor modifications – and then the two big
> >> conversion functions.
> >>
> >> While it looks intimidating, it should be comparably simple to
> >> review as everything is on one place and hopefully sufficiently
> >> well documented.
> >>
> >> OK – for mainline?  Other comments? More PRs which are fixed?
> >> Issues not yet fixed (which are inside the scope of this patch)?
> >>
> >> (If this patch is too long, I also have a nine-day old pending patch
> >> at https://gcc.gnu.org/pipermail/gcc-patches/2021-October/580624.html )
> >>
> >> Tobias
> >>
> >> PS: The following still applies.
> >>
> >> On 06.09.21 12:52, Tobias Burnus wrote:
> >>> gfortran's internal array descriptor (xgfc descriptor) and
> >>> the descriptor used with BIND(C) (CFI descriptor, ISO_Fortran_binding.h
> >>> of TS29113 / Fortran 2018) are 

[PATCH v4] Fix ICE when mixing VLAs and statement expressions [PR91038]

2021-10-17 Thread Uecker, Martin


Here is the 4th version of the patch. I tried to implement
Jason's suggestion and this also fixes the problem. But
I am not sure I understand the condition on
the TREE_SIDE_EFFECTS ... 

And there is now another problem: 

c_finish_omp_for in c-family/c-omp.c does not seem
to understand the expressions anymore and I get a test
failure in 

testsuite/c-c++-common/gomp/for-5.c

where I now get an "invalid increment expression"
instead of the expected error.

(bootstrapping and all other tests work fine)


Martin




Fix ICE when mixing VLAs and statement expressions [PR91038]

When returning VM-types from statement expressions, this can
lead to an ICE when declarations from the statement expression
are referred to later. Most of these issues can be addressed by
gimplifying the base expression earlier in gimplify_compound_lval.
Another issue is fixed by adding SAVE_EXPRs in pointer_int_sum
in the FE to force a correct order of evaluation. This fixes
PR91038 and some of the test cases from PR29970 (structs with
VLA members need further work).


2021-08-01  Martin Uecker  

   2021-08-01  Martin Uecker  

gcc/
PR c/91038
PR c/29970
 
   * gimplify.c (gimplify_var_or_parm_decl): Update comment.
(gimplify_compound_lval):
Gimplify base expression first.
(gimplify_target_expr): Add comment.
* c-family/c-
common.c (pointer_int_sum): Wrap pointer
operand in SAVE_EXPR and also it to the integer
argument.

gcc/testsuite/
PR c/91038
PR c/29970
* gcc.dg/vla-stexp-3.c:
New test.
* gcc.dg/vla-stexp-4.c: New test.
* gcc.dg/vla-stexp-5.c: New test.
*
gcc.dg/vla-stexp-6.c: New test.
* gcc.dg/vla-stexp-7.c: New test.
* gcc.dg/vla-stexp-
8.c: New test.
* gcc.dg/vla-stexp-9.c: New test.


diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
index 9d19e352725..522085664f5 100644
--- a/gcc/c-family/c-common.c
+++ b/gcc/c-family/c-common.c
@@ -3348,6 +3348,16 @@ pointer_int_sum (location_t loc, enum tree_code 
resultcode,
 intop = convert (c_common_type_for_size (TYPE_PRECISION (sizetype),
 TYPE_UNSIGNED (sizetype)), intop);
 
+  /* Wrap the pointer expression in a SAVE_EXPR to make sure it
+   * is evaluated first because the size expression may depend on it
+   * for VM types.
+   */
+  if (TREE_SIDE_EFFECTS (size_exp))
+{
+ptrop = build1_loc (loc, SAVE_EXPR, TREE_TYPE (ptrop), ptrop);
+intop = build2 (COMPOUND_EXPR, TREE_TYPE (intop), ptrop, intop);
+}
+
   /* Replace the integer argument with a suitable product by the object size.
  Do this multiplication as signed, then convert to the appropriate type
  for the pointer operation and disregard an overflow that occurred only
diff --git a/gcc/gimplify.c b/gcc/gimplify.c
index d8e4b139349..be5b00b6716 100644
--- a/gcc/gimplify.c
+++ b/gcc/gimplify.c
@@ -2958,7 +2958,10 @@ gimplify_var_or_parm_decl (tree *expr_p)
  declaration, for which we've already issued an error.  It would
  be really nice if the front end wouldn't leak these at all.
  Currently the only known culprit is C++ destructors, as seen
- in g++.old-deja/g++.jason/binding.C.  */
+ in g++.old-deja/g++.jason/binding.C.
+ Another possible culpit are size expressions for variably modified
+ types which are lost in the FE or not gimplified correctly.
+  */
   if (VAR_P (decl)
   && !DECL_SEEN_IN_BIND_EXPR_P (decl)
   && !TREE_STATIC (decl) && !DECL_EXTERNAL (decl)
@@ -3103,16 +3106,22 @@ gimplify_compound_lval (tree *expr_p, gimple_seq 
*pre_p, gimple_seq *post_p,
  expression until we deal with any variable bounds, sizes, or
  positions in order to deal with PLACEHOLDER_EXPRs.
 
- So we do this in three steps.  First we deal with the annotations
- for any variables in the components, then we gimplify the base,
- then we gimplify any indices, from left to right.  */
+ The base expression may contain a statement expression that
+ has declarations used in size expressions, so has to be
+ gimplified before gimplifying the size expressions.
+
+ So we do this in three steps.  First we deal with variable
+ bounds, sizes, and positions, then we gimplify the base,
+ then we deal with the annotations for any variables in the
+ components and any indices, from left to right.  */
+
   for (i = expr_stack.length () - 1; i >= 0; i--)
 {
   tree t = expr_stack[i];
 
   if (TREE_CODE (t) == ARRAY_REF || TREE_CODE (t) == ARRAY_RANGE_REF)
{
- /* Gimplify the low bound and element type size and put them into
+ /* Deal with the low bound and element type size and put them into
 the ARRAY_REF.  If these values are set, they have already been
 gimplified.  */
  if (TREE_OPERAND (t, 2) == NULL_TREE)
@@ -3121,18 +3130,8 @@ gimplify_compound_lval (tree 

Re: [PATCH v2 0/4] libffi: Sync with upstream

2021-10-17 Thread H.J. Lu via Gcc-patches
On Sat, Oct 16, 2021 at 1:07 PM David Edelsohn  wrote:
>
> On Sat, Oct 16, 2021 at 3:59 PM H.J. Lu  wrote:
> >
> > On Sat, Oct 16, 2021 at 12:53 PM David Edelsohn  wrote:
> > >
> > > On Sat, Oct 16, 2021 at 1:13 PM H.J. Lu  wrote:
> > > >
> > > > On Sat, Oct 16, 2021 at 10:04 AM David Edelsohn  
> > > > wrote:
> > > > >
> > > > > On Sat, Oct 16, 2021 at 7:48 AM H.J. Lu  wrote:
> > > > > >
> > > > > > On Fri, Oct 15, 2021 at 5:22 PM David Edelsohn  
> > > > > > wrote:
> > > > > > >
> > > > > > > On Fri, Oct 15, 2021 at 8:06 PM H.J. Lu  
> > > > > > > wrote:
> > > > > > > >
> > > > > > > > On Wed, Oct 13, 2021 at 6:42 AM H.J. Lu  
> > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > On Wed, Oct 13, 2021 at 6:03 AM Richard Biener
> > > > > > > > >  wrote:
> > > > > > > > > >
> > > > > > > > > > On Wed, Oct 13, 2021 at 2:56 PM H.J. Lu 
> > > > > > > > > >  wrote:
> > > > > > > > > > >
> > > > > > > > > > > On Wed, Oct 13, 2021 at 5:45 AM Richard Biener
> > > > > > > > > > >  wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > On Thu, Sep 2, 2021 at 5:50 PM H.J. Lu 
> > > > > > > > > > > >  wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > Change in the v2 patch:
> > > > > > > > > > > > >
> > > > > > > > > > > > > 1. Disable static trampolines by default.
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > GCC maintained a copy of libffi snapshot from 2009 
> > > > > > > > > > > > > and cherry-picked fixes
> > > > > > > > > > > > > from upstream over the last 10+ years.  In the 
> > > > > > > > > > > > > meantime, libffi upstream
> > > > > > > > > > > > > has been changed significantly with new features, bug 
> > > > > > > > > > > > > fixes and new target
> > > > > > > > > > > > > support.  Here is a set of patches to sync with 
> > > > > > > > > > > > > libffi 3.4.2 release and
> > > > > > > > > > > > > make it easier to sync with libffi upstream:
> > > > > > > > > > > > >
> > > > > > > > > > > > > 1. Document how to sync with upstream.
> > > > > > > > > > > > > 2. Add scripts to help sync with upstream.
> > > > > > > > > > > > > 3. Sync with libffi 3.4.2. This patch is quite big.  
> > > > > > > > > > > > > It is availale at
> > > > > > > > > > > > >
> > > > > > > > > > > > > https://gitlab.com/x86-gcc/gcc/-/commit/15e80c879c571f79a0e57702848a9df5fba5be2f
> > > > > > > > > > > > > 4. Integrate libffi build and testsuite with GCC.
> > > > > > > > > > > >
> > > > > > > > > > > > How did you test this?  It looks like libgo is the only 
> > > > > > > > > > > > consumer of
> > > > > > > > > > > > libffi these days.
> > > > > > > > > > > > In particular go/libgo seems to be supported on almost 
> > > > > > > > > > > > all targets besides
> > > > > > > > > > > > darwin/windows - did you test cross and canadian 
> > > > > > > > > > > > configurations?
> > > > > > > > > > >
> > > > > > > > > > > I only tested it on Linux/i686 and Linux/x86-64.   My 
> > > > > > > > > > > understanding is that
> > > > > > > > > > > the upstream libffi works on Darwin and Windows.
> > > > > > > > > > >
> > > > > > > > > > > > I applaud the attempt to sync to upsteam but I fear you 
> > > > > > > > > > > > won't get any "review"
> > > > > > > > > > > > of this massive diff.
> > > > > > > > > > >
> > > > > > > > > > > I believe that it should just work.  Our libffi is very 
> > > > > > > > > > > much out of date.
> > > > > > > > > >
> > > > > > > > > > Yes, you can hope.  And yes, our libffi is out of date.
> > > > > > > > > >
> > > > > > > > > > Can you please do the extra step to test one weird 
> > > > > > > > > > architecture, namely
> > > > > > > > > > powerpc64-aix which is available on the compile-farm?
> > > > > > > > >
> > > > > > > > > I will give it a try and report back.
> > > > > > > > >
> > > > > > > > > > If that goes well I think it's good to "hope" at this point 
> > > > > > > > > > (and plenty of
> > > > > > > > > > time to fix fallout until the GCC 12 release).
> > > > > > > > > >
> > > > > > > > > > Thus OK after the extra testing dance and waiting until 
> > > > > > > > > > early next
> > > > > > > > > > week so others can throw in a veto.
> > > > > > > >
> > > > > > > > I tried to bootstrap GCC master branch on  gcc119.fsffrance.org:
> > > > > > > >
> > > > > > > > *  MT/MODEL: 8284-22A   
> > > > > > > >   *
> > > > > > > > * Partition: gcc119 
> > > > > > > >   *
> > > > > > > > *System: power8-aix.osuosl.org  
> > > > > > > >   *
> > > > > > > > *   O/S: AIX V7.2 7200-04-03-2038
> > > > > > > >
> > > > > > > > I configured GCC with
> > > > > > > >
> > > > > > > > --with-as=/usr/bin/as --with-ld=/usr/bin/ld
> > > > > > > > --enable-version-specific-runtime-libs --disable-nls
> > > > > > > > --enable-decimal-float=dpd --disable-libstdcxx-pch 
> > > > > > > > --disable-werror
> > > > > > > > --enable-__cxa_atexit 

[PATCH] bfin: Popcount-related improvements to machine description.

2021-10-17 Thread Roger Sayle

Blackfin processors support a ONES instruction that implements a
32-bit popcount returning a 16-bit result.  This instruction was
previously described by GCC's bfin backed using a UNSPEC, but with
this patch uses a POPCOUNT:SI rtx to capture the semantics, allowing
it to evaluated at compile-time.  I've decided to keep the instruction
name the same (avoiding any changes to the __builtin_bfin_ones
machinery), but have provided popcountsi2 and popcounthi2 expanders
so that the middle-end can use this instruction to implement
__builtin_popcount (and __builtin_parity).

The new testcase ones.c
short foo ()
{
  int t = 5;
  short r = __builtin_bfin_ones(t);
  return r;
}

previously generated:
_foo:   nop;
nop;
R0 = 5 (X);
R0.L = ONES R0;
rts;

with this patch, now generates:
_foo:   nop;
nop;
nop;
R0 = 2 (X);
rts;

The new testcase popcount.c
int foo(int x)
{
  return __builtin_popcount(x);
}

previously generated:
_foo:   [--SP] = RETS;
SP += -12;
call ___popcountsi2;
SP += 12;
RETS = [SP++];
rts;

now generates:
_foo:   nop;
nop;
R0.L = ONES R0;
R0 = R0.L (Z);
rts;

And the new testcase parity.c
int foo(int x)
{
  return __builtin_parity(x);
}

previously generated:
_foo:   [--SP] = RETS;
SP += -12;
call ___paritysi2;
SP += 12;
RETS = [SP++];
rts;

now generates:
_foo:   nop;
R1 = 1 (X);
R0.L = ONES R0;
R0 = R1 & R0;
rts;


This patch has been tested on a cross-compiler to bfin-elf hosted
on x86_64-pc-linux-gnu, but without a toolchain, and shows no
regressions in the compile-only parts of the testsuite.
Ok for mainline?


2021-10-17  Roger Sayle  

gcc/ChangeLog
* config/bfin/bfin.md (define_constants): Remove UNSPEC_ONES.
(define_insn "ones"): Replace UNSPEC_ONES with a truncate of
a popcount, allowing compile-time evaluation/simplification.
(popcountsi2, popcounthi2): New expanders using a "ones" insn.

gcc/testsuite/ChangeLog
* gcc.target/bfin/ones.c: New test case.
* gcc.target/bfin/parity.c: New test case.
* gcc.target/bfin/ones.c: New test case.

Thanks in advance,
Roger
--

diff --git a/gcc/config/bfin/bfin.md b/gcc/config/bfin/bfin.md
index 1ec0bbb..8b311f3 100644
--- a/gcc/config/bfin/bfin.md
+++ b/gcc/config/bfin/bfin.md
@@ -138,8 +138,7 @@
;; Distinguish a 32-bit version of an insn from a 16-bit version.
(UNSPEC_32BIT 11)
(UNSPEC_NOP 12)
-   (UNSPEC_ONES 13)
-   (UNSPEC_ATOMIC 14)])
+   (UNSPEC_ATOMIC 13)])
 
 (define_constants
   [(UNSPEC_VOLATILE_CSYNC 1)
@@ -1398,12 +1397,32 @@
 
 (define_insn "ones"
   [(set (match_operand:HI 0 "register_operand" "=d")
-   (unspec:HI [(match_operand:SI 1 "register_operand" "d")]
-   UNSPEC_ONES))]
+   (truncate:HI
+(popcount:SI (match_operand:SI 1 "register_operand" "d"]
   ""
   "%h0 = ONES %1;"
   [(set_attr "type" "alu0")])
 
+(define_expand "popcountsi2"
+  [(set (match_dup 2)
+   (truncate:HI (popcount:SI (match_operand:SI 1 "register_operand" ""
+   (set (match_operand:SI 0 "register_operand")
+   (zero_extend:SI (match_dup 2)))]
+  ""
+{
+  operands[2] = gen_reg_rtx (HImode);
+})
+
+(define_expand "popcounthi2"
+  [(set (match_dup 2)
+   (zero_extend:SI (match_operand:HI 1 "register_operand" "")))
+   (set (match_operand:HI 0 "register_operand") 
+   (truncate:HI (popcount:SI (match_dup 2]
+  ""
+{
+  operands[2] = gen_reg_rtx (SImode);
+})
+
 (define_insn "smaxsi3"
   [(set (match_operand:SI 0 "register_operand" "=d")
(smax:SI (match_operand:SI 1 "register_operand" "d")
/* { dg-do compile } */
/* { dg-options "-O2" } */

short foo ()
{
  int t = 5;
  short r = __builtin_bfin_ones(t);
  return r;
}

/* { dg-final { scan-assembler-not "ONES" } } */
/* { dg-do compile } */
/* { dg-options "-O2" } */

int foo(int x)
{
  return __builtin_parity(x);
}

/* { dg-final { scan-assembler "ONES" } } */
/* { dg-do compile } */
/* { dg-options "-O2" } */

int foo(int x)
{
  return __builtin_popcount(x);
}

/* { dg-final { scan-assembler "ONES" } } */


[PATCH] Constant fold SS_NEG and SS_ABS in simplify-rtx.c

2021-10-17 Thread Roger Sayle

This simple patch performs compile-time constant folding of
signed saturating negation and signed saturating absolute value
in the RTL optimizers.  Normally in two's complement arithmetic
the lowest representable signed value overflows on negation,
with these saturating operators they "saturate" to the maximum
representable signed value, so SS_NEG:QI -128 is 127, and
SS_ABS:HI -32768 is 32767.

On bfin-elf, the following two short functions:

short foo()
{
  short t = -32768;
  short r = __builtin_bfin_negate_fr1x16(t);
  return r;
}

int bar()
{
  int t = -2147483648;
  int r = __builtin_bfin_abs_fr1x32(t);
  return r;
}

currently compile to:
_foo:   nop;
nop;
R0 = -32768 (X);
R0 = -R0 (V);
rts;

_bar:   nop;
R0 = -1 (X);
R0 <<= 31;
R0 = abs R0;
rts;

but with this middle-end patch now compile to:

_foo:   nop;
nop;
nop;
R0 = 32767 (X);
rts;

_bar:   nop;
nop;
R0 = -1 (X);
R0.H = 32767;
rts;


This patch has been tested on x86_64-pc-linux-gnu with "make bootstrap"
and "make -k check" with no new failures.  Ok for mainline?


2021-10-17  Roger Sayle  

gcc/ChangeLog
* simplify-rtx.c (simplify_const_unary_operation) [SS_NEG, SS_ABS]:
Evalute SS_NEG and SS_ABS of a constant argument.

gcc/testsuite/ChangeLog
* gcc.target/bfin/ssabs.c: New test case.
* gcc.target/bfin/ssneg.c: New test case.


Thanks in advance,
Roger
--

diff --git a/gcc/simplify-rtx.c b/gcc/simplify-rtx.c
index e4fae0b..2bb18fb 100644
--- a/gcc/simplify-rtx.c
+++ b/gcc/simplify-rtx.c
@@ -2026,6 +2026,20 @@ simplify_const_unary_operation (enum rtx_code code, 
machine_mode mode,
  result = wide_int::from (op0, width, SIGNED);
  break;
 
+   case SS_NEG:
+ if (wi::only_sign_bit_p (op0))
+   result = wi::max_value (GET_MODE_PRECISION (imode), SIGNED);
+ else
+   result = wi::neg (op0);
+ break;
+
+   case SS_ABS:
+ if (wi::only_sign_bit_p (op0))
+   result = wi::max_value (GET_MODE_PRECISION (imode), SIGNED);
+ else
+   result = wi::abs (op0);
+ break;
+
case SQRT:
default:
  return 0;
/* { dg-do compile } */
/* { dg-options "-O2" } */

int foo()
{
  int t = -2147483648;
  int r = __builtin_bfin_abs_fr1x32(t);
  return r;
}

/* { dg-final { scan-assembler "32767" } } */
/* { dg-do compile } */
/* { dg-options "-O2" } */

short foo()
{
  short t = -32768;
  short r = __builtin_bfin_negate_fr1x16(t);
  return r;
}

/* { dg-final { scan-assembler "32767" } } */


[committed] wwwdocs: nongnu.org wants to be known as www.nongnu.org

2021-10-17 Thread Gerald Pfeifer
---
 htdocs/git.html | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/htdocs/git.html b/htdocs/git.html
index ac1f2eb9..881f1d38 100644
--- a/htdocs/git.html
+++ b/htdocs/git.html
@@ -334,7 +334,7 @@ in Git.
 
   modula-2
   This branch is for the
-http://nongnu.org/gm2/homepage.html;>GNU Modula-2
+http://www.nongnu.org/gm2/homepage.html;>GNU Modula-2
 front end to GCC prior to its integration with the mainline.  The
 branch will be regularly rebased against the mainline.  It is
 maintained by
-- 
2.33.0


[committed] wwwdocs: Remove link to DWARD standard

2021-10-17 Thread Gerald Pfeifer
We've got a number of links to the DWARF standard on our page, which
requires some link maintenance. Remove this one for GCC 7 which is
unlikely to be used (much).
---
 htdocs/gcc-7/changes.html | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/htdocs/gcc-7/changes.html b/htdocs/gcc-7/changes.html
index a040a80a..5bcb59b6 100644
--- a/htdocs/gcc-7/changes.html
+++ b/htdocs/gcc-7/changes.html
@@ -155,8 +155,7 @@ main (int argc, char **argv)
   UndefinedBehavior Sanitizer now diagnoses arithmetic overflows even on
   arithmetic operations with generic vectors.
 
-  Version 5 of the http://www.dwarfstd.org/Download.php;>DWARF debugging
+  Version 5 of the DWARF debugging
   information standard is supported through the -gdwarf-5
   option.  The DWARF version 4 debugging information remains the
   default until consumers of debugging information are adjusted.
-- 
2.33.0