Re: [r12-1045 Regression] FAIL: libgomp.c++/task-reduction-8.C execution test on Linux/x86_64

2021-05-25 Thread Aldy Hernandez via Gcc-patches




On 5/26/21 2:21 AM, sunil.k.pandey wrote:

On Linux/x86_64,

41ddc5b0a6b44a9df53a259636fa3b534ae41cbe is the first bad commit
commit 41ddc5b0a6b44a9df53a259636fa3b534ae41cbe
Author: Aldy Hernandez 
Date:   Tue May 25 08:36:44 2021 +0200

 Fix selftest for targets where short and int are the same size.

caused

FAIL: libgomp.c++/task-reduction-8.C execution test


This commit was a selftest fix.  I don't see how it can make a libgomp 
test fail.  If the selftest didn't succeed on your target, the entire 
build would've stopped before running tests.


Aldy



Re: [PATCH][i386] Split not+broadcast+pand to broadcast+pandn.

2021-05-25 Thread Hongtao Liu via Gcc-patches
On Wed, May 26, 2021 at 12:12 PM Andrew Pinski  wrote:
>
> On Tue, May 25, 2021 at 6:17 PM Hongtao Liu  wrote:
> >
> > Update patch:
> >   The new patch simplify (vec_duplicate (not (nonimmedaite_operand)))
> > to (not (vec_duplicate (nonimmedaite_operand))). This is not a
> > straightforward simplification, just adding some tendency to pull not
> > out of vec_duplicate.
> >
> >   For i386, it will enable below opt
> >
> > from
> > notl%edi
> > vpbroadcastd%edi, %xmm0
> > vpand   %xmm1, %xmm0, %xmm0
> > to
> > vpbroadcastd%edi, %xmm0
> > vpandn   %xmm1, %xmm0, %xmm0
> >
> >   Bootstrapped and regtested on x86_64-linux-gnu{-m32,}.
> >   Ok for trunk?
> > gcc/ChangeLog:
> >
> > PR target/100711
> > * simplify-rtx.c (simplify_unary_operation_1):
> > Simplify (vec_duplicate (not (nonimmedaite_operand)))
> > to (not (vec_duplicate (nonimmedaite_operand))).
> >
> > gcc/testsuite/ChangeLog:
> >
> > PR target/100711
> > * gcc.target/i386/avx2-pr100711.c: New test.
> > * gcc.target/i386/avx512bw-pr100711.c: New test.
>
> This patch should not use nonimmedaite_operand at all in
There's no simplification opportunity for nonimmediate_operand, but
I'm not sure for other cases(not constants).
Reading from codes in case NOT of simplify_unary_operation_1, there
may be (vec_duplicate (not (plus X - 1))???

> simplify-rtx.c.  Rather use !CONSTANT_P (XEXP (op, 0)) instead.
> And even then (not CONST_INT) will never be there anyways as it will
> always be simplified to a constant in the first place.  So removing
> that check is fine.
>
> Thanks,
> Andrew



-- 
BR,
Hongtao


Re: [PATCH] Extend is_cond_scalar_reduction to handle nop_expr after/before scalar reduction.[PR98365]

2021-05-25 Thread Hongtao Liu via Gcc-patches
On Tue, May 25, 2021 at 6:24 PM Richard Biener
 wrote:
>
> On Mon, May 24, 2021 at 11:52 AM Hongtao Liu  wrote:
> >
> > Hi:
> >   Details described in PR.
> >   Bootstrapped and regtest on
> > x86_64-linux-gnu{-m32,}/x86_64-linux-gnu{-m32\
> > -march=cascadelake,-march=cascadelake}
> >   Ok for trunk?
>
> +static tree
> +strip_nop_cond_scalar_reduction (bool has_nop, tree op)
> +{
> +  if (!has_nop)
> +return op;
> +
> +  if (TREE_CODE (op) != SSA_NAME)
> +return NULL_TREE;
> +
> +  gimple* stmt = SSA_NAME_DEF_STMT (op);
> +  if (!stmt
> +  || gimple_code (stmt) != GIMPLE_ASSIGN
> +  || gimple_has_volatile_ops (stmt)
> +  || gimple_assign_rhs_code (stmt) != NOP_EXPR)
> +return NULL_TREE;
> +
> +  return gimple_assign_rhs1 (stmt);
>
> this allows arbitrary conversions where the comment suggests you
> only want to allow conversions to the same precision but different sign.
> Sth like
>
>   gassign *stmt = safe_dyn_cast  (SSA_NAME_DEF_STMT (op));
>   if (!stmt
>   || !CONVERT_EXPR_CODE_P (gimple_assign_rhs_code (stmt))
>   || !tree_nop_conversion_p (TREE_TYPE (op), TREE_TYPE
> (gimple_assign_rhs1 (stmt
> return NULL_TREE;
>
> +  if (gimple_bb (stmt) != gimple_bb (*nop_reduc)
> + || gimple_code (stmt) != GIMPLE_ASSIGN
> + || gimple_has_volatile_ops (stmt))
> +   return false;
>
> !is_gimple_assign (stmt) instead of gimple_code (stmt) != GIMPLE_ASSIGN
>
> the gimple_has_volatile_ops check is superfluous given you restrict
> the assign code.
>
> +  /* Check that R_NOP1 is used in nop_stmt or in PHI only.  */
> +  FOR_EACH_IMM_USE_FAST (use_p, imm_iter, r_nop1)
> +   {
> + gimple *use_stmt = USE_STMT (use_p);
> + if (is_gimple_debug (use_stmt))
> +   continue;
> + if (use_stmt == SSA_NAME_DEF_STMT (r_op1))
> +   continue;
> + if (gimple_code (use_stmt) != GIMPLE_PHI)
> +   return false;
>
> can the last check be use_stmt == phi since we should have the
> PHI readily available?
>
> @@ -1735,6 +1822,23 @@ convert_scalar_cond_reduction (gimple *reduc,
> gimple_stmt_iterator *gsi,
>rhs = fold_build2 (gimple_assign_rhs_code (reduc),
>  TREE_TYPE (rhs1), op0, tmp);
>
> +  if (has_nop)
> +{
> +  /* Create assignment nop_rhs = op0 +/- _ifc_.  */
> +  tree nop_rhs = make_temp_ssa_name (TREE_TYPE (rhs1), NULL, "_nop_");
> +  gimple* new_assign2 = gimple_build_assign (nop_rhs, rhs);
> +  gsi_insert_before (gsi, new_assign2, GSI_SAME_STMT);
> +  /* Rebuild rhs for nop_expr.  */
> +  rhs = fold_build1 (NOP_EXPR,
> +TREE_TYPE (gimple_assign_lhs (nop_reduc)),
> +nop_rhs);
> +
> +  /* Delete nop_reduc.  */
> +  stmt_it = gsi_for_stmt (nop_reduc);
> +  gsi_remove (_it, true);
> +  release_defs (nop_reduc);
> +}
> +
>
> hmm, the whole function could be cleaned up with sth like
>
>  /* Build rhs for unconditional increment/decrement.  */
>  gimple_seq stmts = NULL;
>  rhs = gimple_build (, gimple_assing_rhs_code (reduc),
> TREE_TYPE (rhs1), op0, tmp);
>  if (has_nop)
>rhs = gimple_convert (, TREE_TYPE (gimple_assign_lhs
> (nop_reduc)), rhs);
>  gsi_insert_seq_before (gsi, stmts, GSI_SAME_STMT);
>
> plus in the caller moving the
>
>   new_stmt = gimple_build_assign (res, rhs);
>   gsi_insert_before (gsi, new_stmt, GSI_SAME_STMT);
>
> to the else branch as well as the folding done on new_stmt (maybe return
> new_stmt instead of rhs from convert_scalar_cond_reduction.
Eventually, we needed to assign rhs to res, and with an extra mov stmt
from rhs to res, the vectorizer failed.
the only difference in 166t.ifcvt between successfully vectorization
and failed vectorization is below
   char * _24;
   char _25;
   unsigned char _ifc__29;
+  unsigned char _30;

[local count: 118111600]:
   if (n_10(D) != 0)
@@ -70,7 +71,8 @@ char foo2 (char * a, char * c, int n)
   _5 = c_14(D) + _1;
   _6 = *_5;
   _ifc__29 = _3 == _6 ? 1 : 0;
-  cnt_7 = cnt_18 + _ifc__29;
+  _30 = cnt_18 + _ifc__29;
+  cnt_7 = _30;
   i_16 = i_20 + 1;
   if (n_10(D) != i_16)
 goto ; [89.00%]
@@ -110,7 +112,7 @@ char foo2 (char * a, char * c, int n)
   goto ; [100.00%]

[local count: 105119324]:
-  # cnt_19 = PHI 
+  # cnt_19 = PHI <_30(3), cnt_27(15)>
   _21 = (char) cnt_19;

if we want to eliminate the extra move, gimple_build and
gimple_convert is not suitable since they create a new lhs, is there
any interface like gimple_build_assign but accept stmts?
>
> Richard.
>
> >   gcc/ChangeLog:
> >
> > PR tree-optimization/pr98365
> > * tree-if-conv.c (strip_nop_cond_scalar_reduction): New function.
> > (is_cond_scalar_reduction): Handle nop_expr in cond scalar 
> > reduction.
> > (convert_scalar_cond_reduction): Ditto.
> > (predicate_scalar_phi): Ditto.
> >
> > gcc/testsuite/ChangeLog:
> >
> > PR 

Re: [PATCH][i386] Split not+broadcast+pand to broadcast+pandn.

2021-05-25 Thread Andrew Pinski via Gcc-patches
On Tue, May 25, 2021 at 6:17 PM Hongtao Liu  wrote:
>
> Update patch:
>   The new patch simplify (vec_duplicate (not (nonimmedaite_operand)))
> to (not (vec_duplicate (nonimmedaite_operand))). This is not a
> straightforward simplification, just adding some tendency to pull not
> out of vec_duplicate.
>
>   For i386, it will enable below opt
>
> from
> notl%edi
> vpbroadcastd%edi, %xmm0
> vpand   %xmm1, %xmm0, %xmm0
> to
> vpbroadcastd%edi, %xmm0
> vpandn   %xmm1, %xmm0, %xmm0
>
>   Bootstrapped and regtested on x86_64-linux-gnu{-m32,}.
>   Ok for trunk?
> gcc/ChangeLog:
>
> PR target/100711
> * simplify-rtx.c (simplify_unary_operation_1):
> Simplify (vec_duplicate (not (nonimmedaite_operand)))
> to (not (vec_duplicate (nonimmedaite_operand))).
>
> gcc/testsuite/ChangeLog:
>
> PR target/100711
> * gcc.target/i386/avx2-pr100711.c: New test.
> * gcc.target/i386/avx512bw-pr100711.c: New test.

This patch should not use nonimmedaite_operand at all in
simplify-rtx.c.  Rather use !CONSTANT_P (XEXP (op, 0)) instead.
And even then (not CONST_INT) will never be there anyways as it will
always be simplified to a constant in the first place.  So removing
that check is fine.

Thanks,
Andrew


[PATCH] C-SKY: Support fpuv2:fldrd/fstrd and fpuv3:fldr.64/fstr.64.

2021-05-25 Thread Geng Qi via Gcc-patches
gcc/ChangeLog:

* config/csky/csky.c (ck810_legitimate_index_p): Modified for
support "base + index" with DF mode.
* config/csky/constraints.md ("Y"): New constraint for memory operands
without index register.
* config/csky/csky_insn_fpuv2.md
(fpuv3_movdf):At constraints, use "Y" instead of "m" where mov between
memory and general registers, and put them baskwards.
* config/csky/csky_insn_fpuv3.md
(fpuv2_movdf): Likewise.

gcc/testsuite/ChangeLog:

* gcc.target/csky/fldrd_fstrd.c: New.
* gcc.target/csky/fpuv3/fldr64_fstr64.c: New.
---
 gcc/config/csky/constraints.md  |  4 
 gcc/config/csky/csky.c  |  3 ++-
 gcc/config/csky/csky_insn_fpuv2.md  |  4 ++--
 gcc/config/csky/csky_insn_fpuv3.md  | 16 
 gcc/testsuite/gcc.target/csky/fldrd_fstrd.c | 17 +
 gcc/testsuite/gcc.target/csky/fpuv3/fldr64_fstr64.c | 18 ++
 6 files changed, 51 insertions(+), 11 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/csky/fldrd_fstrd.c
 create mode 100644 gcc/testsuite/gcc.target/csky/fpuv3/fldr64_fstr64.c

diff --git a/gcc/config/csky/constraints.md b/gcc/config/csky/constraints.md
index c9bc9f2..2641ab3 100644
--- a/gcc/config/csky/constraints.md
+++ b/gcc/config/csky/constraints.md
@@ -38,6 +38,10 @@
   "Memory operands with base register, index register"
   (match_test "csky_valid_mem_constraint_operand (op, \"W\")"))
 
+(define_memory_constraint "Y"
+  "Memory operands without index register"
+  (not (match_test "csky_valid_mem_constraint_operand (op, \"W\")")))
+
 (define_constraint "R"
   "Memory operands whose address is a label_ref"
   (and (match_code "mem")
diff --git a/gcc/config/csky/csky.c b/gcc/config/csky/csky.c
index e4c92fe..e55821f 100644
--- a/gcc/config/csky/csky.c
+++ b/gcc/config/csky/csky.c
@@ -3136,7 +3136,8 @@ ck810_legitimate_index_p (machine_mode mode, rtx index, 
int strict_p)
   /* The follow index is for ldr instruction, the ldr cannot
  load dword data, so the mode size should not be larger than
  4.  */
-  else if (GET_MODE_SIZE (mode) <= 4)
+  else if (GET_MODE_SIZE (mode) <= 4
+  || (TARGET_HARD_FLOAT && CSKY_VREG_MODE_P (mode)))
 {
   if (is_csky_address_register_rtx_p (index, strict_p))
return 1;
diff --git a/gcc/config/csky/csky_insn_fpuv2.md 
b/gcc/config/csky/csky_insn_fpuv2.md
index 0a680f8..5a06b22 100644
--- a/gcc/config/csky/csky_insn_fpuv2.md
+++ b/gcc/config/csky/csky_insn_fpuv2.md
@@ -461,8 +461,8 @@
 )
 
 (define_insn "*fpuv2_movdf"
-  [(set (match_operand:DF 0 "nonimmediate_operand" "=r,r, r,m, v,?r,Q,v,v,v")
-   (match_operand:DF 1 "general_operand"  " r,m,mF,r,?r, v,v,Q,v,m"))]
+  [(set (match_operand:DF 0 "nonimmediate_operand" "=r, v,?r,Q,v,v,v,r, r,Y")
+   (match_operand:DF 1 "general_operand"  " r,?r, v,v,Q,v,m,Y,YF,r"))]
   "CSKY_ISA_FEATURE (fpv2_df)"
   "* return csky_output_movedouble(operands, DFmode);"
   [(set (attr "length")
diff --git a/gcc/config/csky/csky_insn_fpuv3.md 
b/gcc/config/csky/csky_insn_fpuv3.md
index 053673c..7849795 100644
--- a/gcc/config/csky/csky_insn_fpuv3.md
+++ b/gcc/config/csky/csky_insn_fpuv3.md
@@ -71,27 +71,27 @@
 )
 
 (define_insn "*fpv3_movdf"
-  [(set (match_operand:DF 0 "nonimmediate_operand" "=r,r, r,m,v,?r,Q,v,v,v, v")
-   (match_operand:DF 1 "general_operand"  " 
r,m,mF,r,?r,v,v,Q,v,m,Dv"))]
+  [(set (match_operand:DF 0 "nonimmediate_operand" "=r, v,?r,Q,v,v,v, v,r, 
r,Y")
+   (match_operand:DF 1 "general_operand"  " r,?r, 
v,v,Q,v,m,Dv,Y,YF,r"))]
   "CSKY_ISA_FEATURE(fpv3_df)"
   "*
   switch (which_alternative)
 {
-case 4:
+case 1:
   if (TARGET_BIG_ENDIAN)
return \"fmtvr.64\\t%0, %R1, %1\";
   return \"fmtvr.64\\t%0, %1, %R1\";
-case 5:
+case 2:
   if (TARGET_BIG_ENDIAN)
return \"fmfvr.64\\t%R0, %0, %1\";
   return \"fmfvr.64\\t%0, %R0, %1\";
+case 3:
+case 4:
 case 6:
-case 7:
-case 9:
   return fpuv3_output_move(operands);
-case 8:
+case 5:
   return \"fmov.64\\t%0, %1\";
-case 10:
+case 7:
   return \"fmovi.64\\t%0, %1\";
 default:
   return csky_output_movedouble(operands, DFmode);
diff --git a/gcc/testsuite/gcc.target/csky/fldrd_fstrd.c 
b/gcc/testsuite/gcc.target/csky/fldrd_fstrd.c
new file mode 100644
index 000..024de18
--- /dev/null
+++ b/gcc/testsuite/gcc.target/csky/fldrd_fstrd.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-csky-options "-mcpu=ck810f -O1 -mhard-float" } */
+
+double fldrd (double *pd, int index)
+{
+  return pd[index];
+}
+
+/* { dg-final { scan-assembler "fldrd" } } */
+
+void fstrd (double *pd, int index, double d)
+{
+  pd[index] = d;
+}
+
+/* { dg-final { scan-assembler "fstrd" } } */
+
diff --git a/gcc/testsuite/gcc.target/csky/fpuv3/fldr64_fstr64.c 

Re: [PATCH v2] forwprop: Support vec perm fed by CTOR and CTOR/CST [PR99398]

2021-05-25 Thread Kewen.Lin via Gcc-patches
>> The attached patch v2 use the structure by considering the above
>> advice and the (code == CONSTRUCTOR || code == VECTOR_CST) part
>> can be shared with VIEW_CONVERT_EXPR handlings as below:
>>
>>   op0 gathering (leave V_C_E in code if it's met)  
>>
>>   else if (code == CONSTRUCTOR || code == VECTOR_CST || VIEW_CONVERT_EXPR) 
>> {
>>op1 gathering (leave V_C_E in code2)
>>
>>if (code == VIEW_CONVERT_EXPR || code2 == VIEW_CONVERT_EXPR)
>>  do the tricks on arg0/arg1/op2
>>
>>the previous handlings on CONSTRUCTOR/VECTOR_CST
>> }
>>
>> Also updated "shrinked" to "shrunk" as Segher pointed out.  :-)
>>
>> Does it look better now?
> 
> Yes.  The forwprop changes are OK - I'd still like Richard to
> review the vec-perm-indices change.
> 

Thanks Richi!



Hi Richard,

Gentle ping for the vec-perm-indices change in case this thread
escaped from your radar.

https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570240.html

BR,
Kewen


PING^2 [PATCH/RFC] combine: Tweak the condition of last_set invalidation

2021-05-25 Thread Kewen.Lin via Gcc-patches
Hi,

Gentle ping this:

https://gcc.gnu.org/pipermail/gcc-patches/2020-December/562015.html

BR,
Kewen

on 2021/5/7 上午10:45, Kewen.Lin via Gcc-patches wrote:
> Hi Segher,
> 
>>>
>>> I think this should be postponed to stage 1 though?  Or is there
>>> anything very urgent in it?
>>>
>>
>> Yeah, I agree that this belongs to stage1, and there isn't anything
>> urgent about it.  Thanks for all further comments above!
>>
> 
> Gentle ping this:
> 
> https://gcc.gnu.org/pipermail/gcc-patches/2020-December/562015.html
> 
> BR,
> Kewen
> 


[r12-1053 Regression] FAIL: libgomp.c++/task-reduction-8.C execution test on Linux/x86_64

2021-05-25 Thread sunil.k.pandey via Gcc-patches
On Linux/x86_64,

a6e94287d31525b3ad0963ad22a92e9f3dbcd3cf is the first bad commit
commit a6e94287d31525b3ad0963ad22a92e9f3dbcd3cf
Author: Andrew MacLeod 
Date:   Tue May 25 14:59:54 2021 -0400

Remove the logical stmt cache for now.

caused

FAIL: libgomp.c++/task-reduction-8.C execution test

with GCC configured with

../../gcc/configure 
--prefix=/local/skpandey/gccwork/toolwork/gcc-bisect-master/master/r12-1053/usr 
--enable-clocale=gnu --with-system-zlib --with-demangler-in-ld 
--with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl 
--enable-libmpx x86_64-linux --disable-bootstrap

To reproduce:

$ cd {build_dir}/x86_64-linux/libgomp/testsuite && make check 
RUNTESTFLAGS="c++.exp=libgomp.c++/task-reduction-8.C --target_board='unix{-m32\ 
-march=cascadelake}'"

(Please do not reply to this email, for question about this report, contact me 
at skpgkp2 at gmail dot com)


PING^1 [PATCH] rs6000: Support more short/char to float conversion

2021-05-25 Thread Kewen.Lin via Gcc-patches
Hi,

Gentle ping this:

  https://gcc.gnu.org/pipermail/gcc-patches/2021-May/569792.html


BR,
Kewen

on 2021/5/7 上午10:30, Kewen.Lin via Gcc-patches wrote:
> Hi,
> 
> For some cases that when we load unsigned char/short values from
> the appropriate unsigned char/short memories and convert them to
> double/single precision floating point value, there would be
> implicit conversions to int first.  It makes GCC not leverage the
> P9 instructions lxsibzx/lxsihzx.  This patch is to add the related
> define_insn_and_split to support this kind of scenario.
> 
> Bootstrapped/regtested on powerpc64le-linux-gnu P9 and
> powerpc64-linux-gnu P8.
> 
> Is it ok for trunk?
> 
> BR,
> Kewen
> --
> gcc/ChangeLog:
> 
>   * config/rs6000/rs6000.md
>   (floatsi2_lfiwax__mem_zext): New
>   define_insn_and_split.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/powerpc/p9-fpcvt-3.c: New test.
> 



[PATCH v2] rs6000: Add load density heuristic

2021-05-25 Thread Kewen.Lin via Gcc-patches
Hi,

This is the updated version of patch to deal with the bwaves_r
degradation due to vector construction fed by strided loads.

As Richi's comments [1], this follows the similar idea to over
price the vector construction fed by VMAT_ELEMENTWISE or
VMAT_STRIDED_SLP.  Instead of adding the extra cost on vector
construction costing immediately, it firstly records how many
loads and vectorized statements in the given loop, later in
rs6000_density_test (called by finish_cost) it computes the
load density ratio against all vectorized stmts, and check
with the corresponding thresholds DENSITY_LOAD_NUM_THRESHOLD
and DENSITY_LOAD_PCT_THRESHOLD, do the actual extra pricing
if both thresholds are exceeded.

Note that this new load density heuristic check is based on
some fields in target cost which are updated as needed when
scanning each add_stmt_cost entry, it's independent of the
current function rs6000_density_test which requires to scan
non_vect stmts.  Since it's checking the load stmts count
vs. all vectorized stmts, it's kind of density, so I put
it in function rs6000_density_test.  With the same reason to
keep it independent, I didn't put it as an else arm of the
current existing density threshold check hunk or before this
hunk.

In the investigation of -1.04% degradation from 526.blender_r
on Power8, I noticed that the extra penalized cost 320 on one
single vector construction with type V16QI is much exaggerated,
which makes the final body cost unreliable, so this patch adds
one maximum bound for the extra penalized cost for each vector
construction statement.

Bootstrapped/regtested on powerpc64le-linux-gnu P9.

Full SPEC2017 performance evaluation on Power8/Power9 with
option combinations:
  * -O2 -ftree-vectorize {,-fvect-cost-model=very-cheap} {,-ffast-math}
  * {-O3, -Ofast} {,-funroll-loops}

bwaves_r degradations on P8/P9 have been fixed, nothing else
remarkable was observed.

Is it ok for trunk?

[1] https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570076.html

BR,
Kewen
-
gcc/ChangeLog:

* config/rs6000/rs6000.c (struct rs6000_cost_data): New members
nstmts, nloads and extra_ctor_cost.
(rs6000_density_test): Add load density related heuristics and the
checks, do extra costing on vector construction statements if need.
(rs6000_init_cost): Init new members.
(rs6000_update_target_cost_per_stmt): New function.
(rs6000_add_stmt_cost): Factor vect_nonmem hunk out to function
rs6000_update_target_cost_per_stmt and call it.

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 83d29cbfac1..806c3335cbc 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -5231,6 +5231,12 @@ typedef struct _rs6000_cost_data
 {
   struct loop *loop_info;
   unsigned cost[3];
+  /* Total number of vectorized stmts (loop only).  */
+  unsigned nstmts;
+  /* Total number of loads (loop only).  */
+  unsigned nloads;
+  /* Possible extra penalized cost on vector construction (loop only).  */
+  unsigned extra_ctor_cost;
   /* For each vectorized loop, this var holds TRUE iff a non-memory vector
  instruction is needed by the vectorization.  */
   bool vect_nonmem;
@@ -5292,9 +5298,45 @@ rs6000_density_test (rs6000_cost_data *data)
   if (dump_enabled_p ())
dump_printf_loc (MSG_NOTE, vect_location,
 "density %d%%, cost %d exceeds threshold, penalizing "
-"loop body cost by %d%%", density_pct,
+"loop body cost by %d%%\n", density_pct,
 vec_cost + not_vec_cost, DENSITY_PENALTY);
 }
+
+  /* Check if we need to penalize the body cost for latency and
+ execution resources bound from strided or elementwise loads
+ into a vector.  */
+  if (data->extra_ctor_cost > 0)
+{
+  /* Threshold for load stmts percentage in all vectorized stmts.  */
+  const int DENSITY_LOAD_PCT_THRESHOLD = 45;
+  /* Threshold for total number of load stmts.  */
+  const int DENSITY_LOAD_NUM_THRESHOLD = 20;
+
+  gcc_assert (data->nloads <= data->nstmts);
+  unsigned int load_pct = (data->nloads * 100) / (data->nstmts);
+
+  /* It's likely to be bounded by latency and execution resources
+from many scalar loads which are strided or elementwise loads
+into a vector if both conditions below are found:
+  1. there are many loads, it's easy to result in a long wait
+ for load units;
+  2. load has a big proportion of all vectorized statements,
+ it's not easy to schedule other statements to spread among
+ the loads.
+One typical case is the innermost loop of the hotspot of SPEC2017
+503.bwaves_r without loop interchange.  */
+  if (data->nloads > DENSITY_LOAD_NUM_THRESHOLD
+ && load_pct > DENSITY_LOAD_PCT_THRESHOLD)
+   {
+ data->cost[vect_body] += data->extra_ctor_cost;
+ if 

Re: [PATCH] C-SKY: Support for fpuv2:fldrd/fstrd and fpuv3:fldr.64/fstr.64.

2021-05-25 Thread Cooper Qu via Gcc-patches

Is any test case for these instructions?

On 4/30/21 9:04 PM, Geng Qi wrote:

gcc/ChangeLog:

* config/csky/csky.c (ck810_legitimate_index_p): Modified for
support "base + index" with DF mode.
* config/csky/constraints.md ("Y"): New constraint for memory operands
without index register.
* config/csky/csky_insn_fpuv2.md
(fpuv3_movdf):At constraints, use "Y" instead of "m" where mov between
memory and general registers, and put them baskwards.
* config/csky/csky_insn_fpuv3.md
(fpuv2_movdf): Likewise.
---
  gcc/config/csky/constraints.md |  4 
  gcc/config/csky/csky.c |  3 ++-
  gcc/config/csky/csky_insn_fpuv2.md |  4 ++--
  gcc/config/csky/csky_insn_fpuv3.md | 16 
  4 files changed, 16 insertions(+), 11 deletions(-)

diff --git a/gcc/config/csky/constraints.md b/gcc/config/csky/constraints.md
index c9bc9f2..2641ab3 100644
--- a/gcc/config/csky/constraints.md
+++ b/gcc/config/csky/constraints.md
@@ -38,6 +38,10 @@
"Memory operands with base register, index register"
(match_test "csky_valid_mem_constraint_operand (op, \"W\")"))
  
+(define_memory_constraint "Y"

+  "Memory operands without index register"
+  (not (match_test "csky_valid_mem_constraint_operand (op, \"W\")")))
+
  (define_constraint "R"
"Memory operands whose address is a label_ref"
(and (match_code "mem")
diff --git a/gcc/config/csky/csky.c b/gcc/config/csky/csky.c
index e4c92fe..e55821f 100644
--- a/gcc/config/csky/csky.c
+++ b/gcc/config/csky/csky.c
@@ -3136,7 +3136,8 @@ ck810_legitimate_index_p (machine_mode mode, rtx index, 
int strict_p)
/* The follow index is for ldr instruction, the ldr cannot
   load dword data, so the mode size should not be larger than
   4.  */
-  else if (GET_MODE_SIZE (mode) <= 4)
+  else if (GET_MODE_SIZE (mode) <= 4
+  || (TARGET_HARD_FLOAT && CSKY_VREG_MODE_P (mode)))
  {
if (is_csky_address_register_rtx_p (index, strict_p))
return 1;
diff --git a/gcc/config/csky/csky_insn_fpuv2.md 
b/gcc/config/csky/csky_insn_fpuv2.md
index 0a680f8..5a06b22 100644
--- a/gcc/config/csky/csky_insn_fpuv2.md
+++ b/gcc/config/csky/csky_insn_fpuv2.md
@@ -461,8 +461,8 @@
  )
  
  (define_insn "*fpuv2_movdf"

-  [(set (match_operand:DF 0 "nonimmediate_operand" "=r,r, r,m, v,?r,Q,v,v,v")
-   (match_operand:DF 1 "general_operand"  " r,m,mF,r,?r, v,v,Q,v,m"))]
+  [(set (match_operand:DF 0 "nonimmediate_operand" "=r, v,?r,Q,v,v,v,r, r,Y")
+   (match_operand:DF 1 "general_operand"  " r,?r, v,v,Q,v,m,Y,YF,r"))]
"CSKY_ISA_FEATURE (fpv2_df)"
"* return csky_output_movedouble(operands, DFmode);"
[(set (attr "length")
diff --git a/gcc/config/csky/csky_insn_fpuv3.md 
b/gcc/config/csky/csky_insn_fpuv3.md
index 053673c..7849795 100644
--- a/gcc/config/csky/csky_insn_fpuv3.md
+++ b/gcc/config/csky/csky_insn_fpuv3.md
@@ -71,27 +71,27 @@
  )
  
  (define_insn "*fpv3_movdf"

-  [(set (match_operand:DF 0 "nonimmediate_operand" "=r,r, r,m,v,?r,Q,v,v,v, v")
-   (match_operand:DF 1 "general_operand"  " 
r,m,mF,r,?r,v,v,Q,v,m,Dv"))]
+  [(set (match_operand:DF 0 "nonimmediate_operand" "=r, v,?r,Q,v,v,v, v,r, 
r,Y")
+   (match_operand:DF 1 "general_operand"  " r,?r, 
v,v,Q,v,m,Dv,Y,YF,r"))]
"CSKY_ISA_FEATURE(fpv3_df)"
"*
switch (which_alternative)
  {
-case 4:
+case 1:
if (TARGET_BIG_ENDIAN)
return \"fmtvr.64\\t%0, %R1, %1\";
return \"fmtvr.64\\t%0, %1, %R1\";
-case 5:
+case 2:
if (TARGET_BIG_ENDIAN)
return \"fmfvr.64\\t%R0, %0, %1\";
return \"fmfvr.64\\t%0, %R0, %1\";
+case 3:
+case 4:
  case 6:
-case 7:
-case 9:
return fpuv3_output_move(operands);
-case 8:
+case 5:
return \"fmov.64\\t%0, %1\";
-case 10:
+case 7:
return \"fmovi.64\\t%0, %1\";
  default:
return csky_output_movedouble(operands, DFmode);


Re: [PATCH] C-SKY: Add insn "ldbs".

2021-05-25 Thread Cooper Qu via Gcc-patches

merged.

On 5/25/21 6:45 PM, Geng Qi wrote:

gcc/
* config/csky/csky.md (cskyv2_sextend_ldbs): New insn.

gcc/testsuite/
* gcc/testsuite/gcc.target/csky/ldbs.c: New.
---
  gcc/config/csky/csky.md  | 10 ++
  gcc/testsuite/gcc.target/csky/ldbs.c | 11 +++
  2 files changed, 21 insertions(+)
  create mode 100644 gcc/testsuite/gcc.target/csky/ldbs.c

diff --git a/gcc/config/csky/csky.md b/gcc/config/csky/csky.md
index c27d627..b980d4c 100644
--- a/gcc/config/csky/csky.md
+++ b/gcc/config/csky/csky.md
@@ -1533,6 +1533,7 @@
}"
  )
  
+;; hi -> si

  (define_insn "extendhisi2"
[(set (match_operand:SI   0 "register_operand" "=r")
(sign_extend:SI (match_operand:HI 1 "register_operand" "r")))]
@@ -1557,6 +1558,15 @@
"sextb  %0, %1"
  )
  
+(define_insn "*cskyv2_sextend_ldbs"

+  [(set (match_operand:SI0 "register_operand" "=r")
+(sign_extend:SI (match_operand:QI 1 "csky_simple_mem_operand" "m")))]
+  "CSKY_ISA_FEATURE (E2)"
+  "ld.bs\t%0, %1"
+  [(set_attr "length" "4")
+   (set_attr "type" "load")]
+)
+
  ;; qi -> hi
  (define_insn "extendqihi2"
[(set (match_operand:HI   0 "register_operand" "=r")
diff --git a/gcc/testsuite/gcc.target/csky/ldbs.c 
b/gcc/testsuite/gcc.target/csky/ldbs.c
new file mode 100644
index 000..27a0254
--- /dev/null
+++ b/gcc/testsuite/gcc.target/csky/ldbs.c
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-skip-if "" { *-*-* } { "-mcpu=ck801" "-march=ck801" } { "*" } } */
+/* { dg-csky-options "-O1" } */
+
+int foo (signed char *pb)
+{
+  return *pb;
+}
+
+/* { dg-final { scan-assembler "ld.bs" } } */
+


Re: [PATCH][i386] Split not+broadcast+pand to broadcast+pandn.

2021-05-25 Thread Hongtao Liu via Gcc-patches
Update patch:
  The new patch simplify (vec_duplicate (not (nonimmedaite_operand)))
to (not (vec_duplicate (nonimmedaite_operand))). This is not a
straightforward simplification, just adding some tendency to pull not
out of vec_duplicate.

  For i386, it will enable below opt

from
notl%edi
vpbroadcastd%edi, %xmm0
vpand   %xmm1, %xmm0, %xmm0
to
vpbroadcastd%edi, %xmm0
vpandn   %xmm1, %xmm0, %xmm0

  Bootstrapped and regtested on x86_64-linux-gnu{-m32,}.
  Ok for trunk?
gcc/ChangeLog:

PR target/100711
* simplify-rtx.c (simplify_unary_operation_1):
Simplify (vec_duplicate (not (nonimmedaite_operand)))
to (not (vec_duplicate (nonimmedaite_operand))).

gcc/testsuite/ChangeLog:

PR target/100711
* gcc.target/i386/avx2-pr100711.c: New test.
* gcc.target/i386/avx512bw-pr100711.c: New test.
From aa36def1266538fdda02177be8dbf9433d7e959c Mon Sep 17 00:00:00 2001
From: liuhongt 
Date: Tue, 25 May 2021 17:17:32 +0800
Subject: [PATCH] Simplify (vec_duplicate (not (nonimmedaite_operand))) to (not
 (vec_duplicate (nonimmedaite_operand))).

This is not a straightforward simplification, just adding some
tendency to pull not out of vec_duplicate.

For i386, it will enable below opt

from
	notl%edi
  	vpbroadcastd%edi, %xmm0
  	vpand   %xmm1, %xmm0, %xmm0
to
  	vpbroadcastd%edi, %xmm0
  	vpandn   %xmm1, %xmm0, %xmm0

gcc/ChangeLog:

	PR target/100711
	* simplify-rtx.c (simplify_unary_operation_1):
	Simplify (vec_duplicate (not (nonimmedaite_operand)))
	to (not (vec_duplicate (nonimmedaite_operand))).

gcc/testsuite/ChangeLog:

	PR target/100711
	* gcc.target/i386/avx2-pr100711.c: New test.
	* gcc.target/i386/avx512bw-pr100711.c: New test.
---
 gcc/simplify-rtx.c|  9 +++
 gcc/testsuite/gcc.target/i386/avx2-pr100711.c | 73 +++
 .../gcc.target/i386/avx512bw-pr100711.c   | 48 
 3 files changed, 130 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/i386/avx2-pr100711.c
 create mode 100644 gcc/testsuite/gcc.target/i386/avx512bw-pr100711.c

diff --git a/gcc/simplify-rtx.c b/gcc/simplify-rtx.c
index 04423bbd195..bb23183a8e0 100644
--- a/gcc/simplify-rtx.c
+++ b/gcc/simplify-rtx.c
@@ -36,6 +36,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "selftest.h"
 #include "selftest-rtl.h"
 #include "rtx-vector-builder.h"
+#include "tm_p.h"
 
 /* Simplification and canonicalization of RTL.  */
 
@@ -1708,6 +1709,14 @@ simplify_context::simplify_unary_operation_1 (rtx_code code, machine_mode mode,
 #endif
   break;
 
+/* Prefer (not (vec_duplicate (nonimmedaite_operand)))
+   to (vec_duplicate (not (nonimmedaite_operand))).  */
+case VEC_DUPLICATE:
+  if (GET_CODE (op) == NOT
+	  && nonimmediate_operand (XEXP (op, 0), GET_MODE (op)))
+	return gen_rtx_NOT (mode, gen_rtx_VEC_DUPLICATE (mode, XEXP (op, 0)));
+  break;
+
 default:
   break;
 }
diff --git a/gcc/testsuite/gcc.target/i386/avx2-pr100711.c b/gcc/testsuite/gcc.target/i386/avx2-pr100711.c
new file mode 100644
index 000..5b144623873
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/avx2-pr100711.c
@@ -0,0 +1,73 @@
+/* { dg-do compile } */
+/* { dg-options "-mavx512bw -O2" } */
+/* { dg-final { scan-assembler-times "pandn" 8 } } */
+/* { dg-final { scan-assembler-not "not\[bwlq\]" } } */
+typedef char v16qi __attribute__((vector_size(16)));
+typedef char v32qi __attribute__((vector_size(32)));
+typedef short v8hi __attribute__((vector_size(16)));
+typedef short v16hi __attribute__((vector_size(32)));
+typedef int v4si __attribute__((vector_size(16)));
+typedef int v8si __attribute__((vector_size(32)));
+typedef long long v2di __attribute__((vector_size(16)));
+typedef long long v4di __attribute__((vector_size(32)));
+
+v16qi
+f1 (char a, v16qi c)
+{
+  char b = ~a;
+  return (__extension__(v16qi) {b, b, b, b, b, b, b, b,
+ b, b, b, b, b, b, b, b}) & c;
+}
+
+v32qi
+f2 (char a, v32qi c)
+{
+  char b = ~a;
+  return (__extension__(v32qi) {b, b, b, b, b, b, b, b,
+ b, b, b, b, b, b, b, b,
+ b, b, b, b, b, b, b, b,
+ b, b, b, b, b, b, b, b}) & c;
+}
+
+v8hi
+f3 (short a, v8hi c)
+{
+  short b = ~a;
+  return (__extension__(v8hi) {b, b, b, b, b, b, b, b}) & c;
+}
+
+v16hi
+f4 (short a, v16hi c)
+{
+  short b = ~a;
+  return (__extension__(v16hi) {b, b, b, b, b, b, b, b,
+ b, b, b, b, b, b, b, b}) & c;
+}
+
+v4si
+f5 (int a, v4si c)
+{
+  int b = ~a;
+  return (__extension__(v4si) {b, b, b, b}) & c;
+}
+
+v8si
+f6 (int a, v8si c)
+{
+  int b = ~a;
+  return (__extension__(v8si) {b, b, b, b, b, b, b, b}) & c;
+}
+
+v2di
+f7 (long long a, v2di c)
+{
+  long long b = ~a;
+  return (__extension__(v2di) {b, b}) & c;
+}
+
+v4di
+f8 (long long a, v4di c)
+{
+  long long b = ~a;
+  return (__extension__(v4di) {b, b, b, b}) & c;
+}
diff --git a/gcc/testsuite/gcc.target/i386/avx512bw-pr100711.c 

[r12-1045 Regression] FAIL: libgomp.c++/task-reduction-8.C execution test on Linux/x86_64

2021-05-25 Thread sunil.k.pandey via Gcc-patches
On Linux/x86_64,

41ddc5b0a6b44a9df53a259636fa3b534ae41cbe is the first bad commit
commit 41ddc5b0a6b44a9df53a259636fa3b534ae41cbe
Author: Aldy Hernandez 
Date:   Tue May 25 08:36:44 2021 +0200

Fix selftest for targets where short and int are the same size.

caused

FAIL: libgomp.c++/task-reduction-8.C execution test

with GCC configured with

../../gcc/configure 
--prefix=/local/skpandey/gccwork/toolwork/gcc-bisect-master/master/r12-1045/usr 
--enable-clocale=gnu --with-system-zlib --with-demangler-in-ld 
--with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl 
--enable-libmpx x86_64-linux --disable-bootstrap

To reproduce:

$ cd {build_dir}/x86_64-linux/libgomp/testsuite && make check 
RUNTESTFLAGS="c++.exp=libgomp.c++/task-reduction-8.C 
--target_board='unix{-m64}'"

(Please do not reply to this email, for question about this report, contact me 
at skpgkp2 at gmail dot com)


Re: [PATCH] libgccjit: add some reflection functions in the jit C api

2021-05-25 Thread Antoni Boucher via Gcc-patches
@David: PING

As far as I know, the only remaining question is about using `ssize_t`
for the return type of some functions.
Here's why I use this type:

That seemed off to return NULL for the functions returning a 
size_t to indicate an error. So I changed it to return -1 (and return
type to ssize_t). Is that the proper way to indicate an error?

Once I know the answer for this error handling question, I'll fix the
types.

Thanks!

Le jeudi 13 mai 2021 à 17:30 -0400, David Malcolm a écrit :
> On Tue, 2020-11-03 at 17:13 -0500, Antoni Boucher wrote:
> > I was missing a check in gcc_jit_struct_get_field, I added it in this
> > new patch.
> > 
> 
> Sorry about the long delay in reviewing this patch.
> 
> The main high-level points are:
> - currently the get_*_count functions return "ssize_t" - why?  Only
> unsigned values are meaningful; shouldn't they return "size_t" instead?
> 
> - the various "lookup by index" functions take "int" i.e. signed, but
> only >= 0 is meaningful.  I think it makes sense to make them take
> size_t instead.
> 
> Sorry if we covered that before in the review, it's been a while.
> 
> Various nitpicks inline below...
> 
> [...snip...]
>  
> > diff --git a/gcc/jit/docs/topics/compatibility.rst
> > b/gcc/jit/docs/topics/compatibility.rst
> > index 6bfa101ed71..236e5c72d81 100644
> > --- a/gcc/jit/docs/topics/compatibility.rst
> > +++ b/gcc/jit/docs/topics/compatibility.rst
> > @@ -226,3 +226,44 @@ entrypoints:
> >  
> >  ``LIBGCCJIT_ABI_14`` covers the addition of
> >  :func:`gcc_jit_global_set_initializer`
> > +
> > +.. _LIBGCCJIT_ABI_15:
> > +
> > +``LIBGCCJIT_ABI_15``
> > +
> > +``LIBGCCJIT_ABI_15`` covers the addition of reflection functions via
> > API
> > +entrypoints:
> 
> This needs updating, as I used LIBGCCJIT_ABI_15 for inline asm.
> 
> [...snip...]
> 
> > diff --git a/gcc/jit/docs/topics/functions.rst
> > b/gcc/jit/docs/topics/functions.rst
> > index eb40d64010e..aa6de87282d 100644
> > --- a/gcc/jit/docs/topics/functions.rst
> > +++ b/gcc/jit/docs/topics/functions.rst
> > @@ -171,6 +171,16 @@ Functions
> >     underlying string, so it is valid to pass in a pointer to an on-
> > stack
> >     buffer.
> >  
> > +.. function::  ssize_t \
> > +   gcc_jit_function_get_param_count (gcc_jit_function
> > *func)
> > +
> > +   Get the number of parameters of the function.
> > +
> > +.. function::  gcc_jit_type \*
> > +   gcc_jit_function_get_return_type (gcc_jit_function
> > *func)
> > +
> > +   Get the return type of the function.
> 
> As noted before, this doesn't yet document all the new entrypoints; I
> think you wanted to hold off until all the details were thrashed out,
> but hopefully we're close.
> 
> The documentation for an entrypoint should specify which ABI it was
> added in.
> 
> [...snip...]
> 
> > +/* Public entrypoint.  See description in libgccjit.h.
> > +
> > +   After error-checking, the real work is done by the
> > +   gcc::jit::recording::type::is_struct method, in
> > +   jit-recording.c.  */
> > +
> > +gcc_jit_struct *
> > +gcc_jit_type_is_struct (gcc_jit_type *type)
> > +{
> > +  RETURN_NULL_IF_FAIL (type, NULL, NULL, "NULL type");
> > +  gcc::jit::recording::struct_ *struct_type = type->is_struct ();
> > +  return (gcc_jit_struct *)struct_type;
> > +}
> > +
> > +/* Public entrypoint.  See description in libgccjit.h.
> > +
> > +   After error-checking, the real work is done by the
> > +   gcc::jit::recording::vector_type::get_num_units method, in
> > +   jit-recording.c.  */
> > +
> > +ssize_t
> > +gcc_jit_vector_type_get_num_units (gcc_jit_vector_type *vector_type)
> > +{
> > +  RETURN_VAL_IF_FAIL (vector_type, -1, NULL, NULL, "NULL
> > vector_type");
> > +  return vector_type->get_num_units ();
> > +}
> > +
> > +/* Public entrypoint.  See description in libgccjit.h.
> > +
> > +   After error-checking, the real work is done by the
> > +   gcc::jit::recording::vector_type::get_element_type method, in
> > +   jit-recording.c.  */
> > +
> > +gcc_jit_type *
> > +gcc_jit_vector_type_get_element_type (gcc_jit_vector_type
> > *vector_type)
> > +{
> > +  RETURN_NULL_IF_FAIL (vector_type, NULL, NULL, "NULL vector_type");
> > +  return (gcc_jit_type *)vector_type->get_element_type ();
> > +}
> > +
> > +/* Public entrypoint.  See description in libgccjit.h.
> > +
> > +   After error-checking, the real work is done by the
> > +   gcc::jit::recording::type::unqualified method, in
> > +   jit-recording.c.  */
> > +
> > +gcc_jit_type *
> > +gcc_jit_type_unqualified (gcc_jit_type *type)
> > +{
> > +  RETURN_NULL_IF_FAIL (type, NULL, NULL, "NULL type");
> > +
> > +  return (gcc_jit_type *)type->unqualified ();
> > +}
> > +
> > +/* Public entrypoint.  See description in libgccjit.h.
> > +
> > +   After error-checking, the real work is done by the
> > +   gcc::jit::recording::type::dyn_cast_function_type method, in
> > +   jit-recording.c.  */
> > +
> > +gcc_jit_function_type *
> > 

Re: [PATCH] libgccjit: Handle truncation and extension for casts [PR 95498]

2021-05-25 Thread Antoni Boucher via Gcc-patches
I updated the patch according to the comments by Tom Tromey.

There's one question left about your question regarding
C_MAYBE_CONST_EXPR, David:

I am not sure if we can get a C_MAYBE_CONST_EXPR from libgccjit, and it
indeed seems like it's only created in c-family.
However, we do use it in libgccjit here:
https://github.com/gcc-mirror/gcc/blob/master/gcc/jit/jit-playback.c#L1180

I tried removing the condition `if (TREE_CODE (t_ret) !=
C_MAYBE_CONST_EXPR)` and all the tests of libgccjit still pass.

That code was copied from here:
https://github.com/gcc-mirror/gcc/blob/master/gcc/c/c-convert.c#L175
and might not be needed in libgccjit.

Should I just remove the condition, then?

Le jeudi 13 mai 2021 à 19:58 -0400, David Malcolm a écrit :
> On Thu, 2021-05-13 at 19:31 -0400, Antoni Boucher wrote:
> > Thanks for your answer.
> > 
> > See my answers below:
> > 
> > Le jeudi 13 mai 2021 à 18:13 -0400, David Malcolm a écrit :
> > > On Sat, 2021-02-20 at 17:17 -0500, Antoni Boucher via Gcc-patches
> > > wrote:
> > > > Hi.
> > > > Thanks for your feedback!
> > > > 
> > > 
> > > Sorry about the delay in responding.
> > > 
> > > In the past I was hesitant about adding more cast support to
> > > libgccjit
> > > since I felt that the user could always just create a union to do
> > > the
> > > cast.  Then I tried actually using the libgccjit API to do this,
> > > and
> > > realized how much work it adds, so I now think we do want to
> > > support
> > > casting more types.
> > > 
> > > 
> > > > See answers below:
> > > > 
> > > > On Sat, Feb 20, 2021 at 11:20:35AM -0700, Tom Tromey wrote:
> > > > > > > > > > "Antoni" == Antoni Boucher via Gcc-patches <   
> > > > > > > > > > gcc-patches@gcc.gnu.org> writes:
> > > > > 
> > > > > Antoni> gcc/jit/
> > > > > Antoni> PR target/95498
> > > > > Antoni> * jit-playback.c: Add support to handle
> > > > > truncation
> > > > > and extension
> > > > > Antoni> in the convert function.
> > > > > 
> > > > > Antoni> +  switch (dst_code)
> > > > > Antoni> +    {
> > > > > Antoni> +    case INTEGER_TYPE:
> > > > > Antoni> +    case ENUMERAL_TYPE:
> > > > > Antoni> +  t_ret = convert_to_integer (dst_type, expr);
> > > > > Antoni> +  goto maybe_fold;
> > > > > Antoni> +
> > > > > Antoni> +    default:
> > > > > Antoni> +  gcc_assert (gcc::jit::active_playback_ctxt);
> > > > > Antoni> +  gcc::jit::active_playback_ctxt->add_error (NULL,
> > > > > "unhandled conversion");
> > > > > Antoni> +  fprintf (stderr, "input expression:\n");
> > > > > Antoni> +  debug_tree (expr);
> > > > > Antoni> +  fprintf (stderr, "requested type:\n");
> > > > > Antoni> +  debug_tree (dst_type);
> > > > > Antoni> +  return error_mark_node;
> > > > > Antoni> +
> > > > > Antoni> +    maybe_fold:
> > > > > Antoni> +  if (TREE_CODE (t_ret) != C_MAYBE_CONST_EXPR)
> > > 
> > > Do we even get C_MAYBE_CONST_EXPR in libgccjit?  That tree code is
> > > defined in c-family/c-common.def; how can nodes of that kind be
> > > created
> > > outside of the c-family?
> > 
> > I am not sure, but that seems like it's only created in c-family
> > indeed.
> > However, we do use it in libgccjit here:
> > 
> > https://github.com/gcc-mirror/gcc/blob/master/gcc/jit/jit-playback.c#L1180
> > 
> > > 
> > > > > Antoni> +   t_ret = fold (t_ret);
> > > > > Antoni> +  return t_ret;
> > > > > 
> > > > > It seems weird to have a single 'goto' to maybe_fold,
> > > > > especially
> > > > > inside
> > > > > a switch like this.
> > > > > 
> > > > > If you think the maybe_fold code won't be reused, then it
> > > > > should
> > > > > just
> > > > > be
> > > > > hoisted up and the 'goto' removed.
> > > > 
> > > > This actually depends on how the support for cast between
> > > > integers
> > > > and 
> > > > pointers will be implemented (see below).
> > > > If we will support truncating pointers (does that even make
> > > > sense?
> > > > and
> > > > I 
> > > > guess we cannot extend a pointer unless we add the support for 
> > > > uint128_t), that label will be reused for that case.
> > > > Otherwise, it might not be reused.
> > > > 
> > > > So, please tell me which option to choose and I'll update my
> > > > patch.
> > > 
> > > FWIW I don't think we'll want to support truncating or extending
> > > pointers.
> > 
> > Ok, but do you think we'll want to support casts between integers and
> > pointers?
> 
> Yes, though we probably want to reject truncating a pointer into a
> smaller integer type.
> 
> > I opened an issue about this
> > (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95438) and would be
> > willing to do a patch for it eventually.
> > 
> > > > 
> > > > > On the other hand, if the maybe_fold code might be reused for
> > > > > some
> > > > > other
> > > > > case, then I suppose I would have the case end with 'break' and
> > > > > then
> > > > > have this code outside the switch.
> > > > > 
> > > > > 
> > > > > In another message, you wrote:
> > > > > 
> > > > > Antoni> For 

[PATCH 7/8] Adjust fur_source internal api to use gori_compute not, ranger_cache.

2021-05-25 Thread Andrew MacLeod via Gcc-patches
I introduced fur_source  a week ago or so to act as the source for 
operands of fold_stmt.. ie, it encapsulated a range_query source to get 
operands for the arguments of a stmt when fold_stmt was invoked.


One of the API points was an internal ranger version which contains a 
reference to a ranger_cache for various dependency set-upo/query 
reasons.   That has been relocated to gori_compute now, so change that 
class to use a gori_compute class reference instead of a ranger_cache 
object.


Again, no functional change.  Bootstraps on x86_64-pc-linux-gnu with no 
regressions.  Pushed.


Andrew


>From f630797a1ed2f82faf965a47b43b5f995bc6623a Mon Sep 17 00:00:00 2001
From: Andrew MacLeod 
Date: Tue, 25 May 2021 14:55:04 -0400
Subject: [PATCH 7/8] Adjust fur_source internal api to use gori_compute not
 ranger_cache.

In order to access the dependencies, the FoldUsingRange source API class
stored a range_cache.. THis is now contained in the base gori_compute class,
so use that now.

	* gimple-range.cc (fold_using_range::range_of_range_op): Use m_gori
	intead of m_cache.
	(fold_using_range::range_of_address): Adjust.
	(fold_using_range::range_of_phi): Adjust.
	* gimple-range.h (class fur_source): Adjust.
	(fur_source::fur_source): Adjust.
---
 gcc/gimple-range.cc | 18 +-
 gcc/gimple-range.h  | 12 ++--
 2 files changed, 15 insertions(+), 15 deletions(-)

diff --git a/gcc/gimple-range.cc b/gcc/gimple-range.cc
index 593ddb1c3f8..e2d24d6e451 100644
--- a/gcc/gimple-range.cc
+++ b/gcc/gimple-range.cc
@@ -435,17 +435,17 @@ fold_using_range::range_of_range_op (irange , gimple *s, fur_source )
 	  // Fold range, and register any dependency if available.
 	  int_range<2> r2 (type);
 	  handler->fold_range (r, type, range1, r2);
-	  if (lhs && src.m_cache)
-	src.m_cache->register_dependency (lhs, op1);
+	  if (lhs && src.m_gori)
+	src.m_gori->register_dependency (lhs, op1);
 	}
   else if (src.get_operand (range2, op2))
 	{
 	  // Fold range, and register any dependency if available.
 	  handler->fold_range (r, type, range1, range2);
-	  if (lhs && src.m_cache)
+	  if (lhs && src.m_gori)
 	{
-	  src.m_cache->register_dependency (lhs, op1);
-	  src.m_cache->register_dependency (lhs, op2);
+	  src.m_gori->register_dependency (lhs, op1);
+	  src.m_gori->register_dependency (lhs, op2);
 	}
 	}
   else
@@ -485,8 +485,8 @@ fold_using_range::range_of_address (irange , gimple *stmt, fur_source )
 {
   tree ssa = TREE_OPERAND (base, 0);
   tree lhs = gimple_get_lhs (stmt);
-  if (src.m_cache && lhs && gimple_range_ssa_p (ssa))
-	src.m_cache->register_dependency (lhs, ssa);
+  if (src.m_gori && lhs && gimple_range_ssa_p (ssa))
+	src.m_gori->register_dependency (lhs, ssa);
   gcc_checking_assert (irange::supports_type_p (TREE_TYPE (ssa)));
   src.get_operand (r, ssa);
   range_cast (r, TREE_TYPE (gimple_assign_rhs1 (stmt)));
@@ -563,8 +563,8 @@ fold_using_range::range_of_phi (irange , gphi *phi, fur_source )
   edge e = gimple_phi_arg_edge (phi, x);
 
   // Register potential dependencies for stale value tracking.
-  if (src.m_cache && gimple_range_ssa_p (arg))
-	src.m_cache->register_dependency (phi_def, arg);
+  if (src.m_gori && gimple_range_ssa_p (arg))
+	src.m_gori->register_dependency (phi_def, arg);
 
   // Get the range of the argument on its edge.
   fur_source e_src (src.m_query, e);
diff --git a/gcc/gimple-range.h b/gcc/gimple-range.h
index 08035a53238..707dcfe027b 100644
--- a/gcc/gimple-range.h
+++ b/gcc/gimple-range.h
@@ -84,10 +84,10 @@ class fur_source
 public:
   inline fur_source (range_query *q, edge e);
   inline fur_source (range_query *q, gimple *s);
-  inline fur_source (range_query *q, class ranger_cache *g, edge e, gimple *s);
+  inline fur_source (range_query *q, gori_compute *g, edge e, gimple *s);
   bool get_operand (irange , tree expr);
 protected:
-  ranger_cache *m_cache;
+  gori_compute *m_gori;
   range_query *m_query;
   edge m_edge;
   gimple *m_stmt;
@@ -124,7 +124,7 @@ inline
 fur_source::fur_source (range_query *q, edge e)
 {
   m_query = q;
-  m_cache = NULL;
+  m_gori = NULL;
   m_edge = e;
   m_stmt = NULL;
 }
@@ -135,7 +135,7 @@ inline
 fur_source::fur_source (range_query *q, gimple *s)
 {
   m_query = q;
-  m_cache = NULL;
+  m_gori = NULL;
   m_edge = NULL;
   m_stmt = s;
 }
@@ -144,10 +144,10 @@ fur_source::fur_source (range_query *q, gimple *s)
 // and can also set the dependency information as appropriate when invoked.
 
 inline
-fur_source::fur_source (range_query *q, ranger_cache *g, edge e, gimple *s)
+fur_source::fur_source (range_query *q, gori_compute *g, edge e, gimple *s)
 {
   m_query = q;
-  m_cache = g;
+  m_gori = g;
   m_edge = e;
   m_stmt = s;
 }
-- 
2.17.2



[PATCH 8/8] Remove the logical stmt cache for now.

2021-05-25 Thread Andrew MacLeod via Gcc-patches
We added the logical stmt depth limit back in January I think, and the 
logical stmt cache is not currently in use.   This patch removes that 
code so it doesn't have too be maintained thru these changes.


Bootstraps on x86_64-pc-linux-gnu with no regressions.  Pushed.

Andrew

>From a6e94287d31525b3ad0963ad22a92e9f3dbcd3cf Mon Sep 17 00:00:00 2001
From: Andrew MacLeod 
Date: Tue, 25 May 2021 14:59:54 -0400
Subject: [PATCH 8/8] Remove the logical stmt cache for now.

With the depth limiting, we are not currently using the logical stmt cache.

	* gimple-range-gori.cc (class logical_stmt_cache): Delete
	(logical_stmt_cache::logical_stmt_cache ): Delete.
	(logical_stmt_cache::~logical_stmt_cache): Delete.
	(logical_stmt_cache::cache_entry::dump): Delete.
	(logical_stmt_cache::get_range): Delete.
	(logical_stmt_cache::cached_name ): Delete.
	(logical_stmt_cache::same_cached_name): Delete.
	(logical_stmt_cache::cacheable_p): Delete.
	(logical_stmt_cache::slot_diagnostics ): Delete.
	(logical_stmt_cache::dump): Delete.
	(gori_compute_cache::gori_compute_cache): Delete.
	(gori_compute_cache::~gori_compute_cache): Delete.
	(gori_compute_cache::compute_operand_range): Delete.
	(gori_compute_cache::cache_stmt): Delete.
	* gimple-range-gori.h (gori_compute::compute_operand_range): Remove
	virtual.
	(class gori_compute_cache): Delete.
---
 gcc/gimple-range-gori.cc | 311 ---
 gcc/gimple-range-gori.h  |  31 +---
 2 files changed, 2 insertions(+), 340 deletions(-)

diff --git a/gcc/gimple-range-gori.cc b/gcc/gimple-range-gori.cc
index 1a4ae45c986..a4c4bf507ba 100644
--- a/gcc/gimple-range-gori.cc
+++ b/gcc/gimple-range-gori.cc
@@ -1160,314 +1160,3 @@ gori_export_iterator::get_name ()
 }
   return NULL_TREE;
 }
-
-
-// --
-
-// Cache for SSAs that appear on the RHS of a boolean assignment.
-//
-// Boolean assignments of logical expressions (i.e. LHS = j_5 > 999)
-// have SSA operands whose range depend on the LHS of the assigment.
-// That is, the range of j_5 when LHS is true is different than when
-// LHS is false.
-//
-// This class caches the TRUE/FALSE ranges of such SSAs to avoid
-// recomputing.
-
-class logical_stmt_cache
-{
-public:
-  logical_stmt_cache ();
-  ~logical_stmt_cache ();
-  void set_range (tree lhs, tree name, const tf_range &);
-  bool get_range (tf_range , tree lhs, tree name) const;
-  bool cacheable_p (gimple *, const irange *lhs_range = NULL) const;
-  void dump (FILE *, gimple *stmt) const;
-  tree same_cached_name (tree lhs1, tree lh2) const;
-private:
-  tree cached_name (tree lhs) const;
-  void slot_diagnostics (tree lhs, const tf_range ) const;
-  struct cache_entry
-  {
-cache_entry (tree name, const irange _range, const irange _range);
-void dump (FILE *out) const;
-tree name;
-tf_range range;
-  };
-  vec m_ssa_cache;
-};
-
-logical_stmt_cache::cache_entry::cache_entry (tree name,
-	  const irange _range,
-	  const irange _range)
-  : name (name), range (t_range, f_range)
-{
-}
-
-logical_stmt_cache::logical_stmt_cache ()
-{
-  m_ssa_cache.create (num_ssa_names + num_ssa_names / 10);
-  m_ssa_cache.safe_grow_cleared (num_ssa_names);
-}
-
-logical_stmt_cache::~logical_stmt_cache ()
-{
-  for (unsigned i = 0; i < m_ssa_cache.length (); ++i)
-if (m_ssa_cache[i])
-  delete m_ssa_cache[i];
-  m_ssa_cache.release ();
-}
-
-// Dump cache_entry to OUT.
-
-void
-logical_stmt_cache::cache_entry::dump (FILE *out) const
-{
-  fprintf (out, "name=");
-  print_generic_expr (out, name, TDF_SLIM);
-  fprintf (out, " ");
-  range.true_range.dump (out);
-  fprintf (out, ", ");
-  range.false_range.dump (out);
-  fprintf (out, "\n");
-}
-
-// Update range for cache entry of NAME as it appears in the defining
-// statement of LHS.
-
-void
-logical_stmt_cache::set_range (tree lhs, tree name, const tf_range )
-{
-  unsigned version = SSA_NAME_VERSION (lhs);
-  if (version >= m_ssa_cache.length ())
-m_ssa_cache.safe_grow_cleared (num_ssa_names + num_ssa_names / 10);
-
-  cache_entry *slot = m_ssa_cache[version];
-  slot_diagnostics (lhs, range);
-  if (slot)
-{
-  // The IL must have changed.  Update the carried SSA name for
-  // consistency.  Testcase is libgomp.fortran/doacross1.f90.
-  if (slot->name != name)
-	slot->name = name;
-  return;
-}
-  m_ssa_cache[version]
-= new cache_entry (name, range.true_range, range.false_range);
-}
-
-// If there is a cached entry of NAME, set it in R and return TRUE,
-// otherwise return FALSE.  LHS is the defining statement where NAME
-// appeared.
-
-bool
-logical_stmt_cache::get_range (tf_range , tree lhs, tree name) const
-{
-  gcc_checking_assert (cacheable_p (SSA_NAME_DEF_STMT (lhs)));
-  if (cached_name (lhs) == name)
-{
-  unsigned version = SSA_NAME_VERSION (lhs);
-  if (m_ssa_cache[version])
-	{
-	  r = m_ssa_cache[version]->range;
-	  return true;
-	}
-}
-  

[PATCH 6/8] Make expr_range_in_bb stmt based rather than block based.

2021-05-25 Thread Andrew MacLeod via Gcc-patches
First step in moving gori_compute to range_query is to standardize its 
queries to be stmt based rather than block based. gori_compute works 
from the bottom of the block back towards the top, so it always has a 
stmt available for a range_of_expr query, there is no need to use a 
non-standard entry range query.


No functional changes.

Bootstraps on x86_64-pc-linux-gnu with no regressions.  Pushed.

Andrew

>From 2bccd9154e127909a4cdff5c19904a6562fcd0ff Mon Sep 17 00:00:00 2001
From: Andrew MacLeod 
Date: Tue, 25 May 2021 14:49:40 -0400
Subject: [PATCH 6/8] Make expr_range_in_bb stmt based rather than block based.

prerequisite to moving to a range_query model, make it stmt based.

	* gimple-range-gori.cc (gori_compute::expr_range_at_stmt): Rename
	from expr_range_in_bb and adjust.
	(gori_compute::compute_name_range_op): Adjust.
	(gori_compute::optimize_logical_operands): Adjust.
	(gori_compute::compute_logical_operands_in_chain): Adjust.
	(gori_compute::compute_operand1_range): Adjust.
	(gori_compute::compute_operand2_range): Adjust.
	(ori_compute_cache::cache_stmt): Adjust.
	* gimple-range-gori.h (gori_compute): Rename prototype.
---
 gcc/gimple-range-gori.cc | 36 ++--
 gcc/gimple-range-gori.h  |  2 +-
 2 files changed, 19 insertions(+), 19 deletions(-)

diff --git a/gcc/gimple-range-gori.cc b/gcc/gimple-range-gori.cc
index 94640adc041..1a4ae45c986 100644
--- a/gcc/gimple-range-gori.cc
+++ b/gcc/gimple-range-gori.cc
@@ -582,10 +582,10 @@ gori_compute::ssa_range_in_bb (irange , tree name, basic_block)
 }
 
 void
-gori_compute::expr_range_in_bb (irange , tree expr, basic_block bb)
+gori_compute::expr_range_at_stmt (irange , tree expr, gimple *s)
 {
   if (gimple_range_ssa_p (expr))
-ssa_range_in_bb (r, expr, bb);
+ssa_range_in_bb (r, expr, gimple_bb (s));
   else
 get_tree_range (r, expr);
 }
@@ -606,7 +606,7 @@ gori_compute::compute_name_range_op (irange , gimple *stmt,
   // Operand 1 is the name being looked for, evaluate it.
   if (op1 == name)
 {
-  expr_range_in_bb (op1_range, op1, gimple_bb (stmt));
+  expr_range_at_stmt (op1_range, op1, stmt);
   if (!op2)
 	{
 	  // The second parameter to a unary operation is the range
@@ -616,7 +616,7 @@ gori_compute::compute_name_range_op (irange , gimple *stmt,
 	  return gimple_range_calc_op1 (r, stmt, lhs, op1_range);
 	}
   // If we need the second operand, get a value and evaluate.
-  expr_range_in_bb (op2_range, op2, gimple_bb (stmt));
+  expr_range_at_stmt (op2_range, op2, stmt);
   if (gimple_range_calc_op1 (r, stmt, lhs, op2_range))
 	r.intersect (op1_range);
   else
@@ -626,8 +626,8 @@ gori_compute::compute_name_range_op (irange , gimple *stmt,
 
   if (op2 == name)
 {
-  expr_range_in_bb (op1_range, op1, gimple_bb (stmt));
-  expr_range_in_bb (r, op2, gimple_bb (stmt));
+  expr_range_at_stmt (op1_range, op1, stmt);
+  expr_range_at_stmt (r, op2, stmt);
   if (gimple_range_calc_op2 (op2_range, stmt, lhs, op1_range))
 r.intersect (op2_range);
   return true;
@@ -877,7 +877,7 @@ gori_compute::optimize_logical_operands (tf_range ,
 {
   if (!compute_operand_range (range.false_range, SSA_NAME_DEF_STMT (op),
   m_bool_zero, name))
-	expr_range_in_bb (range.false_range, name, gimple_bb (stmt));
+	expr_range_at_stmt (range.false_range, name, stmt);
   range.true_range = range.false_range;
   return true;
 }
@@ -886,7 +886,7 @@ gori_compute::optimize_logical_operands (tf_range ,
 {
   if (!compute_operand_range (range.true_range, SSA_NAME_DEF_STMT (op),
   m_bool_one, name))
-	expr_range_in_bb (range.true_range, name, gimple_bb (stmt));
+	expr_range_at_stmt (range.true_range, name, stmt);
   range.false_range = range.true_range;
   return true;
 }
@@ -905,12 +905,12 @@ gori_compute::compute_logical_operands_in_chain (tf_range ,
 		 tree op, bool op_in_chain)
 {
   gimple *src_stmt = gimple_range_ssa_p (op) ? SSA_NAME_DEF_STMT (op) : NULL;
-  basic_block bb = gimple_bb (stmt);
-  if (!op_in_chain || (src_stmt != NULL && bb != gimple_bb (src_stmt)))
+  if (!op_in_chain || (src_stmt != NULL
+  && gimple_bb (stmt) != gimple_bb (src_stmt)))
 {
   // If op is not in the def chain, or defined in this block,
   // use its known value on entry to the block.
-  expr_range_in_bb (range.true_range, name, gimple_bb (stmt));
+  expr_range_at_stmt (range.true_range, name, stmt);
   range.false_range = range.true_range;
   return;
 }
@@ -920,9 +920,9 @@ gori_compute::compute_logical_operands_in_chain (tf_range ,
   // Calculate ranges for true and false on both sides, since the false
   // path is not always a simple inversion of the true side.
   if (!compute_operand_range (range.true_range, src_stmt, m_bool_one, name))
-expr_range_in_bb (range.true_range, name, bb);
+expr_range_at_stmt (range.true_range, name, stmt);
   if (!compute_operand_range 

[PATCH 5/8] Tweak location of non-null calls. revamp ranger debug, output.

2021-05-25 Thread Andrew MacLeod via Gcc-patches
Just some minor tweaking of the location of calls to non_null_deref_p, 
as well as debug output.


Bootstraps on x86_64-pc-linux-gnu with no regressions.  Pushed.

Andrew

>From 35c78c6fc54721e067ed3a30ddd9184b45c5981d Mon Sep 17 00:00:00 2001
From: Andrew MacLeod 
Date: Tue, 25 May 2021 14:41:16 -0400
Subject: [PATCH 5/8] Tweak location of non-null calls. revamp ranger debug
 output.

range_on_entry shouldnt be checking non-null, but we sometimes should
after calling it.
change the debug output a bit.

	* gimple-range.cc (gimple_ranger::range_of_expr): Non-null should be
	checked only after range_of_stmt, not range_on_entry.
	(gimple_ranger::range_on_entry): Check for non-null in any
	predecessor block, if it is not already non-null.
	(gimple_ranger::range_on_exit): DOnt check for non-null after
	range on entry call.
	(gimple_ranger::dump_bb): New.  Split from dump.
	(gimple_ranger::dump): Adjust.
	* gimple-range.h (class gimple_ranger): Adjust.
---
 gcc/gimple-range.cc | 149 ++--
 gcc/gimple-range.h  |   1 +
 2 files changed, 74 insertions(+), 76 deletions(-)

diff --git a/gcc/gimple-range.cc b/gcc/gimple-range.cc
index 06e9804494b..593ddb1c3f8 100644
--- a/gcc/gimple-range.cc
+++ b/gcc/gimple-range.cc
@@ -976,23 +976,16 @@ gimple_ranger::range_of_expr (irange , tree expr, gimple *stmt)
 
   // If name is defined in this block, try to get an range from S.
   if (def_stmt && gimple_bb (def_stmt) == bb)
-range_of_stmt (r, def_stmt, expr);
+{
+  range_of_stmt (r, def_stmt, expr);
+  if (!cfun->can_throw_non_call_exceptions && r.varying_p () &&
+	  m_cache.m_non_null.non_null_deref_p (expr, bb))
+	r = range_nonzero (TREE_TYPE (expr));
+}
   else
 // Otherwise OP comes from outside this block, use range on entry.
 range_on_entry (r, bb, expr);
 
-  // No range yet, see if there is a dereference in the block.
-  // We don't care if it's between the def and a use within a block
-  // because the entire block must be executed anyway.
-  // FIXME:?? For non-call exceptions we could have a statement throw
-  // which causes an early block exit.
-  // in which case we may need to walk from S back to the def/top of block
-  // to make sure the deref happens between S and there before claiming
-  // there is a deref.   Punt for now.
-  if (!cfun->can_throw_non_call_exceptions && r.varying_p () &&
-  m_cache.m_non_null.non_null_deref_p (expr, bb))
-r = range_nonzero (TREE_TYPE (expr));
-
   return true;
 }
 
@@ -1010,6 +1003,10 @@ gimple_ranger::range_on_entry (irange , basic_block bb, tree name)
   // Now see if there is any on_entry value which may refine it.
   if (m_cache.block_range (entry_range, bb, name))
 r.intersect (entry_range);
+
+  if (!cfun->can_throw_non_call_exceptions && r.varying_p () &&
+  m_cache.m_non_null.non_null_deref_p (name, bb))
+r = range_nonzero (TREE_TYPE (name));
 }
 
 // Calculate the range for NAME at the end of block BB and return it in R.
@@ -1032,13 +1029,7 @@ gimple_ranger::range_on_exit (irange , basic_block bb, tree name)
   if (s)
 range_of_expr (r, name, s);
   else
-{
-  range_on_entry (r, bb, name);
-  // See if there was a deref in this block, if applicable
-  if (!cfun->can_throw_non_call_exceptions && r.varying_p () &&
-	  m_cache.m_non_null.non_null_deref_p (name, bb))
-	r = range_nonzero (TREE_TYPE (name));
-}
+range_on_entry (r, bb, name);
   gcc_checking_assert (r.undefined_p ()
 		   || range_compatible_p (r.type (), TREE_TYPE (name)));
 }
@@ -1166,80 +1157,86 @@ gimple_ranger::export_global_ranges ()
 // Print the known table values to file F.
 
 void
-gimple_ranger::dump (FILE *f)
+gimple_ranger::dump_bb (FILE *f, basic_block bb)
 {
-  basic_block bb;
-
-  FOR_EACH_BB_FN (bb, cfun)
-{
-  unsigned x;
-  edge_iterator ei;
-  edge e;
-  int_range_max range;
-  fprintf (f, "\n=== BB %d \n", bb->index);
-  m_cache.dump (f, bb);
+  unsigned x;
+  edge_iterator ei;
+  edge e;
+  int_range_max range;
+  fprintf (f, "\n=== BB %d \n", bb->index);
+  m_cache.dump (f, bb);
 
-  dump_bb (f, bb, 4, TDF_NONE);
+  ::dump_bb (f, bb, 4, TDF_NONE);
 
-  // Now find any globals defined in this block.
-  for (x = 1; x < num_ssa_names; x++)
+  // Now find any globals defined in this block.
+  for (x = 1; x < num_ssa_names; x++)
+{
+  tree name = ssa_name (x);
+  if (gimple_range_ssa_p (name) && SSA_NAME_DEF_STMT (name) &&
+	  gimple_bb (SSA_NAME_DEF_STMT (name)) == bb &&
+	  m_cache.get_global_range (range, name))
 	{
-	  tree name = ssa_name (x);
-	  if (gimple_range_ssa_p (name) && SSA_NAME_DEF_STMT (name) &&
-	  gimple_bb (SSA_NAME_DEF_STMT (name)) == bb &&
-	  m_cache.get_global_range (range, name))
+	  if (!range.varying_p ())
 	{
-	  if (!range.varying_p ())
-	   {
-		 print_generic_expr (f, name, TDF_SLIM);
-		 fprintf (f, " : ");
-		 range.dump (f);
-	

[PATCH 4/8] Unify temporal cache with gori dependencies.

2021-05-25 Thread Andrew MacLeod via Gcc-patches
The patch removes the custom temporal cache from GCC11 and replaces it 
with a simple timestamp vector and utilizes the direct dependants from 
gori_compute.


This allows the registration of said dependencies to be handled 
generically by the simplified fold_stmt class thru the gori_compute 
class.  So All the dependency analysis is handled centrally in one place 
now.


Bootstraps on x86_64-pc-linux-gnu with no regressions.  Pushed.

Andrew


>From 10b286ce335cca135a45a92581b28146f3e3209b Mon Sep 17 00:00:00 2001
From: Andrew MacLeod 
Date: Tue, 25 May 2021 14:34:06 -0400
Subject: [PATCH 4/8] Unify temporal cache with gori dependencies.

Move the temporal cache to strictly be a timestamp, and query GORI for
the dependencies rather than trying to register and maintain them.

	* gimple-range-cache.cc (struct range_timestamp): Delete.
	(class temporal_cache): Adjust.
	(temporal_cache::get_timestamp): Delete.
	(temporal_cache::set_dependency): Delete.
	(temporal_cache::temporal_value): Adjust.
	(temporal_cache::current_p): Take dependencies as params.
	(temporal_cache::set_timestamp): Adjust.
	(temporal_cache::set_always_current): Adjust.
	(ranger_cache::get_non_stale_global_range): Adjust.
	(ranger_cache::register_dependency): Delete.
	* gimple-range-cache.h (class range_cache): Adjust.
---
 gcc/gimple-range-cache.cc | 116 +++---
 gcc/gimple-range-cache.h  |   1 -
 2 files changed, 32 insertions(+), 85 deletions(-)

diff --git a/gcc/gimple-range-cache.cc b/gcc/gimple-range-cache.cc
index 8ad76048272..3969c4de220 100644
--- a/gcc/gimple-range-cache.cc
+++ b/gcc/gimple-range-cache.cc
@@ -474,43 +474,28 @@ ssa_global_cache::dump (FILE *f)
 // --
 
 
-// This struct provides a timestamp for a global range calculation.
-// it contains the time counter, as well as a limited number of ssa-names
-// that it is dependent upon.  If the timestamp for any of the dependent names
-// Are newer, then this range could need updating.
-
-struct range_timestamp
-{
-  unsigned time;
-  unsigned ssa1;
-  unsigned ssa2;
-};
-
 // This class will manage the timestamps for each ssa_name.
-// When a value is calcualted, its timestamp is set to the current time.
-// The ssanames it is dependent on have already been calculated, so they will
-// have older times.  If one fo those values is ever calculated again, it
-// will get a newer timestamp, and the "current_p" check will fail.
+// When a value is calculated, the timestamp is set to the current time.
+// Current time is then incremented.  Any dependencies will already have
+// been calculated, and will thus have older timestamps.
+// If one of those values is ever calculated again, it will get a newer
+// timestamp, and the "current_p" check will fail.
 
 class temporal_cache
 {
 public:
   temporal_cache ();
   ~temporal_cache ();
-  bool current_p (tree name) const;
+  bool current_p (tree name, tree dep1, tree dep2) const;
   void set_timestamp (tree name);
-  void set_dependency (tree name, tree dep);
   void set_always_current (tree name);
 private:
   unsigned temporal_value (unsigned ssa) const;
-  const range_timestamp *get_timestamp (unsigned ssa) const;
-  range_timestamp *get_timestamp (unsigned ssa);
 
   unsigned m_current_time;
-  vec  m_timestamp;
+  vec  m_timestamp;
 };
 
-
 inline
 temporal_cache::temporal_cache ()
 {
@@ -525,65 +510,35 @@ temporal_cache::~temporal_cache ()
   m_timestamp.release ();
 }
 
-// Return a pointer to the timetamp for ssa-name at index SSA, if there is
-// one, otherwise return NULL.
-
-inline const range_timestamp *
-temporal_cache::get_timestamp (unsigned ssa) const
-{
-  if (ssa >= m_timestamp.length ())
-return NULL;
-  return &(m_timestamp[ssa]);
-}
-
-// Return a reference to the timetamp for ssa-name at index SSA.  If the index
-// is past the end of the vector, extend the vector.
-
-inline range_timestamp *
-temporal_cache::get_timestamp (unsigned ssa)
-{
-  if (ssa >= m_timestamp.length ())
-m_timestamp.safe_grow_cleared (num_ssa_names + 20);
-  return &(m_timestamp[ssa]);
-}
-
-// This routine will fill NAME's next operand slot with DEP if DEP is a valid
-// SSA_NAME and there is a free slot.
-
-inline void
-temporal_cache::set_dependency (tree name, tree dep)
-{
-  if (dep && TREE_CODE (dep) == SSA_NAME)
-{
-  gcc_checking_assert (get_timestamp (SSA_NAME_VERSION (name)));
-  range_timestamp& ts = *(get_timestamp (SSA_NAME_VERSION (name)));
-  if (!ts.ssa1)
-	ts.ssa1 = SSA_NAME_VERSION (dep);
-  else if (!ts.ssa2 && ts.ssa1 != SSA_NAME_VERSION (name))
-	ts.ssa2 = SSA_NAME_VERSION (dep);
-}
-}
-
 // Return the timestamp value for SSA, or 0 if there isnt one.
+
 inline unsigned
 temporal_cache::temporal_value (unsigned ssa) const
 {
-  const range_timestamp *ts = get_timestamp (ssa);
-  return ts ? ts->time : 0;
+  if (ssa >= m_timestamp.length ())
+return 0;
+  return m_timestamp[ssa];
 

[PATCH 3/8] Add imports and strengthen the export definition to range_def and gori_map.

2021-05-25 Thread Andrew MacLeod via Gcc-patches
This patch introduced imports to range_def, and corrects some minor 
issues with export calculation.


When gori-compute is evaluating sequences in a block:

  - an "export" is defined as any ssa_name which may have a range 
calculated on an outgoing edge.  If the edge may CHANGE the value of 
ssa-name from its on-entry/exit value, then it is an export.


 - an "import" is any ssa name which which may affect the outgoing 
value of an export.   This may be the ssa_anme itself, or and ssa_name 
used in the calculation of the ssa_name value.


   bb5:
   b_6 = a_4 + 5
   c_5 = b_6 < z_9
   if (c_5 != 0)

in this small sample c_5, b_6, z_9 and a_4 are all exports, because we 
may be able to refine a range for any of them on one of the edges


a_4 and z_9 are considered imports because those are the ssa_names which 
have definitions occurring outside the block which can affect the value 
of any an exports.


This patch introduces imports to  both range_def and gori_map, as well 
as standardizes the dependency chain registration API so we can also 
cache up to 2 direct dependent names and eventually utilize that for 
re-computation and the temporal cache.


Bootstraps on x86_64-pc-linux-gnu with no regressions.  Pushed.

Andrew

>From c21644704160710a17d1ea6c1cd212e079cd5e36 Mon Sep 17 00:00:00 2001
From: Andrew MacLeod 
Date: Tue, 25 May 2021 14:15:50 -0400
Subject: [PATCH 3/8] Add imports and strengthen the export definition in
 range_def and gori_map.

All add up to 2 direct dependencies for each ssa-name.
Add gori import/export iterators.

	* gimple-range-gori.cc (range_def_chain::range_def_chain): init
	bitmap obstack.
	(range_def_chain::~range_def_chain): Dispose of obstack rather than
	each individual bitmap.
	(range_def_chain::set_import): New.
	(range_def_chain::get_imports): New.
	(range_def_chain::chain_import_p): New.
	(range_def_chain::register_dependency): Rename from build_def_chain
	and set imports.
	(range_def_chain::def_chain_in_bitmap_p): New.
	(range_def_chain::add_def_chain_to_bitmap): New.
	(range_def_chain::has_def_chain): Just check first depenedence.
	(range_def_chain::get_def_chain): Process imports, use generic
	register_dependency routine.
	(range_def_chain::dump): New.
	(gori_map::gori_map): Allocate import list.
	(gori_map::~gori_map): Release imports.
	(gori_map::exports): Check for past allocated block size.
	(gori_map::imports): New.
	(gori_map::def_chain_in_export_p): Delete.
	(gori_map::is_import_p): New.
	(gori_map::maybe_add_gori): Handle imports.
	(gori_map::dump): Adjust output, add imports.
	(gori_compute::has_edge_range_p): Remove def_chain_in_export call.
	(gori_export_iterator::gori_export_iterator): New.
	(gori_export_iterator::next): New.
	(gori_export_iterator::get_name): New.
	* gimple-range-gori.h (range_def_chain): Add imports and direct
	dependecies via struct rdc.
	(range_def_chain::depend1): New.
	(range_def_chain::depend2): New.
	(class gori_map): Adjust.
	(FOR_EACH_GORI_IMPORT_NAME): New.
	(FOR_EACH_GORI_EXPORT_NAME): New.
	(class gori_export_iterator): New.
---
 gcc/gimple-range-gori.cc | 356 ---
 gcc/gimple-range-gori.h  |  77 -
 2 files changed, 327 insertions(+), 106 deletions(-)

diff --git a/gcc/gimple-range-gori.cc b/gcc/gimple-range-gori.cc
index e30049edfbd..94640adc041 100644
--- a/gcc/gimple-range-gori.cc
+++ b/gcc/gimple-range-gori.cc
@@ -56,7 +56,7 @@ is_gimple_logical_p (const gimple *gs)
   return false;
 }
 
-/* RANGE_DEF_CHAIN is used to determine what SSA names in a block can
+/* RANGE_DEF_CHAIN is used to determine which SSA names in a block can
have range information calculated for them, and what the
dependencies on each other are.
 
@@ -95,6 +95,7 @@ is_gimple_logical_p (const gimple *gs)
 
 range_def_chain::range_def_chain ()
 {
+  bitmap_obstack_initialize (_bitmaps);
   m_def_chain.create (0);
   m_def_chain.safe_grow_cleared (num_ssa_names);
   m_logical_depth = 0;
@@ -104,11 +105,8 @@ range_def_chain::range_def_chain ()
 
 range_def_chain::~range_def_chain ()
 {
-  unsigned x;
-  for (x = 0; x < m_def_chain.length (); ++x)
-if (m_def_chain[x])
-  BITMAP_FREE (m_def_chain[x]);
   m_def_chain.release ();
+  bitmap_obstack_release (_bitmaps);
 }
 
 // Return true if NAME is in the def chain of DEF.  If BB is provided,
@@ -128,26 +126,112 @@ range_def_chain::in_chain_p (tree name, tree def)
   return bitmap_bit_p (chain, SSA_NAME_VERSION (name));
 }
 
+// Add either IMP or the import list B to the import set of DATA.
+
+void
+range_def_chain::set_import (struct rdc , tree imp, bitmap b)
+{
+  // If there are no imports, just return
+  if (imp == NULL_TREE && !b)
+return;
+  if (!data.m_import)
+data.m_import = BITMAP_ALLOC (_bitmaps);
+  if (imp != NULL_TREE)
+bitmap_set_bit (data.m_import, SSA_NAME_VERSION (imp));
+  else
+bitmap_ior_into (data.m_import, b);
+}
+
+// Return the import list for NAME.
+
+bitmap
+range_def_chain::get_imports (tree name)
+{
+  if 

[PATCH 2/8] fully populate the export list from range_cache, not, gori_compute.

2021-05-25 Thread Andrew MacLeod via Gcc-patches

Ranger wants to prepopulate all the export blocks so that it has an initial
invariant set of names. GORI consumers shouldn't be penalized for ranger
requirements.  This way any gori client remains lightweight.

Bootstraps on x86_64-pc-linux-gnu with no regressions.  Pushed.

Andrew

>From cb33af1a62b09576b0782ac36e5f5cff049f1035 Mon Sep 17 00:00:00 2001
From: Andrew MacLeod 
Date: Tue, 25 May 2021 13:53:25 -0400
Subject: [PATCH 2/8] fully populate the export list from range_cache, not
 gori_compute.

Ranger wants to prepopulate all the export blocks so that it has an initial
invariant set of names. GORI consumers shouldn't be penalized for ranger
requirements.  This way any gori client remains lightweight.

	* gimple-range-cache.cc (ranger_cache::ranger_cache):  Move initial
	export cache filling to here.
	* gimple-range-gori.cc (gori_compute::gori_compute) : From Here.
---
 gcc/gimple-range-cache.cc | 10 ++
 gcc/gimple-range-gori.cc  | 10 --
 2 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/gcc/gimple-range-cache.cc b/gcc/gimple-range-cache.cc
index 2c922e32913..8ad76048272 100644
--- a/gcc/gimple-range-cache.cc
+++ b/gcc/gimple-range-cache.cc
@@ -618,6 +618,16 @@ ranger_cache::ranger_cache (gimple_ranger ) : query (q)
   m_poor_value_list.safe_grow_cleared (20);
   m_poor_value_list.truncate (0);
   m_temporal = new temporal_cache;
+  unsigned x, lim = last_basic_block_for_fn (cfun);
+  // Calculate outgoing range info upfront.  This will fully populate the
+  // m_maybe_variant bitmap which will help eliminate processing of names
+  // which never have their ranges adjusted.
+  for (x = 0; x < lim ; x++)
+{
+  basic_block bb = BASIC_BLOCK_FOR_FN (cfun, x);
+  if (bb)
+	exports (bb);
+}
 }
 
 ranger_cache::~ranger_cache ()
diff --git a/gcc/gimple-range-gori.cc b/gcc/gimple-range-gori.cc
index 074c025be37..e30049edfbd 100644
--- a/gcc/gimple-range-gori.cc
+++ b/gcc/gimple-range-gori.cc
@@ -458,16 +458,6 @@ gori_compute::gori_compute ()
   // Create a boolean_type true and false range.
   m_bool_zero = int_range<2> (boolean_false_node, boolean_false_node);
   m_bool_one = int_range<2> (boolean_true_node, boolean_true_node);
-  unsigned x, lim = last_basic_block_for_fn (cfun);
-  // Calculate outgoing range info upfront.  This will fully populate the
-  // m_maybe_variant bitmap which will help eliminate processing of names
-  // which never have their ranges adjusted.
-  for (x = 0; x < lim ; x++)
-{
-  basic_block bb = BASIC_BLOCK_FOR_FN (cfun, x);
-  if (bb)
-	exports (bb);
-}
 }
 
 // Provide a default of VARYING for all incoming SSA names.
-- 
2.17.2



[PATCH 1/8] Change gori_compute to inherit from gori_map instead of, having a gori-map.

2021-05-25 Thread Andrew MacLeod via Gcc-patches
This is the first in a set of, well, many patches.  It ultimately 
changes the gori-compute model into a consumer of the range_query class, 
which allows for much simpler and consistent interaction with the 
fold_stmt class. I'm basically evolving all the code base to 
consistently interact with range_query.


the gori_compute class currently contains a gori_map class for all the 
dependency analysis and exports.  This patch changes it to inherit from 
a gori_map instead, which then exposes all the dependency and exports 
code to any client using a gori_compute. The upcoming threader code 
makes extensive use of the dependency analysis, and exposing it also 
allows the temporal cache to not try to maintain its own (one of the 
follow on patches)


The range_def and gori_map class definitions are moved from 
gimple-range-gori.cc into the header file, and most of the change is 
adjustments to calling from the base class instead of invoking a method 
of the gori_map member. Sometime in the next couple of weeks I'll write 
up exactly what range-def and gori_map provide, as the threader makes 
use of it, and I presume there are some other places i could be useful.


Bootstraps on x86_64-pc-linux-gnu with no regressions.  Pushed.

Andrew



>From 28ceee1b91f48b5ab09cbd20ea6a9de6ea137af8 Mon Sep 17 00:00:00 2001
From: Andrew MacLeod 
Date: Tue, 25 May 2021 13:45:43 -0400
Subject: [PATCH 1/8] Change gori_compute to inherit from gori_map instead of
 having a gori-map.

Move the classes to the header file and inherit instead of instantiating.

	* gimple-range-gori.cc (range_def_chain): Move to gimple-range-gori.h.
	(gori_map): Move to gimple-range-gori.h.
	(gori_compute::gori_compute): Adjust.
	(gori_compute::~gori_compute): Delete.
	(gori_compute::compute_operand_range_switch): Adjust.
	(gori_compute::compute_operand_range): Adjust.
	(gori_compute::compute_logical_operands): Adjust.
	(gori_compute::has_edge_range_p ): Adjust.
	(gori_compute::set_range_invariant): Delete.
	(gori_compute::dump): Adjust.
	(gori_compute::outgoing_edge_range_p): Adjust.
	* gimple-range-gori.h (class range_def_chain): Relocate here.
	(class gori_map): Relocate here.
	(class gori_compute): Inherit from gori_map, and adjust.
---
 gcc/gimple-range-gori.cc | 76 ++--
 gcc/gimple-range-gori.h  | 48 ++---
 2 files changed, 55 insertions(+), 69 deletions(-)

diff --git a/gcc/gimple-range-gori.cc b/gcc/gimple-range-gori.cc
index 420282deb2d..074c025be37 100644
--- a/gcc/gimple-range-gori.cc
+++ b/gcc/gimple-range-gori.cc
@@ -91,21 +91,6 @@ is_gimple_logical_p (const gimple *gs)
 engine implements operations for.  */
 
 
-class range_def_chain
-{
-public:
-  range_def_chain ();
-  ~range_def_chain ();
-  bool has_def_chain (tree name);
-  bitmap get_def_chain (tree name);
-  bool in_chain_p (tree name, tree def);
-private:
-  vec m_def_chain;	// SSA_NAME : def chain components.
-  void build_def_chain (tree name, bitmap result, basic_block bb);
-  int m_logical_depth;
-};
-
-
 // Construct a range_def_chain.
 
 range_def_chain::range_def_chain ()
@@ -264,27 +249,6 @@ range_def_chain::get_def_chain (tree name)
entire def_chain of all SSA names used in the last statement of the
block which generate ranges.  */
 
-class gori_map : public range_def_chain
-{
-public:
-  gori_map ();
-  ~gori_map ();
-
-  bool is_export_p (tree name, basic_block bb = NULL);
-  bool def_chain_in_export_p (tree name, basic_block bb);
-  bitmap exports (basic_block bb);
-  void set_range_invariant (tree name);
-
-  void dump (FILE *f);
-  void dump (FILE *f, basic_block bb);
-private:
-  bitmap_obstack m_bitmaps;
-  vec m_outgoing;	// BB: Outgoing ranges calculatable on edges
-  bitmap m_maybe_variant;	// Names which might have outgoing ranges.
-  void maybe_add_gori (tree name, basic_block bb);
-  void calculate_gori (basic_block bb);
-};
-
 
 // Initialize a gori-map structure.
 
@@ -494,7 +458,6 @@ gori_compute::gori_compute ()
   // Create a boolean_type true and false range.
   m_bool_zero = int_range<2> (boolean_false_node, boolean_false_node);
   m_bool_one = int_range<2> (boolean_true_node, boolean_true_node);
-  m_gori_map = new gori_map;
   unsigned x, lim = last_basic_block_for_fn (cfun);
   // Calculate outgoing range info upfront.  This will fully populate the
   // m_maybe_variant bitmap which will help eliminate processing of names
@@ -503,17 +466,10 @@ gori_compute::gori_compute ()
 {
   basic_block bb = BASIC_BLOCK_FOR_FN (cfun, x);
   if (bb)
-	m_gori_map->exports (bb);
+	exports (bb);
 }
 }
 
-// Destruct a gori_compute_object.
-
-gori_compute::~gori_compute ()
-{
-  delete m_gori_map;
-}
-
 // Provide a default of VARYING for all incoming SSA names.
 
 void
@@ -597,7 +553,7 @@ gori_compute::compute_operand_range_switch (irange , gswitch *s,
 }
 
   // If op1 is in the defintion chain, pass lhs back.
-  if (gimple_range_ssa_p (op1) && m_gori_map->in_chain_p (name, 

Re: [_GLIBCXX_DEBUG] Enhance rendering of assert message

2021-05-25 Thread Jonathan Wakely via Gcc-patches

On 25/05/21 23:01 +0200, François Dumont wrote:

On 25/05/21 11:58 am, Jonathan Wakely wrote:

On 22/05/21 22:08 +0200, François Dumont via Libstdc++ wrote:
Here is the part of the libbacktrace patch with the enhancement to 
the rendering of assert message.


It only contains one real fix, the rendering of address. In 2 
places it was done with "0x%p", so resulting in something like: 
0x0x012345678


Otherwise it is just enhancements, mostly avoiding intermediate 
buffering.


I am adding the _Parameter::_Named type to help on the rendering. 
I hesitated in doing the same for the _M_iterator type but 
eventually managed it with a template function.


    libstdc++: [_GLIBCXX_DEBUG] Enhance rendering of assert message

    Avoid building an intermediate buffer to print to stderr, push 
directly to stderr.


    libstdc++-v3/ChangeLog:

    * include/debug/formatter.h
    (_Error_formatter::_Parameter::_Named): New.
    (_Error_formatter::_Parameter::_Type): Inherit latter.
    (_Error_formatter::_Parameter::_M_integer): Likewise.
    (_Error_formatter::_Parameter::_M_string): Likewise.
    * src/c++11/debug.cc: Include .
    (_Print_func_t): New.
    (print_raw(PrintContext&, const char*, ptrdiff_t)): New.
    (print_word): Use '%.*s' format in fprintf to render 
only expected number of chars.
    (pretty_print(PrintContext&, const char*, 
_Print_func_t)): New.

    (print_type): Rename in...
    (print_type_info): ...this. Use pretty_print.
    (print_address, print_integer): New.
    (print_named_name, print_iterator_constness, 
print_iterator_state): New.

    (print_iterator_seq_type): New.
    (print_named_field, print_type_field, 
print_instance_field, print_iterator_field): New.

    (print_field): Use latters.
    (print_quoted_named_name, print_type_type, print_type, 
print_instance): New.
    (print_string(PrintContext&, const char*, const 
_Parameter*, size_t)):

    Change signature to...
    (print_string(PrintContext&, const char*, ptrdiff_t, 
const _Parameter*, size_t)):
    ...this and adapt. Remove intermediate buffer to 
render input string.

    (print_string(PrintContext&, const char*, ptrdiff_t)): New.

Ok to commit ?

François




+  void
+  pretty_print(PrintContext& ctx, const char* str, _Print_func_t 
print_func)

+  {
+    const char cxx1998[] = "__cxx1998::";
+    const char uglification[] = "__";
+    for (;;)
+  {
+    auto idx = strstr(str, uglification);


This is confusing. strstr returns a pointer, not an index into the
string.


+    if (idx)
+  {
+    size_t offset =
+  (idx == strstr(str, cxx1998)
+   ? sizeof(cxx1998) : sizeof(uglification)) - 1;


This is a bit inefficient. Consider "int __foo(__cxx1998::bar)". The
first strstr returns a pointer to "__foo" and then the second one
searches the entire string from the beginning looking for
"__cxx1998::", and checks if it is the same position as "__foo".

The second strstr doesn't need to search from the beginning, and it
doesn't need to look all the way to the end. It should be memcmp.

  if (auto pos = strstr(str, uglification))
    {
  if (pos != str)
    print_func(ctx, str, pos - str);

  if (memcmp(pos, cxx1998, sizeof(cxx1998)-1) == 0)
    str = pos + (sizeof(cxx1998) - 1);
  else
    str = pos + (sizeof(uglification) - 1);
      while (*str && isspace((unsigned char)*str))
    ++str;

  if (!*str)
    break;
    }
  else

It doesn't even need to search from the position found by the first
strstr, because we already know it starts with "__", so:

  for (;;)
    {
  if (auto pos = strstr(str, "__"))
    {
  if (pos != str)
    print_func(ctx, str, pos - str);

  pos += 2; // advance past "__"

  if (memcmp(pos, "cxx1998::", 9) == 0)
    str = pos + 9; // advance past "cxx1998::"
      while (*str && isspace((unsigned char)*str))
    ++str;

  if (!*str)
    break;
    }
  else

But either is OK. Just not doing a second strstr through the entire
string again to look for "__cxx1998::".




+
+    if (idx != str)
+  print_func(ctx, str, idx - str);
+
+    str = idx + offset;
+
+    while (*str && isspace((unsigned char)*str))
+  ++str;


Is this really needed?

Why would there be whitespace following "__" or "__cxx1998::" and why
would we want to skip it?


Yes, I cannot remember why I added it in the first place. So removed 
in this new proposal with your other changes.




I know it doesn't follow our usual naming scheme, but a symbol like
"__foo__ bar()" would get printed as "foobar()" wouldn't it?


Yes, it would.

The rest of the patch looks fine, I'm just unsure about pretty_print.

Re: [_GLIBCXX_DEBUG] Enhance rendering of assert message

2021-05-25 Thread François Dumont via Gcc-patches

On 25/05/21 11:58 am, Jonathan Wakely wrote:

On 22/05/21 22:08 +0200, François Dumont via Libstdc++ wrote:
Here is the part of the libbacktrace patch with the enhancement to 
the rendering of assert message.


It only contains one real fix, the rendering of address. In 2 places 
it was done with "0x%p", so resulting in something like: 0x0x012345678


Otherwise it is just enhancements, mostly avoiding intermediate 
buffering.


I am adding the _Parameter::_Named type to help on the rendering. I 
hesitated in doing the same for the _M_iterator type but eventually 
managed it with a template function.


    libstdc++: [_GLIBCXX_DEBUG] Enhance rendering of assert message

    Avoid building an intermediate buffer to print to stderr, push 
directly to stderr.


    libstdc++-v3/ChangeLog:

    * include/debug/formatter.h
    (_Error_formatter::_Parameter::_Named): New.
    (_Error_formatter::_Parameter::_Type): Inherit latter.
    (_Error_formatter::_Parameter::_M_integer): Likewise.
    (_Error_formatter::_Parameter::_M_string): Likewise.
    * src/c++11/debug.cc: Include .
    (_Print_func_t): New.
    (print_raw(PrintContext&, const char*, ptrdiff_t)): New.
    (print_word): Use '%.*s' format in fprintf to render only 
expected number of chars.
    (pretty_print(PrintContext&, const char*, 
_Print_func_t)): New.

    (print_type): Rename in...
    (print_type_info): ...this. Use pretty_print.
    (print_address, print_integer): New.
    (print_named_name, print_iterator_constness, 
print_iterator_state): New.

    (print_iterator_seq_type): New.
    (print_named_field, print_type_field, 
print_instance_field, print_iterator_field): New.

    (print_field): Use latters.
    (print_quoted_named_name, print_type_type, print_type, 
print_instance): New.
    (print_string(PrintContext&, const char*, const 
_Parameter*, size_t)):

    Change signature to...
    (print_string(PrintContext&, const char*, ptrdiff_t, 
const _Parameter*, size_t)):
    ...this and adapt. Remove intermediate buffer to render 
input string.

    (print_string(PrintContext&, const char*, ptrdiff_t)): New.

Ok to commit ?

François




+  void
+  pretty_print(PrintContext& ctx, const char* str, _Print_func_t 
print_func)

+  {
+    const char cxx1998[] = "__cxx1998::";
+    const char uglification[] = "__";
+    for (;;)
+  {
+    auto idx = strstr(str, uglification);


This is confusing. strstr returns a pointer, not an index into the
string.


+    if (idx)
+  {
+    size_t offset =
+  (idx == strstr(str, cxx1998)
+   ? sizeof(cxx1998) : sizeof(uglification)) - 1;


This is a bit inefficient. Consider "int __foo(__cxx1998::bar)". The
first strstr returns a pointer to "__foo" and then the second one
searches the entire string from the beginning looking for
"__cxx1998::", and checks if it is the same position as "__foo".

The second strstr doesn't need to search from the beginning, and it
doesn't need to look all the way to the end. It should be memcmp.

  if (auto pos = strstr(str, uglification))
    {
  if (pos != str)
    print_func(ctx, str, pos - str);

  if (memcmp(pos, cxx1998, sizeof(cxx1998)-1) == 0)
    str = pos + (sizeof(cxx1998) - 1);
  else
    str = pos + (sizeof(uglification) - 1);
      while (*str && isspace((unsigned char)*str))
    ++str;

  if (!*str)
    break;
    }
  else

It doesn't even need to search from the position found by the first
strstr, because we already know it starts with "__", so:

  for (;;)
    {
  if (auto pos = strstr(str, "__"))
    {
  if (pos != str)
    print_func(ctx, str, pos - str);

  pos += 2; // advance past "__"

  if (memcmp(pos, "cxx1998::", 9) == 0)
    str = pos + 9; // advance past "cxx1998::"
      while (*str && isspace((unsigned char)*str))
    ++str;

  if (!*str)
    break;
    }
  else

But either is OK. Just not doing a second strstr through the entire
string again to look for "__cxx1998::".




+
+    if (idx != str)
+  print_func(ctx, str, idx - str);
+
+    str = idx + offset;
+
+    while (*str && isspace((unsigned char)*str))
+  ++str;


Is this really needed?

Why would there be whitespace following "__" or "__cxx1998::" and why
would we want to skip it?


Yes, I cannot remember why I added it in the first place. So removed in 
this new proposal with your other changes.




I know it doesn't follow our usual naming scheme, but a symbol like
"__foo__ bar()" would get printed as "foobar()" wouldn't it?


Yes, it would.

The rest of the patch looks fine, I'm just unsure about pretty_print.
Maybe I've misunderstood the possible strings it 

Re: [PATCH 0/11] warning control by group and location (PR 74765)

2021-05-25 Thread Martin Sebor via Gcc-patches

On 5/25/21 3:04 AM, Richard Biener wrote:

On Tue, May 25, 2021 at 2:53 AM Martin Sebor via Gcc-patches
 wrote:


On 5/24/21 5:08 PM, David Malcolm wrote:

On Mon, 2021-05-24 at 16:02 -0600, Martin Sebor wrote:

The rare expressions that have no location
continue to have just one bit[1].


Where does this get stored?  I see the final patch in the kit removes
TREE_NO_WARNING, but I don't quite follow the logic for where the bit
would then be stored for an expr with UNKNOWN_LOCATION.


The patch just removes the TREE_NO_WARNING macro (along with
the gimple_no_warning_p/gimple_set_no_warning) functions but not
the no-warning bit itself.  It removes them to avoid accidentally
modifying the bit alone without going through the new API and
updating the location -> warning group mapping.  The bit is still
needed for expression/statements with no location.


I wonder if we could clone UNKNOWN_LOCATION, thus when
we set_no_warning on UNKNOWN_LOCATION create a new location
with the source location being still UNKNOWN but with the appropriate
ad-hoc data to disable the warning?  That of course requires the
API to be

location_t set_no_warning (...)

and users would need to update the container with the new location
(or we'd need to use a reference we can update in set_no_warning).


This could be done even in the new set_no_warning(tree, ...), without
changing the callers.  But I think the right place and time to set
the location is in the code that creates the expression.  Doing it
at the time the no-warning bit is being set, either in the new API
or in the caller, seems like papering over the underlying problem.



That said - do you have any stats on how many UNKNOWN_LOCATION
locations we run into with boostrap / the testsuite?


During stage1, roughly 2.7% of all calls to set_no_warning() are
with arguments with no location.  Based on the code I've seen
some are to be expected (e.g. eh_filter_expr, try_catch_expr,
and var_decl for artificial temporaries).  The rest are:
compound_expr, cond_expr, eh_filter_expr, imagpart_expr,
indirect_ref, mem_ref, minus_expr, modify_expr, mult_expr,
ne_expr, plus_expr, realpart_expr, try_catch_expr, and var_decl.

The vast majority (90%) are plus_expr.  At least some of them come
from ASSERT_EXPRs.  The location isn't available at the point VRP
calls set_no_warning() but it is available earlier when the ASSERTs
are being created from the COND_EXPR statements.  I can look into
this when I'm done with this.

Martin


[r12-1044 Regression] FAIL: libgomp.c++/task-reduction-16.C execution test on Linux/x86_64

2021-05-25 Thread sunil.k.pandey via Gcc-patches
On Linux/x86_64,

fd97aeb494cdcffe0d21e7f15ab4593662e065bd is the first bad commit
commit fd97aeb494cdcffe0d21e7f15ab4593662e065bd
Author: Eric Botcazou 
Date:   Tue May 25 18:30:29 2021 +0200

Remove stalled TREE_READONLY flag on automatic variable

caused

FAIL: libgomp.c++/taskloop-reduction-1.C execution test
FAIL: libgomp.c++/task-reduction-15.C execution test
FAIL: libgomp.c++/task-reduction-16.C execution test

with GCC configured with

../../gcc/configure 
--prefix=/local/skpandey/gccwork/toolwork/gcc-bisect-master/master/r12-1044/usr 
--enable-clocale=gnu --with-system-zlib --with-demangler-in-ld 
--with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl 
--enable-libmpx x86_64-linux --disable-bootstrap

To reproduce:

$ cd {build_dir}/x86_64-linux/libgomp/testsuite && make check 
RUNTESTFLAGS="c++.exp=libgomp.c++/taskloop-reduction-1.C 
--target_board='unix{-m32}'"
$ cd {build_dir}/x86_64-linux/libgomp/testsuite && make check 
RUNTESTFLAGS="c++.exp=libgomp.c++/taskloop-reduction-1.C 
--target_board='unix{-m32\ -march=cascadelake}'"
$ cd {build_dir}/x86_64-linux/libgomp/testsuite && make check 
RUNTESTFLAGS="c++.exp=libgomp.c++/taskloop-reduction-1.C 
--target_board='unix{-m64}'"
$ cd {build_dir}/x86_64-linux/libgomp/testsuite && make check 
RUNTESTFLAGS="c++.exp=libgomp.c++/taskloop-reduction-1.C 
--target_board='unix{-m64\ -march=cascadelake}'"
$ cd {build_dir}/x86_64-linux/libgomp/testsuite && make check 
RUNTESTFLAGS="c++.exp=libgomp.c++/task-reduction-15.C 
--target_board='unix{-m32}'"
$ cd {build_dir}/x86_64-linux/libgomp/testsuite && make check 
RUNTESTFLAGS="c++.exp=libgomp.c++/task-reduction-15.C 
--target_board='unix{-m32\ -march=cascadelake}'"
$ cd {build_dir}/x86_64-linux/libgomp/testsuite && make check 
RUNTESTFLAGS="c++.exp=libgomp.c++/task-reduction-15.C 
--target_board='unix{-m64}'"
$ cd {build_dir}/x86_64-linux/libgomp/testsuite && make check 
RUNTESTFLAGS="c++.exp=libgomp.c++/task-reduction-15.C 
--target_board='unix{-m64\ -march=cascadelake}'"
$ cd {build_dir}/x86_64-linux/libgomp/testsuite && make check 
RUNTESTFLAGS="c++.exp=libgomp.c++/task-reduction-16.C 
--target_board='unix{-m32}'"
$ cd {build_dir}/x86_64-linux/libgomp/testsuite && make check 
RUNTESTFLAGS="c++.exp=libgomp.c++/task-reduction-16.C 
--target_board='unix{-m32\ -march=cascadelake}'"

(Please do not reply to this email, for question about this report, contact me 
at skpgkp2 at gmail dot com)


Re: [PATCH] c++: constexpr and copy elision within mem init [PR100368]

2021-05-25 Thread Jason Merrill via Gcc-patches

On 5/25/21 1:12 PM, Patrick Palka wrote:

On Mon, 24 May 2021, Jason Merrill wrote:


On 5/24/21 1:48 PM, Patrick Palka wrote:

In the testcase below, the initializer for C::b inside C's default
constructor is encoded as a TARGET_EXPR wrapping the CALL_EXPR f() in
C++17 mode.  During massaging of this constexpr constructor,
build_target_expr_with_type called from bot_manip ends up trying to use
B's implicitly deleted copy constructor rather than preserving the
copy elision.



This patch makes bot_manip use force_target_expr instead of
build_target_expr_with_type so that it copies TARGET_EXPRs in a more
oblivious manner.


Even with that change we should fix build_target_expr_with_type to handle
CALL_EXPR properly; adding an extra copy is just wrong.


Sounds good.




Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK
for trunk?

PR c++/100368

gcc/cp/ChangeLog:

* tree.c (build_target_expr_with_type): Simplify now that
bot_manip is no longer a caller.
(bot_manip): Use force_target_expr instead of
build_target_expr_with_type.

gcc/testsuite/ChangeLog:

* g++.dg/cpp1z/elide6.C: New test.
---
   gcc/cp/tree.c   |  8 +++-
   gcc/testsuite/g++.dg/cpp1z/elide6.C | 16 
   2 files changed, 19 insertions(+), 5 deletions(-)
   create mode 100644 gcc/testsuite/g++.dg/cpp1z/elide6.C

diff --git a/gcc/cp/tree.c b/gcc/cp/tree.c
index 72f498f4b3b..84b84621d35 100644
--- a/gcc/cp/tree.c
+++ b/gcc/cp/tree.c
@@ -848,12 +848,10 @@ build_target_expr_with_type (tree init, tree type,
tsubst_flags_t complain)
 || init == error_mark_node)
   return init;
 else if (CLASS_TYPE_P (type) && type_has_nontrivial_copy_init (type)
-  && !VOID_TYPE_P (TREE_TYPE (init))
   && TREE_CODE (init) != COND_EXPR
   && TREE_CODE (init) != CONSTRUCTOR
   && TREE_CODE (init) != VA_ARG_EXPR)
-/* We need to build up a copy constructor call.  A void initializer
-   means we're being called from bot_manip.


In general, a void initializer for a TARGET_EXPR means that the initialization
is more complex than initializing the object from the value of the expression.
The caller would need to handle making that initialization apply to the new
TARGET_EXPR_SLOT (and bot_manip does). If we change bot_manip to not call this
function, I think this function should reject void init.


I see, thanks.  How does the following look for trunk?  Bootstrapped and
regetested on x86_64-pc-linux-gnu.

-- >8 --


Subject: [PATCH] c++: constexpr and copy elision within mem init [PR100368]

In the testcase below, the member initializer b(f()) inside C's default
constructor is encoded as a TARGET_EXPR wrapping the CALL_EXPR f() in
C++17 mode.  During massaging of this constexpr constructor,
build_target_expr_with_type called from bot_manip tries to add an extra
copy using B's implicitly deleted copy constructor rather than just
preserving the copy elision.

Since it's wrong to introduce an extra copy when initializing a
temporary from a CALL_EXPR, this patch makes build_target_expr_with_type
avoid calling force_rvalue in this case.  Additionally, bot_manip should
be copying TARGET_EXPRs in a more oblivious manner, so this patch makes
bot_manip use force_target_expr instead of build_target_expr_with_type.
And since bot_manip is now no longer a caller, we can remove the void
initializer handling in build_target_expr_with_type and instead reject
such initializers.

PR c++/100368

gcc/cp/ChangeLog:

* tree.c (build_target_expr_with_type): Don't call force_rvalue
on CALL_EXPR initializer.  Simplify now that bot_manip is no
longer a caller.
(bot_manip): Use force_target_expr instead of
build_target_expr_with_type.

gcc/testsuite/ChangeLog:

* g++.dg/cpp1z/elide6.C: New test.
---
  gcc/cp/tree.c   | 12 ++--
  gcc/testsuite/g++.dg/cpp1z/elide6.C | 16 
  2 files changed, 22 insertions(+), 6 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/cpp1z/elide6.C

diff --git a/gcc/cp/tree.c b/gcc/cp/tree.c
index 72f498f4b3b..d97b220423d 100644
--- a/gcc/cp/tree.c
+++ b/gcc/cp/tree.c
@@ -843,17 +843,17 @@ tree
  build_target_expr_with_type (tree init, tree type, tsubst_flags_t complain)
  {
gcc_assert (!VOID_TYPE_P (type));
+  gcc_assert (!VOID_TYPE_P (TREE_TYPE (init)));
  
if (TREE_CODE (init) == TARGET_EXPR

|| init == error_mark_node)
  return init;
else if (CLASS_TYPE_P (type) && type_has_nontrivial_copy_init (type)
-  && !VOID_TYPE_P (TREE_TYPE (init))
   && TREE_CODE (init) != COND_EXPR
   && TREE_CODE (init) != CONSTRUCTOR
-  && TREE_CODE (init) != VA_ARG_EXPR)
-/* We need to build up a copy constructor call.  A void initializer
-   means we're being called from bot_manip.  COND_EXPR is a special
+  && TREE_CODE (init) != VA_ARG_EXPR
+  && 

[PATCH] Document that -fno-trampolines is for Ada only [PR100735]

2021-05-25 Thread Paul Eggert
The GCC manual's documentation of -fno-trampolines was apparently
written from an Ada point of view. However, when I read it I
understandably mistook it to say that -fno-trampolines also works for
C, C++, etc. It doesn't: it is silently ignored for these languages,
and I assume for any language other than Ada.

This confusion caused me to go in the wrong direction in a Gnulib
dicussion, as I mistakenly thought that entire C apps with nested
functions could be compiled with -fno-trampolines and then use nested
C functions in stack overflow handlers where the alternate stack
is allocated via malloc. I was wrong, as this won't work on common
platforms like x86-64 where malloc yields non-executable storage.

gcc/
* doc/invoke.texi (Code Gen Options):
* doc/tm.texi.in (Trampolines):
Document that -fno-trampolines and -ftrampolines work
only with Ada.
---
 gcc/doc/invoke.texi | 5 +
 gcc/doc/tm.texi.in  | 4 
 2 files changed, 9 insertions(+)

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 5cd4e2d993c..b55bbf3e424 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -16646,6 +16646,11 @@ Moreover, code compiled with @option{-ftrampolines} 
and code compiled with
 present.  This option must therefore be used on a program-wide basis and be
 manipulated with extreme care.
 
+For languages other than Ada, the @code{-ftrampolines} and
+@code{-fno-trampolines} options currently have no effect, and
+trampolines are always generated on platforms that need them
+for nested functions.
+
 @item -fvisibility=@r{[}default@r{|}internal@r{|}hidden@r{|}protected@r{]}
 @opindex fvisibility
 Set the default ELF image symbol visibility to the specified option---all
diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
index d9fbbe20e6f..20501607716 100644
--- a/gcc/doc/tm.texi.in
+++ b/gcc/doc/tm.texi.in
@@ -3828,6 +3828,10 @@ addresses.  Since GCC's generic function descriptors are
 not ABI-compliant, this option is typically used only on a
 per-language basis (notably by Ada) or when it can otherwise be
 applied to the whole program.
+For languages other than Ada, the @code{-ftrampolines} and
+@code{-fno-trampolines} options currently have no effect, and
+trampolines are always generated on platforms that need them
+for nested functions.
 
 Define the following hook if your backend either implements ABI-specified
 descriptor support, or can use GCC's generic descriptor implementation
-- 
2.31.1



Re: [PATCH][DOCS] Remove install-old.texi

2021-05-25 Thread Joseph Myers
On Tue, 25 May 2021, Martin Liška wrote:

> +@quotation
> +aix@var{version}, amdhsa, aout, cygwin, darwin@var{version}

Missing comma at the end of this line.

> +eabi, eabialtivec, eabisim, eabisimaltivec, elf, elf32,
> +elfbare, elfoabi, freebsd@var{version}gnu, hpux, hpux@var{version},

Missing ", " between "freebsd@var{version}" and "gnu".

The patch is OK with those fixes.  There may be further cleanup that can 
be done to these lists, but they're certainly a lot better than the 
versions currently in install-old.texi.

-- 
Joseph S. Myers
jos...@codesourcery.com


PING [PATCH] PR fortran/100602 - [11/12 Regression] Erroneous "pointer argument is not associated" runtime error

2021-05-25 Thread Harald Anlauf via Gcc-patches
*PING*

> Gesendet: Dienstag, 18. Mai 2021 um 20:36 Uhr
> Von: "Harald Anlauf" 
> An: "fortran" , "gcc-patches" 
> Betreff: [PATCH] PR fortran/100602 -  [11/12 Regression] Erroneous "pointer 
> argument is not associated" runtime error
>
> The generation of the new runtime check picked up the wrong attributes
> in the case of CLASS array arguments.  There is related new code in
> gfc_conv_procedure_call which served as reference for the fix.
>
> Regtested on x86_64-pc-linux-gnu.
>
> OK for mainline / 11-branch?
>
> Thanks,
> Harald
>
>
> Fortran: Fix erroneous "pointer argument is not associated" runtime error
>
> For CLASS arrays we need to use the CLASS data attributes to determine
> which runtime check to generate.
>
> gcc/fortran/ChangeLog:
>
>   PR fortran/100602
>   * trans-intrinsic.c (gfc_conv_intrinsic_size): Use CLASS data
>   attributes for CLASS arrays for generation of runtime error.
>
> gcc/testsuite/ChangeLog:
>
>   PR fortran/100602
>   * gfortran.dg/pointer_check_14.f90: New test.
>
>


Re: [PATCH][version 3]add -ftrivial-auto-var-init and variable attribute "uninitialized" to gcc

2021-05-25 Thread Qing Zhao via Gcc-patches
Ping….

Qing

On May 12, 2021, at 12:16 PM, Qing Zhao via Gcc-patches 
mailto:gcc-patches@gcc.gnu.org>> wrote:

Hi,

This is the 3rd version of the patch for the new security feature for GCC.

Please take look and let me know your comments and suggestions.

thanks.

Qing

**Compare with the 2nd version, the following are the major changes:

1. use "lookup_attribute ("uninitialized",) directly instead of adding
   one new field "uninitialized" into tree_decl_with_vis.
2. update documentation to mention that the new option will not confuse
   -Wuninitialized, GCC still consider an auto without explicit initializer
   as uninitialized.
3. change the name of "build_pattern_cst" to more specific name as
   "build_pattern_cst_for_auto_init".
4. handling of nested VLA;
   Adding new testing cases (auto-init-15/16.c) for this new handling.
5. Add  new verifications of calls to .DEFERRED_INIT in tree-cfg.c;
6. in tree-sra.c, update the handling of "grp_to_be_debug_replaced",
   bind the lhs variable to a call to .DEFERRED_INIT.
7. In tree-ssa-structalias.c, delete "find_func_aliases_for_deferred_init",
   return directly for a call to .DEFERRED_INIT in "find_func_aliases_for_call".
8. Add more detailed comments in tree-ssa-uninit.c and tree-ssa.c to explain
   the special handling on REALPART_EXPR/IMAGPRT_EXPR.
9. in build_pattern_cst_for_auto_init:
   BOOLEAN_TYPE will be set to zero always;
   INTEGER_TYPE (?and ENUMERAL_TYPE) use wi::from_buffer in order to
correctly handle 128-bit integers.
   POINTER_TYPE will not assert on SIZE < 32.
   REAL_TYPE add fallback;
10. changed gcc_assert to gcc_unreachable in several places;
11. add more comments;
12. some style issue changes.

**Please see the version 2 at:
https://gcc.gnu.org/pipermail/gcc-patches/2021-March/567262.html


**The following 2 items are the ones I didn’t addressed in this version due 
to further study and might need more discussion:

1. Using __builtin_clear_padding  to replace type_has_padding.

My study shows: the call to __builtin_clear_padding is expanded during 
gimplification phase.
And there is no __bultin_clear_padding expanding during rtx expanding phase.
If so,  for -ftrivial-auto-var-init, padding initialization should be done both 
in gimplification phase and rtx expanding phase.
And since the __builtin_clear_padding might not be good for rtx expanding, 
reusing __builtin_clear_padding might not work.

2. Pattern init to NULLPTR_TYPE and ENUMERAL_TYPE: need more comments from 
Richard Biener on this.

**The change of the 3rd version compared to the 2nd version are:



**The complete 3rd version of the patch are:



<3rd-version-ftrivial-auto-var-init.patch>



[PATCH] c++: Output less irrelevant info for function template decl [PR100716]

2021-05-25 Thread Matthias Kretz

From: Matthias Kretz 

Ensure dump_template_decl for function templates never prints template 
parameters after the function name (it did with -fno-pretty-templates) and 
skip output of irrelevant & confusing "[with T = T]" in dump_substitution.

gcc/cp/ChangeLog:

PR c++/100716
* error.c (dump_template_bindings): Include code to print
"[with" and ']', conditional on whether anything is printed at
all. This is tied to whether a semicolon is needed to separate
multiple template parameters. If the template argument repeats
the template parameter (T = T), then skip the parameter.
(dump_substitution): Moved code to print "[with" and ']' to
dump_template_bindings.
(dump_function_decl): Partial revert of PR50828, which masked
TFF_TEMPLATE_NAME for all of dump_function_decl. Now
TFF_TEMPLATE_NAME is masked for the scope of the function and
only carries through to dump_function_name.
(dump_function_name): Avoid calling dump_template_parms if
TFF_TEMPLATE_NAME is set.

gcc/testsuite/ChangeLog:

PR c++/100716
* g++.dg/diagnostic/pr100716.C: New test.
* g++.dg/diagnostic/pr100716-1.C: Same test with
-fno-pretty-templates.
---
 gcc/cp/error.c   | 59 +++-
 gcc/testsuite/g++.dg/diagnostic/pr100716-1.C | 54 ++
 gcc/testsuite/g++.dg/diagnostic/pr100716.C   | 54 ++
 3 files changed, 152 insertions(+), 15 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/diagnostic/pr100716-1.C
 create mode 100644 gcc/testsuite/g++.dg/diagnostic/pr100716.C


--
──
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 std::experimental::simd  https://github.com/VcDevel/std-simd
──diff --git a/gcc/cp/error.c b/gcc/cp/error.c
index 010fbce41a7..bc0b68f07e0 100644
--- a/gcc/cp/error.c
+++ b/gcc/cp/error.c
@@ -381,7 +381,32 @@ static void
 dump_template_bindings (cxx_pretty_printer *pp, tree parms, tree args,
 vec *typenames)
 {
-  bool need_semicolon = false;
+  struct prepost_semicolon
+  {
+cxx_pretty_printer *pp;
+bool need_semicolon = false;
+
+void operator()()
+{
+  if (need_semicolon)
+	pp_separate_with_semicolon (pp);
+  else
+	{
+	  pp_cxx_whitespace (pp);
+	  pp_cxx_left_bracket (pp);
+	  pp->translate_string ("with");
+	  pp_cxx_whitespace (pp);
+	  need_semicolon = true;
+	}
+}
+
+~prepost_semicolon()
+{
+  if (need_semicolon)
+	pp_cxx_right_bracket (pp);
+}
+  } semicolon_or_introducer = {pp};
+
   int i;
   tree t;
 
@@ -405,10 +430,19 @@ dump_template_bindings (cxx_pretty_printer *pp, tree parms, tree args,
 	  if (lvl_args && NUM_TMPL_ARGS (lvl_args) > arg_idx)
 	arg = TREE_VEC_ELT (lvl_args, arg_idx);
 
-	  if (need_semicolon)
-	pp_separate_with_semicolon (pp);
-	  dump_template_parameter (pp, TREE_VEC_ELT (p, i),
-   TFF_PLAIN_IDENTIFIER);
+	  tree parm_i = TREE_VEC_ELT (p, i);
+	  /* Skip this parameter if it just noise such as "T = T".  */
+	  if (arg && TREE_CODE (arg) == TEMPLATE_TYPE_PARM
+		&& TREE_CODE (parm_i) == TREE_LIST
+		&& TREE_CODE (TREE_VALUE (parm_i)) == TYPE_DECL
+		&& TREE_CODE (TREE_TYPE (TREE_VALUE (parm_i)))
+		 == TEMPLATE_TYPE_PARM
+		&& DECL_NAME (TREE_VALUE (parm_i))
+		 == DECL_NAME (TREE_CHAIN (arg)))
+	continue;
+
+	  semicolon_or_introducer();
+	  dump_template_parameter (pp, parm_i, TFF_PLAIN_IDENTIFIER);
 	  pp_cxx_whitespace (pp);
 	  pp_equal (pp);
 	  pp_cxx_whitespace (pp);
@@ -424,7 +458,6 @@ dump_template_bindings (cxx_pretty_printer *pp, tree parms, tree args,
 	pp_string (pp, M_(""));
 
 	  ++arg_idx;
-	  need_semicolon = true;
 	}
 
   parms = TREE_CHAIN (parms);
@@ -446,8 +479,7 @@ dump_template_bindings (cxx_pretty_printer *pp, tree parms, tree args,
 
   FOR_EACH_VEC_SAFE_ELT (typenames, i, t)
 {
-  if (need_semicolon)
-	pp_separate_with_semicolon (pp);
+  semicolon_or_introducer();
   dump_type (pp, t, TFF_PLAIN_IDENTIFIER);
   pp_cxx_whitespace (pp);
   pp_equal (pp);
@@ -1652,12 +1684,7 @@ dump_substitution (cxx_pretty_printer *pp,
   && !(flags & TFF_NO_TEMPLATE_BINDINGS))
 {
   vec *typenames = t ? find_typenames (t) : NULL;
-  pp_cxx_whitespace (pp);
-  pp_cxx_left_bracket (pp);
-  pp->translate_string ("with");
-  pp_cxx_whitespace (pp);
   dump_template_bindings (pp, template_parms, template_args, typenames);
-  pp_cxx_right_bracket (pp);
 }
 }
 
@@ -1698,7 +1725,8 @@ dump_function_decl (cxx_pretty_printer *pp, tree t, int flags)
   bool constexpr_p;
   tree ret = NULL_TREE;
 
-  flags &= ~(TFF_UNQUALIFIED_NAME | 

Re: [PATCH 2/5] Convert Walloca pass to RANGE_QUERY(cfun).

2021-05-25 Thread Martin Sebor via Gcc-patches

On 5/25/21 10:17 AM, Aldy Hernandez via Gcc-patches wrote:

Adjustments per discussion.

OK pending tests?

Aldy


I have no concern with the alloca changes.  The xfail removals from
the two tests in this patch (a nice improvement) don't seem to be
related to alloca so I'd expect them to fail unless the strlen changes
are committed first.

Martin


Re: [PATCH] c++: constexpr and copy elision within mem init [PR100368]

2021-05-25 Thread Patrick Palka via Gcc-patches
On Mon, 24 May 2021, Jason Merrill wrote:

> On 5/24/21 1:48 PM, Patrick Palka wrote:
> > In the testcase below, the initializer for C::b inside C's default
> > constructor is encoded as a TARGET_EXPR wrapping the CALL_EXPR f() in
> > C++17 mode.  During massaging of this constexpr constructor,
> > build_target_expr_with_type called from bot_manip ends up trying to use
> > B's implicitly deleted copy constructor rather than preserving the
> > copy elision.
> 
> > This patch makes bot_manip use force_target_expr instead of
> > build_target_expr_with_type so that it copies TARGET_EXPRs in a more
> > oblivious manner.
> 
> Even with that change we should fix build_target_expr_with_type to handle
> CALL_EXPR properly; adding an extra copy is just wrong.

Sounds good.

> 
> > Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK
> > for trunk?
> > 
> > PR c++/100368
> > 
> > gcc/cp/ChangeLog:
> > 
> > * tree.c (build_target_expr_with_type): Simplify now that
> > bot_manip is no longer a caller.
> > (bot_manip): Use force_target_expr instead of
> > build_target_expr_with_type.
> > 
> > gcc/testsuite/ChangeLog:
> > 
> > * g++.dg/cpp1z/elide6.C: New test.
> > ---
> >   gcc/cp/tree.c   |  8 +++-
> >   gcc/testsuite/g++.dg/cpp1z/elide6.C | 16 
> >   2 files changed, 19 insertions(+), 5 deletions(-)
> >   create mode 100644 gcc/testsuite/g++.dg/cpp1z/elide6.C
> > 
> > diff --git a/gcc/cp/tree.c b/gcc/cp/tree.c
> > index 72f498f4b3b..84b84621d35 100644
> > --- a/gcc/cp/tree.c
> > +++ b/gcc/cp/tree.c
> > @@ -848,12 +848,10 @@ build_target_expr_with_type (tree init, tree type,
> > tsubst_flags_t complain)
> > || init == error_mark_node)
> >   return init;
> > else if (CLASS_TYPE_P (type) && type_has_nontrivial_copy_init (type)
> > -  && !VOID_TYPE_P (TREE_TYPE (init))
> >&& TREE_CODE (init) != COND_EXPR
> >&& TREE_CODE (init) != CONSTRUCTOR
> >&& TREE_CODE (init) != VA_ARG_EXPR)
> > -/* We need to build up a copy constructor call.  A void initializer
> > -   means we're being called from bot_manip.
> 
> In general, a void initializer for a TARGET_EXPR means that the initialization
> is more complex than initializing the object from the value of the expression.
> The caller would need to handle making that initialization apply to the new
> TARGET_EXPR_SLOT (and bot_manip does). If we change bot_manip to not call this
> function, I think this function should reject void init.

I see, thanks.  How does the following look for trunk?  Bootstrapped and
regetested on x86_64-pc-linux-gnu.

-- >8 --


Subject: [PATCH] c++: constexpr and copy elision within mem init [PR100368]

In the testcase below, the member initializer b(f()) inside C's default
constructor is encoded as a TARGET_EXPR wrapping the CALL_EXPR f() in
C++17 mode.  During massaging of this constexpr constructor,
build_target_expr_with_type called from bot_manip tries to add an extra
copy using B's implicitly deleted copy constructor rather than just
preserving the copy elision.

Since it's wrong to introduce an extra copy when initializing a
temporary from a CALL_EXPR, this patch makes build_target_expr_with_type
avoid calling force_rvalue in this case.  Additionally, bot_manip should
be copying TARGET_EXPRs in a more oblivious manner, so this patch makes
bot_manip use force_target_expr instead of build_target_expr_with_type.
And since bot_manip is now no longer a caller, we can remove the void
initializer handling in build_target_expr_with_type and instead reject
such initializers.

PR c++/100368

gcc/cp/ChangeLog:

* tree.c (build_target_expr_with_type): Don't call force_rvalue
on CALL_EXPR initializer.  Simplify now that bot_manip is no
longer a caller.
(bot_manip): Use force_target_expr instead of
build_target_expr_with_type.

gcc/testsuite/ChangeLog:

* g++.dg/cpp1z/elide6.C: New test.
---
 gcc/cp/tree.c   | 12 ++--
 gcc/testsuite/g++.dg/cpp1z/elide6.C | 16 
 2 files changed, 22 insertions(+), 6 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp1z/elide6.C

diff --git a/gcc/cp/tree.c b/gcc/cp/tree.c
index 72f498f4b3b..d97b220423d 100644
--- a/gcc/cp/tree.c
+++ b/gcc/cp/tree.c
@@ -843,17 +843,17 @@ tree
 build_target_expr_with_type (tree init, tree type, tsubst_flags_t complain)
 {
   gcc_assert (!VOID_TYPE_P (type));
+  gcc_assert (!VOID_TYPE_P (TREE_TYPE (init)));
 
   if (TREE_CODE (init) == TARGET_EXPR
   || init == error_mark_node)
 return init;
   else if (CLASS_TYPE_P (type) && type_has_nontrivial_copy_init (type)
-  && !VOID_TYPE_P (TREE_TYPE (init))
   && TREE_CODE (init) != COND_EXPR
   && TREE_CODE (init) != CONSTRUCTOR
-  && TREE_CODE (init) != VA_ARG_EXPR)
-/* We need to build up a copy constructor call.  A void initializer
-   means we're being called from 

Re: [PATCH] Fix selftest for targets where short and int are the same size.

2021-05-25 Thread Jeff Law via Gcc-patches




On 5/25/2021 10:36 AM, Aldy Hernandez wrote:

Ok, let's use build_nonstandard_integer_type which works for everyone
and gets you unblocked.
Just to be clear, I'm not blocked on xstormy16.   The upstream GCC 
tester flagged the failure and I did enough triage to blame you :-) In 
my day job I'm working in another tree, so you can break the trunk 
willy-nilly and it won't block me.


jeff



Re: [PATCH] Fix selftest for targets where short and int are the same size.

2021-05-25 Thread Aldy Hernandez via Gcc-patches
Ok, let's use build_nonstandard_integer_type which works for everyone
and gets you unblocked.

Pushed.

Aldy

On Tue, May 25, 2021 at 3:15 PM Jeff Law  wrote:
>
>
>
> On 5/25/2021 12:44 AM, Aldy Hernandez wrote:
> > avr-elf seems to use HImode for both integer_type_node and
> > signed_char_type_node, which is causing the check for different sized
> > VARYING ranges to fail.
> >
> > I've fixed this by using a char which I think should always be smaller than 
> > an
> > int.  Is there a preferred way of fixing this?  Perhaps 
> > build_nonstandard_integer
> > or __attribute__((mode(XX)))?
> >
> > Tested on an x86-64 x avr-elf.
> >
> > gcc/ChangeLog:
> >
> >   * value-range.cc (range_tests_legacy): Use signed char instead
> >   of signed short.
> As you note, I wonder if we should just creating our own types for this
> test.  In fact I wonder if that should be considered best practice for
> these tests.  Assumptions about the underlying sizes of the standard
> types has been slightly problematical for the range self-tests.
>
> The alternate approach would be to check the underlying sizes/signedness
> and skip the tests when they don't give us what we need.  But that seems
> inferior to just creating a suitable type.
>
> Jeff
>
> ps.  xstormy16-elf seems to be failing in the same way.  I'll assume
> it's the same problem ;-)
>
commit 41ddc5b0a6b44a9df53a259636fa3b534ae41cbe
Author: Aldy Hernandez 
Date:   Tue May 25 08:36:44 2021 +0200

Fix selftest for targets where short and int are the same size.

avr-elf seems to use HImode for both integer_type_node and
signed_char_type_node, which is causing the check for different sized
VARYING ranges to fail.

gcc/ChangeLog:

* value-range.cc (range_tests_legacy): Use
build_nonstandard_integer_type instead of int and short.

diff --git a/gcc/value-range.cc b/gcc/value-range.cc
index 8d7b46c0239..f113fd7c905 100644
--- a/gcc/value-range.cc
+++ b/gcc/value-range.cc
@@ -2250,11 +2250,13 @@ range_tests_legacy ()
   }
 
   // VARYING of different sizes should not be equal.
-  int_range_max r0 (integer_type_node);
-  int_range_max r1 (short_integer_type_node);
+  tree big_type = build_nonstandard_integer_type (32, 1);
+  tree small_type = build_nonstandard_integer_type (16, 1);
+  int_range_max r0 (big_type);
+  int_range_max r1 (small_type);
   ASSERT_TRUE (r0 != r1);
-  value_range vr0 (integer_type_node);
-  int_range_max vr1 (short_integer_type_node);
+  value_range vr0 (big_type);
+  int_range_max vr1 (small_type);
   ASSERT_TRUE (vr0 != vr1);
 }
 


Re: [PATCH] tree-sra: Avoid refreshing into const base decls (PR 100453)

2021-05-25 Thread Eric Botcazou
> LGTM.

Thanks, but a bit too bold because gimplify_and_add can promote the non-static 
DECL to static memory and reinstate DECL_INITIAL, so first hunk adjusted.


* gimplify.c (gimplify_decl_expr): Clear TREE_READONLY on the DECL
when really creating an initialization statement for it.

-- 
Eric Botcazoudiff --git a/gcc/gimplify.c b/gcc/gimplify.c
index b62ea0efc1c..ed825a93aa1 100644
--- a/gcc/gimplify.c
+++ b/gcc/gimplify.c
@@ -1828,6 +1828,9 @@ gimplify_decl_expr (tree *stmt_p, gimple_seq *seq_p)
 	  init = build2 (INIT_EXPR, void_type_node, decl, init);
 	  gimplify_and_add (init, seq_p);
 	  ggc_free (init);
+	  /* Clear TREE_READONLY if we really have an initialization.  */
+	  if (!DECL_INITIAL (decl))
+		TREE_READONLY (decl) = 0;
 	}
 	  else
 	/* We must still examine initializers for static variables


Re: [PATCH 2/5] Convert Walloca pass to RANGE_QUERY(cfun).

2021-05-25 Thread Jeff Law via Gcc-patches




On 5/25/2021 10:17 AM, Aldy Hernandez wrote:

Adjustments per discussion.

OK pending tests?

The latest revision of #2-#5 are OK once #1 is ACK'd.



Jeff




[PATCH] arm: Auto-vectorization for MVE: vaddv

2021-05-25 Thread Christophe Lyon via Gcc-patches
This patch adds support for the reduc_plus_scal optab with MVE, which
maps to the vaddv instruction.

It moves the reduc_plus_scal_ expander from neon.md to
vec-common.md and adds support for MVE to it.

Since vaddv uses a 32-bits accumulator, we have to truncate it's
result.

For instance:
int32_t test__s8x16 (int8_t *a) {
  int i;
  int8_t result = 0;
  for (i=0; i<16; i++) {
result += a[i];
  }
  return result;
}
is compiled into:
  vldrb.8 q3, [r0]
  vaddv.s8r0, q3
  sxtbr0, r0
  bx  lr

If we used uint8_t instead of int8_t, we still use vaddv.s8r0, q3,
but truncate with uxtbr0, r0.

2021-05-25  Christophe Lyon  

gcc/
* config/arm/mve.md (mve_vaddvq_): Prefix with '@'.
* config/arm/neon.md (reduc_plus_scal_): Move to ..
* config/arm/vec-common.md: .. here. Add support for MVE.

gcc/testsuite/
* gcc.target/arm/simd/mve-vaddv-1.c: New test.
---
 gcc/config/arm/mve.md |  2 +-
 gcc/config/arm/neon.md| 13 --
 gcc/config/arm/vec-common.md  | 26 +++
 .../gcc.target/arm/simd/mve-vaddv-1.c | 26 +++
 4 files changed, 53 insertions(+), 14 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/arm/simd/mve-vaddv-1.c

diff --git a/gcc/config/arm/mve.md b/gcc/config/arm/mve.md
index 133ebe93cf3..0a6ba80c99d 100644
--- a/gcc/config/arm/mve.md
+++ b/gcc/config/arm/mve.md
@@ -464,7 +464,7 @@ (define_insn "mve_vclsq_s"
 ;;
 ;; [vaddvq_s, vaddvq_u])
 ;;
-(define_insn "mve_vaddvq_"
+(define_insn "@mve_vaddvq_"
   [
(set (match_operand:SI 0 "s_register_operand" "=Te")
(unspec:SI [(match_operand:MVE_2 1 "s_register_operand" "w")]
diff --git a/gcc/config/arm/neon.md b/gcc/config/arm/neon.md
index 977adef5490..6a6573317cf 100644
--- a/gcc/config/arm/neon.md
+++ b/gcc/config/arm/neon.md
@@ -1161,19 +1161,6 @@ (define_expand "reduc_plus_scal_"
   DONE;
 })
 
-(define_expand "reduc_plus_scal_"
-  [(match_operand: 0 "nonimmediate_operand")
-   (match_operand:VQ 1 "s_register_operand")]
-  "ARM_HAVE_NEON__ARITH && !BYTES_BIG_ENDIAN"
-{
-  rtx step1 = gen_reg_rtx (mode);
-
-  emit_insn (gen_quad_halves_plus (step1, operands[1]));
-  emit_insn (gen_reduc_plus_scal_ (operands[0], step1));
-
-  DONE;
-})
-
 (define_expand "reduc_plus_scal_v2di"
   [(match_operand:DI 0 "nonimmediate_operand")
(match_operand:V2DI 1 "s_register_operand")]
diff --git a/gcc/config/arm/vec-common.md b/gcc/config/arm/vec-common.md
index e8b2901b006..cc136e2865f 100644
--- a/gcc/config/arm/vec-common.md
+++ b/gcc/config/arm/vec-common.md
@@ -539,3 +539,29 @@ (define_expand "vec_store_lanesxi"
 emit_insn (gen_mve_vst4q (operands[0], operands[1]));
   DONE;
 })
+
+(define_expand "reduc_plus_scal_"
+  [(match_operand: 0 "nonimmediate_operand")
+   (match_operand:VQ 1 "s_register_operand")]
+  "ARM_HAVE_NEON__ARITH || (TARGET_HAVE_MVE && mode != V4SFmode)
+   && !TARGET_REALLY_IWMMXT
+   && !BYTES_BIG_ENDIAN"
+{
+  if (TARGET_NEON)
+{
+  rtx step1 = gen_reg_rtx (mode);
+
+  emit_insn (gen_quad_halves_plus (step1, operands[1]));
+  emit_insn (gen_reduc_plus_scal_ (operands[0], step1));
+}
+  else
+{
+  /* vaddv generates a 32 bits accumulator.  */
+  rtx op0 = gen_reg_rtx (SImode);
+
+  emit_insn (gen_mve_vaddvq (VADDVQ_S, mode, op0, operands[1]));
+  emit_insn (gen_rtx_SET (operands[0], gen_rtx_SUBREG (mode, op0, 
0)));
+}
+
+  DONE;
+})
diff --git a/gcc/testsuite/gcc.target/arm/simd/mve-vaddv-1.c 
b/gcc/testsuite/gcc.target/arm/simd/mve-vaddv-1.c
new file mode 100644
index 000..b6b0bc368f5
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/simd/mve-vaddv-1.c
@@ -0,0 +1,26 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target arm_v8_1m_mve_ok } */
+/* { dg-add-options arm_v8_1m_mve } */
+/* { dg-additional-options "-O3" } */
+
+#include 
+
+#define FUNC(SIGN, TYPE, BITS, NB) \
+  TYPE##32_t test_ ##_ ## SIGN ## BITS ## x ## NB (TYPE##BITS##_t *a) { \
+int i; \
+TYPE##BITS##_t result = 0; \
+for (i=0; i

Re: [PATCH 5/5] Cleanup get_range_info

2021-05-25 Thread Aldy Hernandez via Gcc-patches

No changes needed for this patch.

OK pending tests?

Aldy



Re: [PATCH 4/5] Convert remaining passes to RANGE_QUERY.

2021-05-25 Thread Aldy Hernandez via Gcc-patches

Adjustments per discussion.

OK pending tests?

Aldy
>From d701627d2b0a3cdfea7a11b3b4cf4105db08dcf5 Mon Sep 17 00:00:00 2001
From: Aldy Hernandez 
Date: Wed, 19 May 2021 18:44:08 +0200
Subject: [PATCH 4/5] Convert remaining passes to get_range_query.

This patch converts the remaining users of get_range_info and
get_ptr_nonnull to the get_range_query API.

No effort was made to move passes away from VR_ANTI_RANGE, or any other
use of deprecated methods.  This was a straight up conversion to the new
API, nothing else.

gcc/ChangeLog:

	* builtins.c (check_nul_terminated_array): Convert to get_range_query.
	(expand_builtin_strnlen): Same.
	(determine_block_size): Same.
	* fold-const.c (expr_not_equal_to): Same.
	* gimple-fold.c (size_must_be_zero_p): Same.
	* gimple-match-head.c: Include gimple-range.h.
	* gimple-pretty-print.c (dump_ssaname_info): Convert to get_range_query.
	* gimple-ssa-warn-restrict.c
	(builtin_memref::extend_offset_range): Same.
	* graphite-sese-to-poly.c (add_param_constraints): Same.
	* internal-fn.c (get_min_precision): Same.
	* ipa-fnsummary.c (set_switch_stmt_execution_predicate): Same.
	* ipa-prop.c (ipa_compute_jump_functions_for_edge): Same.
	* match.pd: Same.
	* tree-data-ref.c (split_constant_offset): Same.
	(dr_step_indicator): Same.
	* tree-dfa.c (get_ref_base_and_extent): Same.
	* tree-scalar-evolution.c (iv_can_overflow_p): Same.
	* tree-ssa-loop-niter.c (refine_value_range_using_guard): Same.
	(determine_value_range): Same.
	(record_nonwrapping_iv): Same.
	(infer_loop_bounds_from_signedness): Same.
	(scev_var_range_cant_overflow): Same.
	* tree-ssa-phiopt.c (two_value_replacement): Same.
	* tree-ssa-pre.c (insert_into_preds_of_block): Same.
	* tree-ssa-reassoc.c (optimize_range_tests_to_bit_test): Same.
	* tree-ssa-strlen.c (handle_builtin_stxncpy_strncat): Same.
	(get_range): Same.
	(dump_strlen_info): Same.
	(set_strlen_range): Same.
	(maybe_diag_stxncpy_trunc): Same.
	(get_len_or_size): Same.
	(handle_integral_assign): Same.
	* tree-ssa-structalias.c (find_what_p_points_to): Same.
	* tree-ssa-uninit.c (find_var_cmp_const): Same.
	* tree-switch-conversion.c (bit_test_cluster::emit): Same.
	* tree-vect-patterns.c (vect_get_range_info): Same.
	(vect_recog_divmod_pattern): Same.
	* tree-vrp.c (intersect_range_with_nonzero_bits): Same.
	(register_edge_assert_for_2): Same.
	(determine_value_range_1): Same.
	* tree.c (get_range_pos_neg): Same.
	* vr-values.c (vr_values::get_lattice_entry): Same.
	(vr_values::update_value_range): Same.
	(simplify_conversion_using_ranges): Same.
---
 gcc/builtins.c | 40 ++--
 gcc/fold-const.c   |  8 +++-
 gcc/gimple-fold.c  |  7 ++-
 gcc/gimple-match-head.c|  1 +
 gcc/gimple-pretty-print.c  | 12 -
 gcc/gimple-ssa-warn-restrict.c |  8 +++-
 gcc/graphite-sese-to-poly.c|  9 +++-
 gcc/internal-fn.c  | 14 +++---
 gcc/ipa-fnsummary.c| 11 -
 gcc/ipa-prop.c | 16 +++
 gcc/match.pd   | 19 ++--
 gcc/tree-data-ref.c| 24 --
 gcc/tree-dfa.c | 14 +-
 gcc/tree-scalar-evolution.c| 13 +-
 gcc/tree-ssa-loop-niter.c  | 81 +---
 gcc/tree-ssa-phiopt.c  | 11 -
 gcc/tree-ssa-pre.c | 19 
 gcc/tree-ssa-reassoc.c |  9 ++--
 gcc/tree-ssa-strlen.c  | 85 --
 gcc/tree-ssa-structalias.c |  8 ++--
 gcc/tree-ssa-uninit.c  |  8 +++-
 gcc/tree-switch-conversion.c   | 10 ++--
 gcc/tree-vect-patterns.c   | 18 +--
 gcc/tree-vrp.c | 21 -
 gcc/tree.c | 13 +++---
 gcc/vr-values.c| 12 +++--
 26 files changed, 332 insertions(+), 159 deletions(-)

diff --git a/gcc/builtins.c b/gcc/builtins.c
index e1b284846b1..135d6bbc2d0 100644
--- a/gcc/builtins.c
+++ b/gcc/builtins.c
@@ -79,6 +79,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "tree-outof-ssa.h"
 #include "attr-fnspec.h"
 #include "demangle.h"
+#include "gimple-range.h"
 
 struct target_builtins default_target_builtins;
 #if SWITCHABLE_TARGET
@@ -1214,14 +1215,15 @@ check_nul_terminated_array (tree expr, tree src,
   wide_int bndrng[2];
   if (bound)
 {
-  if (TREE_CODE (bound) == INTEGER_CST)
-	bndrng[0] = bndrng[1] = wi::to_wide (bound);
-  else
-	{
-	  value_range_kind rng = get_range_info (bound, bndrng, bndrng + 1);
-	  if (rng != VR_RANGE)
-	return true;
-	}
+  value_range r;
+
+  get_global_range_query ()->range_of_expr (r, bound);
+
+  if (r.kind () != VR_RANGE)
+	return true;
+
+  bndrng[0] = r.lower_bound ();
+  bndrng[1] = r.upper_bound ();
 
   if (exact)
 	{
@@ -3827,9 +3829,12 @@ expand_builtin_strnlen (tree exp, rtx target, machine_mode target_mode)
 return NULL_RTX;
 
   wide_int min, max;
-  enum value_range_kind rng = get_range_info (bound, , );
-  if (rng != VR_RANGE)
+  

Re: [PATCH 3/5] Convert evrp pass to RANGE_QUERY(cfun).

2021-05-25 Thread Aldy Hernandez via Gcc-patches

Adjustments per discussion.

OK pending tests?

Aldy
>From 1c275296ab64cd877bce795b9964532c8655fa3f Mon Sep 17 00:00:00 2001
From: Aldy Hernandez 
Date: Tue, 25 May 2021 17:44:51 +0200
Subject: [PATCH 2/5] Convert evrp pass to get_range_query.

gcc/ChangeLog:

	* gimple-ssa-evrp.c (rvrp_folder::rvrp_folder): Call
	enable_ranger.
	(rvrp_folder::~rvrp_folder): Call disable_ranger.
	(hybrid_folder::hybrid_folder): Call enable_ranger.
	(hybrid_folder::~hybrid_folder): Call disable_ranger.
---
 gcc/gimple-ssa-evrp.c | 20 +---
 1 file changed, 9 insertions(+), 11 deletions(-)

diff --git a/gcc/gimple-ssa-evrp.c b/gcc/gimple-ssa-evrp.c
index 829fdcdaef2..118d10365a0 100644
--- a/gcc/gimple-ssa-evrp.c
+++ b/gcc/gimple-ssa-evrp.c
@@ -117,11 +117,8 @@ class rvrp_folder : public substitute_and_fold_engine
 public:
 
   rvrp_folder () : substitute_and_fold_engine (), m_simplifier ()
-  { 
-if (param_evrp_mode & EVRP_MODE_TRACE)
-  m_ranger = new trace_ranger ();
-else
-  m_ranger = new gimple_ranger ();
+  {
+m_ranger = enable_ranger (cfun);
 m_simplifier.set_range_query (m_ranger);
   }
   
@@ -129,7 +126,9 @@ public:
   {
 if (dump_file && (dump_flags & TDF_DETAILS))
   m_ranger->dump (dump_file);
-delete m_ranger;
+
+m_ranger->export_global_ranges ();
+disable_ranger (cfun);
   }
 
   tree value_of_expr (tree name, gimple *s = NULL) OVERRIDE
@@ -175,10 +174,7 @@ class hybrid_folder : public evrp_folder
 public:
   hybrid_folder (bool evrp_first)
   {
-if (param_evrp_mode & EVRP_MODE_TRACE)
-  m_ranger = new trace_ranger ();
-else
-  m_ranger = new gimple_ranger ();
+m_ranger = enable_ranger (cfun);
 
 if (evrp_first)
   {
@@ -196,7 +192,9 @@ public:
   {
 if (dump_file && (dump_flags & TDF_DETAILS))
   m_ranger->dump (dump_file);
-delete m_ranger;
+
+m_ranger->export_global_ranges ();
+disable_ranger (cfun);
   }
 
   bool fold_stmt (gimple_stmt_iterator *gsi) OVERRIDE
-- 
2.31.1



Re: [PATCH 2/5] Convert Walloca pass to RANGE_QUERY(cfun).

2021-05-25 Thread Aldy Hernandez via Gcc-patches

Adjustments per discussion.

OK pending tests?

Aldy
>From 97bedf7dc0a7860802461b5fd3e72b687076ae30 Mon Sep 17 00:00:00 2001
From: Aldy Hernandez 
Date: Wed, 19 May 2021 18:27:47 +0200
Subject: [PATCH 3/5] Convert Walloca pass to get_range_query.

This patch converts the Walloca pass to use an on-demand ranger
accesible with get_range_query instead of having to create a ranger and pass
it around.

gcc/ChangeLog:

	* gimple-ssa-warn-alloca.c (alloca_call_type): Use
	  get_range_query instead of query argument.
	(pass_walloca::execute): Enable and disable global ranger.
---
 gcc/gimple-ssa-warn-alloca.c | 10 ++
 gcc/testsuite/gcc.dg/Wstringop-overflow-55.c |  8 
 gcc/testsuite/gcc.dg/pr80776-1.c |  4 +---
 3 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/gcc/gimple-ssa-warn-alloca.c b/gcc/gimple-ssa-warn-alloca.c
index e9a24d4d1d0..72480f1d8cb 100644
--- a/gcc/gimple-ssa-warn-alloca.c
+++ b/gcc/gimple-ssa-warn-alloca.c
@@ -165,7 +165,7 @@ adjusted_warn_limit (bool idx)
 // call was created by the gimplifier for a VLA.
 
 static class alloca_type_and_limit
-alloca_call_type (range_query , gimple *stmt, bool is_vla)
+alloca_call_type (gimple *stmt, bool is_vla)
 {
   gcc_assert (gimple_alloca_call_p (stmt));
   tree len = gimple_call_arg (stmt, 0);
@@ -217,7 +217,7 @@ alloca_call_type (range_query , gimple *stmt, bool is_vla)
   int_range_max r;
   if (warn_limit_specified_p (is_vla)
   && TREE_CODE (len) == SSA_NAME
-  && query.range_of_expr (r, len, stmt)
+  && get_range_query (cfun)->range_of_expr (r, len, stmt)
   && !r.varying_p ())
 {
   // The invalid bits are anything outside of [0, MAX_SIZE].
@@ -256,7 +256,7 @@ in_loop_p (gimple *stmt)
 unsigned int
 pass_walloca::execute (function *fun)
 {
-  gimple_ranger ranger;
+  gimple_ranger *ranger = enable_ranger (fun);
   basic_block bb;
   FOR_EACH_BB_FN (bb, fun)
 {
@@ -290,7 +290,7 @@ pass_walloca::execute (function *fun)
 	continue;
 
 	  class alloca_type_and_limit t
-	= alloca_call_type (ranger, stmt, is_vla);
+	= alloca_call_type (stmt, is_vla);
 
 	  unsigned HOST_WIDE_INT adjusted_alloca_limit
 	= adjusted_warn_limit (false);
@@ -383,6 +383,8 @@ pass_walloca::execute (function *fun)
 	}
 	}
 }
+  ranger->export_global_ranges ();
+  disable_ranger (fun);
   return 0;
 }
 
diff --git a/gcc/testsuite/gcc.dg/Wstringop-overflow-55.c b/gcc/testsuite/gcc.dg/Wstringop-overflow-55.c
index 25f5b82d9be..8df5cb629ae 100644
--- a/gcc/testsuite/gcc.dg/Wstringop-overflow-55.c
+++ b/gcc/testsuite/gcc.dg/Wstringop-overflow-55.c
@@ -66,7 +66,7 @@ void warn_ptrdiff_anti_range_add (ptrdiff_t i)
 {
   i |= 1;
 
-  char ca5[5];  // { dg-message "at offset \\\[1, 5]" "pr?" { xfail *-*-* } }
+  char ca5[5];  // { dg-message "at offset \\\[1, 5]" "pr?" }
   char *p0 = ca5;   // offset
   char *p1 = p0 + i;//  1-5
   char *p2 = p1 + i;//  2-5
@@ -74,7 +74,7 @@ void warn_ptrdiff_anti_range_add (ptrdiff_t i)
   char *p4 = p3 + i;//  4-5
   char *p5 = p4 + i;//   5
 
-  memset (p5, 0, 5);// { dg-warning "writing 5 bytes into a region of size 0" "pr?" { xfail *-*-* } }
+  memset (p5, 0, 5);// { dg-warning "writing 5 bytes into a region of size" "pr?" }
 
   sink (p0, p1, p2, p3, p4, p5);
 }
@@ -83,7 +83,7 @@ void warn_int_anti_range (int i)
 {
   i |= 1;
 
-  char ca5[5];  // { dg-message "at offset \\\[1, 5]" "pr?" { xfail *-*-* } }
+  char ca5[5];  // { dg-message "at offset \\\[1, 5]" "pr?" }
   char *p0 = ca5;   // offset
   char *p1 = p0 + i;//  1-5
   char *p2 = p1 + i;//  2-5
@@ -91,7 +91,7 @@ void warn_int_anti_range (int i)
   char *p4 = p3 + i;//  4-5
   char *p5 = p4 + i;//   5
 
-  memset (p5, 0, 5);// { dg-warning "writing 5 bytes into a region of size 0" "pr?" { xfail *-*-* } }
+  memset (p5, 0, 5);// { dg-warning "writing 5 bytes into a region of size" "pr?" }
 
   sink (p0, p1, p2, p3, p4, p5);
 }
diff --git a/gcc/testsuite/gcc.dg/pr80776-1.c b/gcc/testsuite/gcc.dg/pr80776-1.c
index af41c0c2ffa..f3a120b6744 100644
--- a/gcc/testsuite/gcc.dg/pr80776-1.c
+++ b/gcc/testsuite/gcc.dg/pr80776-1.c
@@ -17,7 +17,5 @@ Foo (void)
 __builtin_unreachable ();
   if (! (0 <= i && i <= 99))
 __builtin_unreachable ();
-  /* The correctness bits for [E]VRP cannot handle chained conditionals
- when deciding to ignore a unreachable branch for setting SSA range info. */
-  sprintf (number, "%d", i); /* { dg-bogus "writing" "" { xfail *-*-* } } */
+  sprintf (number, "%d", i); /* { dg-bogus "writing" "" } */
 }
-- 
2.31.1



Re: [PATCH 1/5] Common API for accessing global and on-demand ranges.

2021-05-25 Thread Aldy Hernandez via Gcc-patches
The interface is now an inline function for get_range_query() and an 
external function for get_global_range_query), as discussed.


There are no magic cfun uses behind the scenes.

The annoying downcast is gone.

Passes can now decide if they want to export global ranges after they 
use a ranger.


I've adjusted the ChangeLog entries, as well as the commit text.

I've addressed everything discussed.

OK pending tests?

Aldy
>From eeb7627ddf686d5affb08dcad3674b560ef3ce6d Mon Sep 17 00:00:00 2001
From: Aldy Hernandez 
Date: Wed, 19 May 2021 18:27:05 +0200
Subject: [PATCH 1/5] Common API for accessing global and on-demand ranges.

This patch provides a generic API for accessing global ranges.  It is
meant to replace get_range_info() and get_ptr_nonnull() with one
common interface.  It uses the same API as the ranger (class
range_query), so there will now be one API for accessing local and
global ranges alike.

Follow-up patches will convert all users of get_range_info and
get_ptr_nonnull to this API.

For get_range_info, instead of:

  if (!POINTER_TYPE_P (TREE_TYPE (name)) && SSA_NAME_RANGE_INFO (name))
get_range_info (name, vr);

You can now do:

  get_range_query (cfun)->range_of_expr (vr, name, [stmt]);

...as well as any other of the range_query methods (range_on_edge,
range_of_stmt, value_of_expr, value_on_edge, value_on_stmt, etc).

As per the API, range_of_expr will work on constants, SSA names, and
anything we support in irange::supports_type_p().

For pointers, the interface is the same, so instead of:

  else if (POINTER_TYPE_P (TREE_TYPE (name)) && SSA_NAME_PTR_INFO (name))
{
  if (get_ptr_nonnull (name))
stuff();
}

One can do:

  get_range_query (cfun)->range_of_expr (vr, name, [stmt]);
  if (vr.nonzero_p ())
stuff ();

Along with this interface, we are providing a mechanism by which a
pass can use an on-demand ranger transparently, without having to
change its code.  Of course, this assumes all get_range_info() and
get_ptr_nonnull() users have been converted to the new API, which
follow-up patches will do.

If a pass would prefer to use an on-demand ranger with finer grained
and context aware ranges, all it would have to do is call
enable_ranger() at the beginning of the pass, and disable_ranger() at
the end of the pass.

Note, that to use context aware ranges, any user of range_of_expr()
would need to pass additional context.  For example, the optional
gimple statement (or perhaps use range_on_edge or range_of_stmt).

The observant reader will note that get_range_query is tied to a
struct function, which may not be available in certain contexts, such
as at RTL time, gimple-fold, or some other places where we may or may
not have cfun set.

For cases where we are sure there is no function, you can use
get_global_range_query() instead of get_range_query(fun).  The API is
the same.

For cases where a function may be called with or without a function,
you could use the following idiom:

  range_query *query = cfun ? get_range_query (cfun)
: get_global_range_query ();

  query->range_of_expr (range, expr, [stmt]);

The default range query obtained by get_range_query() is the global
range query, unless the user has enabled an on-demand ranger with
enable_ranger(), in which case it will use the currently active ranger.
That is, until disable_ranger() is called, at which point, we revert
back to global ranges.

We think this provides a generic way of accessing ranges, both
globally and locally, without having to keep track of types,
SSA_NAME_RANGE_INFO, and SSA_NAME_PTR_INFO.  We also hope this can be
used to transition passes from global to on-demand ranges when
appropriate.

gcc/ChangeLog:

	* function.c (allocate_struct_function): Set cfun->x_range_query.
	* function.h (struct function): Declare x_range_query.
	(get_range_query): New.
	(get_global_range_query): New.
	* gimple-range-cache.cc (ssa_global_cache::ssa_global_cache):
	Remove call to safe_grow_cleared.
	* gimple-range.cc (get_range_global): New.
	(gimple_range_global): Move from gimple-range.h.
	(get_global_range_query): New.
	(global_range_query::range_of_expr): New.
	(enable_ranger): New.
	(disable_ranger): New.
	* gimple-range.h (gimple_range_global): Move to gimple-range.cc.
	(class global_range_query): New.
	(enable_ranger): New.
	(disable_ranger): New.
	* gimple-ssa-evrp.c (evrp_folder::~evrp_folder): Rename
	dump_all_value_ranges to dump.
	* tree-vrp.c (vrp_prop::finalize): Same.
	* value-query.cc (range_query::dump): New.
	* value-query.h (range_query::dump): New.
	* vr-values.c (vr_values::dump_all_value_ranges): Rename to...
	(vr_values::dump): ...this.
	* vr-values.h (class vr_values): Rename dump_all_value_ranges to
	dump and make virtual.
---
 gcc/function.c|   4 ++
 gcc/function.h|  17 +
 gcc/gimple-range-cache.cc |   1 -
 gcc/gimple-range.cc   | 126 ++
 gcc/gimple-range.h|  60 +-
 

Re: [PATCH 1/2] c-family: Copy DECL_USER_ALIGN even if DECL_ALIGN is similar.

2021-05-25 Thread Jason Merrill via Gcc-patches

On 5/25/21 11:15 AM, Martin Sebor wrote:

On 5/25/21 4:38 AM, Robin Dapp wrote:

Hi Martin and Jason,


The removal of the dead code looks good to me.  The change to
"re-init lastalign" doesn't seem right.  When it's zero it means
the conflict is between two attributes on the same declaration,
in which case the note shouldn't be printed (it would just point
to the same location as the warning).


Agreed.


Did I get it correctly that you refer to printing a note in e.g. the 
following case?


  inline int __attribute__ ((aligned (16), aligned (4)))
  finline_align (int);


Yes, that's what I was referring to.



I indeed missed this but it could be fixed by checking (on top of the 
patch)


diff --git a/gcc/c-family/c-attribs.c b/gcc/c-family/c-attribs.c
index 98c98944405..7349da73f14 100644
--- a/gcc/c-family/c-attribs.c
+++ b/gcc/c-family/c-attribs.c
@@ -2324,7 +2324,7 @@ common_handle_aligned_attribute (tree *node, 
tree name, tree args, int flags,

    /* Either a prior attribute on the same declaration or one
  on a prior declaration of the same function specifies
  stricter alignment than this attribute.  */
-  bool note = lastalign != 0;
+  bool note = last_decl != decl && lastalign != 0;

As there wasn't any FAIL, I would add another test which checks for this.


That would be great, thank you!

I find the whole logic here a bit convoluted but when there is no real 
last_decl, then last_decl = decl.  A note would not be printed before 
the patch because we erroneously warned about the "conflict" of the 
function's default alignment (8) vs the requested alignment (4).


Ah, the problem is that we only give this warning because of 
DECL_USER_ALIGN on last_decl, but then don't use the alignment of last_decl.


As you say, the logic is convoluted.  Let's simplify it rather than make 
it more convoluted.  One possibility would be to change || to | to avoid 
the shortcut, and then


bool note = lastalign > curalign;
if (note)
  curalign = lastalign;

Jason



Re: [PATCH 8/11] use xxx_no_warning APIs in Objective-C

2021-05-25 Thread Iain Sandoe via Gcc-patches

Martin Sebor  wrote:


On 5/25/21 8:01 AM, Iain Sandoe via Gcc-patches wrote:

Hi Martin
Martin Sebor via Gcc-patches  wrote:

The attached patch replaces the uses of TREE_NO_WARNING in
the Objective-C front end.


I’ve been gradually trying to improve/add locations in the ObjC stuff.
To that end, I wonder if it might be worth considering always supplying
the intended masked warning (rather than omitting this when the node
currently has no location).  I guess that would mean that the  
setter/getter

would need to determine if there was some suitable location (more work
but better abstraction).
This would mean that an improvement/addition to location would  
automatically

gain the improvement in masked warnings.
This is not an objection (the patch is OK for ObjC as is) .. just a  
question,


I'm not sure I understand correctly.

Let me try to clarify: The calls to the {get,set}_no_warning() with
no option introduced by the patch are of two kinds: one where
the intent is to query or suppress all warnings for an expression
(or a DECL, like a synthesized artificial temporary), and another
where it's not apparent from the code which warning is meant to
be queried or suppressed.  I think all the ones in the ObjC front
end are of the latter sort.  (None of these calls are due to
the location being unknown.)


OK, thanks - that clarifies (and is not particularly surprising, there is
plenty to do there).


With that, if you are suggesting to try to find the suitable option
to pass to the latter kind of calls above, I agree.


Something for me to look at when there’s time, then…


 If you have
ideas for what those might be I can give them a try.  Looking at
the ObjC suppression code again, all the set_no_warning() calls
with no option are for what looks like synthesized types, so
maybe that's a clue: could -Wunused be the warning we want to
suppress there?


… I’d not hazard a guess right now,

thanks for the clarification
Iain



Re: [PATCH 8/11] use xxx_no_warning APIs in Objective-C

2021-05-25 Thread Martin Sebor via Gcc-patches

On 5/25/21 8:01 AM, Iain Sandoe via Gcc-patches wrote:

Hi Martin

Martin Sebor via Gcc-patches  wrote:


The attached patch replaces the uses of TREE_NO_WARNING in
the Objective-C front end.



I’ve been gradually trying to improve/add locations in the ObjC stuff.

To that end, I wonder if it might be worth considering always supplying
the intended masked warning (rather than omitting this when the node
currently has no location).  I guess that would mean that the setter/getter
would need to determine if there was some suitable location (more work
but better abstraction).

This would mean that an improvement/addition to location would 
automatically

gain the improvement in masked warnings.

This is not an objection (the patch is OK for ObjC as is) .. just a 
question,


I'm not sure I understand correctly.

Let me try to clarify: The calls to the {get,set}_no_warning() with
no option introduced by the patch are of two kinds: one where
the intent is to query or suppress all warnings for an expression
(or a DECL, like a synthesized artificial temporary), and another
where it's not apparent from the code which warning is meant to
be queried or suppressed.  I think all the ones in the ObjC front
end are of the latter sort.  (None of these calls are due to
the location being unknown.)

With that, if you are suggesting to try to find the suitable option
to pass to the latter kind of calls above, I agree.  If you have
ideas for what those might be I can give them a try.  Looking at
the ObjC suppression code again, all the set_no_warning() calls
with no option are for what looks like synthesized types, so
maybe that's a clue: could -Wunused be the warning we want to
suppress there?

Martin



thanks
Iain





Re: [PATCH] c++, v2: Avoid -Wunused-value false positives on nullptr passed to ellipsis [PR100666]

2021-05-25 Thread Jason Merrill via Gcc-patches

On 5/25/21 11:10 AM, Jakub Jelinek wrote:

On Tue, May 25, 2021 at 09:40:13AM -0400, Jason Merrill wrote:

Please also test the case of a [[nodiscard]] function returning an empty
class type.


Here it is.  I have also extended the decltype(nullptr) nodiscard test.
Retested on x86_64-linux (nothing other than testcases changed), ok for
trunk?


OK.


2021-05-25  Jakub Jelinek  

PR c++/100666
* call.c (convert_arg_to_ellipsis): For expressions with NULLPTR_TYPE
and side-effects, temporarily disable -Wunused-result warning when
building COMPOUND_EXPR.

* g++.dg/cpp1z/nodiscard8.C: New test.
* g++.dg/cpp1z/nodiscard9.C: New test.

--- gcc/cp/call.c.jj2021-05-25 10:55:55.105239017 +0200
+++ gcc/cp/call.c   2021-05-25 17:01:47.524719692 +0200
@@ -8178,7 +8178,10 @@ convert_arg_to_ellipsis (tree arg, tsubs
  {
arg = mark_rvalue_use (arg);
if (TREE_SIDE_EFFECTS (arg))
-   arg = cp_build_compound_expr (arg, null_pointer_node, complain);
+   {
+ warning_sentinel w(warn_unused_result);
+ arg = cp_build_compound_expr (arg, null_pointer_node, complain);
+   }
else
arg = null_pointer_node;
  }
--- gcc/testsuite/g++.dg/cpp1z/nodiscard8.C.jj  2021-05-25 17:01:47.524719692 
+0200
+++ gcc/testsuite/g++.dg/cpp1z/nodiscard8.C 2021-05-25 17:04:34.668904548 
+0200
@@ -0,0 +1,15 @@
+// PR c++/100666
+// { dg-do compile { target c++11 } }
+
+[[nodiscard]] decltype(nullptr) bar ();
+extern void foo (...);
+template  void qux (T);
+
+void
+baz ()
+{
+  foo (bar ());// { dg-bogus "ignoring return value of '\[^\n\r]*', 
declared with attribute 'nodiscard'" }
+  bar ();  // { dg-warning "ignoring return value of '\[^\n\r]*', 
declared with attribute 'nodiscard'" }
+  auto x = bar (); // { dg-bogus "ignoring return value of '\[^\n\r]*', 
declared with attribute 'nodiscard'" }
+  qux (bar ());// { dg-bogus "ignoring return value of '\[^\n\r]*', 
declared with attribute 'nodiscard'" }
+}
--- gcc/testsuite/g++.dg/cpp1z/nodiscard9.C.jj  2021-05-25 17:05:01.907608730 
+0200
+++ gcc/testsuite/g++.dg/cpp1z/nodiscard9.C 2021-05-25 17:06:40.054542742 
+0200
@@ -0,0 +1,22 @@
+// PR c++/100666
+// { dg-do compile { target c++11 } }
+
+struct S {};
+[[nodiscard]] S bar ();
+struct U { S s; };
+[[nodiscard]] U corge ();
+extern void foo (...);
+template  void qux (T);
+
+void
+baz ()
+{
+  foo (bar ());// { dg-bogus "ignoring return value of '\[^\n\r]*', 
declared with attribute 'nodiscard'" }
+  bar ();  // { dg-warning "ignoring return value of '\[^\n\r]*', 
declared with attribute 'nodiscard'" }
+  auto x = bar (); // { dg-bogus "ignoring return value of '\[^\n\r]*', 
declared with attribute 'nodiscard'" }
+  qux (bar ());// { dg-bogus "ignoring return value of '\[^\n\r]*', 
declared with attribute 'nodiscard'" }
+  foo (corge ());  // { dg-bogus "ignoring return value of '\[^\n\r]*', 
declared with attribute 'nodiscard'" }
+  corge ();// { dg-warning "ignoring return value of '\[^\n\r]*', 
declared with attribute 'nodiscard'" }
+  auto y = corge ();   // { dg-bogus "ignoring return value of '\[^\n\r]*', 
declared with attribute 'nodiscard'" }
+  qux (corge ());  // { dg-bogus "ignoring return value of '\[^\n\r]*', 
declared with attribute 'nodiscard'" }
+}


Jakub





Re: [PATCH] c++: access for hidden friend of nested class template [PR100502]

2021-05-25 Thread Jason Merrill via Gcc-patches

On 5/25/21 10:50 AM, Patrick Palka wrote:

On Mon, 24 May 2021, Jason Merrill wrote:


On 5/21/21 4:35 PM, Patrick Palka wrote:

Here, during ahead of time access checking for the private member
EnumeratorRange::end_reached_ in the hidden friend f, we're triggering
the the assert in enforce_access that verifies we're not trying to add a
dependent access check to TI_DEFERRED_ACCESS_CHECKS.

The special thing about this class member access expression is that it's
considered to be non-type-dependent (so finish_class_member_access_expr
doesn't exit early at template parse time), and then accessible_p
rejects the access (so we don't exit early from enforce access either,
and end up triggering the assert).  I think we're correct to reject it
because a hidden friend is not a member function, so [class.access.nest]
doesn't apply, and also a hidden friend of a nested class is not a
friend of the enclosing class.  (Note that Clang accepts the testcase
and MSVC and ICC reject it.)


Hmm, I think you're right, but that seems inconsistent with the change (long
ago) to give nested classes access to members of the enclosing class.


I guess the question is whether a hidden friend is considered to be a
class member for sake of access checking.  Along that note, I noticed
Clang/GCC/MSVC/ICC all accept the access of A::f in:

   struct A {
   protected:
 static void f();
   };

   struct B : A {
 friend void g() { A::f(); }
   };

But arguably this is valid iff g is considered to be a member of B.

If we adjust the above example to define the friend g at namespace
scope:

   struct A {
   protected:
 static void f();
   };

   struct B : A {
 friend void g();
   };

   void g() { A::f(); }

then GCC/MSVC/ICC accept and Clang rejects.  But this second example is
definitely invalid since it's just a special case of the example in
[class.protected], which says:

   void fr() {
 ...
 B::j = 5; // error: not a friend of naming class B
 ...
   }




This patch relaxes the problematic assert in enforce_access to check
dependent_scope_p instead of uses_template_parms, which is the more
accurate notion of dependence we care about.


Agreed.


This change alone is
sufficient to fix the ICE, but we now end up diagnosing each access
twice, once at substitution time and again from TI_DEFERRED_ACCESS_CHECKS.
So this patch additionally disables ahead of time access checking
during the call to lookup_member from finish_class_member_access_expr;
we're going to check the same access again at substitution time anyway.


That seems undesirable; it's better to diagnose when parsing if we can. Why is
it going on TI_DEFERRED_ACCESS_CHECKS after we already checked it?


At parse time, a negative accessible_p answer only means "maybe not
accessible" rather than "definitely not accessible", since access
may still be granted to some specialization of the current template
via a friend declaration.  I think we'd need to beef up accessible_p a
bit before we can begin diagnosing accesses at template parse time.
This probably wouldn't be too hairy to implement; I'll look into it.


Ah, I missed that you were saying twice at substitution time.  You're 
right that in general we can't diagnose at parse time.



For now, would the assert relaxation in enforce_access be OK for
trunk/11?


Yes.  And the other hunk is OK for trunk.


Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk?  For GCC 11, should we just backport the enforce_access hunk?

PR c++/100502

gcc/cp/ChangeLog:

* semantics.c (enforce_access): Relax assert about the type
depedence of the DECL_CONTEXT of the declaration.
* typeck.c (finish_class_member_access_expr): Disable ahead
of time access checking during the member lookup.

gcc/testsuite/ChangeLog:

* g++.dg/template/access37.C: New test.
* g++.dg/template/access37a.C: New test.
---
   gcc/cp/semantics.c|  2 +-
   gcc/cp/typeck.c   |  6 ++
   gcc/testsuite/g++.dg/template/access37.C  | 26 +++
   gcc/testsuite/g++.dg/template/access37a.C |  6 ++
   4 files changed, 39 insertions(+), 1 deletion(-)
   create mode 100644 gcc/testsuite/g++.dg/template/access37.C
   create mode 100644 gcc/testsuite/g++.dg/template/access37a.C

diff --git a/gcc/cp/semantics.c b/gcc/cp/semantics.c
index 0d590c318fb..0de14316bba 100644
--- a/gcc/cp/semantics.c
+++ b/gcc/cp/semantics.c
@@ -365,7 +365,7 @@ enforce_access (tree basetype_path, tree decl, tree
diag_decl,
   check here.  */
gcc_assert (!uses_template_parms (decl));
if (TREE_CODE (decl) == FIELD_DECL)
- gcc_assert (!uses_template_parms (DECL_CONTEXT (decl)));
+ gcc_assert (!dependent_scope_p (DECL_CONTEXT (decl)));
/* Defer this access check until instantiation time.  */
deferred_access_check access_check;
diff --git a/gcc/cp/typeck.c 

Re: [PATCH 1/2] c-family: Copy DECL_USER_ALIGN even if DECL_ALIGN is similar.

2021-05-25 Thread Martin Sebor via Gcc-patches

On 5/25/21 4:38 AM, Robin Dapp wrote:

Hi Martin and Jason,


The removal of the dead code looks good to me.  The change to
"re-init lastalign" doesn't seem right.  When it's zero it means
the conflict is between two attributes on the same declaration,
in which case the note shouldn't be printed (it would just point
to the same location as the warning).


Agreed.


Did I get it correctly that you refer to printing a note in e.g. the 
following case?


  inline int __attribute__ ((aligned (16), aligned (4)))
  finline_align (int);


Yes, that's what I was referring to.



I indeed missed this but it could be fixed by checking (on top of the 
patch)


diff --git a/gcc/c-family/c-attribs.c b/gcc/c-family/c-attribs.c
index 98c98944405..7349da73f14 100644
--- a/gcc/c-family/c-attribs.c
+++ b/gcc/c-family/c-attribs.c
@@ -2324,7 +2324,7 @@ common_handle_aligned_attribute (tree *node, tree 
name, tree args, int flags,

    /* Either a prior attribute on the same declaration or one
  on a prior declaration of the same function specifies
  stricter alignment than this attribute.  */
-  bool note = lastalign != 0;
+  bool note = last_decl != decl && lastalign != 0;

As there wasn't any FAIL, I would add another test which checks for this.


That would be great, thank you!

Martin



I find the whole logic here a bit convoluted but when there is no real 
last_decl, then last_decl = decl.  A note would not be printed before 
the patch because we erroneously warned about the "conflict" of the 
function's default alignment (8) vs the requested alignment (4).


Regards
  Robin




Re: [PATCH] Add 3 target hooks for memset

2021-05-25 Thread H.J. Lu via Gcc-patches
On Tue, May 25, 2021 at 7:34 AM Richard Biener
 wrote:
>
> On Thu, May 20, 2021 at 10:50 PM H.J. Lu  wrote:
> >
> > On Wed, May 19, 2021 at 5:55 AM H.J. Lu  wrote:
> > >
> > > On Wed, May 19, 2021 at 2:25 AM Richard Biener
> > >  wrote:
> > > >
> > > > On Tue, May 18, 2021 at 9:16 PM H.J. Lu  wrote:
> > > > >
> > > > > Add TARGET_READ_MEMSET_VALUE and TARGET_GEN_MEMSET_VALUE to support
> > > > > target instructions to duplicate QImode value to TImode/OImode/XImode
> > > > > value for memmset.
> > > > >
> > > > > PR middle-end/90773
> > > > > * builtins.c (builtin_memset_read_str): Call
> > > > > targetm.read_memset_value.
> > > > > (builtin_memset_gen_str): Call targetm.gen_memset_value.
> > > > > * target.def (read_memset_value): New hook.
> > > > > (gen_memset_value): Likewise.
> > > > > * targhooks.c: Inclue "builtins.h".
> > > > > (default_read_memset_value): New function.
> > > > > (default_gen_memset_value): Likewise.
> > > > > * targhooks.h (default_read_memset_value): New prototype.
> > > > > (default_gen_memset_value): Likewise.
> > > > > * doc/tm.texi.in: Add TARGET_READ_MEMSET_VALUE and
> > > > > TARGET_GEN_MEMSET_VALUE hooks.
> > > > > * doc/tm.texi: Regenerated.
> > > > > ---
> > > > >  gcc/builtins.c | 47 --
> > > > >  gcc/doc/tm.texi| 16 +
> > > > >  gcc/doc/tm.texi.in |  4 
> > > > >  gcc/target.def | 20 +
> > > > >  gcc/targhooks.c| 56 
> > > > > ++
> > > > >  gcc/targhooks.h|  4 
> > > > >  6 files changed, 104 insertions(+), 43 deletions(-)
> > > > >
> > > > > diff --git a/gcc/builtins.c b/gcc/builtins.c
> > > > > index e1b284846b1..f78a36478ef 100644
> > > > > --- a/gcc/builtins.c
> > > > > +++ b/gcc/builtins.c
> > > > > @@ -6584,24 +6584,11 @@ expand_builtin_strncpy (tree exp, rtx target)
> > > > > previous iteration.  */
> > > > >
> > > > >  rtx
> > > > > -builtin_memset_read_str (void *data, void *prevp,
> > > > > +builtin_memset_read_str (void *data, void *prev,
> > > > >  HOST_WIDE_INT offset ATTRIBUTE_UNUSED,
> > > > >  scalar_int_mode mode)
> > > > >  {
> > > > > -  by_pieces_prev *prev = (by_pieces_prev *) prevp;
> > > > > -  if (prev != nullptr && prev->data != nullptr)
> > > > > -{
> > > > > -  /* Use the previous data in the same mode.  */
> > > > > -  if (prev->mode == mode)
> > > > > -   return prev->data;
> > > > > -}
> > > > > -
> > > > > -  const char *c = (const char *) data;
> > > > > -  char *p = XALLOCAVEC (char, GET_MODE_SIZE (mode));
> > > > > -
> > > > > -  memset (p, *c, GET_MODE_SIZE (mode));
> > > > > -
> > > > > -  return c_readstr (p, mode);
> > > > > +  return targetm.read_memset_value ((const char *) data, prev, mode);
> > > > >  }
> > > > >
> > > > >  /* Callback routine for store_by_pieces.  Return the RTL of a 
> > > > > register
> > > > > @@ -6611,37 +6598,11 @@ builtin_memset_read_str (void *data, void 
> > > > > *prevp,
> > > > > nullptr, it has the RTL info from the previous iteration.  */
> > > > >
> > > > >  static rtx
> > > > > -builtin_memset_gen_str (void *data, void *prevp,
> > > > > +builtin_memset_gen_str (void *data, void *prev,
> > > > > HOST_WIDE_INT offset ATTRIBUTE_UNUSED,
> > > > > scalar_int_mode mode)
> > > > >  {
> > > > > -  rtx target, coeff;
> > > > > -  size_t size;
> > > > > -  char *p;
> > > > > -
> > > > > -  by_pieces_prev *prev = (by_pieces_prev *) prevp;
> > > > > -  if (prev != nullptr && prev->data != nullptr)
> > > > > -{
> > > > > -  /* Use the previous data in the same mode.  */
> > > > > -  if (prev->mode == mode)
> > > > > -   return prev->data;
> > > > > -
> > > > > -  target = simplify_gen_subreg (mode, prev->data, prev->mode, 0);
> > > > > -  if (target != nullptr)
> > > > > -   return target;
> > > > > -}
> > > > > -
> > > > > -  size = GET_MODE_SIZE (mode);
> > > > > -  if (size == 1)
> > > > > -return (rtx) data;
> > > > > -
> > > > > -  p = XALLOCAVEC (char, size);
> > > > > -  memset (p, 1, size);
> > > > > -  coeff = c_readstr (p, mode);
> > > > > -
> > > > > -  target = convert_to_mode (mode, (rtx) data, 1);
> > > > > -  target = expand_mult (mode, target, coeff, NULL_RTX, 1);
> > > > > -  return force_reg (mode, target);
> > > > > +  return targetm.gen_memset_value ((rtx) data, prev, mode);
> > > > >  }
> > > > >
> > > > >  /* Expand expression EXP, which is a call to the memset builtin.  
> > > > > Return
> > > > > diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
> > > > > index 85ea9395560..51385044e76 100644
> > > > > --- a/gcc/doc/tm.texi
> > > > > +++ b/gcc/doc/tm.texi
> > > > > @@ -11868,6 +11868,22 @@ This function prepares to emit a conditional 
> > > > > comparison within a sequence
> > > > >   

[PATCH] c++, v2: Avoid -Wunused-value false positives on nullptr passed to ellipsis [PR100666]

2021-05-25 Thread Jakub Jelinek via Gcc-patches
On Tue, May 25, 2021 at 09:40:13AM -0400, Jason Merrill wrote:
> Please also test the case of a [[nodiscard]] function returning an empty
> class type.

Here it is.  I have also extended the decltype(nullptr) nodiscard test.
Retested on x86_64-linux (nothing other than testcases changed), ok for
trunk?

2021-05-25  Jakub Jelinek  

PR c++/100666
* call.c (convert_arg_to_ellipsis): For expressions with NULLPTR_TYPE
and side-effects, temporarily disable -Wunused-result warning when
building COMPOUND_EXPR.

* g++.dg/cpp1z/nodiscard8.C: New test.
* g++.dg/cpp1z/nodiscard9.C: New test.

--- gcc/cp/call.c.jj2021-05-25 10:55:55.105239017 +0200
+++ gcc/cp/call.c   2021-05-25 17:01:47.524719692 +0200
@@ -8178,7 +8178,10 @@ convert_arg_to_ellipsis (tree arg, tsubs
 {
   arg = mark_rvalue_use (arg);
   if (TREE_SIDE_EFFECTS (arg))
-   arg = cp_build_compound_expr (arg, null_pointer_node, complain);
+   {
+ warning_sentinel w(warn_unused_result);
+ arg = cp_build_compound_expr (arg, null_pointer_node, complain);
+   }
   else
arg = null_pointer_node;
 }
--- gcc/testsuite/g++.dg/cpp1z/nodiscard8.C.jj  2021-05-25 17:01:47.524719692 
+0200
+++ gcc/testsuite/g++.dg/cpp1z/nodiscard8.C 2021-05-25 17:04:34.668904548 
+0200
@@ -0,0 +1,15 @@
+// PR c++/100666
+// { dg-do compile { target c++11 } }
+
+[[nodiscard]] decltype(nullptr) bar ();
+extern void foo (...);
+template  void qux (T);
+
+void
+baz ()
+{
+  foo (bar ());// { dg-bogus "ignoring return value of 
'\[^\n\r]*', declared with attribute 'nodiscard'" }
+  bar ();  // { dg-warning "ignoring return value of '\[^\n\r]*', 
declared with attribute 'nodiscard'" }
+  auto x = bar (); // { dg-bogus "ignoring return value of '\[^\n\r]*', 
declared with attribute 'nodiscard'" }
+  qux (bar ());// { dg-bogus "ignoring return value of 
'\[^\n\r]*', declared with attribute 'nodiscard'" }
+}
--- gcc/testsuite/g++.dg/cpp1z/nodiscard9.C.jj  2021-05-25 17:05:01.907608730 
+0200
+++ gcc/testsuite/g++.dg/cpp1z/nodiscard9.C 2021-05-25 17:06:40.054542742 
+0200
@@ -0,0 +1,22 @@
+// PR c++/100666
+// { dg-do compile { target c++11 } }
+
+struct S {};
+[[nodiscard]] S bar ();
+struct U { S s; };
+[[nodiscard]] U corge ();
+extern void foo (...);
+template  void qux (T);
+
+void
+baz ()
+{
+  foo (bar ());// { dg-bogus "ignoring return value of 
'\[^\n\r]*', declared with attribute 'nodiscard'" }
+  bar ();  // { dg-warning "ignoring return value of '\[^\n\r]*', 
declared with attribute 'nodiscard'" }
+  auto x = bar (); // { dg-bogus "ignoring return value of '\[^\n\r]*', 
declared with attribute 'nodiscard'" }
+  qux (bar ());// { dg-bogus "ignoring return value of 
'\[^\n\r]*', declared with attribute 'nodiscard'" }
+  foo (corge ());  // { dg-bogus "ignoring return value of '\[^\n\r]*', 
declared with attribute 'nodiscard'" }
+  corge ();// { dg-warning "ignoring return value of '\[^\n\r]*', 
declared with attribute 'nodiscard'" }
+  auto y = corge ();   // { dg-bogus "ignoring return value of '\[^\n\r]*', 
declared with attribute 'nodiscard'" }
+  qux (corge ());  // { dg-bogus "ignoring return value of '\[^\n\r]*', 
declared with attribute 'nodiscard'" }
+}


Jakub



RE: [PATCH 3/4][AArch32]: Add support for sign differing dot-product usdot for NEON.

2021-05-25 Thread Tamar Christina via Gcc-patches
Forgot to include the list

> -Original Message-
> From: Tamar Christina
> Sent: Tuesday, May 25, 2021 3:57 PM
> To: Tamar Christina 
> Cc: Richard Earnshaw ; nd ;
> Ramana Radhakrishnan ; Kyrylo Tkachov
> 
> Subject: RE: [PATCH 3/4][AArch32]: Add support for sign differing dot-
> product usdot for NEON.
> 
> Hi All,
> 
> This is a respin based on the feedback gotten from the AArch64 review.
> 
> Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> 
> Ok for master?
> 
> Thanks,
> Tamar
> 
> gcc/ChangeLog:
> 
>   * config/arm/neon.md (usdot_prod): New.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/arm/simd/vusdot-autovec.c: New test.
> 
> > -Original Message-
> > From: Gcc-patches  On Behalf Of
> Tamar
> > Christina via Gcc-patches
> > Sent: Wednesday, May 5, 2021 6:42 PM
> > To: gcc Patches 
> > Cc: Richard Earnshaw ; nd ;
> > Ramana Radhakrishnan 
> > Subject: FW: [PATCH 3/4][AArch32]: Add support for sign differing dot-
> > product usdot for NEON.
> >
> > Forgot to CC maintainers..
> >
> > -Original Message-
> > From: Tamar Christina 
> > Sent: Wednesday, May 5, 2021 6:39 PM
> > To: gcc-patches@gcc.gnu.org
> > Cc: nd 
> > Subject: [PATCH 3/4][AArch32]: Add support for sign differing
> > dot-product usdot for NEON.
> >
> > Hi All,
> >
> > This adds optabs implementing usdot_prod.
> >
> > The following testcase:
> >
> > #define N 480
> > #define SIGNEDNESS_1 unsigned
> > #define SIGNEDNESS_2 signed
> > #define SIGNEDNESS_3 signed
> > #define SIGNEDNESS_4 unsigned
> >
> > SIGNEDNESS_1 int __attribute__ ((noipa)) f (SIGNEDNESS_1 int res,
> > SIGNEDNESS_3 char *restrict a,
> >SIGNEDNESS_4 char *restrict b)
> > {
> >   for (__INTPTR_TYPE__ i = 0; i < N; ++i)
> > {
> >   int av = a[i];
> >   int bv = b[i];
> >   SIGNEDNESS_2 short mult = av * bv;
> >   res += mult;
> > }
> >   return res;
> > }
> >
> > Generates
> >
> > f:
> > vmov.i32q8, #0  @ v4si
> > add r3, r2, #480
> > .L2:
> > vld1.8  {q10}, [r2]!
> > vld1.8  {q9}, [r1]!
> > vusdot.s8   q8, q9, q10
> > cmp r3, r2
> > bne .L2
> > vadd.i32d16, d16, d17
> > vpadd.i32   d16, d16, d16
> > vmov.32 r3, d16[0]
> > add r0, r0, r3
> > bx  lr
> >
> > instead of
> >
> > f:
> > vmov.i32q8, #0  @ v4si
> > add r3, r2, #480
> > .L2:
> > vld1.8  {q9}, [r2]!
> > vld1.8  {q11}, [r1]!
> > cmp r3, r2
> > vmull.s8 q10, d18, d22
> > vmull.s8 q9, d19, d23
> > vaddw.s16   q8, q8, d20
> > vaddw.s16   q8, q8, d21
> > vaddw.s16   q8, q8, d18
> > vaddw.s16   q8, q8, d19
> > bne .L2
> > vadd.i32d16, d16, d17
> > vpadd.i32   d16, d16, d16
> > vmov.32 r3, d16[0]
> > add r0, r0, r3
> > bx  lr
> >
> > For NEON.  I couldn't figure out if the MVE instruction vmlaldav.s16
> > could be used to emulate this.  Because it would require additional
> > widening to work I left MVE out of this patch set but perhaps someone
> should take a look.
> >
> > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> >
> > Ok for master?
> >
> > Thanks,
> > Tamar
> >
> > gcc/ChangeLog:
> >
> > * config/arm/neon.md (usdot_prod): New.
> >
> > gcc/testsuite/ChangeLog:
> >
> > * gcc.target/arm/simd/vusdot-autovec.c: New test.
> >
> > --- inline copy of patch --
> > diff --git a/gcc/config/arm/neon.md b/gcc/config/arm/neon.md index
> >
> fec2cc91d24b6eff7b6fc8fdd54f39b3d646c468..23ad411178db77c5d19bee7452
> > bc1070331c1aa0 100644
> > --- a/gcc/config/arm/neon.md
> > +++ b/gcc/config/arm/neon.md
> > @@ -3075,6 +3075,24 @@ (define_expand "dot_prod"
> >DONE;
> >  })
> >
> > +;; Auto-vectorizer pattern for usdot
> > +(define_expand "usdot_prod"
> > +  [(set (match_operand:VCVTI 0 "register_operand")
> > +   (plus:VCVTI (unspec:VCVTI [(match_operand: 1
> > +   "register_operand")
> > +  (match_operand: 2
> > +   "register_operand")]
> > +UNSPEC_DOT_US)
> > +   (match_operand:VCVTI 3 "register_operand")))]
> > +  "TARGET_I8MM"
> > +{
> > +  emit_insn (
> > +gen_neon_usdot (operands[3], operands[3], operands[1],
> > +   operands[2]));
> > +  emit_insn (gen_rtx_SET (operands[0], operands[3]));
> > +  DONE;
> > +})
> > +
> >  (define_expand "neon_copysignf"
> >[(match_operand:VCVTF 0 "register_operand")
> > (match_operand:VCVTF 1 "register_operand") diff --git
> > a/gcc/testsuite/gcc.target/arm/simd/vusdot-autovec.c
> > b/gcc/testsuite/gcc.target/arm/simd/vusdot-autovec.c
> > new file mode 100644
> > index
> >
> ..7cc56f68817d77d6950df0ab37
> > 2d6fbaad6b3813
> > --- /dev/null

FW: [PATCH 4/4]middle-end: Add tests middle end generic tests for sign differing dotproduct.

2021-05-25 Thread Tamar Christina via Gcc-patches

Forgot the list...

-Original Message-
From: Tamar Christina 
Sent: Tuesday, May 25, 2021 3:58 PM
To: Tamar Christina 
Cc: nd ; rguent...@suse.de
Subject: RE: [PATCH 4/4]middle-end: Add tests middle end generic tests for sign 
differing dotproduct.

Hi All,

Adding a few more tests

Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.

Ok for master?

Thanks,
Tamar

gcc/ChangeLog:

* doc/sourcebuild.texi (arm_v8_2a_i8mm_neon_hw): Document.

gcc/testsuite/ChangeLog:

* lib/target-supports.exp
(check_effective_target_arm_v8_2a_imm8_neon_ok_nocache,
check_effective_target_arm_v8_2a_i8mm_neon_hw,
check_effective_target_vect_usdot_qi): New.
* gcc.dg/vect/vect-reduc-dot-9.c: New test.
* gcc.dg/vect/vect-reduc-dot-10.c: New test.
* gcc.dg/vect/vect-reduc-dot-11.c: New test.
* gcc.dg/vect/vect-reduc-dot-12.c: New test.
* gcc.dg/vect/vect-reduc-dot-13.c: New test.
* gcc.dg/vect/vect-reduc-dot-14.c: New test.
* gcc.dg/vect/vect-reduc-dot-15.c: New test.
* gcc.dg/vect/vect-reduc-dot-16.c: New test.
* gcc.dg/vect/vect-reduc-dot-17.c: New test.
* gcc.dg/vect/vect-reduc-dot-18.c: New test.

> -Original Message-
> From: Gcc-patches  On Behalf Of Tamar 
> Christina via Gcc-patches
> Sent: Wednesday, May 5, 2021 6:40 PM
> To: gcc-patches@gcc.gnu.org
> Cc: nd ; rguent...@suse.de
> Subject: [PATCH 4/4]middle-end: Add tests middle end generic tests for 
> sign differing dotproduct.
> 
> Hi All,
> 
> This adds testcases to test for auto-vect detection of the new sign 
> differing dot product.
> 
> Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> 
> Ok for master?
> 
> Thanks,
> Tamar
> 
> gcc/ChangeLog:
> 
>   * doc/sourcebuild.texi (arm_v8_2a_i8mm_neon_hw): Document.
> 
> gcc/testsuite/ChangeLog:
> 
>   * lib/target-supports.exp
>   (check_effective_target_arm_v8_2a_imm8_neon_ok_nocache,
>   check_effective_target_arm_v8_2a_i8mm_neon_hw,
>   check_effective_target_vect_usdot_qi): New.
>   * gcc.dg/vect/vect-reduc-dot-10.c: New test.
>   * gcc.dg/vect/vect-reduc-dot-11.c: New test.
>   * gcc.dg/vect/vect-reduc-dot-12.c: New test.
>   * gcc.dg/vect/vect-reduc-dot-13.c: New test.
>   * gcc.dg/vect/vect-reduc-dot-14.c: New test.
>   * gcc.dg/vect/vect-reduc-dot-15.c: New test.
>   * gcc.dg/vect/vect-reduc-dot-16.c: New test.
>   * gcc.dg/vect/vect-reduc-dot-9.c: New test.
> 
> --- inline copy of patch --
> diff --git a/gcc/doc/sourcebuild.texi b/gcc/doc/sourcebuild.texi index 
> b0001247795947c9dcab1a14884ecd585976dfdd..0034ac9d86b26e6674d71090b
> 9d04b6148f99e17 100644
> --- a/gcc/doc/sourcebuild.texi
> +++ b/gcc/doc/sourcebuild.texi
> @@ -1672,6 +1672,10 @@ Target supports a vector dot-product of 
> @code{signed char}.
>  @item vect_udot_qi
>  Target supports a vector dot-product of @code{unsigned char}.
> 
> +@item vect_usdot_qi
> +Target supports a vector dot-product where one operand of the 
> +multiply is @code{signed char} and the other of @code{unsigned char}.
> +
>  @item vect_sdot_hi
>  Target supports a vector dot-product of @code{signed short}.
> 
> @@ -1947,6 +1951,11 @@ ARM target supports executing instructions from 
> ARMv8.2-A with the Dot  Product extension. Some multilibs may be 
> incompatible with these options.
>  Implies arm_v8_2a_dotprod_neon_ok.
> 
> +@item arm_v8_2a_i8mm_neon_hw
> +ARM target supports executing instructions from ARMv8.2-A with the 
> +8-bit Matrix Multiply extension.  Some multilibs may be incompatible 
> +with these options.  Implies arm_v8_2a_i8mm_ok.
> +
>  @item arm_fp16fml_neon_ok
>  @anchor{arm_fp16fml_neon_ok}
>  ARM target supports extensions to generate the @code{VFMAL} and 
> @code{VFMLS} diff --git 
> a/gcc/testsuite/gcc.dg/vect/vect-reduc-dot-10.c
> b/gcc/testsuite/gcc.dg/vect/vect-reduc-dot-10.c
> new file mode 100644
> index
> ..7ce86965ea97d37c43d96b4d2
> 271df667dcb2aae
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/vect/vect-reduc-dot-10.c
> @@ -0,0 +1,13 @@
> +/* { dg-require-effective-target vect_int } */
> +/* { dg-require-effective-target arm_v8_2a_i8mm_neon_hw { target {
> +aarch64*-*-* || arm*-*-* } } } */
> +/* { dg-add-options arm_v8_2a_i8mm }  */
> +
> +#define SIGNEDNESS_1 unsigned
> +#define SIGNEDNESS_2 unsigned
> +#define SIGNEDNESS_3 unsigned
> +#define SIGNEDNESS_4 signed
> +
> +#include "vect-reduc-dot-9.c"
> +
> +/* { dg-final { scan-tree-dump-not "vect_recog_dot_prod_pattern:
> +detected" "vect" } } */
> +/* { dg-final { scan-tree-dump-times "vectorized 1 loop" 1 "vect" { 
> +target vect_usdot_qi } } } */
> diff --git a/gcc/testsuite/gcc.dg/vect/vect-reduc-dot-11.c
> b/gcc/testsuite/gcc.dg/vect/vect-reduc-dot-11.c
> new file mode 100644
> index
> ..0f7cbbb87ef028f166366aea55
> bc4ef49d2f8e9b
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/vect/vect-reduc-dot-11.c
> 

[PATCH]AArch64: Correct dot-product auto-vect optab RTL

2021-05-25 Thread Tamar Christina via Gcc-patches
Hi All,

The current RTL for the vectorizer patterns for dot-product are incorrect.
Operand3 isn't an output parameter so we can't write to it.

This fixes this issue and reduces the number of RTL.

Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.

Ok for master? And backport to GCC 11, 10, 9?

Thanks,
Tamar

gcc/ChangeLog:

* config/aarch64/aarch64-simd-builtins.def (udot, sdot): Rename to...
(sdot_prod, udot_prod): ...These.
* config/aarch64/aarch64-simd.md (dot_prod): Remove.
((aarch64_dot): Rename to...
(dot_prod): ...This.
* config/aarch64/arm_neon.h (vdot_u32, vdotq_u32, vdot_s32, vdotq_s32):
Update builtins.

--- inline copy of patch -- 
diff --git a/gcc/config/aarch64/aarch64-simd-builtins.def 
b/gcc/config/aarch64/aarch64-simd-builtins.def
index 
c869ed9a6ab7d63f0e3d5fe393a93c1cc9142e78..fa3bb7b96710122957933b5c0b0b276256892a4c
 100644
--- a/gcc/config/aarch64/aarch64-simd-builtins.def
+++ b/gcc/config/aarch64/aarch64-simd-builtins.def
@@ -362,8 +362,8 @@
   BUILTIN_VSDQ_I_DI (BINOP_UUS, urshl, 0, NONE)
 
   /* Implemented by _prod.  */
-  BUILTIN_VB (TERNOP, sdot, 0, NONE)
-  BUILTIN_VB (TERNOPU, udot, 0, NONE)
+  BUILTIN_VB (TERNOP, sdot_prod, 10, NONE)
+  BUILTIN_VB (TERNOPU, udot_prod, 10, NONE)
   BUILTIN_VB (TERNOP_SSUS, usdot_prod, 10, NONE)
   /* Implemented by aarch64__lane{q}.  */
   BUILTIN_VB (QUADOP_LANE, sdot_lane, 0, NONE)
diff --git a/gcc/config/aarch64/aarch64-simd.md 
b/gcc/config/aarch64/aarch64-simd.md
index 
253ddbe25d3a86af4b40b056132e6a86a0392ea6..638e2d103bcba0af2292b16efd02046d1195095b
 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -587,8 +587,28 @@ (define_expand "cmul3"
   DONE;
 })
 
-;; These instructions map to the __builtins for the Dot Product operations.
-(define_insn "aarch64_dot"
+;; These expands map to the Dot Product optab the vectorizer checks for
+;; and to the intrinsics patttern.
+;; The auto-vectorizer expects a dot product builtin that also does an
+;; accumulation into the provided register.
+;; Given the following pattern
+;;
+;; for (i=0; idot_prod"
   [(set (match_operand:VS 0 "register_operand" "=w")
(plus:VS (match_operand:VS 1 "register_operand" "0")
(unspec:VS [(match_operand: 2 "register_operand" "w")
@@ -613,41 +633,6 @@ (define_insn "usdot_prod"
   [(set_attr "type" "neon_dot")]
 )
 
-;; These expands map to the Dot Product optab the vectorizer checks for.
-;; The auto-vectorizer expects a dot product builtin that also does an
-;; accumulation into the provided register.
-;; Given the following pattern
-;;
-;; for (i=0; idot_prod"
-  [(set (match_operand:VS 0 "register_operand")
-   (plus:VS (unspec:VS [(match_operand: 1 "register_operand")
-   (match_operand: 2 "register_operand")]
-DOTPROD)
-   (match_operand:VS 3 "register_operand")))]
-  "TARGET_DOTPROD"
-{
-  emit_insn (
-gen_aarch64_dot (operands[3], operands[3], operands[1],
-   operands[2]));
-  emit_insn (gen_rtx_SET (operands[0], operands[3]));
-  DONE;
-})
-
 ;; These instructions map to the __builtins for the Dot Product
 ;; indexed operations.
 (define_insn "aarch64_dot_lane"
@@ -944,8 +929,7 @@ (define_expand "sadv16qi"
rtx ones = force_reg (V16QImode, CONST1_RTX (V16QImode));
rtx abd = gen_reg_rtx (V16QImode);
emit_insn (gen_aarch64_abdv16qi (abd, operands[1], operands[2]));
-   emit_insn (gen_aarch64_udotv16qi (operands[0], operands[3],
- abd, ones));
+   emit_insn (gen_udot_prodv16qi (operands[0], operands[3], abd, ones));
DONE;
   }
 rtx reduc = gen_reg_rtx (V8HImode);
diff --git a/gcc/config/aarch64/arm_neon.h b/gcc/config/aarch64/arm_neon.h
index 
373f06a24ea6ce686d7e0cdf53dd364041c61092..90770411f177f05b4f1bdbd83890734612c31dc3
 100644
--- a/gcc/config/aarch64/arm_neon.h
+++ b/gcc/config/aarch64/arm_neon.h
@@ -32112,28 +32112,28 @@ __extension__ extern __inline uint32x2_t
 __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
 vdot_u32 (uint32x2_t __r, uint8x8_t __a, uint8x8_t __b)
 {
-  return __builtin_aarch64_udotv8qi_ (__r, __a, __b);
+  return __builtin_aarch64_udot_prodv8qi_ (__r, __a, __b);
 }
 
 __extension__ extern __inline uint32x4_t
 __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
 vdotq_u32 (uint32x4_t __r, uint8x16_t __a, uint8x16_t __b)
 {
-  return __builtin_aarch64_udotv16qi_ (__r, __a, __b);
+  return __builtin_aarch64_udot_prodv16qi_ (__r, __a, __b);
 }
 
 __extension__ extern __inline int32x2_t
 __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
 vdot_s32 (int32x2_t __r, int8x8_t __a, int8x8_t __b)
 {
-  return __builtin_aarch64_sdotv8qi (__r, __a, __b);
+  return __builtin_aarch64_sdot_prodv8qi (__r, __a, __b);
 }
 
 __extension__ extern __inline int32x4_t
 __attribute__ 

Re: [PATCH] c++/88601 - [C/C++] __builtin_shufflevector support

2021-05-25 Thread Martin Sebor via Gcc-patches

On 5/25/21 7:32 AM, Jason Merrill via Gcc-patches wrote:

On 5/25/21 2:57 AM, Richard Biener wrote:

On Fri, 21 May 2021, Jason Merrill wrote:


On 5/21/21 8:33 AM, Richard Biener wrote:

This adds support for the clang __builtin_shufflevector extension to
the C and C++ frontends.  The builtin is lowered to VEC_PERM_EXPR.
Because VEC_PERM_EXPR does not support different sized vector inputs
or result or the special permute index of -1 (don't-care)
c_build_shufflevector applies lowering by widening inputs and output
to the widest vector, replacing -1 by a defined index and
subsetting the final vector if we produced a wider result than
desired.

Code generation thus can be sub-optimal, followup patches will
aim to fix that by recovering from part of the missing features
during RTL expansion and by relaxing the constraints of the GIMPLE
IL with regard to VEC_PERM_EXPR.

Bootstrapped on x86_64-unknown-linux-gnu, (re-)testing in progress.

Honza - you've filed PR88601, can you point me to testcases that
exercise common uses so we can look at code generation quality
and where time is spent best in improving things?

OK for trunk?

Thanks,
Richard.

2021-05-21  Richard Biener  

PR c++/88601
gcc/c-family/
  * c-common.c: Include tree-vector-builder.h and
  vec-perm-indices.h.
  (c_common_reswords): Add __builtin_shufflevector.
  (c_build_shufflevector): New funtion.
  * c-common.h (enum rid): Add RID_BUILTIN_SHUFFLEVECTOR.
  (c_build_shufflevector): Declare.

gcc/c/
  * c-decl.c (names_builtin_p): Handle RID_BUILTIN_SHUFFLEVECTOR.
  * c-parser.c (c_parser_postfix_expression): Likewise.

gcc/cp/
  * cp-objcp-common.c (names_builtin_p): Handle
  RID_BUILTIN_SHUFFLEVECTOR.
  * cp-tree.h (build_x_shufflevector): Declare.
  * parser.c (cp_parser_postfix_expression): Handle
  RID_BUILTIN_SHUFFLEVECTOR.
  * pt.c (tsubst_copy_and_build): Handle IFN_SHUFFLEVECTOR.
  * typeck.c (build_x_shufflevector): Build either a lowered
  VEC_PERM_EXPR or an unlowered shufflevector via a temporary
  internal function IFN_SHUFFLEVECTOR.

gcc/
  * internal-fn.c (expand_SHUFFLEVECTOR): Define.
  * internal-fn.def (SHUFFLEVECTOR): New.
  * internal-fn.h (expand_SHUFFLEVECTOR): Declare.

gcc/testsuite/
  * c-c++-common/builtin-shufflevector-2.c: New testcase.
  * c-c++-common/torture/builtin-shufflevector-1.c: Likewise.
  * g++.dg/builtin-shufflevector-1.C: Likewise.
  * g++.dg/builtin-shufflevector-2.C: Likewise.
---
   gcc/c-family/c-common.c   | 139 
++

   gcc/c-family/c-common.h   |   4 +-
   gcc/c/c-decl.c    |   1 +
   gcc/c/c-parser.c  |  38 +
   gcc/cp/cp-objcp-common.c  |   1 +
   gcc/cp/cp-tree.h  |   3 +
   gcc/cp/parser.c   |  15 ++
   gcc/cp/pt.c   |   9 ++
   gcc/cp/typeck.c   |  36 +
   gcc/internal-fn.c |   6 +
   gcc/internal-fn.def   |   3 +
   gcc/internal-fn.h |   1 +
   .../c-c++-common/builtin-shufflevector-2.c    |  18 +++
   .../torture/builtin-shufflevector-1.c |  49 ++
   .../g++.dg/builtin-shufflevector-1.C  |  18 +++
   .../g++.dg/builtin-shufflevector-2.C  |  12 ++
   16 files changed, 352 insertions(+), 1 deletion(-)
   create mode 100644 
gcc/testsuite/c-c++-common/builtin-shufflevector-2.c

   create mode 100644
   gcc/testsuite/c-c++-common/torture/builtin-shufflevector-1.c
   create mode 100644 gcc/testsuite/g++.dg/builtin-shufflevector-1.C
   create mode 100644 gcc/testsuite/g++.dg/builtin-shufflevector-2.C

diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
index b7daa2e2654..c4eb2b1c920 100644
--- a/gcc/c-family/c-common.c
+++ b/gcc/c-family/c-common.c
@@ -51,6 +51,8 @@ along with GCC; see the file COPYING3.  If not see
   #include "c-spellcheck.h"
   #include "selftest.h"
   #include "debug.h"
+#include "tree-vector-builder.h"
+#include "vec-perm-indices.h"
   cpp_reader *parse_in;    /* Declared in c-pragma.h.  */
@@ -383,6 +385,7 @@ const struct c_common_resword c_common_reswords[] =
 { "__builtin_has_attribute", RID_BUILTIN_HAS_ATTRIBUTE, 0 },
 { "__builtin_launder", RID_BUILTIN_LAUNDER, D_CXXONLY },
 { "__builtin_shuffle", RID_BUILTIN_SHUFFLE, 0 },
+  { "__builtin_shufflevector", RID_BUILTIN_SHUFFLEVECTOR, 0 },
 { "__builtin_tgmath", RID_BUILTIN_TGMATH, D_CONLY },
 { "__builtin_offsetof", RID_OFFSETOF, 0 },
 { "__builtin_types_compatible_p", RID_TYPES_COMPATIBLE_P, 
D_CONLY },
@@ -1108,6 +,142 @@ c_build_vec_perm_expr (location_t loc, tree 
v0, tree

v1, tree mask,
 return ret;
   }
+/* Build a VEC_PERM_EXPR if V0, V1 are not error_mark_nodes
+   and have vector types, V0 has the same element type as V1, and the
+   number of elements the result is that of MASK.  */
+tree

[PATCH][AArch32]: Correct sdot RTL on aarch32

2021-05-25 Thread Tamar Christina via Gcc-patches
Hi All,

The RTL Generated from dot_prod is invalid as operand3 cannot be
written to, it's a normal input.  For the expand it's just another operand
but the caller does not expect it to be written to.

Bootstrapped Regtested on arm-none-linux-gnueabihf and no issues.

Ok for master? and backport to GCC 11, 10, 9?

Thanks,
Tamar

gcc/ChangeLog:

* config/arm/neon.md (dot_prod): Drop statements.

--- inline copy of patch -- 
diff --git a/gcc/config/arm/neon.md b/gcc/config/arm/neon.md
index 
61d81646475ce3bf62ece2cec2faf0c1fe978ec1..9602e9993aeebf4ec620d105fd20f64498a3b851
 100644
--- a/gcc/config/arm/neon.md
+++ b/gcc/config/arm/neon.md
@@ -3067,13 +3067,7 @@ (define_expand "dot_prod"
 DOTPROD)
(match_operand:VCVTI 3 "register_operand")))]
   "TARGET_DOTPROD"
-{
-  emit_insn (
-gen_neon_dot (operands[3], operands[3], operands[1],
-operands[2]));
-  emit_insn (gen_rtx_SET (operands[0], operands[3]));
-  DONE;
-})
+)
 
 ;; Auto-vectorizer pattern for usdot
 (define_expand "usdot_prod"


-- 
diff --git a/gcc/config/arm/neon.md b/gcc/config/arm/neon.md
index 61d81646475ce3bf62ece2cec2faf0c1fe978ec1..9602e9993aeebf4ec620d105fd20f64498a3b851 100644
--- a/gcc/config/arm/neon.md
+++ b/gcc/config/arm/neon.md
@@ -3067,13 +3067,7 @@ (define_expand "dot_prod"
 		 DOTPROD)
 		(match_operand:VCVTI 3 "register_operand")))]
   "TARGET_DOTPROD"
-{
-  emit_insn (
-gen_neon_dot (operands[3], operands[3], operands[1],
- operands[2]));
-  emit_insn (gen_rtx_SET (operands[0], operands[3]));
-  DONE;
-})
+)
 
 ;; Auto-vectorizer pattern for usdot
 (define_expand "usdot_prod"



RE: [PATCH 1/4]middle-end Vect: Add support for dot-product where the sign for the multiplicant changes.

2021-05-25 Thread Tamar Christina via Gcc-patches
Hi Richi,

Here's a respun version of the patch.

Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.

Ok for master?

Thanks,
Tamar

gcc/ChangeLog:

* optabs.def (usdot_prod_optab): New.
* doc/md.texi: Document it and clarify other dot prod optabs.
* optabs-tree.h (enum optab_subtype): Add optab_vector_mixed_sign.
* optabs-tree.c (optab_for_tree_code): Support usdot_prod_optab.
* optabs.c (expand_widen_pattern_expr): Likewise.
* tree-cfg.c (verify_gimple_assign_ternary): Likewise.
* tree-vect-loop.c (vect_determine_dot_kind): New.
(vectorizable_reduction): Query dot-product kind.
* tree-vect-patterns.c (vect_supportable_direct_optab_p): Take optional
optab subtype.
(vect_joust_widened_type, vect_widened_op_tree): Optionally ignore
mismatch types.
(vect_recog_dot_prod_pattern): Support usdot_prod_optab.


> -Original Message-
> From: Richard Biener 
> Sent: Monday, May 10, 2021 2:29 PM
> To: Tamar Christina 
> Cc: gcc-patches@gcc.gnu.org; nd 
> Subject: RE: [PATCH 1/4]middle-end Vect: Add support for dot-product
> where the sign for the multiplicant changes.
> 
> On Mon, 10 May 2021, Tamar Christina wrote:
> 
> >
> >
> > > -Original Message-
> > > From: Richard Biener 
> > > Sent: Monday, May 10, 2021 12:40 PM
> > > To: Tamar Christina 
> > > Cc: gcc-patches@gcc.gnu.org; nd 
> > > Subject: RE: [PATCH 1/4]middle-end Vect: Add support for dot-product
> > > where the sign for the multiplicant changes.
> > >
> > > On Fri, 7 May 2021, Tamar Christina wrote:
> > >
> > > > Hi Richi,
> > > >
> > > > > -Original Message-
> > > > > From: Richard Biener 
> > > > > Sent: Friday, May 7, 2021 12:46 PM
> > > > > To: Tamar Christina 
> > > > > Cc: gcc-patches@gcc.gnu.org; nd 
> > > > > Subject: Re: [PATCH 1/4]middle-end Vect: Add support for
> > > > > dot-product where the sign for the multiplicant changes.
> > > > >
> > > > > On Wed, 5 May 2021, Tamar Christina wrote:
> > > > >
> > > > > > Hi All,
> > > > > >
> > > > > > This patch adds support for a dot product where the sign of
> > > > > > the multiplication arguments differ. i.e. one is signed and
> > > > > > one is unsigned but the precisions are the same.
> > > > > >
> > > > > > #define N 480
> > > > > > #define SIGNEDNESS_1 unsigned
> > > > > > #define SIGNEDNESS_2 signed
> > > > > > #define SIGNEDNESS_3 signed
> > > > > > #define SIGNEDNESS_4 unsigned
> > > > > >
> > > > > > SIGNEDNESS_1 int __attribute__ ((noipa)) f (SIGNEDNESS_1 int
> > > > > > res,
> > > > > > SIGNEDNESS_3 char *restrict a,
> > > > > >SIGNEDNESS_4 char *restrict b) {
> > > > > >   for (__INTPTR_TYPE__ i = 0; i < N; ++i)
> > > > > > {
> > > > > >   int av = a[i];
> > > > > >   int bv = b[i];
> > > > > >   SIGNEDNESS_2 short mult = av * bv;
> > > > > >   res += mult;
> > > > > > }
> > > > > >   return res;
> > > > > > }
> > > > > >
> > > > > > The operations are performed as if the operands were extended
> > > > > > to a 32-bit
> > > > > value.
> > > > > > As such this operation isn't valid if there is an intermediate
> > > > > > conversion to an unsigned value. i.e.  if SIGNEDNESS_2 is unsigned.
> > > > > >
> > > > > > more over if the signs of SIGNEDNESS_3 and SIGNEDNESS_4 are
> > > > > > flipped the same optab is used but the operands are flipped in
> > > > > > the optab
> > > > > expansion.
> > > > > >
> > > > > > To support this the patch extends the dot-product detection to
> > > > > > optionally ignore operands with different signs and stores
> > > > > > this information in the optab subtype which is now made a bitfield.
> > > > > >
> > > > > > The subtype can now additionally controls which optab an EXPR
> > > > > > can expand
> > > > > to.
> > > > > >
> > > > > > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> > > > > >
> > > > > > Ok for master?
> > > > > >
> > > > > > Thanks,
> > > > > > Tamar
> > > > > >
> > > > > > gcc/ChangeLog:
> > > > > >
> > > > > > * optabs.def (usdot_prod_optab): New.
> > > > > > * doc/md.texi: Document it.
> > > > > > * optabs-tree.c (optab_for_tree_code): Support
> usdot_prod_optab.
> > > > > > * optabs-tree.h (enum optab_subtype): Likewise.
> > > > > > * optabs.c (expand_widen_pattern_expr): Likewise.
> > > > > > * tree-cfg.c (verify_gimple_assign_ternary): Likewise.
> > > > > > * tree-vect-loop.c (vect_determine_dot_kind): New.
> > > > > > (vectorizable_reduction): Query dot-product kind.
> > > > > > * tree-vect-patterns.c (vect_supportable_direct_optab_p):
> > > > > > Take
> > > > > optional
> > > > > > optab subtype.
> > > > > > (vect_joust_widened_type, vect_widened_op_tree):
> Optionally
> > > > > ignore
> > > > > > mismatch types.
> > > > > > (vect_recog_dot_prod_pattern): Support usdot_prod_optab.
> > > > > >
> > > > > > --- inline copy of patch --
> > > > > > diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi index
> > 

RE: [PATCH 2/4]AArch64: Add support for sign differing dot-product usdot for NEON and SVE.

2021-05-25 Thread Tamar Christina via Gcc-patches
Hi Richard,

> -Original Message-
> From: Richard Sandiford 
> Sent: Monday, May 10, 2021 5:49 PM
> To: Tamar Christina 
> Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw
> ; Marcus Shawcroft
> ; Kyrylo Tkachov 
> Subject: Re: [PATCH 2/4]AArch64: Add support for sign differing dot-product
> usdot for NEON and SVE.
> 
> Tamar Christina  writes:
> > diff --git a/gcc/config/aarch64/aarch64-simd.md
> > b/gcc/config/aarch64/aarch64-simd.md
> > index
> >
> 4edee99051c4e2112b546becca47da32aae21df2..c9fb8e702732dd311fb10de1
> 7126
> > 432e2a63a32b 100644
> > --- a/gcc/config/aarch64/aarch64-simd.md
> > +++ b/gcc/config/aarch64/aarch64-simd.md
> > @@ -648,6 +648,22 @@ (define_expand "dot_prod"
> >DONE;
> >  })
> >
> > +;; Auto-vectorizer pattern for usdot
> > +(define_expand "usdot_prod"
> > +  [(set (match_operand:VS 0 "register_operand")
> > +   (plus:VS (unspec:VS [(match_operand: 1
> "register_operand")
> > +   (match_operand: 2 "register_operand")]
> > +UNSPEC_USDOT)
> > +   (match_operand:VS 3 "register_operand")))]
> > +  "TARGET_I8MM"
> > +{
> > +  emit_insn (
> > +gen_aarch64_usdot (operands[3], operands[3], operands[1],
> > +  operands[2]));
> > +  emit_move_insn (operands[0], operands[3]);
> > +  DONE;
> > +})
> 
> We can't modify operands[3] here; it's an input rather than an output.

Sorry, I should have noticed this.. I had blindly copied the existing pattern 
for dot-product and that looks like it's wrong.
I'll send a different patch to fix that one.

> 
> It looks like this would work with just the {…} removed though.
> The pattern will match aarch64_usdot on its own accord.
> 
> Even better would be to rename __builtin_aarch64_usdot… to
> __builtin_usdot_prod…, change its arguments so that they line up with the
> optabs, and change arm_neon.h to match.
> 
> > diff --git a/gcc/testsuite/gcc.target/aarch64/simd/vusdot-autovec.c
> > b/gcc/testsuite/gcc.target/aarch64/simd/vusdot-autovec.c
> > new file mode 100644
> > index
> >
> ..b99a945903c043c7410becaf6f
> 09
> > 496dd038410d
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/aarch64/simd/vusdot-autovec.c
> > @@ -0,0 +1,38 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-O3 -march=armv8.2-a+i8mm" } */
> > +
> > +#define N 480
> > +#define SIGNEDNESS_1 unsigned
> > +#define SIGNEDNESS_2 signed
> > +#define SIGNEDNESS_3 signed
> > +#define SIGNEDNESS_4 unsigned
> > +
> > +SIGNEDNESS_1 int __attribute__ ((noipa)) f (SIGNEDNESS_1 int res,
> > +SIGNEDNESS_3 char *restrict a,
> > +   SIGNEDNESS_4 char *restrict b)
> > +{
> > +  for (__INTPTR_TYPE__ i = 0; i < N; ++i)
> > +{
> > +  int av = a[i];
> > +  int bv = b[i];
> > +  SIGNEDNESS_2 short mult = av * bv;
> > +  res += mult;
> > +}
> > +  return res;
> > +}
> > +
> > +SIGNEDNESS_1 int __attribute__ ((noipa)) g (SIGNEDNESS_1 int res,
> > +SIGNEDNESS_3 char *restrict b,
> > +   SIGNEDNESS_4 char *restrict a)
> > +{
> > +  for (__INTPTR_TYPE__ i = 0; i < N; ++i)
> > +{
> > +  int av = a[i];
> > +  int bv = b[i];
> > +  SIGNEDNESS_2 short mult = av * bv;
> > +  res += mult;
> > +}
> > +  return res;
> > +}
> > +
> > +/* { dg-final { scan-assembler-times {\tusdot\t} 2 } } */
> > diff --git a/gcc/testsuite/gcc.target/aarch64/sve/vusdot-autovec.c
> > b/gcc/testsuite/gcc.target/aarch64/sve/vusdot-autovec.c
> > new file mode 100644
> > index
> >
> ..094dd51cea62e0ba05ec35056
> 57b
> > f05320e5fdbb
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/aarch64/sve/vusdot-autovec.c
> > @@ -0,0 +1,38 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-O3 -march=armv8.2-a+i8mm+sve" } */
> > +
> > +#define N 480
> > +#define SIGNEDNESS_1 unsigned
> > +#define SIGNEDNESS_2 signed
> > +#define SIGNEDNESS_3 signed
> > +#define SIGNEDNESS_4 unsigned
> > +
> > +SIGNEDNESS_1 int __attribute__ ((noipa)) f (SIGNEDNESS_1 int res,
> > +SIGNEDNESS_3 char *restrict a,
> > +   SIGNEDNESS_4 char *restrict b)
> > +{
> > +  for (__INTPTR_TYPE__ i = 0; i < N; ++i)
> > +{
> > +  int av = a[i];
> > +  int bv = b[i];
> > +  SIGNEDNESS_2 short mult = av * bv;
> > +  res += mult;
> > +}
> > +  return res;
> > +}
> > +
> > +SIGNEDNESS_1 int __attribute__ ((noipa)) g (SIGNEDNESS_1 int res,
> > +SIGNEDNESS_3 char *restrict b,
> > +   SIGNEDNESS_4 char *restrict a)
> > +{
> > +  for (__INTPTR_TYPE__ i = 0; i < N; ++i)
> > +{
> > +  int av = a[i];
> > +  int bv = b[i];
> > +  SIGNEDNESS_2 short mult = av * bv;
> > +  res += mult;
> > +}
> > +  return res;
> > +}
> > +
> > +/* { dg-final { scan-assembler-times {\tusdot\t} 2 } } */
> 
> Guess this is personal preference, but I don't think the SIGNEDNESS_*
> macros add anything when used like this.  I remember doing something
> similar in the past when including .c files from other .c files(!) in order to
> avoid cut-&-paste, but 

Re: [PATCH] c++: access for hidden friend of nested class template [PR100502]

2021-05-25 Thread Patrick Palka via Gcc-patches
On Mon, 24 May 2021, Jason Merrill wrote:

> On 5/21/21 4:35 PM, Patrick Palka wrote:
> > Here, during ahead of time access checking for the private member
> > EnumeratorRange::end_reached_ in the hidden friend f, we're triggering
> > the the assert in enforce_access that verifies we're not trying to add a
> > dependent access check to TI_DEFERRED_ACCESS_CHECKS.
> > 
> > The special thing about this class member access expression is that it's
> > considered to be non-type-dependent (so finish_class_member_access_expr
> > doesn't exit early at template parse time), and then accessible_p
> > rejects the access (so we don't exit early from enforce access either,
> > and end up triggering the assert).  I think we're correct to reject it
> > because a hidden friend is not a member function, so [class.access.nest]
> > doesn't apply, and also a hidden friend of a nested class is not a
> > friend of the enclosing class.  (Note that Clang accepts the testcase
> > and MSVC and ICC reject it.)
> 
> Hmm, I think you're right, but that seems inconsistent with the change (long
> ago) to give nested classes access to members of the enclosing class.

I guess the question is whether a hidden friend is considered to be a
class member for sake of access checking.  Along that note, I noticed
Clang/GCC/MSVC/ICC all accept the access of A::f in:

  struct A {
  protected:
static void f();
  };

  struct B : A {
friend void g() { A::f(); }
  };

But arguably this is valid iff g is considered to be a member of B.

If we adjust the above example to define the friend g at namespace
scope:

  struct A {
  protected:
static void f();
  };

  struct B : A {
friend void g();
  };

  void g() { A::f(); }

then GCC/MSVC/ICC accept and Clang rejects.  But this second example is
definitely invalid since it's just a special case of the example in
[class.protected], which says:

  void fr() {
...
B::j = 5; // error: not a friend of naming class B
...
  }

> 
> > This patch relaxes the problematic assert in enforce_access to check
> > dependent_scope_p instead of uses_template_parms, which is the more
> > accurate notion of dependence we care about.
> 
> Agreed.
> 
> > This change alone is
> > sufficient to fix the ICE, but we now end up diagnosing each access
> > twice, once at substitution time and again from TI_DEFERRED_ACCESS_CHECKS.
> > So this patch additionally disables ahead of time access checking
> > during the call to lookup_member from finish_class_member_access_expr;
> > we're going to check the same access again at substitution time anyway.
> 
> That seems undesirable; it's better to diagnose when parsing if we can. Why is
> it going on TI_DEFERRED_ACCESS_CHECKS after we already checked it?

At parse time, a negative accessible_p answer only means "maybe not
accessible" rather than "definitely not accessible", since access
may still be granted to some specialization of the current template
via a friend declaration.  I think we'd need to beef up accessible_p a
bit before we can begin diagnosing accesses at template parse time.
This probably wouldn't be too hairy to implement; I'll look into it.

For now, would the assert relaxation in enforce_access be OK for
trunk/11?

> 
> > Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
> > trunk?  For GCC 11, should we just backport the enforce_access hunk?
> > 
> > PR c++/100502
> > 
> > gcc/cp/ChangeLog:
> > 
> > * semantics.c (enforce_access): Relax assert about the type
> > depedence of the DECL_CONTEXT of the declaration.
> > * typeck.c (finish_class_member_access_expr): Disable ahead
> > of time access checking during the member lookup.
> > 
> > gcc/testsuite/ChangeLog:
> > 
> > * g++.dg/template/access37.C: New test.
> > * g++.dg/template/access37a.C: New test.
> > ---
> >   gcc/cp/semantics.c|  2 +-
> >   gcc/cp/typeck.c   |  6 ++
> >   gcc/testsuite/g++.dg/template/access37.C  | 26 +++
> >   gcc/testsuite/g++.dg/template/access37a.C |  6 ++
> >   4 files changed, 39 insertions(+), 1 deletion(-)
> >   create mode 100644 gcc/testsuite/g++.dg/template/access37.C
> >   create mode 100644 gcc/testsuite/g++.dg/template/access37a.C
> > 
> > diff --git a/gcc/cp/semantics.c b/gcc/cp/semantics.c
> > index 0d590c318fb..0de14316bba 100644
> > --- a/gcc/cp/semantics.c
> > +++ b/gcc/cp/semantics.c
> > @@ -365,7 +365,7 @@ enforce_access (tree basetype_path, tree decl, tree
> > diag_decl,
> >check here.  */
> > gcc_assert (!uses_template_parms (decl));
> > if (TREE_CODE (decl) == FIELD_DECL)
> > - gcc_assert (!uses_template_parms (DECL_CONTEXT (decl)));
> > + gcc_assert (!dependent_scope_p (DECL_CONTEXT (decl)));
> > /* Defer this access check until instantiation time.  */
> > deferred_access_check access_check;
> > diff --git a/gcc/cp/typeck.c b/gcc/cp/typeck.c
> > index 

Re: RFA: fix gcc.dg/tree-ssa/popcount4l.c 16 bit failure, improve 64 bit popcount expansion for 32 bit target

2021-05-25 Thread Richard Biener via Gcc-patches
On Mon, May 17, 2021 at 3:18 PM Joern Wolfgang Rennecke
 wrote:
>
> Attached is the updated version of the patch.
> Bootstrapped and regtested on x86_64-pc-linux-gnu.
>
> OK to apply?

+   machine_mode m = mode_for_size ((prec + 1) / 2, MODE_INT, 1).require ();
+   int half_prec = GET_MODE_PRECISION (as_a  (m));
+   if (m != TYPE_MODE (type))

so I'd rather see it as

  opt_machine_mode m = mode_for_size ((prec + 1) / 2, MODE_INT, 1);
  int half_prec = 8;
  if (m.exists ()
  && m.require () != TYPE_MODE (type))
   {
  half_prec = GET_MODE_PRECISION (as_a  (m));
  half_type = build_nonstandard_integer_type (half_prec, 1);
   }

to avoid .require () on a possibly non-existent mode.  Maybe there's a more
clever way to formulate the mode comparison.  CCed Richard for this.

OK with such a change.

Thanks,
Richard.


Re: [PATCH v4 04/12] Remove MAX_BITSIZE_MODE_ANY_INT

2021-05-25 Thread Richard Biener via Gcc-patches
On Tue, May 18, 2021 at 9:16 PM H.J. Lu  wrote:
>
> It is only defined for i386 and everyone uses the default:
>
>  #define MAX_BITSIZE_MODE_ANY_INT (64*BITS_PER_UNIT)
>
> Whatever problems we had before, they have been fixed now.

So I don't have a strong recollection here apart from memory usage
considerations with wide-int (possibly fixed by all the trailing-wide-int stuff
we now have).  So I'm fine if the target maintainer is - but then we probably
should remove all vestiges of non-default MAX_BITSIZE_MODE_ANY_INT,
or do we want to keep it just in case?

Thanks,
Richard.

> * config/i386/i386-modes.def (MAX_BITSIZE_MODE_ANY_INT): Removed.
> ---
>  gcc/config/i386/i386-modes.def | 15 +++
>  1 file changed, 3 insertions(+), 12 deletions(-)
>
> diff --git a/gcc/config/i386/i386-modes.def b/gcc/config/i386/i386-modes.def
> index dbddfd8e48f..4e7014be034 100644
> --- a/gcc/config/i386/i386-modes.def
> +++ b/gcc/config/i386/i386-modes.def
> @@ -107,19 +107,10 @@ INT_MODE (XI, 64);
>  PARTIAL_INT_MODE (HI, 16, P2QI);
>  PARTIAL_INT_MODE (SI, 32, P2HI);
>
> -/* Mode used for signed overflow checking of TImode.  As
> -   MAX_BITSIZE_MODE_ANY_INT is only 160, wide-int.h reserves only that
> -   rounded up to multiple of HOST_BITS_PER_WIDE_INT bits in wide_int etc.,
> -   so OImode is too large.  For the overflow checking we actually need
> -   just 1 or 2 bits beyond TImode precision.  Use 160 bits to have
> -   a multiple of 32.  */
> +/* Mode used for signed overflow checking of TImode.  For the overflow
> +   checking we actually need just 1 or 2 bits beyond TImode precision.
> +   Use 160 bits to have a multiple of 32.  */
>  PARTIAL_INT_MODE (OI, 160, POI);
>
> -/* Keep the OI and XI modes from confusing the compiler into thinking
> -   that these modes could actually be used for computation.  They are
> -   only holders for vectors during data movement.  Include POImode precision
> -   though.  */
> -#define MAX_BITSIZE_MODE_ANY_INT (160)
> -
>  /* The symbol Pmode stands for one of the above machine modes (usually 
> SImode).
> The tm.h file specifies which one.  It is not a distinct mode.  */
> --
> 2.31.1
>


Re: [PATCH] Add 3 target hooks for memset

2021-05-25 Thread Richard Biener via Gcc-patches
On Thu, May 20, 2021 at 10:50 PM H.J. Lu  wrote:
>
> On Wed, May 19, 2021 at 5:55 AM H.J. Lu  wrote:
> >
> > On Wed, May 19, 2021 at 2:25 AM Richard Biener
> >  wrote:
> > >
> > > On Tue, May 18, 2021 at 9:16 PM H.J. Lu  wrote:
> > > >
> > > > Add TARGET_READ_MEMSET_VALUE and TARGET_GEN_MEMSET_VALUE to support
> > > > target instructions to duplicate QImode value to TImode/OImode/XImode
> > > > value for memmset.
> > > >
> > > > PR middle-end/90773
> > > > * builtins.c (builtin_memset_read_str): Call
> > > > targetm.read_memset_value.
> > > > (builtin_memset_gen_str): Call targetm.gen_memset_value.
> > > > * target.def (read_memset_value): New hook.
> > > > (gen_memset_value): Likewise.
> > > > * targhooks.c: Inclue "builtins.h".
> > > > (default_read_memset_value): New function.
> > > > (default_gen_memset_value): Likewise.
> > > > * targhooks.h (default_read_memset_value): New prototype.
> > > > (default_gen_memset_value): Likewise.
> > > > * doc/tm.texi.in: Add TARGET_READ_MEMSET_VALUE and
> > > > TARGET_GEN_MEMSET_VALUE hooks.
> > > > * doc/tm.texi: Regenerated.
> > > > ---
> > > >  gcc/builtins.c | 47 --
> > > >  gcc/doc/tm.texi| 16 +
> > > >  gcc/doc/tm.texi.in |  4 
> > > >  gcc/target.def | 20 +
> > > >  gcc/targhooks.c| 56 ++
> > > >  gcc/targhooks.h|  4 
> > > >  6 files changed, 104 insertions(+), 43 deletions(-)
> > > >
> > > > diff --git a/gcc/builtins.c b/gcc/builtins.c
> > > > index e1b284846b1..f78a36478ef 100644
> > > > --- a/gcc/builtins.c
> > > > +++ b/gcc/builtins.c
> > > > @@ -6584,24 +6584,11 @@ expand_builtin_strncpy (tree exp, rtx target)
> > > > previous iteration.  */
> > > >
> > > >  rtx
> > > > -builtin_memset_read_str (void *data, void *prevp,
> > > > +builtin_memset_read_str (void *data, void *prev,
> > > >  HOST_WIDE_INT offset ATTRIBUTE_UNUSED,
> > > >  scalar_int_mode mode)
> > > >  {
> > > > -  by_pieces_prev *prev = (by_pieces_prev *) prevp;
> > > > -  if (prev != nullptr && prev->data != nullptr)
> > > > -{
> > > > -  /* Use the previous data in the same mode.  */
> > > > -  if (prev->mode == mode)
> > > > -   return prev->data;
> > > > -}
> > > > -
> > > > -  const char *c = (const char *) data;
> > > > -  char *p = XALLOCAVEC (char, GET_MODE_SIZE (mode));
> > > > -
> > > > -  memset (p, *c, GET_MODE_SIZE (mode));
> > > > -
> > > > -  return c_readstr (p, mode);
> > > > +  return targetm.read_memset_value ((const char *) data, prev, mode);
> > > >  }
> > > >
> > > >  /* Callback routine for store_by_pieces.  Return the RTL of a register
> > > > @@ -6611,37 +6598,11 @@ builtin_memset_read_str (void *data, void 
> > > > *prevp,
> > > > nullptr, it has the RTL info from the previous iteration.  */
> > > >
> > > >  static rtx
> > > > -builtin_memset_gen_str (void *data, void *prevp,
> > > > +builtin_memset_gen_str (void *data, void *prev,
> > > > HOST_WIDE_INT offset ATTRIBUTE_UNUSED,
> > > > scalar_int_mode mode)
> > > >  {
> > > > -  rtx target, coeff;
> > > > -  size_t size;
> > > > -  char *p;
> > > > -
> > > > -  by_pieces_prev *prev = (by_pieces_prev *) prevp;
> > > > -  if (prev != nullptr && prev->data != nullptr)
> > > > -{
> > > > -  /* Use the previous data in the same mode.  */
> > > > -  if (prev->mode == mode)
> > > > -   return prev->data;
> > > > -
> > > > -  target = simplify_gen_subreg (mode, prev->data, prev->mode, 0);
> > > > -  if (target != nullptr)
> > > > -   return target;
> > > > -}
> > > > -
> > > > -  size = GET_MODE_SIZE (mode);
> > > > -  if (size == 1)
> > > > -return (rtx) data;
> > > > -
> > > > -  p = XALLOCAVEC (char, size);
> > > > -  memset (p, 1, size);
> > > > -  coeff = c_readstr (p, mode);
> > > > -
> > > > -  target = convert_to_mode (mode, (rtx) data, 1);
> > > > -  target = expand_mult (mode, target, coeff, NULL_RTX, 1);
> > > > -  return force_reg (mode, target);
> > > > +  return targetm.gen_memset_value ((rtx) data, prev, mode);
> > > >  }
> > > >
> > > >  /* Expand expression EXP, which is a call to the memset builtin.  
> > > > Return
> > > > diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
> > > > index 85ea9395560..51385044e76 100644
> > > > --- a/gcc/doc/tm.texi
> > > > +++ b/gcc/doc/tm.texi
> > > > @@ -11868,6 +11868,22 @@ This function prepares to emit a conditional 
> > > > comparison within a sequence
> > > >   @var{bit_code} is @code{AND} or @code{IOR}, which is the op on the 
> > > > compares.
> > > >  @end deftypefn
> > > >
> > > > +@deftypefn {Target Hook} rtx TARGET_READ_MEMSET_VALUE (const char 
> > > > *@var{c}, void *@var{prev}, scalar_int_mode @var{mode})
> > > > +This function returns the RTL of a constant 

Re: [RFC PATCH] i386: Enable auto-vectorization for 32bit modes (+ testcases)

2021-05-25 Thread Richard Biener via Gcc-patches
On Fri, May 21, 2021 at 5:00 PM Uros Bizjak via Gcc-patches
 wrote:
>
> Here it is, the patch that enables auto-vectorization for 32bit modes.
>
> Sent as RFC, because the patch fails some vectorizer scans, as it
> obviously enables more vectorization to happen:
>
> Running target unix
> FAIL: gcc.dg/vect/pr71264.c -flto -ffat-lto-objects  scan-tree-dump
> vect "vectorized 1 loops in function"
> FAIL: gcc.dg/vect/pr71264.c scan-tree-dump vect "vectorized 1 loops in 
> function"
> FAIL: gcc.dg/vect/slp-28.c -flto -ffat-lto-objects
> scan-tree-dump-times vect "vectorized 1 loops" 1
> FAIL: gcc.dg/vect/slp-28.c -flto -ffat-lto-objects
> scan-tree-dump-times vect "vectorizing stmts using SLP" 1
> FAIL: gcc.dg/vect/slp-28.c scan-tree-dump-times vect "vectorized 1 loops" 1
> FAIL: gcc.dg/vect/slp-28.c scan-tree-dump-times vect "vectorizing
> stmts using SLP" 1
> FAIL: gcc.dg/vect/slp-3.c -flto -ffat-lto-objects
> scan-tree-dump-times vect "vectorized 3 loops" 1
> FAIL: gcc.dg/vect/slp-3.c -flto -ffat-lto-objects
> scan-tree-dump-times vect "vectorizing stmts using SLP" 3
> FAIL: gcc.dg/vect/slp-3.c scan-tree-dump-times vect "vectorized 3 loops" 1
> FAIL: gcc.dg/vect/slp-3.c scan-tree-dump-times vect "vectorizing stmts
> using SLP" 3
>
>
> Running target unix/-m32
> FAIL: gcc.dg/vect/no-vfa-vect-101.c scan-tree-dump-times vect "can't
> determine dependence" 1
> FAIL: gcc.dg/vect/no-vfa-vect-102.c scan-tree-dump-times vect
> "possible dependence between data-refs" 1
> FAIL: gcc.dg/vect/no-vfa-vect-102a.c scan-tree-dump-times vect
> "possible dependence between data-refs" 1
> FAIL: gcc.dg/vect/no-vfa-vect-37.c scan-tree-dump-times vect "can't
> determine dependence" 2
> FAIL: gcc.dg/vect/pr71264.c -flto -ffat-lto-objects  scan-tree-dump
> vect "vectorized 1 loops in function"
> FAIL: gcc.dg/vect/pr71264.c scan-tree-dump vect "vectorized 1 loops in 
> function"
> FAIL: gcc.dg/vect/slp-28.c -flto -ffat-lto-objects
> scan-tree-dump-times vect "vectorized 1 loops" 1
> FAIL: gcc.dg/vect/slp-28.c -flto -ffat-lto-objects
> scan-tree-dump-times vect "vectorizing stmts using SLP" 1
> FAIL: gcc.dg/vect/slp-28.c scan-tree-dump-times vect "vectorized 1 loops" 1
> FAIL: gcc.dg/vect/slp-28.c scan-tree-dump-times vect "vectorizing
> stmts using SLP" 1
> FAIL: gcc.dg/vect/slp-3.c -flto -ffat-lto-objects
> scan-tree-dump-times vect "vectorized 3 loops" 1
> FAIL: gcc.dg/vect/slp-3.c -flto -ffat-lto-objects
> scan-tree-dump-times vect "vectorizing stmts using SLP" 3
> FAIL: gcc.dg/vect/slp-3.c scan-tree-dump-times vect "vectorized 3 loops" 1
> FAIL: gcc.dg/vect/slp-3.c scan-tree-dump-times vect "vectorizing stmts
> using SLP" 3
> FAIL: gcc.dg/vect/vect-104.c -flto -ffat-lto-objects
> scan-tree-dump-times vect "possible dependence between data-refs" 1
> FAIL: gcc.dg/vect/vect-104.c scan-tree-dump-times vect "possible
> dependence between data-refs" 1

Yeah, it's a bit iffy to adjust expectations.  If there's a way to
disable vectorization
for 32bit modes on x86 that might be a way to "fix" them, otherwise we're
lacking a way to query for available vector modes/sizes in the dejagnu vect
targets.  There's available_vector_sizes but it's implementation is hardly
complete nor is size the only important thing (FP vs. INT).  At least
one could add a vect32 predicate similar to the existing vect64 one.

Richard.


> Please also note that V4QI and V2HI modes do not use MMX registers, so
> auto-vectorization can also be enabled on 32bit x86 targets.
>
> Uros.


Re: [PATCHv2] Add a couple of A?CST1:CST2 match and simplify optimizations

2021-05-25 Thread Richard Biener via Gcc-patches
On Sun, May 23, 2021 at 12:03 PM apinski--- via Gcc-patches
 wrote:
>
> From: Andrew Pinski 
>
> Instead of some of the more manual optimizations inside phi-opt,
> it would be good idea to do a lot of the heavy lifting inside match
> and simplify instead. In the process, this moves the three simple
> A?CST1:CST2 (where CST1 or CST2 is zero) simplifications.
>
> OK? Boostrapped and tested on x86_64-linux-gnu with no regressions.
>
> Differences from V1:
> * Use bit_xor 1 instead of bit_not to fix the problem with boolean types
> which are not 1 bit precision.

OK.

Thanks,
Richard.

> Thanks,
> Andrew Pinski
>
> gcc:
> * match.pd (A?CST1:CST2): Add simplifcations for A?0:+-1, A?+-1:0,
> A?POW2:0 and A?0:POW2.
> ---
>  gcc/match.pd | 41 +
>  1 file changed, 41 insertions(+)
>
> diff --git a/gcc/match.pd b/gcc/match.pd
> index 1fc6b7b1557..ad6b057c56d 100644
> --- a/gcc/match.pd
> +++ b/gcc/match.pd
> @@ -3711,6 +3711,47 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> (if (integer_all_onesp (@1) && integer_zerop (@2))
>  @0
>
> +/* A few simplifications of "a ? CST1 : CST2". */
> +/* NOTE: Only do this on gimple as the if-chain-to-switch
> +   optimization depends on the gimple to have if statements in it. */
> +#if GIMPLE
> +(simplify
> + (cond @0 INTEGER_CST@1 INTEGER_CST@2)
> + (switch
> +  (if (integer_zerop (@2))
> +   (switch
> +/* a ? 1 : 0 -> a if 0 and 1 are integral types. */
> +(if (integer_onep (@1))
> + (convert (convert:boolean_type_node @0)))
> +/* a ? -1 : 0 -> -a. */
> +(if (integer_all_onesp (@1))
> + (negate (convert (convert:boolean_type_node @0
> +/* a ? powerof2cst : 0 -> a << (log2(powerof2cst)) */
> +(if (!POINTER_TYPE_P (type) && integer_pow2p (@1))
> + (with {
> +   tree shift = build_int_cst (integer_type_node, tree_log2 (@1));
> +  }
> +  (lshift (convert (convert:boolean_type_node @0)) { shift; })
> +  (if (integer_zerop (@1))
> +   (with {
> +  tree booltrue = constant_boolean_node (true, boolean_type_node);
> +}
> +(switch
> + /* a ? 0 : 1 -> !a. */
> + (if (integer_onep (@2))
> +  (convert (bit_xor (convert:boolean_type_node @0) { booltrue; } )))
> + /* a ? -1 : 0 -> -(!a). */
> + (if (integer_all_onesp (@2))
> +  (negate (convert (bit_xor (convert:boolean_type_node @0) { booltrue; } 
> 
> + /* a ? powerof2cst : 0 -> (!a) << (log2(powerof2cst)) */
> + (if (!POINTER_TYPE_P (type) && integer_pow2p (@2))
> +  (with {
> +   tree shift = build_int_cst (integer_type_node, tree_log2 (@2));
> +   }
> +   (lshift (convert (bit_xor (convert:boolean_type_node @0) { booltrue; 
> } ))
> +{ shift; }
> +#endif
> +
>  /* Simplification moved from fold_cond_expr_with_comparison.  It may also
> be extended.  */
>  /* This pattern implements two kinds simplification:
> --
> 2.17.1
>


Re: [PATCHv2] Optimize x < 0 ? ~y : y to (x >> 31) ^ y in match.pd

2021-05-25 Thread Richard Biener via Gcc-patches
On Mon, May 24, 2021 at 3:27 AM apinski--- via Gcc-patches
 wrote:
>
> From: Andrew Pinski 
>
> This copies the optimization that is done in phiopt for
> "x < 0 ? ~y : y to (x >> 31) ^ y" into match.pd. The code
> for phiopt is kept around until phiopt uses match.pd (which
> I am working towards).
>
> Note the original testcase is now optimized early on and I added a
> new testcase to optimize during phiopt.
>
> OK?  Bootstrapped and tested on x86_64-linux-gnu with no regressions.
>
> Thanks,
> Andrew Pinski
>
> Differences from v1:
> V2: Add check for integeral type to make sure vector types are not done.
>
> gcc:
> * match.pd (x < 0 ? ~y : y): New patterns.
>
> gcc/testsuite:
> * gcc.dg/tree-ssa/pr96928.c: Update test for slightly different IR.
> * gcc.dg/tree-ssa/pr96928-1.c: New testcase.
> ---
>  gcc/match.pd  | 32 +++
>  gcc/testsuite/gcc.dg/tree-ssa/pr96928-1.c | 48 +++
>  gcc/testsuite/gcc.dg/tree-ssa/pr96928.c   |  7 +++-
>  3 files changed, 85 insertions(+), 2 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr96928-1.c
>
> diff --git a/gcc/match.pd b/gcc/match.pd
> index ad6b057c56d..dd730814942 100644
> --- a/gcc/match.pd
> +++ b/gcc/match.pd
> @@ -4875,6 +4875,38 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>(cmp (bit_and@2 @0 integer_pow2p@1) @1)
>(icmp @2 { build_zero_cst (TREE_TYPE (@0)); })))
>
> +(for cmp (ge lt)
> +/* x < 0 ? ~y : y into (x >> (prec-1)) ^ y. */
> +/* x >= 0 ? ~y : y into ~((x >> (prec-1)) ^ y). */
> + (simplify
> +  (cond (cmp @0 integer_zerop) (bit_not @1) @1)
> +   (if (INTEGRAL_TYPE_P (type)
> +   && INTEGRAL_TYPE_P (TREE_TYPE (@0))
> +&& !TYPE_UNSIGNED (TREE_TYPE (@0))
> +&& TYPE_PRECISION (TREE_TYPE (@0)) == TYPE_PRECISION (type))
> +(with
> + {
> +   tree shifter = build_int_cst (integer_type_node, TYPE_PRECISION 
> (type) - 1);
> + }
> +(if (cmp == LT_EXPR)
> + (bit_xor (convert (rshift @0 {shifter;})) @1)
> + (bit_not (bit_xor (convert (rshift @0 {shifter;})) @1))
> +/* x < 0 ? y : ~y into ~((x >> (prec-1)) ^ y). */
> +/* x >= 0 ? y : ~y into (x >> (prec-1)) ^ y. */
> + (simplify
> +  (cond (cmp @0 integer_zerop) @1 (bit_not @1))
> +   (if (INTEGRAL_TYPE_P (type)
> +   && INTEGRAL_TYPE_P (TREE_TYPE (@0))
> +&& !TYPE_UNSIGNED (TREE_TYPE (@0))
> +&& TYPE_PRECISION (TREE_TYPE (@0)) == TYPE_PRECISION (type))
> +(with
> + {
> +   tree shifter = build_int_cst (integer_type_node, TYPE_PRECISION 
> (type) - 1);
> + }
> +(if (cmp == GE_EXPR)
> + (bit_xor (convert (rshift @0 {shifter;})) @1)
> + (bit_not (bit_xor (convert (rshift @0 {shifter;})) @1)))

I wonder if it makes sense to support

 (for cmp (ge lt)
  (cond:c (cmp @0 integer_zerop) (bit_not @1) @1)
  ...

similar to how we support (cmp:c ...), so (cond:c (cmp ...) @0 @1) would
lower to

  (cond (cmp ..) @0 @1)

and

  (cond (invert_tree_comparison (cmp) ..) @1 @0)

with the caveat that invert_tree_comparison doesn't work for all compares
with HONOR_NANS (and thus compares ever matching FP codes).  Thus the
implementation might be a bit tricky.

Anyway, the patch looks OK to me, we can ponder over match.pd extensions
as followup.

Thanks,
Richard.

>  /* If we have (A & C) != 0 ? D : 0 where C and D are powers of 2,
> convert this into a shift followed by ANDing with D.  */
>  (simplify
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr96928-1.c 
> b/gcc/testsuite/gcc.dg/tree-ssa/pr96928-1.c
> new file mode 100644
> index 000..a2770e5e896
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr96928-1.c
> @@ -0,0 +1,48 @@
> +/* PR tree-optimization/96928 */
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-phiopt2" } */
> +/* { dg-final { scan-tree-dump-times " = a_\[0-9]*\\\(D\\\) >> " 5 "phiopt2" 
> } } */
> +/* { dg-final { scan-tree-dump-times " = ~c_\[0-9]*\\\(D\\\);" 1 "phiopt2" } 
> } */
> +/* { dg-final { scan-tree-dump-times " = ~" 1 "phiopt2" } } */
> +/* { dg-final { scan-tree-dump-times " = \[abc_0-9\\\(\\\)D]* \\\^ " 5 
> "phiopt2" } } */
> +/* { dg-final { scan-tree-dump-not "a < 0" "phiopt2" } } */
> +
> +int
> +foo (int a)
> +{
> +  if (a < 0)
> +return ~a;
> +  return a;
> +}
> +
> +int
> +bar (int a, int b)
> +{
> +  if (a < 0)
> +return ~b;
> +  return b;
> +}
> +
> +unsigned
> +baz (int a, unsigned int b)
> +{
> +  if (a < 0)
> +return ~b;
> +  return b;
> +}
> +
> +unsigned
> +qux (int a, unsigned int c)
> +{
> +  if (a >= 0)
> +return ~c;
> +  return c;
> +}
> +
> +int
> +corge (int a, int b)
> +{
> +  if (a >= 0)
> +return b;
> +  return ~b;
> +}
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr96928.c 
> b/gcc/testsuite/gcc.dg/tree-ssa/pr96928.c
> index 20913572691..e8fd82fc26e 100644
> --- a/gcc/testsuite/gcc.dg/tree-ssa/pr96928.c
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr96928.c
> @@ -1,8 +1,11 @@
>  /* PR tree-optimization/96928 */
>  /* { dg-do 

Re: [PATCH] c++tools: Include for exit [PR100731]

2021-05-25 Thread Jakub Jelinek via Gcc-patches
On Tue, May 25, 2021 at 11:18:55AM +0200, Richard Biener via Gcc-patches wrote:
> On Tue, May 25, 2021 at 11:15 AM Jakub Jelinek via Gcc-patches
>  wrote:
> >
> > Hi!
> >
> > This TU uses exit, but doesn't include  or  and relies
> > on some other header to include it indirectly, which apparently doesn't
> > happen on reporter's host.
> >
> > The other  headers aren't guarded either and we rely on a compiler
> > capable of C++11, so maybe we can rely on  being around
> > unconditionally.
> >
> > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk/11?
> 
> OK, but as the reporter notes none of the functions pulled by
> c* are std:: qualified at calls ... is this not a requirement?

Jonathan said that too.  But the functions from the other  headers
are also used without std:: qualification.
In gcc/ I think we typically include both  and  headers and use
the global namespace entrypoints, but perhaps we should just use std::
qualification in c++tools.

> > 2021-05-25  Jakub Jelinek  
> >
> > PR bootstrap/100731
> > * server.cc: Include .
> >
> > --- c++tools/server.cc.jj   2021-05-24 14:20:01.905748402 +0200
> > +++ c++tools/server.cc  2021-05-24 14:24:29.760813389 +0200
> > @@ -29,6 +29,7 @@ along with GCC; see the file COPYING3.
> >  #include 
> >  #include 
> >  #include 
> > +#include 
> >  // OS
> >  #include 
> >  #include 

Jakub



Re: [PATCH 8/11] use xxx_no_warning APIs in Objective-C

2021-05-25 Thread Iain Sandoe via Gcc-patches

Hi Martin

Martin Sebor via Gcc-patches  wrote:


The attached patch replaces the uses of TREE_NO_WARNING in
the Objective-C front end.



I’ve been gradually trying to improve/add locations in the ObjC stuff.

To that end, I wonder if it might be worth considering always supplying
the intended masked warning (rather than omitting this when the node
currently has no location).  I guess that would mean that the setter/getter
would need to determine if there was some suitable location (more work
but better abstraction).

This would mean that an improvement/addition to location would automatically
gain the improvement in masked warnings.

This is not an objection (the patch is OK for ObjC as is) .. just a question,

thanks
Iain



Re: [PATCH] Use match-and-simplify in phi-opt

2021-05-25 Thread Richard Biener via Gcc-patches
On Mon, May 24, 2021 at 4:09 AM apinski--- via Gcc-patches
 wrote:
>
> From: Andrew Pinski 
>
> To simplify PHI-OPT and future improvements to it in most
> (but not all) cases, using match-and-simplify simplifies how
> much code is needed to be added.
>
> This depends on the following two patches:
> https://gcc.gnu.org/pipermail/gcc-patches/2021-May/571033.html
> https://gcc.gnu.org/pipermail/gcc-patches/2021-May/571054.html
> As this patch removes those parts from phiopt.
>
> Note I will be looking to move two_value_replacement and
> value_replacement to match-and-simplify next.
>
> Note also there is one latent bug found while working
> on this: https://gcc.gnu.org/PR100733 .
>
> OK?  Bootstrapped and tested on x86_64-linux-gnu with no regressions and all 
> languages.
>
> Thanks,
> Andrew Pinski
>
> gcc/ChangeLog:
>
> * tree-ssa-phiopt.c: Include explow.h.
> Fix up comment before the pass struction.
> (conditional_replacement): Remove function.
> (xor_replacement): Remove function.
> (match_simplify_replacement): New function.
> (tree_ssa_phiopt_worker): Don't call conditional_replacement
> or xor_replacement. Call match_simplify_replacement
> if everything else fails to happen.
> (block_with_single_simple_statement): New function.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.dg/tree-ssa/pr96928-1.c: Fix testcase for now that ~
> happens on the outside of the bit_xor.
> ---
>  gcc/testsuite/gcc.dg/tree-ssa/pr96928-1.c |   4 +-
>  gcc/tree-ssa-phiopt.c | 478 ++
>  2 files changed, 229 insertions(+), 253 deletions(-)
>
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr96928-1.c 
> b/gcc/testsuite/gcc.dg/tree-ssa/pr96928-1.c
> index a2770e5e896..2e86620da11 100644
> --- a/gcc/testsuite/gcc.dg/tree-ssa/pr96928-1.c
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr96928-1.c
> @@ -1,9 +1,9 @@
>  /* PR tree-optimization/96928 */
>  /* { dg-do compile } */
> -/* { dg-options "-O2 -fdump-tree-phiopt2" } */
> +/* { dg-options "-O2 -fdump-tree-phiopt2 -fdump-tree-optimized" } */
>  /* { dg-final { scan-tree-dump-times " = a_\[0-9]*\\\(D\\\) >> " 5 "phiopt2" 
> } } */
>  /* { dg-final { scan-tree-dump-times " = ~c_\[0-9]*\\\(D\\\);" 1 "phiopt2" } 
> } */
> -/* { dg-final { scan-tree-dump-times " = ~" 1 "phiopt2" } } */
> +/* { dg-final { scan-tree-dump-times " = ~" 1 "optimized" } } */
>  /* { dg-final { scan-tree-dump-times " = \[abc_0-9\\\(\\\)D]* \\\^ " 5 
> "phiopt2" } } */
>  /* { dg-final { scan-tree-dump-not "a < 0" "phiopt2" } } */
>
> diff --git a/gcc/tree-ssa-phiopt.c b/gcc/tree-ssa-phiopt.c
> index f133659a781..f7c82cf192f 100644
> --- a/gcc/tree-ssa-phiopt.c
> +++ b/gcc/tree-ssa-phiopt.c
> @@ -48,12 +48,11 @@ along with GCC; see the file COPYING3.  If not see
>  #include "tree-eh.h"
>  #include "gimple-fold.h"
>  #include "internal-fn.h"
> +#include "explow.h" /* For promote_mode. */
>
>  static unsigned int tree_ssa_phiopt_worker (bool, bool, bool);
>  static bool two_value_replacement (basic_block, basic_block, edge, gphi *,
>tree, tree);
> -static bool conditional_replacement (basic_block, basic_block,
> -edge, edge, gphi *, tree, tree);
>  static gphi *factor_out_conditional_conversion (edge, edge, gphi *, tree, 
> tree,
> gimple *);
>  static int value_replacement (basic_block, basic_block,
> @@ -62,8 +61,6 @@ static bool minmax_replacement (basic_block, basic_block,
> edge, edge, gphi *, tree, tree);
>  static bool abs_replacement (basic_block, basic_block,
>  edge, edge, gphi *, tree, tree);
> -static bool xor_replacement (basic_block, basic_block,
> -edge, edge, gphi *, tree, tree);
>  static bool spaceship_replacement (basic_block, basic_block,
>edge, edge, gphi *, tree, tree);
>  static bool cond_removal_in_popcount_clz_ctz_pattern (basic_block, 
> basic_block,
> @@ -71,6 +68,8 @@ static bool cond_removal_in_popcount_clz_ctz_pattern 
> (basic_block, basic_block,
>   tree, tree);
>  static bool cond_store_replacement (basic_block, basic_block, edge, edge,
> hash_set *);
> +static bool match_simplify_replacement (basic_block, basic_block,
> +   edge, edge, gimple_seq, bool, bool);
>  static bool cond_if_else_store_replacement (basic_block, basic_block, 
> basic_block);
>  static hash_set * get_non_trapping ();
>  static void replace_phi_edge_with_variable (basic_block, edge, gphi *, tree);
> @@ -319,7 +318,7 @@ tree_ssa_phiopt_worker (bool do_store_elim, bool 
> do_hoist_loads, bool early_p)
>
>   phi = single_non_singleton_phi_for_edges (phis, e1, e2);
>   if (!phi)
> -   continue;
> +   goto 

Re: [PATCH] c++: Avoid -Wunused-value false positives on nullptr passed to ellipsis [PR100666]

2021-05-25 Thread Jason Merrill via Gcc-patches

On 5/25/21 4:44 AM, Jakub Jelinek wrote:

When passing expressions with decltype(nullptr) type with side-effects to
ellipsis, we pass (void *)0 instead, but for the side-effects evaluate them
on the lhs of a COMPOUND_EXPR.  Unfortunately that means we warn about it
if the expression is a call to nodiscard marked function, even when the
result is really used, just needs to be transformed.


Please also test the case of a [[nodiscard]] function returning an empty 
class type.



Fixed by adding a warning_sentinel, bootstrapped/regtested on x86_64-linux
and i686-linux, ok for trunk?



2021-05-25  Jakub Jelinek  

PR c++/100666
* call.c (convert_arg_to_ellipsis): For expressions with NULLPTR_TYPE
and side-effects, temporarily disable -Wunused-result warning when
building COMPOUND_EXPR.

* g++.dg/cpp1z/nodiscard8.C: New test.

--- gcc/cp/call.c.jj2021-05-21 10:34:09.139562923 +0200
+++ gcc/cp/call.c   2021-05-24 18:36:35.041184496 +0200
@@ -8178,7 +8178,10 @@ convert_arg_to_ellipsis (tree arg, tsubs
  {
arg = mark_rvalue_use (arg);
if (TREE_SIDE_EFFECTS (arg))
-   arg = cp_build_compound_expr (arg, null_pointer_node, complain);
+   {
+ warning_sentinel w(warn_unused_result);
+ arg = cp_build_compound_expr (arg, null_pointer_node, complain);
+   }
else
arg = null_pointer_node;
  }
--- gcc/testsuite/g++.dg/cpp1z/nodiscard8.C.jj  2021-05-24 19:14:43.472158432 
+0200
+++ gcc/testsuite/g++.dg/cpp1z/nodiscard8.C 2021-05-24 19:13:54.959688504 
+0200
@@ -0,0 +1,11 @@
+// PR c++/100666
+// { dg-do compile { target c++11 } }
+
+[[nodiscard]] decltype(nullptr) bar ();
+extern void foo (...);
+
+void
+baz ()
+{
+  foo (bar ());// { dg-bogus "ignoring return value of '\[^\n\r]*', 
declared with attribute 'nodiscard'" }
+}

Jakub





[PATCH, OpenMP 5.0] Remove array section base-pointer mapping semantics, and other front-end adjustments (mainline trunk)

2021-05-25 Thread Chung-Lin Tang

Hi Jakub,
this is a version of this patch: 
https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570075.html
for mainline trunk.

This patch largely implements three pieces of functionality:

(1) Per discussion and clarification on the omp-lang mailing list,
standards conforming behavior for mapping array sections should *NOT* also map 
the base-pointer,
i.e for this code:

struct S { int *ptr; ... };
struct S s;
#pragma omp target enter data map(to: s.ptr[:100])

Currently we generate after gimplify:
#pragma omp target enter data map(struct:s [len: 1]) map(alloc:s.ptr [len: 8]) \
   map(to:*_1 [len: 400]) map(attach:s.ptr [bias: 
0])

which is deemed incorrect. After this patch, the gimplify results are now 
adjusted to:
#pragma omp target enter data map(to:*_1 [len: 400]) map(attach:s.ptr [bias: 0])
(the attach operation is still generated, and if s.ptr is already mapped prior, 
attachment will happen)

The correct way of achieving the base-pointer-also-mapped behavior would be to 
use:
#pragma omp target enter data map(to: s.ptr, s.ptr[:100])

This adjustment in behavior required a number of small adjustments here and 
there in gimplify, including
to accomodate map sequences for C++ references.

There is also a small Fortran front-end patch involved (hence CCing Tobias and 
fortran@).
The new gimplify processing changed behavior in handling 
GOMP_MAP_ALWAYS_POINTER maps such that
the libgomp.fortran/struct-elem-map-1.f90 regressed. It appeared that the 
Fortran FE was generating
a GOMP_MAP_ALWAYS_POINTER for array types, which didn't seem quite correct, and 
the pre-patch behavior
was removing this map anyways. I have a small change in 
trans-openmp.c:gfc_trans_omp_array_section
to not generate the map in this case, and so far no bad test results.

(2) The second part (though kind of related to the first above) are fixes in 
libgomp/target.c
to not overwrite attached pointers when handling device<->host copies, mainly for the 
"always" case.
This behavior is also noted in the 5.0 spec, but not yet properly coded before.

(3) The third is a set of changes to the C/C++ front-ends to extend the allowed 
component access syntax
in map clauses. This is actually mainly an effort to allow SPEC HPC to compile, 
so despite in the long
term the entire map clause syntax parsing is probably going to be revamped, 
we're still adding this in
for now. These changes are enabled for both OpenACC and OpenMP.

Tested on x86_64-linux with nvptx offloading with no regressions. This patch 
was merged and tested atop
of the prior submitted patches:
 (a) https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570886.html
 "[PATCH, OpenMP 5.0] Improve OpenMP target support for C++ (includes PR92120 
v3)"
 (b) https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570365.html
 "[PATCH, OpenMP 5.0] Implement relaxation of implicit map vs. existing device 
mappings (for mainline trunk)"
so you might queued this one later than those for review.

Thanks,
Chung-Lin

2021-05-25  Chung-Lin Tang  

gcc/c/ChangeLog:

* c-parser.c (struct omp_dim): New struct type for use inside
c_parser_omp_variable_list.
(c_parser_omp_variable_list): Allow multiple levels of array and
component accesses in array section base-pointer expression.
(c_parser_omp_clause_to): Set 'allow_deref' to true in call to
c_parser_omp_var_list_parens.
(c_parser_omp_clause_from): Likewise.
* c-typeck.c (handle_omp_array_sections_1): Extend allowed range
of base-pointer expressions involving INDIRECT/MEM/ARRAY_REF and
POINTER_PLUS_EXPR.
(c_finish_omp_clauses): Extend allowed ranged of expressions
involving INDIRECT/MEM/ARRAY_REF and POINTER_PLUS_EXPR.

gcc/cp/ChangeLog:

* parser.c (struct omp_dim): New struct type for use inside
cp_parser_omp_var_list_no_open.
(cp_parser_omp_var_list_no_open): Allow multiple levels of array and
component accesses in array section base-pointer expression.
(cp_parser_omp_all_clauses): Set 'allow_deref' to true in call to
cp_parser_omp_var_list for to/from clauses.
* semantics.c (handle_omp_array_sections_1): Extend allowed range
of base-pointer expressions involving INDIRECT/MEM/ARRAY_REF and
POINTER_PLUS_EXPR.
(handle_omp_array_sections): Adjust pointer map generation of
references.
(finish_omp_clauses): Extend allowed ranged of expressions
involving INDIRECT/MEM/ARRAY_REF and POINTER_PLUS_EXPR.

gcc/fortran/ChangeLog:

* trans-openmp.c (gfc_trans_omp_array_section): Do not generate
GOMP_MAP_ALWAYS_POINTER map for main array maps of ARRAY_TYPE type.

gcc/ChangeLog:

* gimplify.c (extract_base_bit_offset): Add 'tree *offsetp' parameter,
accomodate case where 'offset' return of get_inner_reference is
non-NULL.
(is_or_contains_p): Further robustify 

Re: [PATCH] c++/88601 - [C/C++] __builtin_shufflevector support

2021-05-25 Thread Jason Merrill via Gcc-patches

On 5/25/21 2:57 AM, Richard Biener wrote:

On Fri, 21 May 2021, Jason Merrill wrote:


On 5/21/21 8:33 AM, Richard Biener wrote:

This adds support for the clang __builtin_shufflevector extension to
the C and C++ frontends.  The builtin is lowered to VEC_PERM_EXPR.
Because VEC_PERM_EXPR does not support different sized vector inputs
or result or the special permute index of -1 (don't-care)
c_build_shufflevector applies lowering by widening inputs and output
to the widest vector, replacing -1 by a defined index and
subsetting the final vector if we produced a wider result than
desired.

Code generation thus can be sub-optimal, followup patches will
aim to fix that by recovering from part of the missing features
during RTL expansion and by relaxing the constraints of the GIMPLE
IL with regard to VEC_PERM_EXPR.

Bootstrapped on x86_64-unknown-linux-gnu, (re-)testing in progress.

Honza - you've filed PR88601, can you point me to testcases that
exercise common uses so we can look at code generation quality
and where time is spent best in improving things?

OK for trunk?

Thanks,
Richard.

2021-05-21  Richard Biener  

PR c++/88601
gcc/c-family/
  * c-common.c: Include tree-vector-builder.h and
  vec-perm-indices.h.
  (c_common_reswords): Add __builtin_shufflevector.
  (c_build_shufflevector): New funtion.
  * c-common.h (enum rid): Add RID_BUILTIN_SHUFFLEVECTOR.
  (c_build_shufflevector): Declare.

gcc/c/
  * c-decl.c (names_builtin_p): Handle RID_BUILTIN_SHUFFLEVECTOR.
  * c-parser.c (c_parser_postfix_expression): Likewise.

gcc/cp/
  * cp-objcp-common.c (names_builtin_p): Handle
  RID_BUILTIN_SHUFFLEVECTOR.
  * cp-tree.h (build_x_shufflevector): Declare.
  * parser.c (cp_parser_postfix_expression): Handle
  RID_BUILTIN_SHUFFLEVECTOR.
  * pt.c (tsubst_copy_and_build): Handle IFN_SHUFFLEVECTOR.
  * typeck.c (build_x_shufflevector): Build either a lowered
  VEC_PERM_EXPR or an unlowered shufflevector via a temporary
  internal function IFN_SHUFFLEVECTOR.

gcc/
  * internal-fn.c (expand_SHUFFLEVECTOR): Define.
  * internal-fn.def (SHUFFLEVECTOR): New.
  * internal-fn.h (expand_SHUFFLEVECTOR): Declare.

gcc/testsuite/
  * c-c++-common/builtin-shufflevector-2.c: New testcase.
  * c-c++-common/torture/builtin-shufflevector-1.c: Likewise.
  * g++.dg/builtin-shufflevector-1.C: Likewise.
  * g++.dg/builtin-shufflevector-2.C: Likewise.
---
   gcc/c-family/c-common.c   | 139 ++
   gcc/c-family/c-common.h   |   4 +-
   gcc/c/c-decl.c|   1 +
   gcc/c/c-parser.c  |  38 +
   gcc/cp/cp-objcp-common.c  |   1 +
   gcc/cp/cp-tree.h  |   3 +
   gcc/cp/parser.c   |  15 ++
   gcc/cp/pt.c   |   9 ++
   gcc/cp/typeck.c   |  36 +
   gcc/internal-fn.c |   6 +
   gcc/internal-fn.def   |   3 +
   gcc/internal-fn.h |   1 +
   .../c-c++-common/builtin-shufflevector-2.c|  18 +++
   .../torture/builtin-shufflevector-1.c |  49 ++
   .../g++.dg/builtin-shufflevector-1.C  |  18 +++
   .../g++.dg/builtin-shufflevector-2.C  |  12 ++
   16 files changed, 352 insertions(+), 1 deletion(-)
   create mode 100644 gcc/testsuite/c-c++-common/builtin-shufflevector-2.c
   create mode 100644
   gcc/testsuite/c-c++-common/torture/builtin-shufflevector-1.c
   create mode 100644 gcc/testsuite/g++.dg/builtin-shufflevector-1.C
   create mode 100644 gcc/testsuite/g++.dg/builtin-shufflevector-2.C

diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
index b7daa2e2654..c4eb2b1c920 100644
--- a/gcc/c-family/c-common.c
+++ b/gcc/c-family/c-common.c
@@ -51,6 +51,8 @@ along with GCC; see the file COPYING3.  If not see
   #include "c-spellcheck.h"
   #include "selftest.h"
   #include "debug.h"
+#include "tree-vector-builder.h"
+#include "vec-perm-indices.h"
   
   cpp_reader *parse_in;		/* Declared in c-pragma.h.  */
   
@@ -383,6 +385,7 @@ const struct c_common_resword c_common_reswords[] =

 { "__builtin_has_attribute", RID_BUILTIN_HAS_ATTRIBUTE, 0 },
 { "__builtin_launder", RID_BUILTIN_LAUNDER, D_CXXONLY },
 { "__builtin_shuffle", RID_BUILTIN_SHUFFLE, 0 },
+  { "__builtin_shufflevector", RID_BUILTIN_SHUFFLEVECTOR, 0 },
 { "__builtin_tgmath", RID_BUILTIN_TGMATH, D_CONLY },
 { "__builtin_offsetof", RID_OFFSETOF, 0 },
 { "__builtin_types_compatible_p", RID_TYPES_COMPATIBLE_P, D_CONLY },
@@ -1108,6 +,142 @@ c_build_vec_perm_expr (location_t loc, tree v0, tree
v1, tree mask,
 return ret;
   }
   
+/* Build a VEC_PERM_EXPR if V0, V1 are not error_mark_nodes

+   and have vector types, V0 has the same element type as V1, and the
+   number of elements the result is that of MASK.  */
+tree
+c_build_shufflevector (location_t loc, tree v0, tree v1, vec mask,



[committed] RISC-V: Pass -mno-relax to assembler

2021-05-25 Thread Kito Cheng
gcc/ChangeLog:
* config/riscv/riscv.h (ASM_SPEC): Pass -mno-relax.
---
 gcc/config/riscv/riscv.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/gcc/config/riscv/riscv.h b/gcc/config/riscv/riscv.h
index f3e85723c85..f47d5b40a66 100644
--- a/gcc/config/riscv/riscv.h
+++ b/gcc/config/riscv/riscv.h
@@ -98,6 +98,7 @@ extern const char *riscv_default_mtune (int argc, const char 
**argv);
 %{" FPIE_OR_FPIC_SPEC ":-fpic} \
 %{march=*} \
 %{mabi=*} \
+%{mno-relax} \
 %{mbig-endian} \
 %{mlittle-endian} \
 %(subtarget_asm_spec)" \
-- 
2.31.1



Re: [PATCH] middle-end/100727 - fix call expansion with WITH_SIZE_EXPR arg

2021-05-25 Thread Jeff Law via Gcc-patches




On 5/25/2021 2:27 AM, Richard Biener wrote:

call expansion used the result of get_base_address to switch between
ABIs - with get_base_address now never returning NULL we have to
re-instantiate the check in a more explicit way.  This also adjusts
mark_addressable to skip WITH_SIZE_EXPRs, consistent with how
build_fold_addr_expr handles it.

Bootstrap & regtest running on x86_64-unknown-linux-gnu.

Richard.

2021-05-25  Richard Biener  

PR middle-end/100727
* calls.c (initialize_argument_information): Explicitely test
for WITH_SIZE_EXPR.
* gimple-expr.c (mark_addressable): Skip outer WITH_SIZE_EXPR.

Thanks.  I've got the v8 and mn103 ports respinning in the tester now.
jeff



Re: [PATCH] Fix selftest for targets where short and int are the same size.

2021-05-25 Thread Jeff Law via Gcc-patches




On 5/25/2021 12:44 AM, Aldy Hernandez wrote:

avr-elf seems to use HImode for both integer_type_node and
signed_char_type_node, which is causing the check for different sized
VARYING ranges to fail.

I've fixed this by using a char which I think should always be smaller than an
int.  Is there a preferred way of fixing this?  Perhaps 
build_nonstandard_integer
or __attribute__((mode(XX)))?

Tested on an x86-64 x avr-elf.

gcc/ChangeLog:

* value-range.cc (range_tests_legacy): Use signed char instead
of signed short.
As you note, I wonder if we should just creating our own types for this 
test.  In fact I wonder if that should be considered best practice for 
these tests.  Assumptions about the underlying sizes of the standard 
types has been slightly problematical for the range self-tests.


The alternate approach would be to check the underlying sizes/signedness 
and skip the tests when they don't give us what we need.  But that seems 
inferior to just creating a suitable type.


Jeff

ps.  xstormy16-elf seems to be failing in the same way.  I'll assume 
it's the same problem ;-)


[PATCH] C-SKY: Fix copyright of csky-modes.def.

2021-05-25 Thread Xianmiao Qu
From: Cooper Qu 

Tested and pushed.

The incorrect copyright comment format causes build error:
builddir/source//gcc/gcc/config/csky/csky-modes.def: In function ‘void 
create_modes()’:
builddir/source//gcc/gcc/config/csky/csky-modes.def:1:4: error: ‘C’ was not 
declared in this scope
 ;; C-SKY extra machine modes.
^
builddir/source//gcc/gcc/config/csky/csky-modes.def:1:6: error: ‘SKY’ was not 
declared in this scope
 ;; C-SKY extra machine modes.
  ^
builddir/source//gcc/gcc/config/csky/csky-modes.def:2:16: error: ‘Copyright’ 
was not declared in this scope
 ;; Copyright (C) 2018-2021 Free Software Foundation, Inc.
^
builddir/source//gcc/gcc/config/csky/csky-modes.def:3:4: error: ‘Contributed’ 
was not declared in this scope
 ;; Contributed by C-SKY Microsystems and Mentor Graphics.
^

gcc/ChangeLog:
* config/csky/csky-modes.def : Fix copyright.
---
 gcc/config/csky/csky-modes.def | 38 +-
 1 file changed, 19 insertions(+), 19 deletions(-)

diff --git a/gcc/config/csky/csky-modes.def b/gcc/config/csky/csky-modes.def
index 9062efcf929..109ee514040 100644
--- a/gcc/config/csky/csky-modes.def
+++ b/gcc/config/csky/csky-modes.def
@@ -1,22 +1,22 @@
-;; C-SKY extra machine modes.
-;; Copyright (C) 2018-2021 Free Software Foundation, Inc.
-;; Contributed by C-SKY Microsystems and Mentor Graphics.
-;;
-;; This file is part of GCC.
-;;
-;; GCC is free software; you can redistribute it and/or modify it
-;; under the terms of the GNU General Public License as published by
-;; the Free Software Foundation; either version 3, or (at your option)
-;; any later version.
-;;
-;; GCC is distributed in the hope that it will be useful, but
-;; WITHOUT ANY WARRANTY; without even the implied warranty of
-;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
-;; General Public License for more details.
-;;
-;; You should have received a copy of the GNU General Public License
-;; along with GCC; see the file COPYING3.  If not see
-;; .  */
+/* C-SKY extra machine modes.
+   Copyright (C) 2018-2021 Free Software Foundation, Inc.
+   Contributed by C-SKY Microsystems and Mentor Graphics.
+
+   This file is part of GCC.
+
+   GCC is free software; you can redistribute it and/or modify it
+   under the terms of the GNU General Public License as published
+   by the Free Software Foundation; either version 3, or (at your
+   option) any later version.
+
+   GCC is distributed in the hope that it will be useful, but WITHOUT
+   ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
+   or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
+   License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with GCC; see the file COPYING3.  If not see
+   .  */
 
 /* Float modes.  */
 FLOAT_MODE (HF, 2, ieee_half_format);/* Half-precision floating point 
*/
-- 
2.26.2



Re: [PATCH 4/5] Convert remaining passes to RANGE_QUERY.

2021-05-25 Thread Aldy Hernandez via Gcc-patches
Same as before, but with the minor change that the following IPA pass 
uses the correct struct function, as discussed.


Tests on x86-64 Linux on-going.

Aldy

diff --git a/gcc/tree-ssa-structalias.c b/gcc/tree-ssa-structalias.c
index a0238710e72..d8292777647 100644
--- a/gcc/tree-ssa-structalias.c
+++ b/gcc/tree-ssa-structalias.c
@@ -43,6 +43,7 @@
 #include "attribs.h"
 #include "tree-ssa.h"
 #include "tree-cfg.h"
+#include "gimple-range.h"

 /* The idea behind this analyzer is to generate set constraints from the
program, then solve the resulting constraints in order to generate the
@@ -6740,7 +6741,9 @@ find_what_p_points_to (tree fndecl, tree p)
   struct ptr_info_def *pi;
   tree lookup_p = p;
   varinfo_t vi;
-  bool nonnull = get_ptr_nonnull (p);
+  value_range vr;
+  RANGE_QUERY (DECL_STRUCT_FUNCTION (fndecl))->range_of_expr (vr, p);
+  bool nonnull = vr.nonzero_p ();

[snip]
>From 2e185f6caab305ac3779097005abf2153c91d1d5 Mon Sep 17 00:00:00 2001
From: Aldy Hernandez 
Date: Wed, 19 May 2021 18:44:08 +0200
Subject: [PATCH 4/5] Convert remaining passes to RANGE_QUERY.

This patch converts the remaining users of get_range_info and
get_ptr_nonnull to the range_query API.

No effort was made to move passes away from VR_ANTI_RANGE, or any other
use of deprecated methods.  This was a straight up conversion to the new
API, nothing else.

gcc/ChangeLog:

	* builtins.c (check_nul_terminated_array): Convert to RANGE_QUERY.
	(expand_builtin_strnlen): Same.
	(determine_block_size): Same.
	* fold-const.c (expr_not_equal_to): Same.
	* gimple-fold.c (size_must_be_zero_p): Same.
	* gimple-match-head.c: Include gimple-range.h.
	* gimple-pretty-print.c (dump_ssaname_info): Convert to RANGE_QUERY.
	* gimple-ssa-warn-restrict.c
	(builtin_memref::extend_offset_range): Same.
	* graphite-sese-to-poly.c (add_param_constraints): Same.
	* internal-fn.c (get_min_precision): Same.
	* ipa-fnsummary.c (set_switch_stmt_execution_predicate): Same.
	* ipa-prop.c (ipa_compute_jump_functions_for_edge): Same.
	* match.pd: Same.
	* tree-data-ref.c (split_constant_offset): Same.
	(dr_step_indicator): Same.
	* tree-dfa.c (get_ref_base_and_extent): Same.
	* tree-scalar-evolution.c (iv_can_overflow_p): Same.
	* tree-ssa-loop-niter.c (refine_value_range_using_guard): Same.
	(determine_value_range): Same.
	(record_nonwrapping_iv): Same.
	(infer_loop_bounds_from_signedness): Same.
	(scev_var_range_cant_overflow): Same.
	* tree-ssa-phiopt.c (two_value_replacement): Same.
	* tree-ssa-pre.c (insert_into_preds_of_block): Same.
	* tree-ssa-reassoc.c (optimize_range_tests_to_bit_test): Same.
	* tree-ssa-strlen.c (handle_builtin_stxncpy_strncat): Same.
	(get_range): Same.
	(dump_strlen_info): Same.
	(set_strlen_range): Same.
	(maybe_diag_stxncpy_trunc): Same.
	(get_len_or_size): Same.
	(handle_integral_assign): Same.
	* tree-ssa-structalias.c (find_what_p_points_to): Same.
	* tree-ssa-uninit.c (find_var_cmp_const): Same.
	* tree-switch-conversion.c (bit_test_cluster::emit): Same.
	* tree-vect-patterns.c (vect_get_range_info): Same.
	(vect_recog_divmod_pattern): Same.
	* tree-vrp.c (intersect_range_with_nonzero_bits): Same.
	(register_edge_assert_for_2): Same.
	(determine_value_range_1): Same.
	* tree.c (get_range_pos_neg): Same.
	* vr-values.c (vr_values::get_lattice_entry): Same.
	(vr_values::update_value_range): Same.
	(simplify_conversion_using_ranges): Same.
---
 gcc/builtins.c | 40 ++--
 gcc/fold-const.c   |  8 +++-
 gcc/gimple-fold.c  |  7 ++-
 gcc/gimple-match-head.c|  1 +
 gcc/gimple-pretty-print.c  | 12 -
 gcc/gimple-ssa-warn-restrict.c |  8 +++-
 gcc/graphite-sese-to-poly.c|  9 +++-
 gcc/internal-fn.c  | 14 +++---
 gcc/ipa-fnsummary.c| 11 -
 gcc/ipa-prop.c | 16 +++
 gcc/match.pd   | 19 ++--
 gcc/tree-data-ref.c| 24 --
 gcc/tree-dfa.c | 14 +-
 gcc/tree-scalar-evolution.c| 13 +-
 gcc/tree-ssa-loop-niter.c  | 81 +---
 gcc/tree-ssa-phiopt.c  | 11 -
 gcc/tree-ssa-pre.c | 19 
 gcc/tree-ssa-reassoc.c |  9 ++--
 gcc/tree-ssa-strlen.c  | 85 --
 gcc/tree-ssa-structalias.c |  8 ++--
 gcc/tree-ssa-uninit.c  |  8 +++-
 gcc/tree-switch-conversion.c   | 10 ++--
 gcc/tree-vect-patterns.c   | 18 +--
 gcc/tree-vrp.c | 21 -
 gcc/tree.c | 13 +++---
 gcc/vr-values.c| 12 +++--
 26 files changed, 332 insertions(+), 159 deletions(-)

diff --git a/gcc/builtins.c b/gcc/builtins.c
index e1b284846b1..deb7c083315 100644
--- a/gcc/builtins.c
+++ b/gcc/builtins.c
@@ -79,6 +79,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "tree-outof-ssa.h"
 #include "attr-fnspec.h"
 #include "demangle.h"
+#include "gimple-range.h"
 
 struct target_builtins default_target_builtins;
 #if 

Re: [wwwdocs, patch] htdocs/gitwrite.html: Clarify ChangeLog generation

2021-05-25 Thread Tobias Burnus

On 24.05.21 09:45, Gerald Pfeifer wrote:

On Sun, 23 May 2021, Tobias Burnus wrote:

As there was some confusion regarding when the ChangeLog is generated,
I propose the attached wwwdocs patch. Comments?

-Apply the patch to your local tree.  ChangeLog entries will be
-automatically added to the corresponding ChangeLog files based
-on the git commit message.  See the documentation of
+Apply the patch to your local tree.  On the release branches, ChangeLog
+entries will be automatically added to the corresponding ChangeLog files based
+on the git commit message by the daily-bump commit.

Just "On release branches".

And "by the daily-dump commit based on git commit messages" (plural
for commit messages and different order).


Can we assume everyone knows about the "daily-dump commit" here, or
should we just write "once a day" or something like that?

I think "once a day" is clearer. [The commit has the message "Daily
bump." and bumps (not "dumps") the date in DATESTAMP. I wanted to relate
those – but it seems as if this just adds more confusion.]

And a question for my understanding: Why only on release branches?
 From what I can tell the same applies for trunk?


Yes, to mainline ("master" branch, also available as "trunk") and to
(currently) 9, 10, and 11 – but not to closed (release) branches like
GCC 8 (releases/gcc-8) or to devel/*, vendor/*, user/* branches.

In my mind, "release branch" included the to-be-released mainline (in
line with the version list on gcc.gnu.org), but one can argue about
"branch" and "release". – I now use "master" (as that's the branch name
in our git repo, and not "target" (alias) or "mainline" (generic GCC term)).

Hence, I attached a new version ...

Tobias

-
Mentor Graphics (Deutschland) GmbH, Arnulfstrasse 201, 80634 München 
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Frank 
Thürauf
htdocs/gitwrite.html: Clarify ChangeLog generation

diff --git a/htdocs/gitwrite.html b/htdocs/gitwrite.html
index 8363e70c..596f3bef 100644
--- a/htdocs/gitwrite.html
+++ b/htdocs/gitwrite.html
@@ -233,9 +233,9 @@ pull" before attempting a checkin; this will save you a little
 time if someone else has modified the source tree since the last time
 you synced your sources.
 
-Apply the patch to your local tree.  ChangeLog entries will be
-automatically added to the corresponding ChangeLog files based
-on the git commit message.  See the documentation of
+Apply the patch to your local tree.  On master and release branches,
+ChangeLog entries will be automatically added to the corresponding ChangeLog
+files based on the git commit message once a day.  See the documentation of
 ChangeLog format.
 
 Make sure to rebuild any generated files affected by


Re: [PATCH 1/5] Common API for accessing global and on-demand ranges.

2021-05-25 Thread Richard Biener via Gcc-patches
On Tue, May 25, 2021 at 12:53 PM Aldy Hernandez  wrote:
>
> On 5/25/21 11:46 AM, Richard Biener wrote:
> > On Tue, May 25, 2021 at 11:36 AM Aldy Hernandez  wrote:
> >>
> >>
> >>
> >> On 5/25/21 10:57 AM, Richard Biener wrote:
> >>> On Mon, May 24, 2021 at 6:44 PM Aldy Hernandez via Gcc-patches
> >>>  wrote:
> 
> 
> 
>  On 5/21/21 1:39 PM, Aldy Hernandez wrote:
> > This patch provides a generic API for accessing global ranges.  It is
> > meant to replace get_range_info() and get_ptr_nonnull() with one
> > common interface.  It uses the same API as the ranger (class
> > range_query), so there will now be one API for accessing local and
> > global ranges alike.
> >
> > Follow-up patches will convert all users of get_range_info and
> > get_ptr_nonnull to this API.
> >
> > For get_range_info, instead of:
> >
> >  if (!POINTER_TYPE_P (TREE_TYPE (name)) && SSA_NAME_RANGE_INFO 
> > (name))
> >get_range_info (name, vr);
> >
> > You can now do:
> >
> >  RANGE_QUERY (cfun)->range_of_expr (vr, name, [stmt]);
> 
>  BTW, we're not wed to the idea of putting the current range object in
>  cfun.  The important thing is that the API is consistent across, not
>  where it lives.
> >>>
> >>> If the range object is specific for a function (and thus cannot handle
> >>> multiple functions in IPA mode) then struct function looks like the 
> >>> correct
> >>> place.  Accessing that unconditionally via 'cfun' sounds bad though 
> >>> because
> >>> that disallows use from IPA passes.
> >>
> >> The default range object can either be the "global_ranges" object
> >> (get_range_info / get_ptr_nonnull wrapper) or a ranger.  So, the former
> >> is global in nature and not tied to any function, and the latter is tied
> >> to the gimple IL in a function.
> >>
> >> What we want is a mechanism from which a pass can query the range of an
> >> SSA (or expression) at a statement or edge, etc agnostically.  If a
> >> ranger is activated, use that, otherwise use the global information.
> >>
> >> For convenience we wanted a mechanism in which we didn't have to pass an
> >> object between functions in a pass (be it a ranger or a struct
> >> function).  Back when I tried to convert some passes to a ranger, it was
> >> a pain to pass a ranger object around, and having to pass struct
> >> function would be similarly painful.
> >>
> >> ISTM, that most converted passes in this patchset already use cfun
> >> throughout.  For that matter, even the two IPA ones (ipa-fnsummary and
> >> ipa-prop) use cfun throughout (by first calling push_cfun (node->decl)).
> >>
> >> How about I use fun if easily accessible in a pass, otherwise cfun?  I'm
> >> trying to avoid having to pass around a struct function in passes that
> >> require surgery to do so (especially when they're already using cfun).
> >>
> >> Basically, we want minimal changes to clients for ease of use.
> >
> > I think it's fine to not fix "endusers", esp. if they already use 'cfun'
> > and fixing would be a lot mechanical work.  What we need to avoid
> > is implicit uses of cfun via APIs we introduce because that makes
> > a pass/API that is "cfun" clean, eventually even working on explicit
> > struct function (and thus IPA safe) no longer so and depend on "cfun"
> > without that being visible.
>
> Sounds reasonable.
>
> I have removed the use of cfun in get_global_range_query(), so no users
> of GLOBAL_RANGE_QUERY will implicitly use it.
>
> I have verified that all uses of cfun are in passes that already have
> cfun uses, or in the case of -Wrestrict in a pass that requires
> shuffling things around to avoid cfun.  Besides, -Wrestrict is not an
> IPA pass.
>
> Note that there are 3 of uses of the following idiom:
>
> +  if (cfun)
> +   RANGE_QUERY (cfun)->range_of_expr (vr, t);
> +  else
> +   GLOBAL_RANGE_QUERY->range_of_expr (vr, t);
>
> This is for three functions in fold-const.c, gimple-fold.c, and
> tree-dfa.c that may or may not be called with a cfun.  We'd like these
> functions to pick up the current available range object.  But if doing
> so is problematic, I can change it to just use GLOBAL_RANGE_QUERY.

I think that's OK for now.  It means that for hypothetical IPA passes
using range queries that they'd get GLOBAL_RANGE_QUERY instead
of the per-function one when running into those functions.

But as you maybe figured there's quite some cfun/current_function_decl
uses in "infrastructure" that has the same issue - we're just trying to reduce
that.

Richard.

> Aldy


Re: [PATCH 1/5] Common API for accessing global and on-demand ranges.

2021-05-25 Thread Aldy Hernandez via Gcc-patches

On 5/25/21 11:46 AM, Richard Biener wrote:

On Tue, May 25, 2021 at 11:36 AM Aldy Hernandez  wrote:




On 5/25/21 10:57 AM, Richard Biener wrote:

On Mon, May 24, 2021 at 6:44 PM Aldy Hernandez via Gcc-patches
 wrote:




On 5/21/21 1:39 PM, Aldy Hernandez wrote:

This patch provides a generic API for accessing global ranges.  It is
meant to replace get_range_info() and get_ptr_nonnull() with one
common interface.  It uses the same API as the ranger (class
range_query), so there will now be one API for accessing local and
global ranges alike.

Follow-up patches will convert all users of get_range_info and
get_ptr_nonnull to this API.

For get_range_info, instead of:

 if (!POINTER_TYPE_P (TREE_TYPE (name)) && SSA_NAME_RANGE_INFO (name))
   get_range_info (name, vr);

You can now do:

 RANGE_QUERY (cfun)->range_of_expr (vr, name, [stmt]);


BTW, we're not wed to the idea of putting the current range object in
cfun.  The important thing is that the API is consistent across, not
where it lives.


If the range object is specific for a function (and thus cannot handle
multiple functions in IPA mode) then struct function looks like the correct
place.  Accessing that unconditionally via 'cfun' sounds bad though because
that disallows use from IPA passes.


The default range object can either be the "global_ranges" object
(get_range_info / get_ptr_nonnull wrapper) or a ranger.  So, the former
is global in nature and not tied to any function, and the latter is tied
to the gimple IL in a function.

What we want is a mechanism from which a pass can query the range of an
SSA (or expression) at a statement or edge, etc agnostically.  If a
ranger is activated, use that, otherwise use the global information.

For convenience we wanted a mechanism in which we didn't have to pass an
object between functions in a pass (be it a ranger or a struct
function).  Back when I tried to convert some passes to a ranger, it was
a pain to pass a ranger object around, and having to pass struct
function would be similarly painful.

ISTM, that most converted passes in this patchset already use cfun
throughout.  For that matter, even the two IPA ones (ipa-fnsummary and
ipa-prop) use cfun throughout (by first calling push_cfun (node->decl)).

How about I use fun if easily accessible in a pass, otherwise cfun?  I'm
trying to avoid having to pass around a struct function in passes that
require surgery to do so (especially when they're already using cfun).

Basically, we want minimal changes to clients for ease of use.


I think it's fine to not fix "endusers", esp. if they already use 'cfun'
and fixing would be a lot mechanical work.  What we need to avoid
is implicit uses of cfun via APIs we introduce because that makes
a pass/API that is "cfun" clean, eventually even working on explicit
struct function (and thus IPA safe) no longer so and depend on "cfun"
without that being visible.


Sounds reasonable.

I have removed the use of cfun in get_global_range_query(), so no users 
of GLOBAL_RANGE_QUERY will implicitly use it.


I have verified that all uses of cfun are in passes that already have 
cfun uses, or in the case of -Wrestrict in a pass that requires 
shuffling things around to avoid cfun.  Besides, -Wrestrict is not an 
IPA pass.


Note that there are 3 of uses of the following idiom:

+  if (cfun)
+   RANGE_QUERY (cfun)->range_of_expr (vr, t);
+  else
+   GLOBAL_RANGE_QUERY->range_of_expr (vr, t);

This is for three functions in fold-const.c, gimple-fold.c, and 
tree-dfa.c that may or may not be called with a cfun.  We'd like these 
functions to pick up the current available range object.  But if doing 
so is problematic, I can change it to just use GLOBAL_RANGE_QUERY.


Aldy
>From eb294702e7e6a900d521a548cae5a175ea319231 Mon Sep 17 00:00:00 2001
From: Aldy Hernandez 
Date: Wed, 19 May 2021 18:27:05 +0200
Subject: [PATCH 1/6] Common API for accessing global and on-demand ranges.

This patch provides a generic API for accessing global ranges.  It is
meant to replace get_range_info() and get_ptr_nonnull() with one
common interface.  It uses the same API as the ranger (class
range_query), so there will now be one API for accessing local and
global ranges alike.

Follow-up patches will convert all users of get_range_info and
get_ptr_nonnull to this API.

For get_range_info, instead of:

  if (!POINTER_TYPE_P (TREE_TYPE (name)) && SSA_NAME_RANGE_INFO (name))
get_range_info (name, vr);

You can now do:

  RANGE_QUERY (cfun)->range_of_expr (vr, name, [stmt]);

...as well as any other of the range_query methods (range_on_edge,
range_of_stmt, value_of_expr, value_on_edge, value_on_stmt, etc).

As per the API, range_of_expr will work on constants, SSA names, and
anything we support in irange::supports_type_p().

For pointers, the interface is the same, so instead of:

  else if (POINTER_TYPE_P (TREE_TYPE (name)) && SSA_NAME_PTR_INFO (name))
{
  if (get_ptr_nonnull (name))
stuff();
}

One 

[PATCH] C-SKY: Add insn "ldbs".

2021-05-25 Thread Geng Qi via Gcc-patches
gcc/
* config/csky/csky.md (cskyv2_sextend_ldbs): New insn.

gcc/testsuite/
* gcc/testsuite/gcc.target/csky/ldbs.c: New.
---
 gcc/config/csky/csky.md  | 10 ++
 gcc/testsuite/gcc.target/csky/ldbs.c | 11 +++
 2 files changed, 21 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/csky/ldbs.c

diff --git a/gcc/config/csky/csky.md b/gcc/config/csky/csky.md
index c27d627..b980d4c 100644
--- a/gcc/config/csky/csky.md
+++ b/gcc/config/csky/csky.md
@@ -1533,6 +1533,7 @@
   }"
 )
 
+;; hi -> si
 (define_insn "extendhisi2"
   [(set (match_operand:SI0 "register_operand" "=r")
(sign_extend:SI (match_operand:HI 1 "register_operand" "r")))]
@@ -1557,6 +1558,15 @@
   "sextb  %0, %1"
 )
 
+(define_insn "*cskyv2_sextend_ldbs"
+  [(set (match_operand:SI0 "register_operand" "=r")
+(sign_extend:SI (match_operand:QI 1 "csky_simple_mem_operand" "m")))]
+  "CSKY_ISA_FEATURE (E2)"
+  "ld.bs\t%0, %1"
+  [(set_attr "length" "4")
+   (set_attr "type" "load")]
+)
+
 ;; qi -> hi
 (define_insn "extendqihi2"
   [(set (match_operand:HI0 "register_operand" "=r")
diff --git a/gcc/testsuite/gcc.target/csky/ldbs.c 
b/gcc/testsuite/gcc.target/csky/ldbs.c
new file mode 100644
index 000..27a0254
--- /dev/null
+++ b/gcc/testsuite/gcc.target/csky/ldbs.c
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-skip-if "" { *-*-* } { "-mcpu=ck801" "-march=ck801" } { "*" } } */
+/* { dg-csky-options "-O1" } */
+
+int foo (signed char *pb)
+{
+  return *pb;
+}
+
+/* { dg-final { scan-assembler "ld.bs" } } */
+
-- 
2.7.4



Re: [PATCH 1/2] c-family: Copy DECL_USER_ALIGN even if DECL_ALIGN is similar.

2021-05-25 Thread Robin Dapp via Gcc-patches

Hi Martin and Jason,


The removal of the dead code looks good to me.  The change to
"re-init lastalign" doesn't seem right.  When it's zero it means
the conflict is between two attributes on the same declaration,
in which case the note shouldn't be printed (it would just point
to the same location as the warning).


Agreed.


Did I get it correctly that you refer to printing a note in e.g. the 
following case?


 inline int __attribute__ ((aligned (16), aligned (4)))
 finline_align (int);

I indeed missed this but it could be fixed by checking (on top of the patch)

diff --git a/gcc/c-family/c-attribs.c b/gcc/c-family/c-attribs.c
index 98c98944405..7349da73f14 100644
--- a/gcc/c-family/c-attribs.c
+++ b/gcc/c-family/c-attribs.c
@@ -2324,7 +2324,7 @@ common_handle_aligned_attribute (tree *node, tree 
name, tree args, int flags,

   /* Either a prior attribute on the same declaration or one
 on a prior declaration of the same function specifies
 stricter alignment than this attribute.  */
-  bool note = lastalign != 0;
+  bool note = last_decl != decl && lastalign != 0;

As there wasn't any FAIL, I would add another test which checks for this.

I find the whole logic here a bit convoluted but when there is no real 
last_decl, then last_decl = decl.  A note would not be printed before 
the patch because we erroneously warned about the "conflict" of the 
function's default alignment (8) vs the requested alignment (4).


Regards
 Robin


Re: [PATCH] Extend is_cond_scalar_reduction to handle nop_expr after/before scalar reduction.[PR98365]

2021-05-25 Thread Richard Biener via Gcc-patches
On Mon, May 24, 2021 at 11:52 AM Hongtao Liu  wrote:
>
> Hi:
>   Details described in PR.
>   Bootstrapped and regtest on
> x86_64-linux-gnu{-m32,}/x86_64-linux-gnu{-m32\
> -march=cascadelake,-march=cascadelake}
>   Ok for trunk?

+static tree
+strip_nop_cond_scalar_reduction (bool has_nop, tree op)
+{
+  if (!has_nop)
+return op;
+
+  if (TREE_CODE (op) != SSA_NAME)
+return NULL_TREE;
+
+  gimple* stmt = SSA_NAME_DEF_STMT (op);
+  if (!stmt
+  || gimple_code (stmt) != GIMPLE_ASSIGN
+  || gimple_has_volatile_ops (stmt)
+  || gimple_assign_rhs_code (stmt) != NOP_EXPR)
+return NULL_TREE;
+
+  return gimple_assign_rhs1 (stmt);

this allows arbitrary conversions where the comment suggests you
only want to allow conversions to the same precision but different sign.
Sth like

  gassign *stmt = safe_dyn_cast  (SSA_NAME_DEF_STMT (op));
  if (!stmt
  || !CONVERT_EXPR_CODE_P (gimple_assign_rhs_code (stmt))
  || !tree_nop_conversion_p (TREE_TYPE (op), TREE_TYPE
(gimple_assign_rhs1 (stmt
return NULL_TREE;

+  if (gimple_bb (stmt) != gimple_bb (*nop_reduc)
+ || gimple_code (stmt) != GIMPLE_ASSIGN
+ || gimple_has_volatile_ops (stmt))
+   return false;

!is_gimple_assign (stmt) instead of gimple_code (stmt) != GIMPLE_ASSIGN

the gimple_has_volatile_ops check is superfluous given you restrict
the assign code.

+  /* Check that R_NOP1 is used in nop_stmt or in PHI only.  */
+  FOR_EACH_IMM_USE_FAST (use_p, imm_iter, r_nop1)
+   {
+ gimple *use_stmt = USE_STMT (use_p);
+ if (is_gimple_debug (use_stmt))
+   continue;
+ if (use_stmt == SSA_NAME_DEF_STMT (r_op1))
+   continue;
+ if (gimple_code (use_stmt) != GIMPLE_PHI)
+   return false;

can the last check be use_stmt == phi since we should have the
PHI readily available?

@@ -1735,6 +1822,23 @@ convert_scalar_cond_reduction (gimple *reduc,
gimple_stmt_iterator *gsi,
   rhs = fold_build2 (gimple_assign_rhs_code (reduc),
 TREE_TYPE (rhs1), op0, tmp);

+  if (has_nop)
+{
+  /* Create assignment nop_rhs = op0 +/- _ifc_.  */
+  tree nop_rhs = make_temp_ssa_name (TREE_TYPE (rhs1), NULL, "_nop_");
+  gimple* new_assign2 = gimple_build_assign (nop_rhs, rhs);
+  gsi_insert_before (gsi, new_assign2, GSI_SAME_STMT);
+  /* Rebuild rhs for nop_expr.  */
+  rhs = fold_build1 (NOP_EXPR,
+TREE_TYPE (gimple_assign_lhs (nop_reduc)),
+nop_rhs);
+
+  /* Delete nop_reduc.  */
+  stmt_it = gsi_for_stmt (nop_reduc);
+  gsi_remove (_it, true);
+  release_defs (nop_reduc);
+}
+

hmm, the whole function could be cleaned up with sth like

 /* Build rhs for unconditional increment/decrement.  */
 gimple_seq stmts = NULL;
 rhs = gimple_build (, gimple_assing_rhs_code (reduc),
TREE_TYPE (rhs1), op0, tmp);
 if (has_nop)
   rhs = gimple_convert (, TREE_TYPE (gimple_assign_lhs
(nop_reduc)), rhs);
 gsi_insert_seq_before (gsi, stmts, GSI_SAME_STMT);

plus in the caller moving the

  new_stmt = gimple_build_assign (res, rhs);
  gsi_insert_before (gsi, new_stmt, GSI_SAME_STMT);

to the else branch as well as the folding done on new_stmt (maybe return
new_stmt instead of rhs from convert_scalar_cond_reduction.

Richard.

>   gcc/ChangeLog:
>
> PR tree-optimization/pr98365
> * tree-if-conv.c (strip_nop_cond_scalar_reduction): New function.
> (is_cond_scalar_reduction): Handle nop_expr in cond scalar reduction.
> (convert_scalar_cond_reduction): Ditto.
> (predicate_scalar_phi): Ditto.
>
> gcc/testsuite/ChangeLog:
>
> PR tree-optimization/pr98365
> * gcc.target/i386/pr98365.c: New test.
>
> --
> BR,
> Hongtao


[c++tools] Fix typo and weird syntax in configure script

2021-05-25 Thread Eric Botcazou
Tested on x86-64/Linux, applied on mainline and 11 branch as obvious.


2021-05-25  Eric Botcazou  

c++tools/
* configure.ac (--enable-maintainer-mode): Fix typo and weird syntax.
* configure: Regenerate.

-- 
Eric Botcazoudiff --git a/c++tools/configure.ac b/c++tools/configure.ac
index 5771f2ace68..c8f85209188 100644
--- a/c++tools/configure.ac
+++ b/c++tools/configure.ac
@@ -58,9 +58,9 @@ AS_HELP_STRING([--enable-maintainer-mode],
 [enable maintainer mode.  Add rules to rebuild configurey bits]),,
 [enable_maintainer_mode=no])
 case "$enable_maintainer_mode" in
-  ("yes") maintainer_mode=yes ;;
-  ("no") maintainer=no ;;
-  (*) AC_MSG_ERROR([unknown maintainer mode $enable_maintainer_mode]) ;;
+  yes) maintainer_mode=yes ;;
+  no) maintainer_mode=no ;;
+  *) AC_MSG_ERROR([unknown maintainer mode $enable_maintainer_mode]) ;;
 esac
 AC_MSG_CHECKING([maintainer-mode])
 AC_MSG_RESULT([$maintainer_mode])


Re: RFA: save/restore target options in handle_optimize_attribute

2021-05-25 Thread Richard Biener via Gcc-patches
On Mon, May 24, 2021 at 10:56 AM Martin Liška  wrote:
>
> On 5/20/21 9:55 AM, Richard Biener wrote:
> > On Thu, May 20, 2021 at 12:29 AM Joern Wolfgang Rennecke
> >  wrote:
> >>
> >> We set default for some target options in TARGET_OPTION_OPTIMIZATION_TABLE,
> >> but these can be overridden by specifying the corresponding explicit
> >> -mXXX / -mno-XXX options.
> >> When a function bears the attribue
> >> __attribute__ ((optimize("02")))
> >> the target options are set to the default for that optimization level,
> >> which can be different from what was selected for the file as a whole.
> >> As handle_optimize_attribute is right now, it will thus clobber the
> >> target options, and with enable_checking it will then abort.
> >>
> >> The attached patch makes it save and restore the target options.
> >>
> >> Bootstrapped and regression tested on x86_64-pc-linux-gnu.
>
> Interesting, I prepared very similar patch for this stage1. My patch covers 
> few more
> cases where target options interfere with optimize options (and vice versa).
>
> >
> > That looks reasonable but of course it doesn't solve the issue that those
> > altered target options will not be in effect on the optimize("O2") function.
> >
> > IIRC Martin has changes in the works to unify target & optimize here
> > which should obsolete this fix.  Martin - what's the state of this?  Do you
> > think this patch makes sense in the mean time (and maybe also on
> > the branch though the assert is not in effect there but the behavior
> > is still observed and unexpected).
>
> Well, I really tried doing the merge but I failed. It's pretty huge task and 
> I was
> unable to get something reasonable for x86_64 target :/

I wonder what the big issue is - the point is that target and optimize options
should not vary independently and thus the caching should not happen
independently.  In the end this means unifying
DECL_FUNCTION_SPECIFIC_TARET/OPTIMIZATION or as a first step,
updating them always in lock-step.  Like making the existing macros
return an rvalue and providing set_* wrappers that perform a forceful
update of both with unified caching.

> However, my patch mitigates
> 2 more cases.
>
> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
>
> Ready to be installed?

OK - might be worth backporting to GCC 11?

Thanks,
Richard.

> Thanks,
> Martin
>
> >
> > Thanks,
> > Richard.
> >
>


Re: [_GLIBCXX_DEBUG] Enhance rendering of assert message

2021-05-25 Thread Jonathan Wakely via Gcc-patches

On 22/05/21 22:08 +0200, François Dumont via Libstdc++ wrote:
Here is the part of the libbacktrace patch with the enhancement to the 
rendering of assert message.


It only contains one real fix, the rendering of address. In 2 places 
it was done with "0x%p", so resulting in something like: 0x0x012345678


Otherwise it is just enhancements, mostly avoiding intermediate buffering.

I am adding the _Parameter::_Named type to help on the rendering. I 
hesitated in doing the same for the _M_iterator type but eventually 
managed it with a template function.


    libstdc++: [_GLIBCXX_DEBUG] Enhance rendering of assert message

    Avoid building an intermediate buffer to print to stderr, push 
directly to stderr.


    libstdc++-v3/ChangeLog:

    * include/debug/formatter.h
    (_Error_formatter::_Parameter::_Named): New.
    (_Error_formatter::_Parameter::_Type): Inherit latter.
    (_Error_formatter::_Parameter::_M_integer): Likewise.
    (_Error_formatter::_Parameter::_M_string): Likewise.
    * src/c++11/debug.cc: Include .
    (_Print_func_t): New.
    (print_raw(PrintContext&, const char*, ptrdiff_t)): New.
    (print_word): Use '%.*s' format in fprintf to render only 
expected number of chars.

    (pretty_print(PrintContext&, const char*, _Print_func_t)): New.
    (print_type): Rename in...
    (print_type_info): ...this. Use pretty_print.
    (print_address, print_integer): New.
    (print_named_name, print_iterator_constness, 
print_iterator_state): New.

    (print_iterator_seq_type): New.
    (print_named_field, print_type_field, 
print_instance_field, print_iterator_field): New.

    (print_field): Use latters.
    (print_quoted_named_name, print_type_type, print_type, 
print_instance): New.
    (print_string(PrintContext&, const char*, const 
_Parameter*, size_t)):

    Change signature to...
    (print_string(PrintContext&, const char*, ptrdiff_t, const 
_Parameter*, size_t)):
    ...this and adapt. Remove intermediate buffer to render 
input string.

    (print_string(PrintContext&, const char*, ptrdiff_t)): New.

Ok to commit ?

François




+  void
+  pretty_print(PrintContext& ctx, const char* str, _Print_func_t print_func)
+  {
+const char cxx1998[] = "__cxx1998::";
+const char uglification[] = "__";
+for (;;)
+  {
+   auto idx = strstr(str, uglification);


This is confusing. strstr returns a pointer, not an index into the
string.


+   if (idx)
+ {
+   size_t offset =
+ (idx == strstr(str, cxx1998)
+  ? sizeof(cxx1998) : sizeof(uglification)) - 1;


This is a bit inefficient. Consider "int __foo(__cxx1998::bar)". The
first strstr returns a pointer to "__foo" and then the second one
searches the entire string from the beginning looking for
"__cxx1998::", and checks if it is the same position as "__foo".

The second strstr doesn't need to search from the beginning, and it
doesn't need to look all the way to the end. It should be memcmp.

  if (auto pos = strstr(str, uglification))
{
  if (pos != str)
print_func(ctx, str, pos - str);

  if (memcmp(pos, cxx1998, sizeof(cxx1998)-1) == 0)
str = pos + (sizeof(cxx1998) - 1);
  else
str = pos + (sizeof(uglification) - 1);

  while (*str && isspace((unsigned char)*str))

++str;

  if (!*str)
break;
}
  else

It doesn't even need to search from the position found by the first
strstr, because we already know it starts with "__", so:

  for (;;)
{
  if (auto pos = strstr(str, "__"))
{
  if (pos != str)
print_func(ctx, str, pos - str);

  pos += 2; // advance past "__"

  if (memcmp(pos, "cxx1998::", 9) == 0)
str = pos + 9; // advance past "cxx1998::"

  while (*str && isspace((unsigned char)*str))

++str;

  if (!*str)
break;
}
  else

But either is OK. Just not doing a second strstr through the entire
string again to look for "__cxx1998::".




+
+   if (idx != str)
+ print_func(ctx, str, idx - str);
+
+   str = idx + offset;
+
+   while (*str && isspace((unsigned char)*str))
+ ++str;


Is this really needed?

Why would there be whitespace following "__" or "__cxx1998::" and why
would we want to skip it?

I know it doesn't follow our usual naming scheme, but a symbol like
"__foo__ bar()" would get printed as "foobar()" wouldn't it?


The rest of the patch looks fine, I'm just unsure about pretty_print.
Maybe I've misunderstood the possible strings it gets used with?




Re: [PATCH] Add no_sanitize_coverage attribute.

2021-05-25 Thread Richard Biener via Gcc-patches
On Mon, May 24, 2021 at 10:27 AM Martin Liška  wrote:
>
> On 5/21/21 2:39 PM, Marco Elver wrote:
> > On Fri, May 21, 2021 at 10:50AM +0200, Martin Liška wrote:
> >> On 5/20/21 12:55 PM, Marco Elver wrote:
> >>> I think this came up with other no_sanitize [1] based on what I had
> >>> written to you last year [2].
> >>>
> >>> [1]https://gcc.gnu.org/pipermail/gcc-patches/2020-June/547618.html
> >>> [2]https://lore.kernel.org/lkml/canpmjnnrz5ovkb6pe7k6gjfogbht_zhypkng9ad+kjndzk7...@mail.gmail.com/
> >>
> >> Ah, you're right. I've just updated the patch to address that.
> >>
> >> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
> >>
> >> Ready to be installed?
> >
> > Looks good, I also just built a kernel with the no_sanitize_coverage
> > attribute (without the objtool nop-workaround) and works as expected.
>
> Good, thanks!
>
> >
> > Not sure if required, but would such an additional test be useful:
>
> Yes, it is, thanks for it.
>
> >
> > ---
> >
> > diff --git a/gcc/testsuite/gcc.dg/sancov/attribute.c 
> > b/gcc/testsuite/gcc.dg/sancov/attribute.c
> > index bf6dbd4bae7..7cfa9134ff1 100644
> > --- a/gcc/testsuite/gcc.dg/sancov/attribute.c
> > +++ b/gcc/testsuite/gcc.dg/sancov/attribute.c
> > @@ -11,5 +11,17 @@ bar(void)
> >   {
> >   }
> >
> > +static void inline
> > +__attribute__((always_inline))
> > +inline_fn(void)
> > +{
> > +}
> > +
> > +void
> > +__attribute__((no_sanitize_coverage))
> > +baz(void)
> > +{
> > +  inline_fn();
> > +}
> >
> >   /* { dg-final { scan-tree-dump-times "__builtin___sanitizer_cov_trace_pc 
> > \\(\\)" 1 "optimized" } } */
> >
> > ---
> >
> > Otherwise, please go ahead. I assume this is targeting GCC 12?
>
> Yep.
>
> There's V3 I'm sending.
>
> Ready for master?

OK.

Thanks,
Richard.

> Thanks,
> Martin
>
> >
> > Thanks,
> > -- Marco
> >
>


Re: [PATCH 1/5] Common API for accessing global and on-demand ranges.

2021-05-25 Thread Richard Biener via Gcc-patches
On Tue, May 25, 2021 at 11:36 AM Aldy Hernandez  wrote:
>
>
>
> On 5/25/21 10:57 AM, Richard Biener wrote:
> > On Mon, May 24, 2021 at 6:44 PM Aldy Hernandez via Gcc-patches
> >  wrote:
> >>
> >>
> >>
> >> On 5/21/21 1:39 PM, Aldy Hernandez wrote:
> >>> This patch provides a generic API for accessing global ranges.  It is
> >>> meant to replace get_range_info() and get_ptr_nonnull() with one
> >>> common interface.  It uses the same API as the ranger (class
> >>> range_query), so there will now be one API for accessing local and
> >>> global ranges alike.
> >>>
> >>> Follow-up patches will convert all users of get_range_info and
> >>> get_ptr_nonnull to this API.
> >>>
> >>> For get_range_info, instead of:
> >>>
> >>> if (!POINTER_TYPE_P (TREE_TYPE (name)) && SSA_NAME_RANGE_INFO (name))
> >>>   get_range_info (name, vr);
> >>>
> >>> You can now do:
> >>>
> >>> RANGE_QUERY (cfun)->range_of_expr (vr, name, [stmt]);
> >>
> >> BTW, we're not wed to the idea of putting the current range object in
> >> cfun.  The important thing is that the API is consistent across, not
> >> where it lives.
> >
> > If the range object is specific for a function (and thus cannot handle
> > multiple functions in IPA mode) then struct function looks like the correct
> > place.  Accessing that unconditionally via 'cfun' sounds bad though because
> > that disallows use from IPA passes.
>
> The default range object can either be the "global_ranges" object
> (get_range_info / get_ptr_nonnull wrapper) or a ranger.  So, the former
> is global in nature and not tied to any function, and the latter is tied
> to the gimple IL in a function.
>
> What we want is a mechanism from which a pass can query the range of an
> SSA (or expression) at a statement or edge, etc agnostically.  If a
> ranger is activated, use that, otherwise use the global information.
>
> For convenience we wanted a mechanism in which we didn't have to pass an
> object between functions in a pass (be it a ranger or a struct
> function).  Back when I tried to convert some passes to a ranger, it was
> a pain to pass a ranger object around, and having to pass struct
> function would be similarly painful.
>
> ISTM, that most converted passes in this patchset already use cfun
> throughout.  For that matter, even the two IPA ones (ipa-fnsummary and
> ipa-prop) use cfun throughout (by first calling push_cfun (node->decl)).
>
> How about I use fun if easily accessible in a pass, otherwise cfun?  I'm
> trying to avoid having to pass around a struct function in passes that
> require surgery to do so (especially when they're already using cfun).
>
> Basically, we want minimal changes to clients for ease of use.

I think it's fine to not fix "endusers", esp. if they already use 'cfun'
and fixing would be a lot mechanical work.  What we need to avoid
is implicit uses of cfun via APIs we introduce because that makes
a pass/API that is "cfun" clean, eventually even working on explicit
struct function (and thus IPA safe) no longer so and depend on "cfun"
without that being visible.

The GSoC project to do threading did some cleanups here and
there and in the past we've transitioned to explicit 'cfun' uses
like BASIC_BLOCK_FOR_FN, etc. to expose those and make
pass refactorings meaningful so that a pass using passed around
'cfun' would survive with setting cfun to NULL.

Richard.

> Aldy
>


Re: [PATCH 1/5] Common API for accessing global and on-demand ranges.

2021-05-25 Thread Aldy Hernandez via Gcc-patches




On 5/25/21 10:57 AM, Richard Biener wrote:

On Mon, May 24, 2021 at 6:44 PM Aldy Hernandez via Gcc-patches
 wrote:




On 5/21/21 1:39 PM, Aldy Hernandez wrote:

This patch provides a generic API for accessing global ranges.  It is
meant to replace get_range_info() and get_ptr_nonnull() with one
common interface.  It uses the same API as the ranger (class
range_query), so there will now be one API for accessing local and
global ranges alike.

Follow-up patches will convert all users of get_range_info and
get_ptr_nonnull to this API.

For get_range_info, instead of:

if (!POINTER_TYPE_P (TREE_TYPE (name)) && SSA_NAME_RANGE_INFO (name))
  get_range_info (name, vr);

You can now do:

RANGE_QUERY (cfun)->range_of_expr (vr, name, [stmt]);


BTW, we're not wed to the idea of putting the current range object in
cfun.  The important thing is that the API is consistent across, not
where it lives.


If the range object is specific for a function (and thus cannot handle
multiple functions in IPA mode) then struct function looks like the correct
place.  Accessing that unconditionally via 'cfun' sounds bad though because
that disallows use from IPA passes.


The default range object can either be the "global_ranges" object 
(get_range_info / get_ptr_nonnull wrapper) or a ranger.  So, the former 
is global in nature and not tied to any function, and the latter is tied 
to the gimple IL in a function.


What we want is a mechanism from which a pass can query the range of an 
SSA (or expression) at a statement or edge, etc agnostically.  If a 
ranger is activated, use that, otherwise use the global information.


For convenience we wanted a mechanism in which we didn't have to pass an 
object between functions in a pass (be it a ranger or a struct 
function).  Back when I tried to convert some passes to a ranger, it was 
a pain to pass a ranger object around, and having to pass struct 
function would be similarly painful.


ISTM, that most converted passes in this patchset already use cfun 
throughout.  For that matter, even the two IPA ones (ipa-fnsummary and 
ipa-prop) use cfun throughout (by first calling push_cfun (node->decl)).


How about I use fun if easily accessible in a pass, otherwise cfun?  I'm 
trying to avoid having to pass around a struct function in passes that 
require surgery to do so (especially when they're already using cfun).


Basically, we want minimal changes to clients for ease of use.

Aldy



[PATCH] C-SKY: Amend copyrights of recently added files.

2021-05-25 Thread Xianmiao Qu
From: Cooper Qu 

This patch has been pushed.

gcc/ChangeLog:
* config/csky/csky-modes.def : Amend copyright.
* config/csky/csky_insn_fpuv2.md : Likewise.
* config/csky/csky_insn_fpuv3.md : Likewise.

gcc/testsuite/ChangeLog:
* gcc.target/csky/fpuv3/fpuv3.exp : Amend copyright.
---
 gcc/config/csky/csky-modes.def| 20 +++
 gcc/config/csky/csky_insn_fpuv2.md| 19 ++
 gcc/config/csky/csky_insn_fpuv3.md| 19 ++
 gcc/testsuite/gcc.target/csky/fpuv3/fpuv3.exp | 11 +-
 4 files changed, 64 insertions(+), 5 deletions(-)

diff --git a/gcc/config/csky/csky-modes.def b/gcc/config/csky/csky-modes.def
index a2427ff17c7..9062efcf929 100644
--- a/gcc/config/csky/csky-modes.def
+++ b/gcc/config/csky/csky-modes.def
@@ -1,2 +1,22 @@
+;; C-SKY extra machine modes.
+;; Copyright (C) 2018-2021 Free Software Foundation, Inc.
+;; Contributed by C-SKY Microsystems and Mentor Graphics.
+;;
+;; This file is part of GCC.
+;;
+;; GCC is free software; you can redistribute it and/or modify it
+;; under the terms of the GNU General Public License as published by
+;; the Free Software Foundation; either version 3, or (at your option)
+;; any later version.
+;;
+;; GCC is distributed in the hope that it will be useful, but
+;; WITHOUT ANY WARRANTY; without even the implied warranty of
+;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+;; General Public License for more details.
+;;
+;; You should have received a copy of the GNU General Public License
+;; along with GCC; see the file COPYING3.  If not see
+;; .  */
+
 /* Float modes.  */
 FLOAT_MODE (HF, 2, ieee_half_format);/* Half-precision floating point 
*/
diff --git a/gcc/config/csky/csky_insn_fpuv2.md 
b/gcc/config/csky/csky_insn_fpuv2.md
index 0a680f8bf35..d56b61f4032 100644
--- a/gcc/config/csky/csky_insn_fpuv2.md
+++ b/gcc/config/csky/csky_insn_fpuv2.md
@@ -1,3 +1,22 @@
+;; C-SKY FPUV2 instruction descriptions.
+;; Copyright (C) 2018-2021 Free Software Foundation, Inc.
+;; Contributed by C-SKY Microsystems and Mentor Graphics.
+;;
+;; This file is part of GCC.
+;;
+;; GCC is free software; you can redistribute it and/or modify it
+;; under the terms of the GNU General Public License as published by
+;; the Free Software Foundation; either version 3, or (at your option)
+;; any later version.
+;;
+;; GCC is distributed in the hope that it will be useful, but
+;; WITHOUT ANY WARRANTY; without even the implied warranty of
+;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+;; General Public License for more details.
+;;
+;; You should have received a copy of the GNU General Public License
+;; along with GCC; see the file COPYING3.  If not see
+;; .  */
 
 ;; -
 ;; Float Abs instructions
diff --git a/gcc/config/csky/csky_insn_fpuv3.md 
b/gcc/config/csky/csky_insn_fpuv3.md
index 053673c49d2..b5f47980fa6 100644
--- a/gcc/config/csky/csky_insn_fpuv3.md
+++ b/gcc/config/csky/csky_insn_fpuv3.md
@@ -1,3 +1,22 @@
+;; C-SKY FPUV3 instruction descriptions.
+;; Copyright (C) 2018-2021 Free Software Foundation, Inc.
+;; Contributed by C-SKY Microsystems and Mentor Graphics.
+;;
+;; This file is part of GCC.
+;;
+;; GCC is free software; you can redistribute it and/or modify it
+;; under the terms of the GNU General Public License as published by
+;; the Free Software Foundation; either version 3, or (at your option)
+;; any later version.
+;;
+;; GCC is distributed in the hope that it will be useful, but
+;; WITHOUT ANY WARRANTY; without even the implied warranty of
+;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+;; General Public License for more details.
+;;
+;; You should have received a copy of the GNU General Public License
+;; along with GCC; see the file COPYING3.  If not see
+;; .  */
 
 (define_c_enum "unspec" [
   UNSPEC_MAXNM_F3
diff --git a/gcc/testsuite/gcc.target/csky/fpuv3/fpuv3.exp 
b/gcc/testsuite/gcc.target/csky/fpuv3/fpuv3.exp
index 1170e12ac28..68c166cd485 100644
--- a/gcc/testsuite/gcc.target/csky/fpuv3/fpuv3.exp
+++ b/gcc/testsuite/gcc.target/csky/fpuv3/fpuv3.exp
@@ -1,20 +1,21 @@
-# Copyright (C) 1997, 2004, 2006, 2007 Free Software Foundation, Inc.
-
+# GCC testsuite for C-SKY targets FPUV3 instructions.
+# Copyright (C) 2012-2021 Free Software Foundation, Inc.
+# Contributed by C-SKY Microsystems and Mentor Graphics.
+#
 # This program is free software; you can redistribute it and/or modify
 # it under the terms of the GNU General Public License as published by
 # the Free Software Foundation; either version 3 of the License, or
 # (at your option) any later version.
-#
+# 
 # This program is distributed in the hope that it will be useful,
 # but WITHOUT ANY WARRANTY; without even the implied warranty of
 # 

Re: [PATCH] c++tools: Include for exit [PR100731]

2021-05-25 Thread Richard Biener via Gcc-patches
On Tue, May 25, 2021 at 11:15 AM Jakub Jelinek via Gcc-patches
 wrote:
>
> Hi!
>
> This TU uses exit, but doesn't include  or  and relies
> on some other header to include it indirectly, which apparently doesn't
> happen on reporter's host.
>
> The other  headers aren't guarded either and we rely on a compiler
> capable of C++11, so maybe we can rely on  being around
> unconditionally.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk/11?

OK, but as the reporter notes none of the functions pulled by
c* are std:: qualified at calls ... is this not a requirement?

Richard.

> 2021-05-25  Jakub Jelinek  
>
> PR bootstrap/100731
> * server.cc: Include .
>
> --- c++tools/server.cc.jj   2021-05-24 14:20:01.905748402 +0200
> +++ c++tools/server.cc  2021-05-24 14:24:29.760813389 +0200
> @@ -29,6 +29,7 @@ along with GCC; see the file COPYING3.
>  #include 
>  #include 
>  #include 
> +#include 
>  // OS
>  #include 
>  #include 
>
> Jakub
>


RE: [backport gcc10, gcc9] Requet to backport PR97969

2021-05-25 Thread Przemyslaw Wirkus via Gcc-patches
> -Original Message-
> From: Richard Biener 
> Sent: 02 February 2021 10:08
> To: Przemyslaw Wirkus 
> Cc: Vladimir Makarov ; gcc-patches@gcc.gnu.org;
> ja...@redhat.com; ni...@redhat.com; Richard Earnshaw
> ; Ramana Radhakrishnan
> ; Kyrylo Tkachov
> 
> Subject: RE: [backport gcc10, gcc9] Requet to backport PR97969
> 
> On Tue, 2 Feb 2021, Przemyslaw Wirkus wrote:
> 
> > > On 2021-01-18 7:50 a.m., Richard Biener wrote:
> > > > On Mon, 18 Jan 2021, Przemyslaw Wirkus wrote:
> > > >
> > > >> Hi all,
> > > >>
> > > >> Can we backport PR97969 patch to GCC 10 and (maybe) GCC 9 ?:
> > > >> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97969
> > > >>
> > > >> IMHO bug is severe and could land in GCC 10 and 9. Vladimir's
> > > >> original
> > > patch:
> > > >> https://gcc.gnu.org/pipermail/gcc-patches/2021-January/563322.htm
> > > >> l applies without changes to both gcc-10 and gcc-9.
> > > >>
> > > >> I've regression tested this patch on both gcc-10 and gcc-9
> > > >> branched for
> > > >> x86_64 cross (arm-eabi target) and no issues.
> > > >>
> > > >> OK for gcc-10 and gcc-9 ?
> > > > I see two fallout PRs with a trivial search: PR98643 and PR98722.
> > > > LRA patches quite easily trigger unexpected fallout unfortunately ...
> > > >
> > > Yes, I am agree.  We should wait until the new regressions are
> > > fixed.  I am going to work on this patch more to fix the new
> > > regressions.� Although the basic idea of the original problem solution
> probably will stay the same.
> >
> > I've retested series of three patches which are related to this PR:
> >
> > 19af25c0b3aa2a78b4d45d295359ec26cb9fc607 [PR98777]
> > 79c57603602c4493b6baa1d47ed451e8f5e9c0f3 [PR98722]
> > 34aa56af2547e1646c0f07b9b88b210ebdb2a9f5 [PR97969]
> >
> > on top of gcc-10 branch.
> >
> > Bootstrapped and regression tested on aarch64-linux-gnu machine and no
> issues.
> > Regression tested on x86_64 host (arm-eabi target) cross and no issues.
> >
> > OK for gcc-10 ?
> 
> I think this warrants waiting until at least the GCC 11 release.

Hi,
Just a follow up after GCC 11 release.

I've backported to gcc-10 branch (without any change to original patches)
PR97969 and following PR98722 & PR98777 patches.

Commits apply cleanly without changes.
Built and regression tested on:
* arm-none-eabi and
* aarch64-none-linux-gnu cross toolchains.

There were no issues and no regressions (all OK).

OK for backport to gcc-10 branch ?

Kind regards,
Przemyslaw Wirkus

---
commits I've backported:

commit cf2ac1c30af0fa783c8d72e527904dda5d8cc330
Author: Vladimir N. Makarov 
Date:   Tue Jan 12 11:26:15 2021 -0500

[PR97969] LRA: Transform pattern `plus (plus (hard reg, const), pseudo)` 
after elimination

commit 4334b524274203125193a08a8485250c41c2daa9
Author: Vladimir N. Makarov 
Date:   Wed Jan 20 11:40:14 2021 -0500

[PR98722] LRA: Check that target has no 3-op add insn to transform 2 plus 
expression.

commit 68ba1039c7daf0485b167fe199ed7e8031158091
Author: Vladimir N. Makarov 
Date:   Thu Jan 21 17:27:01 2021 -0500

[PR98777] LRA: Use preliminary created pseudo for in LRA elimination subpass

$ ./contrib/git-backport.py cf2ac1c30af0fa783c8d72e527904dda5d8cc330
$ ./contrib/git-backport.py 4334b524274203125193a08a8485250c41c2daa9
$ ./contrib/git-backport.py 68ba1039c7daf0485b167fe199ed7e8031158091


> Richard.


Re: [PATCH 4/5] Convert remaining passes to RANGE_QUERY.

2021-05-25 Thread Aldy Hernandez via Gcc-patches




On 5/25/21 10:47 AM, Richard Biener wrote:

On Mon, May 24, 2021 at 10:02 PM Martin Sebor via Gcc-patches
 wrote:


On 5/21/21 5:39 AM, Aldy Hernandez via Gcc-patches wrote:

This patch converts the remaining users of get_range_info and
get_ptr_nonnull to the range_query API.

No effort was made to move passes away from VR_ANTI_RANGE, or any other
use of deprecated methods.  This was a straight up conversion to the new
API, nothing else.


A question about the uses of the RANGE_QUERY() and GLOBAL_RANGE_QUERY()
macros (hopefully functions): some clients in this patch call one or
the other based on whether cfun is set or null, while others call it
without such testing.  That suggests that the former clients might
be making the assumption that cfun is null while the latter ones
make the opposite assumption that cfun is not null.  It seems that
the code would be safer/more future-proof if it avoided making
these assumptions.

That could be done by introducing a function like this one:

range_query&
get_range_query (const function *func = cfun)
{
  if (func)
return func->x_range_query;
  return *get_global_range_query ();
}

This function would be easier to use since clients wouldn't have
to worry about when cfun is null.


Note that IPA passes also work on specific 'fun', not 'cfun' and
that 'cfun' stands in the way of threading GCC.  So please avoid
adding new references, even more so default args.

I wonder if SSA_NAME_RANGE_INFO is then obsolete?  What if
"ranger" is not enabled/available - will this effectively regress things
by not exposing SSA_NAME_RANGE_INFO (which also encodes
nonzero bits & friends)?


No, SSA_NAME_RANGE_INFO is not obsolete.  The default range mechanism 
when enable_ranger() has not been called is global_ranges, which is just 
a range_query object that uses SSA_NAME_RANGE_INFO and SSA_NAME_PTR_INFO 
under the covers.  See global_range_query::range_of_expr().


This patchset just provides a generic API so all things range related 
use the same interface.  In the future we may obsolete 
SSA_NAME_RANGE_INFO, but not before we provide all the functionality it 
already provides.  Note that get_nonzero_bits() is currently untouched.


However, what I will remove is global access to get_ptr_nonnull() and 
get_range_info(), since their only remaining user is from the 
range_query object this patchset provides.


Aldy



Re: [PATCH] c++tools: Include for exit [PR100731]

2021-05-25 Thread Jonathan Wakely via Gcc-patches

On 25/05/21 10:37 +0200, Jakub Jelinek wrote:

Hi!

This TU uses exit, but doesn't include  or  and relies
on some other header to include it indirectly, which apparently doesn't
happen on reporter's host.

The other  headers aren't guarded either and we rely on a compiler
capable of C++11, so maybe we can rely on  being around
unconditionally.


 is required since C++98 anyway, as is std::exit.

But it's incorrect to include  and then use ::exit, it should
be  and std::exit, or  and ::exit. But it probably
works everywhere this way too.


Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk/11?

2021-05-25  Jakub Jelinek  

PR bootstrap/100731
* server.cc: Include .

--- c++tools/server.cc.jj   2021-05-24 14:20:01.905748402 +0200
+++ c++tools/server.cc  2021-05-24 14:24:29.760813389 +0200
@@ -29,6 +29,7 @@ along with GCC; see the file COPYING3.
#include 
#include 
#include 
+#include 
// OS
#include 
#include 

Jakub




[committed] openmp: Fix reduction clause handling on teams distribute simd [PR99928]

2021-05-25 Thread Jakub Jelinek via Gcc-patches
Hi!

When a directive isn't combined with worksharing-loop, it takes much
simpler clause splitting path for reduction, and that one was missing
handling of teams when combined with simd.

Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux,
committed to trunk.

2021-05-25  Jakub Jelinek  

PR middle-end/99928
gcc/c-family/
* c-omp.c (c_omp_split_clauses): Copy reduction to teams when teams is
combined with simd and not with taskloop or for.
gcc/testsuite/
* c-c++-common/gomp/pr99928-8.c: Remove xfails from omp teams r21 and
r28 checks.
* c-c++-common/gomp/pr99928-9.c: Likewise.
* c-c++-common/gomp/pr99928-10.c: Likewise.
libgomp/
* testsuite/libgomp.c-c++-common/reduction-17.c: New test.

--- gcc/c-family/c-omp.c.jj 2021-05-21 21:16:03.079850750 +0200
+++ gcc/c-family/c-omp.c2021-05-24 17:34:13.514132206 +0200
@@ -2059,6 +2059,23 @@ c_omp_split_clauses (location_t loc, enu
  OMP_CLAUSE_CHAIN (c) = cclauses[C_OMP_CLAUSE_SPLIT_TASKLOOP];
  cclauses[C_OMP_CLAUSE_SPLIT_TASKLOOP] = c;
}
+ else if ((mask & (OMP_CLAUSE_MASK_1
+   << PRAGMA_OMP_CLAUSE_NUM_TEAMS)) != 0)
+   {
+ c = build_omp_clause (OMP_CLAUSE_LOCATION (clauses),
+   OMP_CLAUSE_REDUCTION);
+ OMP_CLAUSE_DECL (c) = OMP_CLAUSE_DECL (clauses);
+ OMP_CLAUSE_REDUCTION_CODE (c)
+   = OMP_CLAUSE_REDUCTION_CODE (clauses);
+ OMP_CLAUSE_REDUCTION_PLACEHOLDER (c)
+   = OMP_CLAUSE_REDUCTION_PLACEHOLDER (clauses);
+ OMP_CLAUSE_REDUCTION_DECL_PLACEHOLDER (c)
+   = OMP_CLAUSE_REDUCTION_DECL_PLACEHOLDER (clauses);
+ OMP_CLAUSE_REDUCTION_INSCAN (c)
+   = OMP_CLAUSE_REDUCTION_INSCAN (clauses);
+ OMP_CLAUSE_CHAIN (c) = cclauses[C_OMP_CLAUSE_SPLIT_TEAMS];
+ cclauses[C_OMP_CLAUSE_SPLIT_TEAMS] = c;
+   }
  s = C_OMP_CLAUSE_SPLIT_SIMD;
}
  else
--- gcc/testsuite/c-c++-common/gomp/pr99928-8.c.jj  2021-05-13 
16:53:31.063368324 +0200
+++ gcc/testsuite/c-c++-common/gomp/pr99928-8.c 2021-05-24 17:57:15.809001673 
+0200
@@ -155,7 +155,7 @@ bar (void)
 r20++;
   /* { dg-final { scan-tree-dump "omp target\[^\n\r]*map\\(tofrom:r21" 
"gimple" { xfail *-*-* } } } */
   /* { dg-final { scan-tree-dump-not "omp 
target\[^\n\r]*firstprivate\\(r21\\)" "gimple" { xfail *-*-* } } } */
-  /* { dg-final { scan-tree-dump "omp teams\[^\n\r]*reduction\\(\\+:r21\\)" 
"gimple" { xfail *-*-* } } } */
+  /* { dg-final { scan-tree-dump "omp teams\[^\n\r]*reduction\\(\\+:r21\\)" 
"gimple" } } */
   /* { dg-final { scan-tree-dump-not "omp 
distribute\[^\n\r]*reduction\\(\\+:r21\\)" "gimple" } } */
   /* { dg-final { scan-tree-dump "omp simd\[^\n\r]*reduction\\(\\+:r21\\)" 
"gimple" } } */
   #pragma omp target teams distribute simd reduction(+:r21)
@@ -202,7 +202,7 @@ bar (void)
   #pragma omp teams distribute parallel for simd reduction(+:r27)
   for (int i = 0; i < 64; i++)
 r27++;
-  /* { dg-final { scan-tree-dump "omp teams\[^\n\r]*reduction\\(\\+:r28\\)" 
"gimple" { xfail *-*-* } } } */
+  /* { dg-final { scan-tree-dump "omp teams\[^\n\r]*reduction\\(\\+:r28\\)" 
"gimple" } } */
   /* { dg-final { scan-tree-dump-not "omp 
distribute\[^\n\r]*reduction\\(\\+:r28\\)" "gimple" } } */
   /* { dg-final { scan-tree-dump "omp simd\[^\n\r]*reduction\\(\\+:r28\\)" 
"gimple" } } */
   #pragma omp teams distribute simd reduction(+:r28)
--- gcc/testsuite/c-c++-common/gomp/pr99928-9.c.jj  2021-05-13 
16:53:31.064368313 +0200
+++ gcc/testsuite/c-c++-common/gomp/pr99928-9.c 2021-05-24 17:58:22.022277334 
+0200
@@ -155,7 +155,7 @@ bar (void)
 r20[1]++;
   /* { dg-final { scan-tree-dump "omp target\[^\n\r]*map\\(tofrom:r21\\\[1\\\] 
\\\[len: 8\\\]" "gimple" { xfail *-*-* } } } */
   /* { dg-final { scan-tree-dump-not "omp 
target\[^\n\r]*firstprivate\\(r21\\)" "gimple" } } */
-  /* { dg-final { scan-tree-dump "omp 
teams\[^\n\r]*reduction\\(\\+:MEM\[^\n\r]* \\+ 4" "gimple" { xfail *-*-* } 
} } */
+  /* { dg-final { scan-tree-dump "omp 
teams\[^\n\r]*reduction\\(\\+:MEM\[^\n\r]* \\+ 4" "gimple" } } */
   /* { dg-final { scan-tree-dump-not "omp 
distribute\[^\n\r]*reduction\\(\\+:MEM\[^\n\r]* \\+ 4" "gimple" } } */
   /* { dg-final { scan-tree-dump "omp 
simd\[^\n\r]*reduction\\(\\+:MEM\[^\n\r]* \\+ 4" "gimple" } } */
   #pragma omp target teams distribute simd reduction(+:r21[1:2])
@@ -202,7 +202,7 @@ bar (void)
   #pragma omp teams distribute parallel for simd reduction(+:r27[1:2])
   for (int i = 0; i < 64; i++)
 r27[1]++;
-  /* { dg-final { scan-tree-dump "omp 
teams\[^\n\r]*reduction\\(\\+:MEM\[^\n\r]* \\+ 4" "gimple" { xfail *-*-* } 
} } */
+  /* { dg-final { scan-tree-dump "omp 
teams\[^\n\r]*reduction\\(\\+:MEM\[^\n\r]* \\+ 4" 

Re: [PATCH] tree-sra: Avoid refreshing into const base decls (PR 100453)

2021-05-25 Thread Richard Biener
On Tue, 25 May 2021, Eric Botcazou wrote:

> > The problem with this patch is that it causes:
> > 
> >   FAIL: gnat.dg/opt94.adb scan-tree-dump-times optimized "worker" 1
> > 
> > which is exactly the testcase from the commit which caused the bug I am
> > trying to address.
> 
> Sorry about that, a thinko in the original change, I'm testing this fixlet.

LGTM.

> 
>   * gimplify.c (gimplify_decl_expr): Clear TREE_READONLY on the DECL
>   when creating an initialization statement for it.
>   * tree-inline.c (setup_one_parameter): Fix thinko in new condition.
> 


Re: [PATCH] Update copyright years in c++tools

2021-05-25 Thread Richard Biener
On Tue, 25 May 2021, Jakub Jelinek wrote:

> Hi!
> 
> While looking at PR100731, I have noticed the copyright years are 2020-ish
> only.  This patch adds it to update-copyright.py and updates those.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

OK.

> 2021-05-25  Jakub Jelinek  
> 
> contrib/
>   * update-copyright.py: Add c++tools.
> c++tools/
>   * Makefile.in: Update copyright year.
>   * configure.ac: Likewise.
>   * resolver.cc: Likewise.
>   * resolver.h: Likewise.
>   * server.cc: Likewise.
>   (print_version): Update copyright notice date.
> 
> --- contrib/update-copyright.py
> +++ contrib/update-copyright.py
> @@ -735,6 +735,7 @@ class GCCCmdLine (CmdLine):
>  
>  self.add_dir ('.', TopLevelFilter())
>  # boehm-gc is imported from upstream.
> +self.add_dir ('c++tools')
>  self.add_dir ('config', ConfigFilter())
>  # contrib isn't really part of GCC.
>  self.add_dir ('fixincludes')
> @@ -770,6 +771,7 @@ class GCCCmdLine (CmdLine):
>  # zlib is imported from upstream.
>  
>  self.default_dirs = [
> +'c++tools',
>  'gcc',
>  'include',
>  'libada',
> --- c++tools/Makefile.in
> +++ c++tools/Makefile.in
> @@ -1,5 +1,5 @@
>  # Makefile for c++tools
> -#   Copyright 2020 Free Software Foundation, Inc.
> +#   Copyright (C) 2020-2021 Free Software Foundation, Inc.
>  #
>  # This file is free software; you can redistribute it and/or modify
>  # it under the terms of the GNU General Public License as published by
> --- c++tools/configure.ac
> +++ c++tools/configure.ac
> @@ -1,5 +1,5 @@
>  # Configure script for c++tools
> -#   Copyright (C) 2020 Free Software Foundation, Inc.
> +#   Copyright (C) 2020-2021 Free Software Foundation, Inc.
>  #   Written by Nathan Sidwell  while at FaceBook
>  #
>  # This file is free software; you can redistribute it and/or modify it
> --- c++tools/resolver.cc
> +++ c++tools/resolver.cc
> @@ -1,5 +1,5 @@
>  /* C++ modules.  Experimental!   -*- c++ -*-
> -   Copyright (C) 2017-2020 Free Software Foundation, Inc.
> +   Copyright (C) 2017-2021 Free Software Foundation, Inc.
> Written by Nathan Sidwell  while at FaceBook
>  
> This file is part of GCC.
> --- c++tools/resolver.h
> +++ c++tools/resolver.h
> @@ -1,5 +1,5 @@
>  /* C++ modules.  Experimental!   -*- c++ -*-
> -   Copyright (C) 2017-2020 Free Software Foundation, Inc.
> +   Copyright (C) 2017-2021 Free Software Foundation, Inc.
> Written by Nathan Sidwell  while at FaceBook
>  
> This file is part of GCC.
> --- c++tools/server.cc
> +++ c++tools/server.cc
> @@ -1,5 +1,5 @@
>  /* C++ modules.  Experimental!
> -   Copyright (C) 2018-2020 Free Software Foundation, Inc.
> +   Copyright (C) 2018-2021 Free Software Foundation, Inc.
> Written by Nathan Sidwell  while at FaceBook
>  
> This file is part of GCC.
> @@ -290,7 +290,7 @@ static void ATTRIBUTE_NORETURN
>  print_version (void)
>  {
>fnotice (stdout, "%s %s%s\n", progname, pkgversion_string, version_string);
> -  fprintf (stdout, "Copyright %s 2018-2020 Free Software Foundation, Inc.\n",
> +  fprintf (stdout, "Copyright %s 2018-2021 Free Software Foundation, Inc.\n",
>  ("(C)"));
>fnotice (stdout,
>  ("This is free software; see the source for copying conditions.\n"
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)


Re: [PATCH 0/11] warning control by group and location (PR 74765)

2021-05-25 Thread Richard Biener via Gcc-patches
On Tue, May 25, 2021 at 2:53 AM Martin Sebor via Gcc-patches
 wrote:
>
> On 5/24/21 5:08 PM, David Malcolm wrote:
> > On Mon, 2021-05-24 at 16:02 -0600, Martin Sebor wrote:
> >>The rare expressions that have no location
> >> continue to have just one bit[1].
> >
> > Where does this get stored?  I see the final patch in the kit removes
> > TREE_NO_WARNING, but I don't quite follow the logic for where the bit
> > would then be stored for an expr with UNKNOWN_LOCATION.
>
> The patch just removes the TREE_NO_WARNING macro (along with
> the gimple_no_warning_p/gimple_set_no_warning) functions but not
> the no-warning bit itself.  It removes them to avoid accidentally
> modifying the bit alone without going through the new API and
> updating the location -> warning group mapping.  The bit is still
> needed for expression/statements with no location.

I wonder if we could clone UNKNOWN_LOCATION, thus when
we set_no_warning on UNKNOWN_LOCATION create a new location
with the source location being still UNKNOWN but with the appropriate
ad-hoc data to disable the warning?  That of course requires the
API to be

location_t set_no_warning (...)

and users would need to update the container with the new location
(or we'd need to use a reference we can update in set_no_warning).

That said - do you have any stats on how many UNKNOWN_LOCATION
locations we run into with boostrap / the testsuite?

Otherwise thanks for tackling this long-standing issue.

Richard.


  1   2   >