Re: [PATCH] rtl-optimization/80960 - avoid creating garbage RTL in DSE

2021-02-01 Thread Richard Biener
On Mon, 1 Feb 2021, Jakub Jelinek wrote:

> On Mon, Feb 01, 2021 at 12:54:50PM -0700, Jeff Law wrote:
> > >>> So I see no difference for stage2-gcc/*.o dse1/dse2 with/without the
> > >>> patch but counts are _extremely_ small.  Statistics:
> > >>>
> > >>>   70148 dse: local deletions = 0, global deletions = 0
> > >>>  32 dse: local deletions = 0, global deletions = 1
> > >>>   9 dse: local deletions = 0, global deletions = 2
> > >>>   7 dse: local deletions = 0, global deletions = 3
> > >>>   2 dse: local deletions = 0, global deletions = 4
> > >>>   2 dse: local deletions = 0, global deletions = 5
> > >>>   3 dse: local deletions = 0, global deletions = 7
> > >>>  67 dse: local deletions = 1, global deletions = 0
> > >>>   1 dse: local deletions = 1, global deletions = 2
> > >>>  12 dse: local deletions = 2, global deletions = 0
> > >>>   1 dse: local deletions = 24, global deletions = 1
> > >>>   2 dse: local deletions = 3, global deletions = 0
> > >>>   4 dse: local deletions = 4, global deletions = 0
> > >>>   4 dse: local deletions = 6, global deletions = 0
> > >>>   1 dse: local deletions = 7, global deletions = 0
> > >>>   1 dse: local deletions = 8, global deletions = 0
> > >>>
> > >>> so not sure how much confidence this brings over the analytical
> > >>> reasoning that it shouldn't make a difference ...
> > >>>
> > >>> stats on just dse2 are even more depressing (given it's cost)
> > >>>
> > >>>   35123 dse: local deletions = 0, global deletions = 0
> > >>>   2 dse: local deletions = 0, global deletions = 1
> > >>>  20 dse: local deletions = 1, global deletions = 0
> > >>>   1 dse: local deletions = 2, global deletions = 0
> > >>>   1 dse: local deletions = 3, global deletions = 0
> > >>>   1 dse: local deletions = 4, global deletions = 0
> > >> Based on that, I'd argue that DSE2 should go away and DSE1 should be
> > >> evaluated for the chopping block.  While RTL DSE was marginally
> > >> important in 1999 when it was first submitted, the tree-ssa pipeline as
> > >> a whole has probably made RTL DSE largely pointless.
> > > True. Though I'd argue that DSE2 might be the conceptually more useful 
> > > pass since it sees spill slots. 
> > True in concept, but I bet that the SSA pipeline has made this much less
> > common in RTL DSE than it was 20+ years ago.  Our allocator and reloader
> > are much improved as well which would further decrease the number of
> > opportunities.
> > 
> > I'd hazard a guess that what's left are locals that need to be
> > addressable and some optimization in the RTL pipeline exposed a dead
> > store that wasn't otherwise visible in the SSA pipeline.  BUt the only
> > way to be sure would be to dig into them.
> 
> Shouldn't we gather statistics from larger codebase first and perhaps
> compare against tree-ssa-dse statistics?  I mean, in many functions there
> are no DSE opportunities at all.

Of course.  Some DSE will definitely be required because we expose
ABI details only on RTL and expand sometimes is quite stupid.  ISTR
either DCE or CSE performs some limited amount of DSE as well?

The most needed and interesting work will be to disentangle RTL expansion
into the "complex" bits to be done on (lowered) GIMPLE and the
mechanical detail of GIMPLE to RTL one instruction at a time.  I guess
only during this work we'll learn what we need in lowered GIMPLE.

Richard.


Re: The performance data for two different implementation of new security feature -ftrivial-auto-var-init

2021-02-01 Thread Richard Biener
On Mon, 1 Feb 2021, Qing Zhao wrote:

> Hi, Richard,
> 
> I have adjusted SRA phase to split calls to DEFERRED_INIT per you suggestion.
> 
> And now the routine “bump_map” in 511.povray is like following:
> ...
> 
>  # DEBUG BEGIN_STMT
>   xcoor = 0.0;
>   ycoor = 0.0;
>   # DEBUG BEGIN_STMT
>   index = .DEFERRED_INIT (index, 2);
>   index2 = .DEFERRED_INIT (index2, 2);
>   index3 = .DEFERRED_INIT (index3, 2);
>   # DEBUG BEGIN_STMT
>   colour1 = .DEFERRED_INIT (colour1, 2);
>   colour2 = .DEFERRED_INIT (colour2, 2);
>   colour3 = .DEFERRED_INIT (colour3, 2);
>   # DEBUG BEGIN_STMT
>   p1$0_181 = .DEFERRED_INIT (p1$0_195(D), 2);
>   # DEBUG p1$0 => p1$0_181
>   p1$1_184 = .DEFERRED_INIT (p1$1_182(D), 2);
>   # DEBUG p1$1 => p1$1_184
>   p1$2_172 = .DEFERRED_INIT (p1$2_185(D), 2);
>   # DEBUG p1$2 => p1$2_172
>   p2$0_177 = .DEFERRED_INIT (p2$0_173(D), 2);
>   # DEBUG p2$0 => p2$0_177
>   p2$1_135 = .DEFERRED_INIT (p2$1_178(D), 2);
>   # DEBUG p2$1 => p2$1_135
>   p2$2_137 = .DEFERRED_INIT (p2$2_136(D), 2);
>   # DEBUG p2$2 => p2$2_137
>   p3$0_377 = .DEFERRED_INIT (p3$0_376(D), 2);
>   # DEBUG p3$0 => p3$0_377
>   p3$1_379 = .DEFERRED_INIT (p3$1_378(D), 2);
>   # DEBUG p3$1 => p3$1_379
>   p3$2_381 = .DEFERRED_INIT (p3$2_380(D), 2);
>   # DEBUG p3$2 => p3$2_381
> 
> 
> In the above, p1, p2, and p3 are all splitted to calls to DEFERRED_INIT of 
> the components of p1, p2 and p3. 
> 
> With this change, the stack usage numbers with -fstack-usage for approach A, 
> old approach D and new D with the splitting in SRA are:
> 
>   Approach A  Approach D-old  Approach D-new
> 
>   272 624 368
> 
> From the above, we can see that splitting the call to DEFERRED_INIT in SRA 
> can reduce the stack usage increase dramatically. 
> 
> However, looks like that the stack size for D is still bigger than A. 
> 
> I checked the IR again, and found that the alias analysis might be 
> responsible for this (by compare the image.cpp.026t.ealias for both A and D):
> 
> (Due to the call to:
> 
>   colour1 = .DEFERRED_INIT (colour1, 2);
> )
> 
> **Approach A:
> 
> Points_to analysis:
> 
> Constraints:
> …
> colour1 = 
> …
> colour1 = 
> colour1 = 
> colour1 = 
> colour1 = 
> colour1 = 
> ...
> callarg(53) = 
> ...
> _53 = colour1
> 
> Points_to sets:
> …
> colour1 = { NULL ESCAPED NONLOCAL } same as _53
> ...
> CALLUSED(48) = { NULL ESCAPED NONLOCAL index colour1 }
> CALLCLOBBERED(49) = { NULL ESCAPED NONLOCAL index colour1 } same as 
> CALLUSED(48)
> ...
> callarg(53) = { NULL ESCAPED NONLOCAL colour1 }
> 
> **Apprach D:
> 
> Points_to analysis:
> 
> Constraints:
> …
> callarg(19) = colour1
> callarg(19) = 
> colour1 = callarg(19) + UNKNOWN
> colour1 = 
> …
> colour1 = 
> colour1 = 
> colour1 = 
> colour1 = 
> colour1 = 
> …
> callarg(74) = 
> callarg(74) = callarg(74) + UNKNOWN
> callarg(74) = *callarg(74) + UNKNOWN
> …
> _53 = colour1
> _54 = _53
> _55 = _54 + UNKNOWN
> _55 = 
> _56 = colour1
> _57 = _56
> _58 = _57 + UNKNOWN
> _58 = 
> _59 = _55 + UNKNOWN
> _59 = _58 + UNKNOWN
> _60 = colour1
> _61 = _60
> _62 = _61 + UNKNOWN
> _62 = 
> _63 = _59 + UNKNOWN
> _63 = _62 + UNKNOWN
> _64 = _63 + UNKNOWN
> ..
> Points_to set:
> …
> colour1 = { ESCAPED NONLOCAL } same as callarg(19)
> …
> CALLUSED(69) = { ESCAPED NONLOCAL index colour1 }
> CALLCLOBBERED(70) = { ESCAPED NONLOCAL index colour1 } same as CALLUSED(69)
> callarg(71) = { ESCAPED NONLOCAL }
> callarg(72) = { ESCAPED NONLOCAL }
> callarg(73) = { ESCAPED NONLOCAL }
> callarg(74) = { ESCAPED NONLOCAL colour1 }
> 
> My question:
> 
> Is it possible to adjust alias analysis to resolve this issue?

You probably want to handle .DEFERRED_INIT in tree-ssa-structalias.c
find_func_aliases_for_call (it's not a builtin but you can look in
the respective subroutine for examples).  Specifically you want to
avoid making anything escaped or clobbered.

> thanks.
> 
> Qing
> 
> > On Jan 18, 2021, at 10:12 AM, Qing Zhao via Gcc-patches 
> >  wrote:
> > 
> > I checked the routine “poverties::bump_map” in 511.povray_r since it
> > has a lot stack increase 
> > due to implementation D, by examine the IR immediate before RTL
> > expansion phase.  
> > (image.cpp.244t.optimized), I found that we have the following
> > additional statements for the array elements:
> > 
> > void  pov::bump_map (double * EPoint, struct TNORMAL * Tnormal, double
> > * normal)
> > {
> > …
> > double p3[3];
> > double p2[3];
> > double p1[3];
> > float colour3[5];
> > float colour2[5];
> > float colour1[5];
> > …
> > # DEBUG BEGIN_STMT
> > colour1 = .DEFERRED_INIT (colour1, 2);
> > colour2 = .DEFERRED_INIT (colour2, 2);
> > colour3 = .DEFERRED_INIT (colour3, 2);
> > # DEBUG BEGIN_STMT
> > MEM  [(double[3] *)] = p1$0_144(D);
> > MEM  [(double[3] *) + 8B] = p1$1_135(D);
> > MEM  [(double[3] *) + 16B] = p1$2_138(D);
> > p1 = .DEFERRED_INIT (p1, 2);
> > # DEBUG 

[PATCH v2] PR target/98743: Fix ICE in convert_move for RISC-V

2021-02-01 Thread Kito Cheng
 - Check `from` mode is not BLMmode before call store_expr, calling store_expr
   with BLKmode will cause ICE.

 - Verified with riscv64, x86_64 and aarch64, no introduce new regression.

Note: Those logic was introduced by 3e60ddeb8220ed388819bb3f14e8caa9309fd3c2,
  so I cc Jakub for reivew.

Changes for V2:

 - Checking mode of `from` rather than mode of `to`.
 - Verified on riscv64, x86_64 and aarch64 again.

gcc/ChangeLog:

PR target/98743
* expr.c: Check mode before calling store_expr.

gcc/testsuite/ChangeLog:

PR target/98743
* g++.target/riscv/pr98743.C: New.
---
 gcc/expr.c   |  1 +
 gcc/testsuite/g++.target/riscv/pr98743.C | 27 
 2 files changed, 28 insertions(+)
 create mode 100644 gcc/testsuite/g++.target/riscv/pr98743.C

diff --git a/gcc/expr.c b/gcc/expr.c
index 04ef5ad114d..86dc1b6c973 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -5459,6 +5459,7 @@ expand_assignment (tree to, tree from, bool nontemporal)
  /* If to_rtx is a promoted subreg, we need to zero or sign
 extend the value afterwards.  */
  if (TREE_CODE (to) == MEM_REF
+ && TYPE_MODE (TREE_TYPE (from)) != BLKmode
  && !REF_REVERSE_STORAGE_ORDER (to)
  && known_eq (bitpos, 0)
  && known_eq (bitsize, GET_MODE_BITSIZE (GET_MODE (to_rtx
diff --git a/gcc/testsuite/g++.target/riscv/pr98743.C 
b/gcc/testsuite/g++.target/riscv/pr98743.C
new file mode 100644
index 000..41f476fbe8e
--- /dev/null
+++ b/gcc/testsuite/g++.target/riscv/pr98743.C
@@ -0,0 +1,27 @@
+// Test for value-initialization via {}
+// { dg-do run { target c++11 } }
+/* { dg-options "-Og -fno-early-inlining -finline-small-functions 
-fpack-struct" */
+void * operator new (__SIZE_TYPE__, void *p) { return p; }
+void * operator new[] (__SIZE_TYPE__, void *p) { return p; }
+
+// Empty base so A isn't an aggregate
+struct B {};
+struct A: B {
+  int i;
+};
+
+struct C: A {
+  C(): A{} {}
+};
+
+int main()
+{
+  int space = 42;
+  A* ap = new () A{};
+  int space1[1] = { 42 };
+  A* a1p = new (space1) A[1]{};
+  if (ap->i != 0 ||
+  a1p[0].i != 0)
+return 1;
+  return 0;
+}
-- 
2.30.0



Re: [PATCH] arm: Auto-vectorization for MVE: vorn

2021-02-01 Thread Christophe Lyon via Gcc-patches
On Mon, 1 Feb 2021 at 10:08, Kyrylo Tkachov  wrote:
>
>
>
> > -Original Message-
> > From: Christophe Lyon 
> > Sent: 29 January 2021 18:18
> > To: Kyrylo Tkachov 
> > Cc: gcc-patches@gcc.gnu.org
> > Subject: Re: [PATCH] arm: Auto-vectorization for MVE: vorn
> >
> > On Fri, 29 Jan 2021 at 16:03, Kyrylo Tkachov 
> > wrote:
> > >
> > >
> > >
> > > > -Original Message-
> > > > From: Gcc-patches  On Behalf Of
> > > > Christophe Lyon via Gcc-patches
> > > > Sent: 29 January 2021 14:55
> > > > To: gcc-patches@gcc.gnu.org
> > > > Subject: [PATCH] arm: Auto-vectorization for MVE: vorn
> > > >
> > > > This patch enables MVE vornq instructions for auto-vectorization.  MVE
> > > > vornq insns in mve.md are modified to use ior instead of unspec
> > > > expression to support ior3.  The ior3 expander is added
> > to
> > > > vec-common.md
> > > >
> > > > 2021-01-29  Christophe Lyon  
> > > >
> > > >   gcc/
> > > >   * config/arm/iterators.md (supf): Remove VORNQ_S and VORNQ_U.
> > > >   (VORNQ): Remove.
> > > >   * config/arm/mve.md (mve_vornq_s): New entry for vorn
> > > >   instruction using expression ior.
> > > >   (mve_vornq_u): New expander.
> > > >   (mve_vornq_f): Use ior code instead of unspec.
> > > >   * config/arm/unspecs.md (VORNQ_S, VORNQ_U, VORNQ_F):
> > > > Remove.
> > > >   * config/arm/vec-common.md (orn3): New expander.
> > > >
> > > >   gcc/testsuite/
> > > >   * gcc.target/arm/simd/mve-vorn.c: Add vorn tests.
> > > > ---
> > > >  gcc/config/arm/iterators.md  |  3 +--
> > > >  gcc/config/arm/mve.md| 23 +++--
> > > >  gcc/config/arm/unspecs.md|  3 ---
> > > >  gcc/config/arm/vec-common.md |  8 ++
> > > >  gcc/testsuite/gcc.target/arm/simd/mve-vorn.c | 38
> > > > 
> > > >  5 files changed, 62 insertions(+), 13 deletions(-)
> > > >  create mode 100644 gcc/testsuite/gcc.target/arm/simd/mve-vorn.c
> > > >
> > > > diff --git a/gcc/config/arm/iterators.md b/gcc/config/arm/iterators.md
> > > > index b902790..43aab23 100644
> > > > --- a/gcc/config/arm/iterators.md
> > > > +++ b/gcc/config/arm/iterators.md
> > > > @@ -1293,7 +1293,7 @@ (define_int_attr supf [(VCVTQ_TO_F_S "s")
> > > > (VCVTQ_TO_F_U "u") (VREV16Q_S "s")
> > > >  (VMULLBQ_INT_S "s") (VMULLBQ_INT_U "u")
> > > > (VQADDQ_S "s")
> > > >  (VMULLTQ_INT_S "s") (VMULLTQ_INT_U "u")
> > > > (VQADDQ_U "u")
> > > >  (VMULQ_N_S "s") (VMULQ_N_U "u") (VMULQ_S "s")
> > > > -(VMULQ_U "u") (VORNQ_S "s") (VORNQ_U "u")
> > > > +(VMULQ_U "u")
> > > >  (VQADDQ_N_S "s") (VQADDQ_N_U "u")
> > > >  (VQRSHLQ_N_S "s") (VQRSHLQ_N_U "u") (VQRSHLQ_S
> > > > "s")
> > > >  (VQRSHLQ_U "u") (VQSHLQ_N_S "s")
> > > >   (VQSHLQ_N_U "u")
> > > > @@ -1563,7 +1563,6 @@ (define_int_iterator VMULLBQ_INT
> > > > [VMULLBQ_INT_U VMULLBQ_INT_S])
> > > >  (define_int_iterator VMULLTQ_INT [VMULLTQ_INT_U VMULLTQ_INT_S])
> > > >  (define_int_iterator VMULQ [VMULQ_U VMULQ_S])
> > > >  (define_int_iterator VMULQ_N [VMULQ_N_U VMULQ_N_S])
> > > > -(define_int_iterator VORNQ [VORNQ_U VORNQ_S])
> > > >  (define_int_iterator VQADDQ [VQADDQ_U VQADDQ_S])
> > > >  (define_int_iterator VQADDQ_N [VQADDQ_N_S VQADDQ_N_U])
> > > >  (define_int_iterator VQRSHLQ [VQRSHLQ_S VQRSHLQ_U])
> > > > diff --git a/gcc/config/arm/mve.md b/gcc/config/arm/mve.md
> > > > index 465f71c..ec0ef7b 100644
> > > > --- a/gcc/config/arm/mve.md
> > > > +++ b/gcc/config/arm/mve.md
> > > > @@ -1634,18 +1634,26 @@ (define_insn "mve_vmulq"
> > > >  ;;
> > > >  ;; [vornq_u, vornq_s])
> > > >  ;;
> > > > -(define_insn "mve_vornq_"
> > > > +(define_insn "mve_vornq_s"
> > > >[
> > > > (set (match_operand:MVE_2 0 "s_register_operand" "=w")
> > > > - (unspec:MVE_2 [(match_operand:MVE_2 1 "s_register_operand" "w")
> > > > -(match_operand:MVE_2 2 "s_register_operand" "w")]
> > > > -  VORNQ))
> > > > + (ior:MVE_2 (not:MVE_2 (match_operand:MVE_2 2
> > > > "s_register_operand" "w"))
> > > > +(match_operand:MVE_2 1 "s_register_operand" "w")))
> > > >]
> > > >"TARGET_HAVE_MVE"
> > > > -  "vorn %q0, %q1, %q2"
> > > > +   "vorn\t%q0, %q1, %q2"
> > > >[(set_attr "type" "mve_move")
> > > >  ])
> > > >
> > > > +(define_expand "mve_vornq_u"
> > > > +  [
> > > > +   (set (match_operand:MVE_2 0 "s_register_operand")
> > > > + (ior:MVE_2 (not:MVE_2 (match_operand:MVE_2 2
> > > > "s_register_operand"))
> > > > +(match_operand:MVE_2 1 "s_register_operand")))
> > > > +  ]
> > > > +  "TARGET_HAVE_MVE"
> > > > +)
> > > > +
> > > >  ;;
> > > >  ;; [vorrq_s, vorrq_u])
> > > >  ;;
> > > > @@ -2630,9 +2638,8 @@ (define_insn "mve_vmulq_n_f"
> > > >  (define_insn "mve_vornq_f"
> > > >[
> > > > (set (match_operand:MVE_0 0 

Re: [[C++ PATCH]] Implement C++2a P0330R2 - Literal Suffixes for ptrdiff_t and size_t

2021-02-01 Thread Ed Smith-Rowland via Gcc-patches

On 2/2/21 12:12 AM, Jason Merrill wrote:

On 2/1/21 9:15 PM, Ed Smith-Rowland wrote:

On 2/1/21 2:23 PM, Jakub Jelinek wrote:

On Mon, Feb 01, 2021 at 01:46:13PM -0500, Ed Smith-Rowland wrote:

@@ -0,0 +1,8 @@
+// { dg-do compile { target c++23 } }
+
+#include 
+#include 
+
+static_assert(std::is_same_v);
+static_assert(std::is_same_v);
Shouldn't this be std::make_signed::type instead of 
std::ptrdiff_t
Yes it should. The paper goes on about ptrdiff_t but at the very end 
they punt on that in favor of what you have.


+std::ptrdiff_t pd1 = 1234z; // { dg-warning "use of C\+\+23 
ptrdiff_t integer constant" "" { target c++20_down } }
+std::ptrdiff_t PD1 = 5678Z; // { dg-warning "use of C\+\+23 
ptrdiff_t integer constant" "" { target c++20_down } }

Ditto here.

Agree.


+  const char *message = (result & CPP_N_UNSIGNED) == 
CPP_N_UNSIGNED

+    ? N_("use of C++23 size_t integer constant")
+    : N_("use of C++23 ptrdiff_t integer constant");

And here too (perhaps %::type%> )?
And maybe % too.

Agree.



--- a/libcpp/include/cpplib.h
+++ b/libcpp/include/cpplib.h
@@ -500,6 +500,9 @@ struct cpp_options
    /* Nonzero means tokenize C++20 module directives.  */
    unsigned char module_directives;
+  /* Nonzero for C++23 ptrdiff_t and size_t literals.  */

And drop "ptrdiff_t and " here?

+#define CPP_N_SIZE_T    0x200 /* C++23 size_t or ptrdiff_t 
literal  */

And " or ptrdiff_t" here?

While ptrdiff_t will usually be the same type, seems there is e.g.:
config/darwin.h:#define SIZE_TYPE "long unsigned int"
config/darwin.h:#define PTRDIFF_TYPE "int"
config/i386/djgpp.h:#define SIZE_TYPE "long unsigned int"
config/i386/djgpp.h:#define PTRDIFF_TYPE "int"
config/m32c/m32c.h:#define PTRDIFF_TYPE (TARGET_A16 ? "int" : "long 
int")

config/m32c/m32c.h:#define SIZE_TYPE "unsigned int"
config/rs6000/rs6000.h:#define PTRDIFF_TYPE "int"
config/rs6000/rs6000.h:#define SIZE_TYPE "long unsigned int"
config/s390/linux.h:#define SIZE_TYPE "long unsigned int"
config/s390/linux.h:#define PTRDIFF_TYPE (TARGET_64BIT ? "long int" 
: "int")

config/visium/visium.h:#define SIZE_TYPE "unsigned int"
config/visium/visium.h:#define PTRDIFF_TYPE "long int"
config/vms/vms.h:#define SIZE_TYPE  "unsigned int"
config/vms/vms.h:#define PTRDIFF_TYPE (flag_vms_pointer_size == 
VMS_POINTER_SIZE_NONE ? \

config/vms/vms.h-  "int" : "long long int")
so quite a few differences.

Jakub


Here is my last patch with all the concerns addressed.

I am not smart enough to get the dg-warning regex in 
Wsize_t-literals.C to work. If someone could carry this over the 
finish line that would be great. Or give me pointers. I can't any more.


Your regex will work fine if you wrap it in {} instead of "", e.g.

{ dg-warning {use of C\+\+23 .size_t. integer constant} }

Jason


Thank you Jason,

So here is the latest in testing.

Ed


diff --git a/gcc/c-family/c-cppbuiltin.c b/gcc/c-family/c-cppbuiltin.c
index dca6815a876..48dec21d4b4 100644
--- a/gcc/c-family/c-cppbuiltin.c
+++ b/gcc/c-family/c-cppbuiltin.c
@@ -1025,6 +1025,11 @@ c_cpp_builtins (cpp_reader *pfile)
  cpp_define (pfile, "__cpp_aggregate_paren_init=201902L");
  cpp_define (pfile, "__cpp_using_enum=201907L");
}
+  if (cxx_dialect > cxx20)
+   {
+ /* Set feature test macros for C++23.  */
+ cpp_define (pfile, "__cpp_size_t_suffix=202006L");
+   }
   if (flag_concepts)
 {
  if (cxx_dialect >= cxx20)
diff --git a/gcc/c-family/c-lex.c b/gcc/c-family/c-lex.c
index fe40a0f728b..6374b72ed2d 100644
--- a/gcc/c-family/c-lex.c
+++ b/gcc/c-family/c-lex.c
@@ -834,6 +834,14 @@ interpret_integer (const cpp_token *token, unsigned int 
flags,
 type = ((flags & CPP_N_UNSIGNED)
? widest_unsigned_literal_type_node
: widest_integer_literal_type_node);
+  else if (flags & CPP_N_SIZE_T)
+{
+  /* itk refers to fundamental types not aliased size types.  */
+  if (flags & CPP_N_UNSIGNED)
+   type = size_type_node;
+  else
+   type = signed_size_type_node;
+}
   else
 {
   type = integer_types[itk];
diff --git a/gcc/testsuite/g++.dg/cpp0x/udlit-shadow-neg.C 
b/gcc/testsuite/g++.dg/cpp0x/udlit-shadow-neg.C
index f8d84ed..a30ec0f4f7e 100644
--- a/gcc/testsuite/g++.dg/cpp0x/udlit-shadow-neg.C
+++ b/gcc/testsuite/g++.dg/cpp0x/udlit-shadow-neg.C
@@ -17,6 +17,30 @@ unsigned long long int
 operator"" ull(unsigned long long int k)  // { dg-warning "integer 
suffix|shadowed by implementation" }
 { return k; }
 
+unsigned long long int
+operator"" z(unsigned long long int k)  // { dg-warning "integer 
suffix|shadowed by implementation" }
+{ return k; }
+
+unsigned long long int
+operator"" uz(unsigned long long int k)  // { dg-warning "integer 
suffix|shadowed by implementation" }
+{ return k; }
+
+unsigned long long int
+operator"" zu(unsigned long long int k)  // { dg-warning "integer 
suffix|shadowed by implementation" }
+{ return k; 

Re: [PATCH] c++: Fix ICE from op_unqualified_lookup [PR97582]

2021-02-01 Thread Jason Merrill via Gcc-patches

On 2/2/21 12:19 AM, Patrick Palka wrote:

In this testcase, we're crashing because the lookup of operator+ from
within the generic lambda via lookup_name finds multiple bindings
(namely C1::operator+ and C2::operator+) and returns a TREE_LIST
thereof, something which maybe_save_operator_binding isn't prepared to
handle.

Since we already discard the result of lookup_name when it returns a
class-scope binding here, it seems cleaner (and equivalent) to instead
communicate to lookup_name that we don't want such bindings in the first
place.  While this change seems like an improvement on its own, it also
fixes the mentioned PR, because the call to lookup_name now returns
NULL_TREE rather than a TREE_LIST of (unwanted) class-scope bindings.

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk/9/10?

gcc/cp/ChangeLog:

PR c++/97582
* name-lookup.c (op_unqualified_lookup): Pass BLOCK_NAMESPACE to
lookup_name in order to ignore class-scope bindings, rather
than discarding them after the fact.

gcc/testsuite/ChangeLog:

PR c++/97582
* g++.dg/cpp0x/lambda/lambda-template17.C: New test.
---
  gcc/cp/name-lookup.c  | 11 +++
  gcc/testsuite/g++.dg/cpp0x/lambda/lambda-template17.C |  8 
  2 files changed, 11 insertions(+), 8 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/cpp0x/lambda/lambda-template17.C

diff --git a/gcc/cp/name-lookup.c b/gcc/cp/name-lookup.c
index 52e4a630e25..46d6cc0dfa4 100644
--- a/gcc/cp/name-lookup.c
+++ b/gcc/cp/name-lookup.c
@@ -9213,17 +9213,12 @@ op_unqualified_lookup (tree fnname)
return NULL_TREE;
  }
  
-  tree fns = lookup_name (fnname);

+  /* We don't need to remember class-scope functions or declarations,
+ normal unqualified lookup will find them again.  */
+  tree fns = lookup_name (fnname, LOOK_where::BLOCK_NAMESPACE);


Hmm, I'd expect this to look past class-scope declarations to find 
namespace-scope declarations, but we want class decls to hide decls in 
an outer scope.



if (!fns)
  /* Remember we found nothing!  */
  return error_mark_node;
-
-  tree d = is_overloaded_fn (fns) ? get_first_fn (fns) : fns;
-  if (DECL_CLASS_SCOPE_P (d))
-/* We don't need to remember class-scope functions or declarations,
-   normal unqualified lookup will find them again.  */
-fns = NULL_TREE;
-
return fns;
  }
  
diff --git a/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-template17.C b/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-template17.C

new file mode 100644
index 000..6cafbab8cb0
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-template17.C
@@ -0,0 +1,8 @@
+// PR c++/97582
+// { dg-do compile { target c++11 } }
+
+struct C1 { void operator+(); };
+struct C2 { void operator+(); };
+struct C3 : C1, C2 {
+  template  void get() { [] (T x) { +x; }; }
+};





[PATCH] c++: Fix ICE from op_unqualified_lookup [PR97582]

2021-02-01 Thread Patrick Palka via Gcc-patches
In this testcase, we're crashing because the lookup of operator+ from
within the generic lambda via lookup_name finds multiple bindings
(namely C1::operator+ and C2::operator+) and returns a TREE_LIST
thereof, something which maybe_save_operator_binding isn't prepared to
handle.

Since we already discard the result of lookup_name when it returns a
class-scope binding here, it seems cleaner (and equivalent) to instead
communicate to lookup_name that we don't want such bindings in the first
place.  While this change seems like an improvement on its own, it also
fixes the mentioned PR, because the call to lookup_name now returns
NULL_TREE rather than a TREE_LIST of (unwanted) class-scope bindings.

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk/9/10?

gcc/cp/ChangeLog:

PR c++/97582
* name-lookup.c (op_unqualified_lookup): Pass BLOCK_NAMESPACE to
lookup_name in order to ignore class-scope bindings, rather
than discarding them after the fact.

gcc/testsuite/ChangeLog:

PR c++/97582
* g++.dg/cpp0x/lambda/lambda-template17.C: New test.
---
 gcc/cp/name-lookup.c  | 11 +++
 gcc/testsuite/g++.dg/cpp0x/lambda/lambda-template17.C |  8 
 2 files changed, 11 insertions(+), 8 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/lambda/lambda-template17.C

diff --git a/gcc/cp/name-lookup.c b/gcc/cp/name-lookup.c
index 52e4a630e25..46d6cc0dfa4 100644
--- a/gcc/cp/name-lookup.c
+++ b/gcc/cp/name-lookup.c
@@ -9213,17 +9213,12 @@ op_unqualified_lookup (tree fnname)
return NULL_TREE;
 }
 
-  tree fns = lookup_name (fnname);
+  /* We don't need to remember class-scope functions or declarations,
+ normal unqualified lookup will find them again.  */
+  tree fns = lookup_name (fnname, LOOK_where::BLOCK_NAMESPACE);
   if (!fns)
 /* Remember we found nothing!  */
 return error_mark_node;
-
-  tree d = is_overloaded_fn (fns) ? get_first_fn (fns) : fns;
-  if (DECL_CLASS_SCOPE_P (d))
-/* We don't need to remember class-scope functions or declarations,
-   normal unqualified lookup will find them again.  */
-fns = NULL_TREE;
-
   return fns;
 }
 
diff --git a/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-template17.C 
b/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-template17.C
new file mode 100644
index 000..6cafbab8cb0
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-template17.C
@@ -0,0 +1,8 @@
+// PR c++/97582
+// { dg-do compile { target c++11 } }
+
+struct C1 { void operator+(); };
+struct C2 { void operator+(); };
+struct C3 : C1, C2 {
+  template  void get() { [] (T x) { +x; }; }
+};
-- 
2.30.0.335.ge6362826a0



Re: [[C++ PATCH]] Implement C++2a P0330R2 - Literal Suffixes for ptrdiff_t and size_t

2021-02-01 Thread Jason Merrill via Gcc-patches

On 2/1/21 9:15 PM, Ed Smith-Rowland wrote:

On 2/1/21 2:23 PM, Jakub Jelinek wrote:

On Mon, Feb 01, 2021 at 01:46:13PM -0500, Ed Smith-Rowland wrote:

@@ -0,0 +1,8 @@
+// { dg-do compile { target c++23 } }
+
+#include 
+#include 
+
+static_assert(std::is_same_v);
+static_assert(std::is_same_v);
Shouldn't this be std::make_signed::type instead of 
std::ptrdiff_t
Yes it should. The paper goes on about ptrdiff_t but at the very end 
they punt on that in favor of what you have.


+std::ptrdiff_t pd1 = 1234z; // { dg-warning "use of C\+\+23 
ptrdiff_t integer constant" "" { target c++20_down } }
+std::ptrdiff_t PD1 = 5678Z; // { dg-warning "use of C\+\+23 
ptrdiff_t integer constant" "" { target c++20_down } }

Ditto here.

Agree.



+  const char *message = (result & CPP_N_UNSIGNED) == CPP_N_UNSIGNED
+    ? N_("use of C++23 size_t integer constant")
+    : N_("use of C++23 ptrdiff_t integer constant");

And here too (perhaps %::type%> )?
And maybe % too.

Agree.



--- a/libcpp/include/cpplib.h
+++ b/libcpp/include/cpplib.h
@@ -500,6 +500,9 @@ struct cpp_options
    /* Nonzero means tokenize C++20 module directives.  */
    unsigned char module_directives;
+  /* Nonzero for C++23 ptrdiff_t and size_t literals.  */

And drop "ptrdiff_t and " here?

+#define CPP_N_SIZE_T    0x200 /* C++23 size_t or ptrdiff_t 
literal  */

And " or ptrdiff_t" here?

While ptrdiff_t will usually be the same type, seems there is e.g.:
config/darwin.h:#define SIZE_TYPE "long unsigned int"
config/darwin.h:#define PTRDIFF_TYPE "int"
config/i386/djgpp.h:#define SIZE_TYPE "long unsigned int"
config/i386/djgpp.h:#define PTRDIFF_TYPE "int"
config/m32c/m32c.h:#define PTRDIFF_TYPE (TARGET_A16 ? "int" : "long int")
config/m32c/m32c.h:#define SIZE_TYPE "unsigned int"
config/rs6000/rs6000.h:#define PTRDIFF_TYPE "int"
config/rs6000/rs6000.h:#define SIZE_TYPE "long unsigned int"
config/s390/linux.h:#define SIZE_TYPE "long unsigned int"
config/s390/linux.h:#define PTRDIFF_TYPE (TARGET_64BIT ? "long int" : 
"int")

config/visium/visium.h:#define SIZE_TYPE "unsigned int"
config/visium/visium.h:#define PTRDIFF_TYPE "long int"
config/vms/vms.h:#define SIZE_TYPE  "unsigned int"
config/vms/vms.h:#define PTRDIFF_TYPE (flag_vms_pointer_size == 
VMS_POINTER_SIZE_NONE ? \

config/vms/vms.h-  "int" : "long long int")
so quite a few differences.

Jakub


Here is my last patch with all the concerns addressed.

I am not smart enough to get the dg-warning regex in Wsize_t-literals.C 
to work. If someone could carry this over the finish line that would be 
great. Or give me pointers. I can't any more.


Your regex will work fine if you wrap it in {} instead of "", e.g.

{ dg-warning {use of C\+\+23 .size_t. integer constant} }

Jason



[PATCH] Make asm not contain prefixed addresses.

2021-02-01 Thread Michael Meissner via Gcc-patches
>From 4ceff15935a16da9ec5833279807855a8afc47cd Mon Sep 17 00:00:00 2001
From: Michael Meissner 
Date: Mon, 1 Feb 2021 22:19:57 -0500
Subject: [PATCH] Make asm not contain prefixed addresses.

In PR target/98519, the assembler does not like asm memory references that are
prefixed.  We can't automatically change the instruction to prefixed form with
a 'p' like we do for normal RTL insns, since this is assembly code.  Instead,
the patch prevents prefixed memory addresses from being passed by default.

This patch uses the TARGET_MD_ASM_ADJUST target hook to change the 'm' and 'o'
constraints to be 'em' and 'eo'.  The 'em' and 'eo' constraints do not allow
prefixed instructions.  In addition, a new constraint 'ep' is added to
explicitly require a prefixed instruction.

I have tested this on a little endian power9 system in doing a bootstrap build
and make check.  There were no regressions.  Can I check this into the master
branch?

I would like to also backport this change to the GCC 10 branch, since the bug
can happen in that release as well.

[gcc]
2021-02-01  Michael Meissner  

PR target/98519
* config/rs6000/constraints.md (em): New constraint.
(eo): New constraint.
(ep): New constraint.
* config/rs6000/rs6000.md (rs6000_md_asm_adjust): If prefixed
addresses are allowed, change the 'm' and 'o' constraints to 'em'
and 'eo' that do not allow prefixed instructions.
* doc/md.texi (PowerPC and IBM RS6000 constraints): Document 'em',
'eo', and 'ep' constraints.

[gcc/testsuite]
2021-02-01  Michael Meissner  

PR target/98519
* gcc.target/powerpc/pr98519.c: New test.
---
 gcc/config/rs6000/constraints.md   | 19 
 gcc/config/rs6000/rs6000.c | 56 +-
 gcc/doc/md.texi|  9 
 gcc/testsuite/gcc.target/powerpc/pr98519.c | 17 +++
 4 files changed, 100 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pr98519.c

diff --git a/gcc/config/rs6000/constraints.md b/gcc/config/rs6000/constraints.md
index 561ce9797af..56a7eda4c85 100644
--- a/gcc/config/rs6000/constraints.md
+++ b/gcc/config/rs6000/constraints.md
@@ -245,6 +245,25 @@ (define_memory_constraint "es"
   (and (match_code "mem")
(match_test "GET_RTX_CLASS (GET_CODE (XEXP (op, 0))) != RTX_AUTOINC")))
 
+;; Non-prefixed memory
+(define_memory_constraint "em"
+  "@internal Memory operand that is not prefixed."
+  (and (match_code "mem")
+   (not (match_operand 0 "prefixed_memory"
+
+;; Non-prefixed memory that is also offsettable
+(define_memory_constraint "eo"
+  "@internal Memory operand that is not prefixed but is offsettable."
+  (and (match_code "mem")
+   (not (match_operand 0 "prefixed_memory"))
+   (match_operand 0 "offsettable_mem_operand")))
+
+;; prefixed memory
+(define_memory_constraint "ep"
+  "@internal Memory operand that is prefixed."
+  (and (match_code "mem")
+   (match_operand 0 "prefixed_memory")))
+
 (define_memory_constraint "Q"
   "A memory operand addressed by just a base register."
   (and (match_code "mem")
diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index ec068c58aa5..726928a4d32 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -3414,9 +3414,63 @@ rs6000_builtin_mask_calculate (void)
 
 static rtx_insn *
 rs6000_md_asm_adjust (vec &/*outputs*/, vec &/*inputs*/,
- vec &/*constraints*/,
+ vec ,
  vec , HARD_REG_SET _regs)
 {
+  /* If prefixed addresses are allowed, change the "m" constraint to "em" and
+ the "o" constraint to "eo".  The "em" and "eo" constraints do not allow
+ prefixed memory.  */
+  if (TARGET_PREFIXED)
+{
+  size_t max_len = 0;
+  for (unsigned i = 0; i < constraints.length (); ++i)
+   {
+ size_t len = strlen (constraints[i]);
+ if (len > max_len)
+   max_len = len;
+   }
+
+  char *buffer = (char *) alloca (2 * max_len + 1);
+  for (unsigned i = 0; i < constraints.length (); ++i)
+   {
+ const char *constraint = constraints[i];
+ char *new_constraint = buffer;
+ char ch;
+ bool found_m_or_o = false;
+
+ while ((ch = *constraint++) != '\0')
+   switch (ch)
+ {
+ default:
+   *new_constraint++ = ch;
+   break;
+
+   /* If we found 'm' or 'o', convert it to 'em' or 'eo' so that
+  prefixed memory is not allowed.  */
+ case 'm':
+ case 'o':
+   found_m_or_o = true;
+   *new_constraint++ = 'e';
+   *new_constraint++ = ch;
+   break;
+
+   /* 'e' and 'w' begin two letter constraints.  */
+ case 'e':
+ case 'w':
+   *new_constraint++ = ch;
+   *new_constraint++ = 

[committed] analyzer: directly explore within static functions [PR93355, PR96374]

2021-02-01 Thread David Malcolm via Gcc-patches
PR analyzer/93355 tracks that -fanalyzer fails to report the FILE *
leak in read_alias_file in intl/localealias.c.

One reason for the failure is that read_alias_file is marked as
"static", and the path leading to the single call of
read_alias_file is falsely rejected as infeasible due to
PR analyzer/96374.  I have been attempting to fix that bug, but
don't have a good solution yet.

Previously, -fanalyzer only directly explored "static" functions
if they were needed for call summaries, instead forcing them to
be indirectly explored, but if we have a feasibility bug like
above, we will fail to report any issues in a function that's
only called by such a falsely infeasible path.

It now seems wrong to me to reject directly exploring static
functions: even if there is currently no way to call a function,
it seems reasonable to warn about bugs within them, since
otherwise these latent bugs are a timebomb in the code.

Hence this patch reworks toplevel_function_p to directly explore
almost all functions, working around these feasiblity issues.
It introduces a naming convention that "__analyzer_"-prefixed
function names don't get directly explored, since this is
useful in the analyzer's DejaGnu-based tests.

This workaround gets PR analyzer/93355 closer to working, but
unfortunately there is a second instance of PR analyzer/96374
within read_alias_file itself which means even with this patch
-fanalyzer falsely rejects the path as infeasible.

Still, this ought to help in other cases, and simplifies the
implementation.

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to trunk as r11-7029-g8a2750086d57d1a2251d9239fa4e6c2dc9ec3a86.

gcc/analyzer/ChangeLog:
PR analyzer/93355
PR analyzer/96374
* engine.cc (toplevel_function_p): Simplify so that
we only reject functions with a "__analyzer_" prefix.
(add_any_callbacks): Delete.
(exploded_graph::build_initial_worklist): Update for
dropped param of toplevel_function_p.
(exploded_graph::build_initial_worklist): Don't bother
looking for callbacks that are reachable from global
initializers.

gcc/testsuite/ChangeLog:
PR analyzer/93355
PR analyzer/96374
* gcc.dg/analyzer/conditionals-3.c: Add "__analyzer_"
prefix to support subroutines where necessary.
* gcc.dg/analyzer/data-model-1.c: Likewise.
* gcc.dg/analyzer/feasibility-1.c (called_by_test_6a): New.
(test_6a): New.
* gcc.dg/analyzer/params.c: Add "__analyzer_" prefix to support
subroutines where necessary.
* gcc.dg/analyzer/pr96651-2.c: Likewise.
* gcc.dg/analyzer/signal-4b.c: Likewise.
* gcc.dg/analyzer/single-field.c: Likewise.
* gcc.dg/analyzer/torture/conditionals-2.c: Likewise.
---
 gcc/analyzer/engine.cc| 68 +--
 .../gcc.dg/analyzer/conditionals-3.c  |  8 +--
 gcc/testsuite/gcc.dg/analyzer/data-model-1.c  |  4 +-
 gcc/testsuite/gcc.dg/analyzer/feasibility-1.c | 26 +++
 gcc/testsuite/gcc.dg/analyzer/params.c|  4 +-
 gcc/testsuite/gcc.dg/analyzer/pr96651-2.c |  4 +-
 gcc/testsuite/gcc.dg/analyzer/signal-4b.c | 18 ++---
 gcc/testsuite/gcc.dg/analyzer/single-field.c  |  8 +--
 .../gcc.dg/analyzer/torture/conditionals-2.c  |  8 +--
 9 files changed, 69 insertions(+), 79 deletions(-)

diff --git a/gcc/analyzer/engine.cc b/gcc/analyzer/engine.cc
index fc81e7523fb..45aed8f0d37 100644
--- a/gcc/analyzer/engine.cc
+++ b/gcc/analyzer/engine.cc
@@ -2348,38 +2348,27 @@ exploded_graph::get_per_function_data (function *fun) 
const
   return NULL;
 }
 
-/* Return true if NODE and FUN should be traversed directly, rather than
+/* Return true if FUN should be traversed directly, rather than only as
called via other functions.  */
 
 static bool
-toplevel_function_p (cgraph_node *node, function *fun, logger *logger)
+toplevel_function_p (function *fun, logger *logger)
 {
-  /* TODO: better logic here
- e.g. only if more than one caller, and significantly complicated.
- Perhaps some whole-callgraph analysis to decide if it's worth summarizing
- an edge, and if so, we need summaries.  */
-  if (flag_analyzer_call_summaries)
-{
-  int num_call_sites = 0;
-  for (cgraph_edge *edge = node->callers; edge; edge = edge->next_caller)
-   ++num_call_sites;
-
-  /* For now, if there's more than one in-edge, and we want call
-summaries, do it at the top level so that there's a chance
-we'll have a summary when we need one.  */
-  if (num_call_sites > 1)
-   {
- if (logger)
-   logger->log ("traversing %qE (%i call sites)",
-fun->decl, num_call_sites);
- return true;
-   }
-}
-
-  if (!TREE_PUBLIC (fun->decl))
+  /* Don't directly traverse into functions that have an "__analyzer_"
+ prefix.  Doing so is useful for the analyzer test suite, allowing
+ us 

[committed] analyzer: add more feasibility test cases [PR93355, PR96374]

2021-02-01 Thread David Malcolm via Gcc-patches
This patch adds a couple more reduced test cases derived from the
integration test for PR analyzer/93355.  In both cases, the analyzer
falsely rejects the buggy code paths as being infeasible due to
PR analyzer/96374, and so the tests are marked as XFAIL for now.

Tested on x86_64-pc-linux-gnu.
Pushed to trunk as r11-7028-gf2f639c4a781016ad146d44f463714fe4295cb6e.

gcc/testsuite/ChangeLog:
PR analyzer/93355
PR analyzer/96374
* gcc.dg/analyzer/pr93355-localealias-feasibility-2.c: New test.
* gcc.dg/analyzer/pr93355-localealias-feasibility-3.c: New test.
---
 .../pr93355-localealias-feasibility-2.c   | 31 +
 .../pr93355-localealias-feasibility-3.c   | 64 +++
 2 files changed, 95 insertions(+)
 create mode 100644 
gcc/testsuite/gcc.dg/analyzer/pr93355-localealias-feasibility-2.c
 create mode 100644 
gcc/testsuite/gcc.dg/analyzer/pr93355-localealias-feasibility-3.c

diff --git a/gcc/testsuite/gcc.dg/analyzer/pr93355-localealias-feasibility-2.c 
b/gcc/testsuite/gcc.dg/analyzer/pr93355-localealias-feasibility-2.c
new file mode 100644
index 000..1afc6df5da1
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/analyzer/pr93355-localealias-feasibility-2.c
@@ -0,0 +1,31 @@
+/* Simplified version of test to ensure we issue a FILE * leak diagnostic,
+   reproducing a feasibility issue.
+   Adapted from intl/localealias.c, with all #includes removed.  */
+
+/* { dg-do "compile" } */
+
+#include "analyzer-decls.h"
+
+#define NULL ((void *) 0)
+#define PATH_SEPARATOR ':'
+#define LOCALE_ALIAS_PATH "value for LOCALE_ALIAS_PATH"
+
+const char *
+_nl_expand_alias (void)
+{
+  static const char *locale_alias_path;
+
+  if (locale_alias_path == NULL)
+locale_alias_path = LOCALE_ALIAS_PATH;
+
+  const char *start = locale_alias_path;
+
+  while (locale_alias_path[0] != '\0'
+&& locale_alias_path[0] != PATH_SEPARATOR)
+++locale_alias_path;
+
+  if (start < locale_alias_path)
+__analyzer_dump_path (); /* { dg-message "path" "" { xfail *-*-* } } */
+  /* XFAIL: PR analyzer/96374
+ Use -fno-analyzer-feasibility to see the path.  */
+}
diff --git a/gcc/testsuite/gcc.dg/analyzer/pr93355-localealias-feasibility-3.c 
b/gcc/testsuite/gcc.dg/analyzer/pr93355-localealias-feasibility-3.c
new file mode 100644
index 000..a86483113ff
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/analyzer/pr93355-localealias-feasibility-3.c
@@ -0,0 +1,64 @@
+/* Simplified version of test to ensure we issue a FILE * leak diagnostic,
+   reproducing a feasibility issue.
+   Adapted from intl/localealias.c, with all #includes removed.  */
+
+/* { dg-do "compile" } */
+
+/* Handle aliases for locale names.
+   Copyright (C) 1995-1999, 2000-2001, 2003 Free Software Foundation, Inc.
+
+   This program is free software; you can redistribute it and/or modify it
+   under the terms of the GNU Library General Public License as published
+   by the Free Software Foundation; either version 2, or (at your option)
+   any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Library General Public License for more details.
+
+   You should have received a copy of the GNU Library General Public
+   License along with this program; if not, write to the Free Software
+   Foundation, Inc., 51 Franklin Street - Fifth Floor, Boston, MA 02110-1301,
+   USA.  */
+
+/* Minimal version of system headers.  */
+
+typedef __SIZE_TYPE__ size_t;
+#define NULL ((void *)0)
+
+typedef struct _IO_FILE FILE;
+extern FILE *fopen (const char *__restrict __filename,
+   const char *__restrict __modes);
+extern int fclose (FILE *__stream);
+
+extern int isspace (int) __attribute__((__nothrow__, __leaf__));
+
+/* Cleaned-up body of localealias.c follows.  */
+
+size_t
+read_alias_file (const char *fname, char *cp)
+{
+  FILE *fp;
+
+  fp = fopen (fname, "r"); /* { dg-message "opened here" "" { xfail *-*-* } } 
*/
+  /* XFAIL: PR analyzer/96374
+ Use -fno-analyzer-feasibility to see the path.  */
+  if (fp == NULL)
+return 0;
+
+  if (cp[0] != '\0')
+*cp++ = '\0';
+
+  while (isspace ((unsigned char)cp[0]))
+++cp;
+
+  if (cp[0] != '\0')
+return 42; /* { dg-warning "leak of FILE 'fp'" "" { xfail *-*-* } } */
+  /* XFAIL: PR analyzer/96374
+ Use -fno-analyzer-feasibility to see the path.  */
+
+  fclose(fp);
+
+  return 0;
+}
-- 
2.26.2



Re: [[C++ PATCH]] Implement C++2a P0330R2 - Literal Suffixes for ptrdiff_t and size_t

2021-02-01 Thread Ed Smith-Rowland via Gcc-patches

On 2/1/21 2:23 PM, Jakub Jelinek wrote:

On Mon, Feb 01, 2021 at 01:46:13PM -0500, Ed Smith-Rowland wrote:

@@ -0,0 +1,8 @@
+// { dg-do compile { target c++23 } }
+
+#include 
+#include 
+
+static_assert(std::is_same_v);
+static_assert(std::is_same_v);

Shouldn't this be std::make_signed::type instead of std::ptrdiff_t
Yes it should. The paper goes on about ptrdiff_t but at the very end 
they punt on that in favor of what you have.



+std::ptrdiff_t pd1 = 1234z; // { dg-warning "use of C\+\+23 ptrdiff_t integer constant" 
"" { target c++20_down } }
+std::ptrdiff_t PD1 = 5678Z; // { dg-warning "use of C\+\+23 ptrdiff_t integer constant" 
"" { target c++20_down } }

Ditto here.

Agree.



+ const char *message = (result & CPP_N_UNSIGNED) == CPP_N_UNSIGNED
+   ? N_("use of C++23 size_t integer constant")
+   : N_("use of C++23 ptrdiff_t integer constant");

And here too (perhaps %::type%> )?
And maybe % too.

Agree.



--- a/libcpp/include/cpplib.h
+++ b/libcpp/include/cpplib.h
@@ -500,6 +500,9 @@ struct cpp_options
/* Nonzero means tokenize C++20 module directives.  */
unsigned char module_directives;
  
+  /* Nonzero for C++23 ptrdiff_t and size_t literals.  */

And drop "ptrdiff_t and " here?


+#define CPP_N_SIZE_T   0x200 /* C++23 size_t or ptrdiff_t literal  */

And " or ptrdiff_t" here?

While ptrdiff_t will usually be the same type, seems there is e.g.:
config/darwin.h:#define SIZE_TYPE "long unsigned int"
config/darwin.h:#define PTRDIFF_TYPE "int"
config/i386/djgpp.h:#define SIZE_TYPE "long unsigned int"
config/i386/djgpp.h:#define PTRDIFF_TYPE "int"
config/m32c/m32c.h:#define PTRDIFF_TYPE (TARGET_A16 ? "int" : "long int")
config/m32c/m32c.h:#define SIZE_TYPE "unsigned int"
config/rs6000/rs6000.h:#define PTRDIFF_TYPE "int"
config/rs6000/rs6000.h:#define SIZE_TYPE "long unsigned int"
config/s390/linux.h:#define SIZE_TYPE "long unsigned int"
config/s390/linux.h:#define PTRDIFF_TYPE (TARGET_64BIT ? "long int" : "int")
config/visium/visium.h:#define SIZE_TYPE "unsigned int"
config/visium/visium.h:#define PTRDIFF_TYPE "long int"
config/vms/vms.h:#define SIZE_TYPE  "unsigned int"
config/vms/vms.h:#define PTRDIFF_TYPE (flag_vms_pointer_size == 
VMS_POINTER_SIZE_NONE ? \
config/vms/vms.h-  "int" : "long long int")
so quite a few differences.

Jakub


Here is my last patch with all the concerns addressed.

I am not smart enough to get the dg-warning regex in Wsize_t-literals.C 
to work. If someone could carry this over the finish line that would be 
great. Or give me pointers. I can't any more.


Ed


diff --git a/gcc/c-family/c-cppbuiltin.c b/gcc/c-family/c-cppbuiltin.c
index dca6815a876..48dec21d4b4 100644
--- a/gcc/c-family/c-cppbuiltin.c
+++ b/gcc/c-family/c-cppbuiltin.c
@@ -1025,6 +1025,11 @@ c_cpp_builtins (cpp_reader *pfile)
  cpp_define (pfile, "__cpp_aggregate_paren_init=201902L");
  cpp_define (pfile, "__cpp_using_enum=201907L");
}
+  if (cxx_dialect > cxx20)
+   {
+ /* Set feature test macros for C++23.  */
+ cpp_define (pfile, "__cpp_size_t_suffix=202006L");
+   }
   if (flag_concepts)
 {
  if (cxx_dialect >= cxx20)
diff --git a/gcc/c-family/c-lex.c b/gcc/c-family/c-lex.c
index fe40a0f728b..6374b72ed2d 100644
--- a/gcc/c-family/c-lex.c
+++ b/gcc/c-family/c-lex.c
@@ -834,6 +834,14 @@ interpret_integer (const cpp_token *token, unsigned int 
flags,
 type = ((flags & CPP_N_UNSIGNED)
? widest_unsigned_literal_type_node
: widest_integer_literal_type_node);
+  else if (flags & CPP_N_SIZE_T)
+{
+  /* itk refers to fundamental types not aliased size types.  */
+  if (flags & CPP_N_UNSIGNED)
+   type = size_type_node;
+  else
+   type = signed_size_type_node;
+}
   else
 {
   type = integer_types[itk];
diff --git a/gcc/testsuite/g++.dg/cpp0x/udlit-shadow-neg.C 
b/gcc/testsuite/g++.dg/cpp0x/udlit-shadow-neg.C
index f8d84ed..a30ec0f4f7e 100644
--- a/gcc/testsuite/g++.dg/cpp0x/udlit-shadow-neg.C
+++ b/gcc/testsuite/g++.dg/cpp0x/udlit-shadow-neg.C
@@ -17,6 +17,30 @@ unsigned long long int
 operator"" ull(unsigned long long int k)  // { dg-warning "integer 
suffix|shadowed by implementation" }
 { return k; }
 
+unsigned long long int
+operator"" z(unsigned long long int k)  // { dg-warning "integer 
suffix|shadowed by implementation" }
+{ return k; }
+
+unsigned long long int
+operator"" uz(unsigned long long int k)  // { dg-warning "integer 
suffix|shadowed by implementation" }
+{ return k; }
+
+unsigned long long int
+operator"" zu(unsigned long long int k)  // { dg-warning "integer 
suffix|shadowed by implementation" }
+{ return k; }
+
+unsigned long long int
+operator"" Z(unsigned long long int k)  // { dg-warning "integer 
suffix|shadowed by implementation" }
+{ return k; }
+
+unsigned long long int
+operator"" UZ(unsigned long long int k)  // { dg-warning "integer 

[committed] d: Fix junk in generated symbol on powerpc64-*-* (PR98921)

2021-02-01 Thread Iain Buclaw via Gcc-patches
Hi,

This patch merges the D front-end implementation with upstream dmd
5e2a81d9c, fixing PR98921.  This adds a special formatter to OutBuffer
to handle formatted printing of integers, a common case.  The
replacement is faster and safer.

In dmangle.c, it also gets rid of a number of problematic casts, as seen
on powerpc64 targets.

Bootstrapped and regression tested on x86_64-linux-gnu/-m32/-mx32, and
on powerpc64le-linux-gnu.  Committed to mainline, and backported to the
releases/gcc-10 and gcc-9 branches.

Regards,
Iain.

---
gcc/d/ChangeLog:

PR d/98921
* dmd/MERGE: Merge upstream dmd 5e2a81d9c.
---
 gcc/d/dmd/MERGE|  2 +-
 gcc/d/dmd/dmangle.c| 29 -
 gcc/d/dmd/root/outbuffer.c | 31 +++
 gcc/d/dmd/root/outbuffer.h |  1 +
 4 files changed, 53 insertions(+), 10 deletions(-)

diff --git a/gcc/d/dmd/MERGE b/gcc/d/dmd/MERGE
index 228eed838b2..342871f9a1a 100644
--- a/gcc/d/dmd/MERGE
+++ b/gcc/d/dmd/MERGE
@@ -1,4 +1,4 @@
-609c3ce2d5d5d8a3dc4ba12c5e6e1100873f9ed1
+5e2a81d9cbcd653d9eed52344d664e72ba1355bc
 
 The first line of this file holds the git revision number of the last
 merge done from the dlang/dmd repository.
diff --git a/gcc/d/dmd/dmangle.c b/gcc/d/dmd/dmangle.c
index f6eee52afbf..4a9a118ebba 100644
--- a/gcc/d/dmd/dmangle.c
+++ b/gcc/d/dmd/dmangle.c
@@ -279,7 +279,7 @@ public:
 {
 visit((Type *)t);
 if (t->dim)
-buf->printf("%llu", t->dim->toInteger());
+buf->print(t->dim->toInteger());
 if (t->next)
 visitWithMask(t->next, t->mod);
 }
@@ -377,7 +377,8 @@ public:
 visit((Type *)t);
 const char *name = t->ident->toChars();
 size_t len = strlen(name);
-buf->printf("%u%s", (unsigned)len, name);
+buf->print(len);
+buf->writestring(name);
 }
 
 void visit(TypeEnum *t)
@@ -493,7 +494,7 @@ public:
 s->error("excessive length %llu for symbol, possible recursive 
expansion?", buf->length() + len);
 else
 {
-buf->printf("%llu", (ulonglong)len);
+buf->print(len);
 buf->write(id, len);
 }
 }
@@ -822,9 +823,15 @@ public:
 void visit(IntegerExp *e)
 {
 if ((sinteger_t)e->value < 0)
-buf->printf("N%lld", -e->value);
+{
+buf->writeByte('N');
+buf->print(-e->value);
+}
 else
-buf->printf("i%lld",  e->value);
+{
+buf->writeByte('i');
+buf->print(e->value);
+}
 }
 
 void visit(RealExp *e)
@@ -946,7 +953,8 @@ public:
 }
 buf->reserve(1 + 11 + 2 * qlen);
 buf->writeByte(m);
-buf->printf("%d_", (int)qlen); // nbytes <= 11
+buf->print(qlen);
+buf->writeByte('_');// nbytes <= 11
 
 for (utf8_t *p = (utf8_t *)buf->slice().ptr + buf->length(), *pend = p 
+ 2 * qlen;
  p < pend; p += 2, ++q)
@@ -962,7 +970,8 @@ public:
 void visit(ArrayLiteralExp *e)
 {
 size_t dim = e->elements ? e->elements->length : 0;
-buf->printf("A%u", dim);
+buf->writeByte('A');
+buf->print(dim);
 for (size_t i = 0; i < dim; i++)
 {
 e->getElement(i)->accept(this);
@@ -972,7 +981,8 @@ public:
 void visit(AssocArrayLiteralExp *e)
 {
 size_t dim = e->keys->length;
-buf->printf("A%u", dim);
+buf->writeByte('A');
+buf->print(dim);
 for (size_t i = 0; i < dim; i++)
 {
 (*e->keys)[i]->accept(this);
@@ -983,7 +993,8 @@ public:
 void visit(StructLiteralExp *e)
 {
 size_t dim = e->elements ? e->elements->length : 0;
-buf->printf("S%u", dim);
+buf->writeByte('S');
+buf->print(dim);
 for (size_t i = 0; i < dim; i++)
 {
 Expression *ex = (*e->elements)[i];
diff --git a/gcc/d/dmd/root/outbuffer.c b/gcc/d/dmd/root/outbuffer.c
index 8544697a3d5..81c2e901805 100644
--- a/gcc/d/dmd/root/outbuffer.c
+++ b/gcc/d/dmd/root/outbuffer.c
@@ -319,6 +319,37 @@ void OutBuffer::printf(const char *format, ...)
 va_end(ap);
 }
 
+/**
+ * Convert `u` to a string and append it to the buffer.
+ * Params:
+ *  u = integral value to append
+ */
+void OutBuffer::print(unsigned long long u)
+{
+unsigned long long value = u;
+char buf[20];
+const unsigned radix = 10;
+
+size_t i = sizeof(buf);
+do
+{
+if (value < radix)
+{
+unsigned char x = (unsigned char)value;
+buf[--i] = (char)(x + '0');
+break;
+}
+else
+{
+unsigned char x = (unsigned char)(value % radix);
+value = value / radix;
+buf[--i] = (char)(x + '0');
+}
+} while (value);
+
+write(buf + i, sizeof(buf) - i);
+}
+
 void OutBuffer::bracket(char left, 

Re: [PATCH v6] Practical improvement to libgcc complex divide

2021-02-01 Thread Joseph Myers
On Mon, 1 Feb 2021, Patrick McGehearty via Gcc-patches wrote:

> The message which contains the attached gzipped tarball of the
> development test programs is:
> https://www.mail-archive.com/gcc-patches@gcc.gnu.org/msg254210.html
> I'll include that link in the new patch as well.

I think it's best to include the URL in the archives on gcc.gnu.org rather 
than on a third-party site.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH] [PR91400] build only one __cpu_model variable

2021-02-01 Thread Ivan Sorokin via Gcc-patches
On 02.02.2021 02:00, Ivan Sorokin wrote:
> [PATCH] [PR91400] build only one __cpu_model variable

This is my first patch to GCC. So I might have done something totally
stupid or totally wrong. Caution is required for reviewing. :-)

> gcc/ChangeLog:
> 
>   PR target/91400
>   * config/i386/i386-builtins.c (fold_builtin_cpu): Extract
>   building of __cpu_model and __processor_model into new
>   function.
>   * config/i386/i386-builtins.c (init_cpu_model_var): New.
>   Cache creation of __cpu_model and __processor_model.
> 
> gcc/testsuite/Changelog:
> 
>   PR target/91400
>   * gcc.target/i386/pr91400.c: New.

I wrote the change log text manually. I hope I didn't mess up the
formatting.

> +static GTY(()) tree __cpu_model_var;
> +static GTY(()) tree __processor_model_type;

I felt a bit uneasy writing global variables, but this file contains
other global variables already and they are used for similar purpose.

> +static void
> +init_cpu_model_var()
> +{
> +  if (__cpu_model_var != NULL_TREE)
> +{
> +  gcc_assert(__processor_model_type != NULL_TREE);
> +  return;
> +}
> +
> +  __processor_model_type = build_processor_model_struct ();
> +  __cpu_model_var = make_var_decl (__processor_model_type,
> +"__cpu_model");
> +
> +  varpool_node::add (__cpu_model_var);

I have no idea what this line does. But I decided that perhaps we want
to do it only once instead of once for each usage of __builtin_cpu_supports.

> diff --git a/gcc/testsuite/gcc.target/i386/pr91400.c 
> b/gcc/testsuite/gcc.target/i386/pr91400.c
> new file mode 100644
> index 000..e8b7d9285f9
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/pr91400.c
> @@ -0,0 +1,11 @@
> +/* PR target/91400 */
> +/* { dg-do compile { target { ! ia32 } } } */
> +/* { dg-options "-O2" } */
> +/* { dg-final { scan-assembler-times "andl" 1 } } */
> +/* { dg-final { scan-assembler-times "68" 2 } } */
> +/* { dg-final { scan-assembler-not "je" } } */
> +
> +_Bool f()
> +{
> +return __builtin_cpu_supports("popcnt") && 
> __builtin_cpu_supports("ssse3");
> +}

The test was the most complicated thing for me. Previous versions of GCC
did two instructions andl, so I written check for them. Current master
does a conditional jump, so I add test for it too. Any suggestions for
better checks are welcomed.


[PATCH] [PR91400] build only one __cpu_model variable

2021-02-01 Thread Ivan Sorokin via Gcc-patches
Prior to this commit GCC -O2 generated quite bad code for this
function:

bool f()
{
return __builtin_cpu_supports("popcnt")
&& __builtin_cpu_supports("ssse3");
}

f:
movl__cpu_model+12(%rip), %eax
xorl%r8d, %r8d
testb   $4, %al
je  .L1
shrl$6, %eax
movl%eax, %r8d
andl$1, %r8d
.L1:
movl%r8d, %eax
ret

The problem was caused by the fact that internally every invocation
of __builtin_cpu_supports built a new variable __cpu_model and a new
type __processor_model. Because of this GIMPLE level optimizers
weren't able to CSE the loads of __cpu_model and optimize
bit-operations properly.

This commit fixes the problem by caching created __cpu_model
variable and __processor_model type. Now the GCC -O2 generates:

f:
movl__cpu_model+12(%rip), %eax
andl$68, %eax
cmpl$68, %eax
sete%al
ret

gcc/ChangeLog:

PR target/91400
* config/i386/i386-builtins.c (fold_builtin_cpu): Extract
building of __cpu_model and __processor_model into new
function.
* config/i386/i386-builtins.c (init_cpu_model_var): New.
Cache creation of __cpu_model and __processor_model.

gcc/testsuite/Changelog:

PR target/91400
* gcc.target/i386/pr91400.c: New.
---
 gcc/config/i386/i386-builtins.c | 27 ++---
 gcc/testsuite/gcc.target/i386/pr91400.c | 11 ++
 2 files changed, 31 insertions(+), 7 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr91400.c

diff --git a/gcc/config/i386/i386-builtins.c b/gcc/config/i386/i386-builtins.c
index 4fcdf4b89ee..96534318756 100644
--- a/gcc/config/i386/i386-builtins.c
+++ b/gcc/config/i386/i386-builtins.c
@@ -2085,6 +2085,25 @@ make_var_decl (tree type, const char *name)
   return new_decl;
 }
 
+static GTY(()) tree __cpu_model_var;
+static GTY(()) tree __processor_model_type;
+
+static void
+init_cpu_model_var()
+{
+  if (__cpu_model_var != NULL_TREE)
+{
+  gcc_assert(__processor_model_type != NULL_TREE);
+  return;
+}
+
+  __processor_model_type = build_processor_model_struct ();
+  __cpu_model_var = make_var_decl (__processor_model_type,
+  "__cpu_model");
+
+  varpool_node::add (__cpu_model_var);
+}
+
 /* FNDECL is a __builtin_cpu_is or a __builtin_cpu_supports call that is folded
into an integer defined in libgcc/config/i386/cpuinfo.c */
 
@@ -2096,13 +2115,7 @@ fold_builtin_cpu (tree fndecl, tree *args)
 = (enum ix86_builtins) DECL_MD_FUNCTION_CODE (fndecl);
   tree param_string_cst = NULL;
 
-  tree __processor_model_type = build_processor_model_struct ();
-  tree __cpu_model_var = make_var_decl (__processor_model_type,
-   "__cpu_model");
-
-
-  varpool_node::add (__cpu_model_var);
-
+  init_cpu_model_var ();
   gcc_assert ((args != NULL) && (*args != NULL));
 
   param_string_cst = *args;
diff --git a/gcc/testsuite/gcc.target/i386/pr91400.c 
b/gcc/testsuite/gcc.target/i386/pr91400.c
new file mode 100644
index 000..e8b7d9285f9
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr91400.c
@@ -0,0 +1,11 @@
+/* PR target/91400 */
+/* { dg-do compile { target { ! ia32 } } } */
+/* { dg-options "-O2" } */
+/* { dg-final { scan-assembler-times "andl" 1 } } */
+/* { dg-final { scan-assembler-times "68" 2 } } */
+/* { dg-final { scan-assembler-not "je" } } */
+
+_Bool f()
+{
+return __builtin_cpu_supports("popcnt") && __builtin_cpu_supports("ssse3");
+}
-- 
2.25.1



Re: [PATCH v6] Practical improvement to libgcc complex divide

2021-02-01 Thread Patrick McGehearty via Gcc-patches

The message which contains the attached gzipped tarball of the
development test programs is:
https://www.mail-archive.com/gcc-patches@gcc.gnu.org/msg254210.html
I'll include that link in the new patch as well.

I can't recall why I did not use _LIBGCC_XF_EPSILON__ or
_LIBGCC_TF_EPSILON__ before. Perhaps a copy/paste oversight?
Fixed now.

On the issue of handling more float_modes than the well known FLT,
DBL, and LDBL, it took me some time to understand how and where the
additional modes were defined. For example, it is hard to find FLT64
when you grep for FLOAT64. Once I found gcc/glinclude/float.h, and
some other files, everything fell into place. Actually making the code
changes as suggested to support all float modes was easy.

The new patch [v7] will be forwarded as soon as testing is complete,
likely later today.

- patrick


On 1/21/2021 5:04 PM, Joseph Myers wrote:

On Fri, 18 Dec 2020, Patrick McGehearty via Gcc-patches wrote:


TEST Data

I'd still like to see the test data / code used to produce the accuracy
and performance results made available somewhere (presumably with a link
then being provided in the commit message).


+ if ((mode == TYPE_MODE (float_type_node))
+ || (mode == TYPE_MODE (double_type_node))
+ || (mode == TYPE_MODE (long_double_type_node)))
+   {
+ char val_name[64];
+ char fname[8] = "";
+ if (mode == TYPE_MODE (float_type_node))
+   memcpy (fname, "FLT", 4);
+ else if (mode == TYPE_MODE (double_type_node))
+   memcpy (fname, "DBL", 4);
+ else if (mode == TYPE_MODE (long_double_type_node))
+   memcpy (fname, "LDBL", 5);

This logic being used to generate EPSILON, MAX and MIN macros only handles
modes that match float, double or long double (so won't define the macros
for a mode that only matches another type such as _Float128, for example).

Earlier in the same function, there is existing code to define
__LIBGCC_%s_FUNC_EXT__.  That code has to do something similar, to
determine the matching type for a mode - but it also handles _FloatN /
_FloatNx types, and has an assertion at the end that some matching type
was found.

Rather than having this code which handles a more limited set of types, I
think the __LIBGCC_%s_FUNC_EXT__ code should be extended, so that as well
as computing a function suffix it also computes a prefix such as FLT, DBL,
FLT128 or FLT64X.  Then all supported floating-point modes can get these
three LIBGCC macros defined, rather than just those matching float, double
or long double.


  #elif defined(L_mulxc3) || defined(L_divxc3)
  # define MTYPEXFtype
  # define CTYPEXCtype
  # define MODE xc
  # define CEXT __LIBGCC_XF_FUNC_EXT__
  # define NOTRUNC (!__LIBGCC_XF_EXCESS_PRECISION__)
+# define RBIG  ((__LIBGCC_XF_MAX__)/2)
+# define RMIN  (__LIBGCC_XF_MIN__)
+# define RMIN2 (__LIBGCC_DF_EPSILON__)
+# define RMINSCAL (1/__LIBGCC_DF_EPSILON__)
+# define RMAX2 ((RBIG)*(RMIN2))

I'd then expect __LIBGCC_XF_EPSILON__ to be used for XFmode in place of
__LIBGCC_DF_EPSILON__ unless there is some good reason to use the latter
(which would need a comment to explain it if so).


  #elif defined(L_multc3) || defined(L_divtc3)
  # define MTYPETFtype
  # define CTYPETCtype
  # define MODE tc
  # define CEXT __LIBGCC_TF_FUNC_EXT__
  # define NOTRUNC (!__LIBGCC_TF_EXCESS_PRECISION__)
+#if defined(__LIBGCC_TF_MIN__)
+# define RBIG  ((__LIBGCC_TF_MAX__)/2)
+# define RMIN  (__LIBGCC_TF_MIN__)
+#else
+# define RBIG  ((__LIBGCC_XF_MAX__)/2)
+# define RMIN  (__LIBGCC_XF_MIN__)
+#endif
+# define RMIN2 (__LIBGCC_DF_EPSILON__)
+# define RMINSCAL (1/__LIBGCC_DF_EPSILON__)

And, likewise, with the suggested changes to c-cppbuiltin.c this code can
use __LIBGCC_TF_MAX__, __LIBGCC_TF_MIN__ and __LIBGCC_TF_EPSILON__
unconditionally, without ever needing to use XF or DF macros.  (If you
want to use a different EPSILON value in the case where TFmode is IBM long
double because of LDBL_EPSILON being too small in that case, condition
that on __LIBGCC_TF_MANT_DIG__ == 106, and use ((TFtype) 0x1p-106) in
place of __LIBGCC_TF_EPSILON__ in that case.)





[PATCH] aarch64: Reimplement vrshrn* intrinsics using builtins

2021-02-01 Thread Kyrylo Tkachov via Gcc-patches
Hi all,

This patch moves the vrshrn* intrinsics to builtins away from inline asm.
It's a bit of code, but it's very similar to the recent vsrhn* reimplementation
except that we use an unspec rather than standard RTL codes for the 
functionality.

Bootstrapped and tested on aarch64-none-linux-gnu and aarch64_be-none-elf.

Pushing to trunk.
Thanks,
Kyrill

gcc/ChangeLog:

* config/aarch64/aarch64-simd-builtins.def (rshrn, rshrn2): Define
builtins.
* config/aarch64/aarch64-simd.md (aarch64_rshrn_insn_le): Define.
(aarch64_rshrn_insn_be): Likewise.
(aarch64_rshrn): Likewise.
(aarch64_rshrn2_insn_le): Likewise.
(aarch64_rshrn2_insn_be): Likewise.
(aarch64_rshrn2): Likewise.
* config/aarch64/aarch64.md (unspec): Add UNSPEC_RSHRN.
* config/aarch64/arm_neon.h (vrshrn_high_n_s16): Reimplement using 
builtin.
(vrshrn_high_n_s32): Likewise.
(vrshrn_high_n_s64): Likewise.
(vrshrn_high_n_u16): Likewise.
(vrshrn_high_n_u32): Likewise.
(vrshrn_high_n_u64): Likewise.
(vrshrn_n_s16): Likewise.
(vrshrn_n_s32): Likewise.
(vrshrn_n_s64): Likewise.
(vrshrn_n_u16): Likewise.
(vrshrn_n_u32): Likewise.
(vrshrn_n_u64): Likewise.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/narrow_high-intrinsics.c: Adjust rshrn2 assembly
scan.


vrshrn.patch
Description: vrshrn.patch


[r11-7011 Regression] FAIL: g++.dg/cpp0x/alias-decl-dr1558.C -std=c++17 (test for excess errors) on Linux/x86_64

2021-02-01 Thread sunil.k.pandey via Gcc-patches
On Linux/x86_64,

6e0a231a4aa2407bb7167daf98a37795a67364d8 is the first bad commit
commit 6e0a231a4aa2407bb7167daf98a37795a67364d8
Author: Jason Merrill 
Date:   Wed Jan 27 17:15:39 2021 -0500

c++: alias in qualified-id in template arg [PR98570]

caused

FAIL: g++.dg/cpp0x/alias-decl-52.C  -std=c++17 (internal compiler error)
FAIL: g++.dg/cpp0x/alias-decl-52.C  -std=c++17 (test for excess errors)
FAIL: g++.dg/cpp0x/alias-decl-dr1558.C  -std=c++17 (internal compiler error)
FAIL: g++.dg/cpp0x/alias-decl-dr1558.C  -std=c++17  (test for errors, line 13)
FAIL: g++.dg/cpp0x/alias-decl-dr1558.C  -std=c++17  (test for errors, line 7)
FAIL: g++.dg/cpp0x/alias-decl-dr1558.C  -std=c++17 (test for excess errors)

with GCC configured with

../../gcc/configure 
--prefix=/local/skpandey/gccwork/toolwork/gcc-bisect-master/master/r11-7011/usr 
--enable-clocale=gnu --with-system-zlib --with-demangler-in-ld 
--with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl 
--enable-libmpx x86_64-linux --disable-bootstrap

To reproduce:

$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="dg.exp=g++.dg/cpp0x/alias-decl-52.C --target_board='unix{-m64\ 
-march=cascadelake}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="dg.exp=g++.dg/cpp0x/alias-decl-dr1558.C 
--target_board='unix{-m64\ -march=cascadelake}'"

(Please do not reply to this email, for question about this report, contact me 
at skpgkp2 at gmail dot com)


[committed] analyzer: fix false positives with *UNKNOWN_PTR [PR98918]

2021-02-01 Thread David Malcolm via Gcc-patches
PR analyzer/98918 reports various false positives and state explosions
on correct code that frees nodes and other pointers in a singly-linked
list.

The issue is that state-merger in the loop leads to UNKNOWN_VALUEs,
and these are then erroneously used to form compound symbolic values
and regions, such as;
  INIT_VAL((*UNKNOWN(struct marker *)).ref)
and:
  (*INIT_VAL((*UNKNOWN(struct marker * *
The malloc state machine then treats these symbolic values as
identifying specific pointers, and thus e.g. erroneously reports a
double-free when
  INIT_VAL((*UNKNOWN(struct marker *)).ref)
is freed twice (on subsequent iterations of the loop).

Similarly, the increasingly complex compound symbolic values have
sm-state which prevents state merging, and eventually lead to the
analysis hitting safety limits and stopping.

This patch makes various compound values involving UNKNOWN be
themselves UNKNOWN, resolving both the false positives and the state
explosions.

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to trunk as r11-7024-g11d4ec5d45c02a19b8ff9d7f26800637ad563e05.

gcc/analyzer/ChangeLog:
PR analyzer/98918
* region-model-manager.cc
(region_model_manager::get_or_create_initial_value):
Fold the initial value of *UNKNOWN_PTR to an UNKNOWN value.
(region_model_manager::get_field_region): Fold the value
of UNKNOWN_PTR->FIELD to *UNKNOWN_PTR_OF__TYPE.

gcc/testsuite/ChangeLog:
PR analyzer/98918
* gcc.dg/analyzer/pr98918.c: New test.
---
 gcc/analyzer/region-model-manager.cc| 13 +
 gcc/testsuite/gcc.dg/analyzer/pr98918.c | 22 ++
 2 files changed, 35 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/analyzer/pr98918.c

diff --git a/gcc/analyzer/region-model-manager.cc 
b/gcc/analyzer/region-model-manager.cc
index c864a8fbc21..dfd2413e914 100644
--- a/gcc/analyzer/region-model-manager.cc
+++ b/gcc/analyzer/region-model-manager.cc
@@ -249,6 +249,10 @@ region_model_manager::get_or_create_initial_value (const 
region *reg)
 get_or_create_initial_value (original_reg));
 }
 
+  /* INIT_VAL (*UNKNOWN_PTR) -> UNKNOWN_VAL.  */
+  if (reg->symbolic_for_unknown_ptr_p ())
+return get_or_create_unknown_svalue (reg->get_type ());
+
   if (initial_svalue **slot = m_initial_values_map.get (reg))
 return *slot;
   initial_svalue *initial_sval = new initial_svalue (reg->get_type (), reg);
@@ -815,6 +819,15 @@ region_model_manager::get_field_region (const region 
*parent, tree field)
 {
   gcc_assert (TREE_CODE (field) == FIELD_DECL);
 
+  /* (*UNKNOWN_PTR).field is (*UNKNOWN_PTR_OF__TYPE).  */
+  if (parent->symbolic_for_unknown_ptr_p ())
+{
+  tree ptr_to_field_type = build_pointer_type (TREE_TYPE (field));
+  const svalue *unknown_ptr_to_field
+   = get_or_create_unknown_svalue (ptr_to_field_type);
+  return get_symbolic_region (unknown_ptr_to_field);
+}
+
   field_region::key_t key (parent, field);
   if (field_region *reg = m_field_regions.get (key))
 return reg;
diff --git a/gcc/testsuite/gcc.dg/analyzer/pr98918.c 
b/gcc/testsuite/gcc.dg/analyzer/pr98918.c
new file mode 100644
index 000..ac626ba1f30
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/analyzer/pr98918.c
@@ -0,0 +1,22 @@
+#include 
+
+struct marker {
+  struct marker *next;
+  void *ref;
+};
+struct data {
+  struct marker *marker;
+};
+
+void data_free(struct data d)
+{
+  struct marker *nm, *m;
+
+  m = d.marker;
+  while (m) {
+nm = m->next;
+free(m->ref);
+free(m);
+m = nm;
+  }
+}
-- 
2.26.2



Re: [PATCH] rtl-optimization/80960 - avoid creating garbage RTL in DSE

2021-02-01 Thread Jakub Jelinek via Gcc-patches
On Mon, Feb 01, 2021 at 12:54:50PM -0700, Jeff Law wrote:
> >>> So I see no difference for stage2-gcc/*.o dse1/dse2 with/without the
> >>> patch but counts are _extremely_ small.  Statistics:
> >>>
> >>>   70148 dse: local deletions = 0, global deletions = 0
> >>>  32 dse: local deletions = 0, global deletions = 1
> >>>   9 dse: local deletions = 0, global deletions = 2
> >>>   7 dse: local deletions = 0, global deletions = 3
> >>>   2 dse: local deletions = 0, global deletions = 4
> >>>   2 dse: local deletions = 0, global deletions = 5
> >>>   3 dse: local deletions = 0, global deletions = 7
> >>>  67 dse: local deletions = 1, global deletions = 0
> >>>   1 dse: local deletions = 1, global deletions = 2
> >>>  12 dse: local deletions = 2, global deletions = 0
> >>>   1 dse: local deletions = 24, global deletions = 1
> >>>   2 dse: local deletions = 3, global deletions = 0
> >>>   4 dse: local deletions = 4, global deletions = 0
> >>>   4 dse: local deletions = 6, global deletions = 0
> >>>   1 dse: local deletions = 7, global deletions = 0
> >>>   1 dse: local deletions = 8, global deletions = 0
> >>>
> >>> so not sure how much confidence this brings over the analytical
> >>> reasoning that it shouldn't make a difference ...
> >>>
> >>> stats on just dse2 are even more depressing (given it's cost)
> >>>
> >>>   35123 dse: local deletions = 0, global deletions = 0
> >>>   2 dse: local deletions = 0, global deletions = 1
> >>>  20 dse: local deletions = 1, global deletions = 0
> >>>   1 dse: local deletions = 2, global deletions = 0
> >>>   1 dse: local deletions = 3, global deletions = 0
> >>>   1 dse: local deletions = 4, global deletions = 0
> >> Based on that, I'd argue that DSE2 should go away and DSE1 should be
> >> evaluated for the chopping block.  While RTL DSE was marginally
> >> important in 1999 when it was first submitted, the tree-ssa pipeline as
> >> a whole has probably made RTL DSE largely pointless.
> > True. Though I'd argue that DSE2 might be the conceptually more useful pass 
> > since it sees spill slots. 
> True in concept, but I bet that the SSA pipeline has made this much less
> common in RTL DSE than it was 20+ years ago.  Our allocator and reloader
> are much improved as well which would further decrease the number of
> opportunities.
> 
> I'd hazard a guess that what's left are locals that need to be
> addressable and some optimization in the RTL pipeline exposed a dead
> store that wasn't otherwise visible in the SSA pipeline.  BUt the only
> way to be sure would be to dig into them.

Shouldn't we gather statistics from larger codebase first and perhaps
compare against tree-ssa-dse statistics?  I mean, in many functions there
are no DSE opportunities at all.

Jakub



Re: [PATCH] rtl-optimization/80960 - avoid creating garbage RTL in DSE

2021-02-01 Thread Jeff Law via Gcc-patches



On 2/1/21 12:47 PM, Richard Biener wrote:
> On February 1, 2021 8:34:35 PM GMT+01:00, Jeff Law  wrote:
>>
>> On 1/28/21 1:09 AM, Richard Biener wrote:
>>> On Wed, 27 Jan 2021, Jakub Jelinek wrote:
>>>
 On Wed, Jan 27, 2021 at 03:40:38PM +0100, Richard Biener wrote:
> The following avoids repeatedly turning VALUE RTXen into
> sth useful and re-applying a constant offset through get_addr
> via DSE check_mem_read_rtx.  Instead perform this once for
> all stores to be visited in check_mem_read_rtx.  This avoids
> allocating 1.6GB of garbage PLUS RTXen on the PR80960
> testcase, fixing the memory usage regression from old GCC.
>
> Bootstrap and regtest running on x86_64-unknown-linux-gnu, OK?
>
> Thanks,
> Richard.
>
> 2021-01-27  Richard Biener  
>
>   PR rtl-optimization/80960
>   * dse.c (check_mem_read_rtx): Call get_addr on the
>   offsetted address.
> ---
>  gcc/dse.c | 5 +
>  1 file changed, 5 insertions(+)
>
> diff --git a/gcc/dse.c b/gcc/dse.c
> index c88587e7d94..da0df54a2dd 100644
> --- a/gcc/dse.c
> +++ b/gcc/dse.c
> @@ -2219,6 +2219,11 @@ check_mem_read_rtx (rtx *loc, bb_info_t
>> bb_info)
>  }
>if (maybe_ne (offset, 0))
>  mem_addr = plus_constant (get_address_mode (mem), mem_addr,
>> offset);
> +  /* Avoid passing VALUE RTXen as mem_addr to
>> canon_true_dependence
> + which will over and over re-create proper RTL and re-apply
>> the
> + offset above.  See PR80960 where we almost allocate 1.6GB of
>> PLUS
> + RTXen that way.  */
> +  mem_addr = get_addr (mem_addr);
>  
>if (group_id >= 0)
>  {
 Does that result in any changes on how much does DSE optimize?
 I mean, if you do 2 bootstraps/regtests, one with this patch and
>> another one
 without it, and at the end of rest_of_handle_dse dump
 locally_deleted, globally_deleted
 for each CU/function, do you get the same counts except perhaps for
>> dse.c?
>>> So I see no difference for stage2-gcc/*.o dse1/dse2 with/without the
>>> patch but counts are _extremely_ small.  Statistics:
>>>
>>>   70148 dse: local deletions = 0, global deletions = 0
>>>  32 dse: local deletions = 0, global deletions = 1
>>>   9 dse: local deletions = 0, global deletions = 2
>>>   7 dse: local deletions = 0, global deletions = 3
>>>   2 dse: local deletions = 0, global deletions = 4
>>>   2 dse: local deletions = 0, global deletions = 5
>>>   3 dse: local deletions = 0, global deletions = 7
>>>  67 dse: local deletions = 1, global deletions = 0
>>>   1 dse: local deletions = 1, global deletions = 2
>>>  12 dse: local deletions = 2, global deletions = 0
>>>   1 dse: local deletions = 24, global deletions = 1
>>>   2 dse: local deletions = 3, global deletions = 0
>>>   4 dse: local deletions = 4, global deletions = 0
>>>   4 dse: local deletions = 6, global deletions = 0
>>>   1 dse: local deletions = 7, global deletions = 0
>>>   1 dse: local deletions = 8, global deletions = 0
>>>
>>> so not sure how much confidence this brings over the analytical
>>> reasoning that it shouldn't make a difference ...
>>>
>>> stats on just dse2 are even more depressing (given it's cost)
>>>
>>>   35123 dse: local deletions = 0, global deletions = 0
>>>   2 dse: local deletions = 0, global deletions = 1
>>>  20 dse: local deletions = 1, global deletions = 0
>>>   1 dse: local deletions = 2, global deletions = 0
>>>   1 dse: local deletions = 3, global deletions = 0
>>>   1 dse: local deletions = 4, global deletions = 0
>> Based on that, I'd argue that DSE2 should go away and DSE1 should be
>> evaluated for the chopping block.  While RTL DSE was marginally
>> important in 1999 when it was first submitted, the tree-ssa pipeline as
>> a whole has probably made RTL DSE largely pointless.
> True. Though I'd argue that DSE2 might be the conceptually more useful pass 
> since it sees spill slots. 
True in concept, but I bet that the SSA pipeline has made this much less
common in RTL DSE than it was 20+ years ago.  Our allocator and reloader
are much improved as well which would further decrease the number of
opportunities.

I'd hazard a guess that what's left are locals that need to be
addressable and some optimization in the RTL pipeline exposed a dead
store that wasn't otherwise visible in the SSA pipeline.  BUt the only
way to be sure would be to dig into them.

jeff



Re: [PATCH] rtl-optimization/80960 - avoid creating garbage RTL in DSE

2021-02-01 Thread Richard Biener
On February 1, 2021 8:34:35 PM GMT+01:00, Jeff Law  wrote:
>
>
>On 1/28/21 1:09 AM, Richard Biener wrote:
>> On Wed, 27 Jan 2021, Jakub Jelinek wrote:
>>
>>> On Wed, Jan 27, 2021 at 03:40:38PM +0100, Richard Biener wrote:
 The following avoids repeatedly turning VALUE RTXen into
 sth useful and re-applying a constant offset through get_addr
 via DSE check_mem_read_rtx.  Instead perform this once for
 all stores to be visited in check_mem_read_rtx.  This avoids
 allocating 1.6GB of garbage PLUS RTXen on the PR80960
 testcase, fixing the memory usage regression from old GCC.

 Bootstrap and regtest running on x86_64-unknown-linux-gnu, OK?

 Thanks,
 Richard.

 2021-01-27  Richard Biener  

PR rtl-optimization/80960
* dse.c (check_mem_read_rtx): Call get_addr on the
offsetted address.
 ---
  gcc/dse.c | 5 +
  1 file changed, 5 insertions(+)

 diff --git a/gcc/dse.c b/gcc/dse.c
 index c88587e7d94..da0df54a2dd 100644
 --- a/gcc/dse.c
 +++ b/gcc/dse.c
 @@ -2219,6 +2219,11 @@ check_mem_read_rtx (rtx *loc, bb_info_t
>bb_info)
  }
if (maybe_ne (offset, 0))
  mem_addr = plus_constant (get_address_mode (mem), mem_addr,
>offset);
 +  /* Avoid passing VALUE RTXen as mem_addr to
>canon_true_dependence
 + which will over and over re-create proper RTL and re-apply
>the
 + offset above.  See PR80960 where we almost allocate 1.6GB of
>PLUS
 + RTXen that way.  */
 +  mem_addr = get_addr (mem_addr);
  
if (group_id >= 0)
  {
>>> Does that result in any changes on how much does DSE optimize?
>>> I mean, if you do 2 bootstraps/regtests, one with this patch and
>another one
>>> without it, and at the end of rest_of_handle_dse dump
>>> locally_deleted, globally_deleted
>>> for each CU/function, do you get the same counts except perhaps for
>dse.c?
>> So I see no difference for stage2-gcc/*.o dse1/dse2 with/without the
>> patch but counts are _extremely_ small.  Statistics:
>>
>>   70148 dse: local deletions = 0, global deletions = 0
>>  32 dse: local deletions = 0, global deletions = 1
>>   9 dse: local deletions = 0, global deletions = 2
>>   7 dse: local deletions = 0, global deletions = 3
>>   2 dse: local deletions = 0, global deletions = 4
>>   2 dse: local deletions = 0, global deletions = 5
>>   3 dse: local deletions = 0, global deletions = 7
>>  67 dse: local deletions = 1, global deletions = 0
>>   1 dse: local deletions = 1, global deletions = 2
>>  12 dse: local deletions = 2, global deletions = 0
>>   1 dse: local deletions = 24, global deletions = 1
>>   2 dse: local deletions = 3, global deletions = 0
>>   4 dse: local deletions = 4, global deletions = 0
>>   4 dse: local deletions = 6, global deletions = 0
>>   1 dse: local deletions = 7, global deletions = 0
>>   1 dse: local deletions = 8, global deletions = 0
>>
>> so not sure how much confidence this brings over the analytical
>> reasoning that it shouldn't make a difference ...
>>
>> stats on just dse2 are even more depressing (given it's cost)
>>
>>   35123 dse: local deletions = 0, global deletions = 0
>>   2 dse: local deletions = 0, global deletions = 1
>>  20 dse: local deletions = 1, global deletions = 0
>>   1 dse: local deletions = 2, global deletions = 0
>>   1 dse: local deletions = 3, global deletions = 0
>>   1 dse: local deletions = 4, global deletions = 0
>Based on that, I'd argue that DSE2 should go away and DSE1 should be
>evaluated for the chopping block.  While RTL DSE was marginally
>important in 1999 when it was first submitted, the tree-ssa pipeline as
>a whole has probably made RTL DSE largely pointless.

True. Though I'd argue that DSE2 might be the conceptually more useful pass 
since it sees spill slots. 

Richard. 

>Jeff



Re: [[C++ PATCH]] Implement C++2a P0330R2 - Literal Suffixes for ptrdiff_t and size_t

2021-02-01 Thread Jakub Jelinek via Gcc-patches
On Mon, Feb 01, 2021 at 08:23:18PM +0100, Jakub Jelinek via Gcc-patches wrote:
> On Mon, Feb 01, 2021 at 01:46:13PM -0500, Ed Smith-Rowland wrote:
> > @@ -0,0 +1,8 @@
> > +// { dg-do compile { target c++23 } }
> > +
> > +#include 
> > +#include 
> > +
> > +static_assert(std::is_same_v);
> > +static_assert(std::is_same_v);
> 
> Shouldn't this be std::make_signed::type instead of 
> std::ptrdiff_t

Or better std::make_signed_t etc.

Jakub



Re: [[C++ PATCH]] Implement C++2a P0330R2 - Literal Suffixes for ptrdiff_t and size_t

2021-02-01 Thread Jason Merrill via Gcc-patches

On 2/1/21 1:46 PM, Ed Smith-Rowland wrote:

On 2/1/21 10:33 AM, Jason Merrill wrote:

On 1/30/21 6:22 PM, Ed Smith-Rowland wrote:

On 1/27/21 3:32 PM, Jakub Jelinek wrote:

On Sun, Oct 21, 2018 at 04:39:30PM -0400, Ed Smith-Rowland wrote:
This patch implements C++2a proposal P0330R2 Literal Suffixes for 
ptrdiff_t
and size_t*.  It's not official yet but looks very likely to pass. 
It is
incomplete because I'm looking for some opinions. 9We also might 
wait 'till

it actually passes).

This paper takes the direction of a language change rather than a 
library
change through C++11 literal operators.  This was after feedback on 
that

paper after a few iterations.

As coded in this patch, integer suffixes involving 'z' are errors 
in C and

warnings for C++ <= 17 (in addition to the usual warning about
implementation suffixes shadowing user-defined ones).

OTOH, the 'z' suffix is not currently legal - it can't break
currently-correct code in any C/C++ dialect.  furthermore, I 
suspect the
language direction was chosen to accommodate a similar addition to 
C20.


I'm thinking of making this feature available as an extension to 
all of

C/C++ perhaps with appropriate pedwarn.
GCC now supports -std=c++2b and -std=gnu++2b, are you going to 
update your

patch against it (and change for z/Z standing for ssize_t rather than
ptrdiff_t), plus incorporate the feedback from Joseph and Jason?

Jakub


Here is a rebased patch that is a bit leaner than the original.

Since I chose to be conservative in applying this just to C++23 I'm 
not adding this to C or t earlier versions of C++ as extensions. We 
can add that if people really want, maybe in stage 1.


The compat warning for C++ < 23 is not optional. since the suffixes 
are not preceded by '-' I don't hav much sympathy if people tried to 
make a literal 'z' operator. Which is the only reason I can see for a 
warning suppression.



+  /* itk refers to fundamental types not aliased size types.  */
+  if (flags & CPP_N_UNSIGNED)
+    type = size_type_node;
+  else
+    type = ptrdiff_type_node;


I thought size_type_node and ptrdiff_type_node were sort of fundamental 
the the others derived:


Yes, but the final proposal specifies a suffix for signed size_t, not 
ptrdiff_t.



ssize_t:
   signed_size_type_node = c_common_signed_type (size_type_node);


Ah, I wasn't aware of signed_size_type_node, that's better than my 
suggestion of calling c_common_signed_type again.


Jason



Re: [PATCH] rtl-optimization/80960 - avoid creating garbage RTL in DSE

2021-02-01 Thread Jeff Law via Gcc-patches



On 1/28/21 1:09 AM, Richard Biener wrote:
> On Wed, 27 Jan 2021, Jakub Jelinek wrote:
>
>> On Wed, Jan 27, 2021 at 03:40:38PM +0100, Richard Biener wrote:
>>> The following avoids repeatedly turning VALUE RTXen into
>>> sth useful and re-applying a constant offset through get_addr
>>> via DSE check_mem_read_rtx.  Instead perform this once for
>>> all stores to be visited in check_mem_read_rtx.  This avoids
>>> allocating 1.6GB of garbage PLUS RTXen on the PR80960
>>> testcase, fixing the memory usage regression from old GCC.
>>>
>>> Bootstrap and regtest running on x86_64-unknown-linux-gnu, OK?
>>>
>>> Thanks,
>>> Richard.
>>>
>>> 2021-01-27  Richard Biener  
>>>
>>> PR rtl-optimization/80960
>>> * dse.c (check_mem_read_rtx): Call get_addr on the
>>> offsetted address.
>>> ---
>>>  gcc/dse.c | 5 +
>>>  1 file changed, 5 insertions(+)
>>>
>>> diff --git a/gcc/dse.c b/gcc/dse.c
>>> index c88587e7d94..da0df54a2dd 100644
>>> --- a/gcc/dse.c
>>> +++ b/gcc/dse.c
>>> @@ -2219,6 +2219,11 @@ check_mem_read_rtx (rtx *loc, bb_info_t bb_info)
>>>  }
>>>if (maybe_ne (offset, 0))
>>>  mem_addr = plus_constant (get_address_mode (mem), mem_addr, offset);
>>> +  /* Avoid passing VALUE RTXen as mem_addr to canon_true_dependence
>>> + which will over and over re-create proper RTL and re-apply the
>>> + offset above.  See PR80960 where we almost allocate 1.6GB of PLUS
>>> + RTXen that way.  */
>>> +  mem_addr = get_addr (mem_addr);
>>>  
>>>if (group_id >= 0)
>>>  {
>> Does that result in any changes on how much does DSE optimize?
>> I mean, if you do 2 bootstraps/regtests, one with this patch and another one
>> without it, and at the end of rest_of_handle_dse dump
>> locally_deleted, globally_deleted
>> for each CU/function, do you get the same counts except perhaps for dse.c?
> So I see no difference for stage2-gcc/*.o dse1/dse2 with/without the
> patch but counts are _extremely_ small.  Statistics:
>
>   70148 dse: local deletions = 0, global deletions = 0
>  32 dse: local deletions = 0, global deletions = 1
>   9 dse: local deletions = 0, global deletions = 2
>   7 dse: local deletions = 0, global deletions = 3
>   2 dse: local deletions = 0, global deletions = 4
>   2 dse: local deletions = 0, global deletions = 5
>   3 dse: local deletions = 0, global deletions = 7
>  67 dse: local deletions = 1, global deletions = 0
>   1 dse: local deletions = 1, global deletions = 2
>  12 dse: local deletions = 2, global deletions = 0
>   1 dse: local deletions = 24, global deletions = 1
>   2 dse: local deletions = 3, global deletions = 0
>   4 dse: local deletions = 4, global deletions = 0
>   4 dse: local deletions = 6, global deletions = 0
>   1 dse: local deletions = 7, global deletions = 0
>   1 dse: local deletions = 8, global deletions = 0
>
> so not sure how much confidence this brings over the analytical
> reasoning that it shouldn't make a difference ...
>
> stats on just dse2 are even more depressing (given it's cost)
>
>   35123 dse: local deletions = 0, global deletions = 0
>   2 dse: local deletions = 0, global deletions = 1
>  20 dse: local deletions = 1, global deletions = 0
>   1 dse: local deletions = 2, global deletions = 0
>   1 dse: local deletions = 3, global deletions = 0
>   1 dse: local deletions = 4, global deletions = 0
Based on that, I'd argue that DSE2 should go away and DSE1 should be
evaluated for the chopping block.  While RTL DSE was marginally
important in 1999 when it was first submitted, the tree-ssa pipeline as
a whole has probably made RTL DSE largely pointless.

Jeff



Re: [PATCH, rs6000] do not generate fusion.md, update contrib/gcc_update

2021-02-01 Thread David Edelsohn via Gcc-patches
Okay.

Thanks, David

On Mon, Feb 1, 2021 at 2:17 PM  wrote:
>
> From: Aaron Sawdey 
>
> In a previous fusion-combine patch for rs6000, Segher had asked me to
> comment out the automatic regeneration of fusion.md. And more recently
> Edelsohn pointed out that gcc_update needed to fix the timestamp of
> fusion.md so it didn't get unnecessarily regenerated.
>
> OK for trunk if bootstrap/regtest passes?
>
> Thanks,
>Aaron
>
> contrib/ChangeLog:
>
> * gcc_update (files_and_dependencies): Add dependency for
> gcc/config/rs6000/fusion.md on gcc/config/rs6000/genfusion.md.
>
> gcc/ChangeLog:
>
> * config/rs6000/t-rs6000: Comment out auto generation of
> fusion.md for now.
> ---
>  contrib/gcc_update | 1 +
>  gcc/config/rs6000/t-rs6000 | 4 ++--
>  2 files changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/contrib/gcc_update b/contrib/gcc_update
> index 43d284d8125..45a27b76cc3 100755
> --- a/contrib/gcc_update
> +++ b/contrib/gcc_update
> @@ -89,6 +89,7 @@ gcc/config/c6x/c6x-mult.md: gcc/config/c6x/c6x-mult.md.in 
> gcc/config/c6x/genmult
>  gcc/config/m68k/m68k-tables.opt: gcc/config/m68k/m68k-devices.def 
> gcc/config/m68k/m68k-isas.def gcc/config/m68k/m68k-microarchs.def 
> gcc/config/m68k/genopt.sh
>  gcc/config/mips/mips-tables.opt: gcc/config/mips/mips-cpus.def 
> gcc/config/mips/genopt.sh
>  gcc/config/rs6000/rs6000-tables.opt: gcc/config/rs6000/rs6000-cpus.def 
> gcc/config/rs6000/genopt.sh
> +gcc/config/rs6000/fusion.md: gcc/config/rs6000/genfusion.pl
>  gcc/config/tilegx/mul-tables.c: gcc/config/tilepro/gen-mul-tables.cc
>  gcc/config/tilepro/mul-tables.c: gcc/config/tilepro/gen-mul-tables.cc
>  # And then, language-specific files
> diff --git a/gcc/config/rs6000/t-rs6000 b/gcc/config/rs6000/t-rs6000
> index e3a58bf31bf..1541a653738 100644
> --- a/gcc/config/rs6000/t-rs6000
> +++ b/gcc/config/rs6000/t-rs6000
> @@ -47,8 +47,8 @@ rs6000-call.o: $(srcdir)/config/rs6000/rs6000-call.c
> $(COMPILE) $<
> $(POSTCOMPILE)
>
> -$(srcdir)/config/rs6000/fusion.md: $(srcdir)/config/rs6000/genfusion.pl
> -   $(srcdir)/config/rs6000/genfusion.pl > 
> $(srcdir)/config/rs6000/fusion.md
> +#$(srcdir)/config/rs6000/fusion.md: $(srcdir)/config/rs6000/genfusion.pl
> +#  $(srcdir)/config/rs6000/genfusion.pl > 
> $(srcdir)/config/rs6000/fusion.md
>
>  $(srcdir)/config/rs6000/rs6000-tables.opt: $(srcdir)/config/rs6000/genopt.sh 
> \
>$(srcdir)/config/rs6000/rs6000-cpus.def
> --
> 2.27.0
>


Re: [[C++ PATCH]] Implement C++2a P0330R2 - Literal Suffixes for ptrdiff_t and size_t

2021-02-01 Thread Jakub Jelinek via Gcc-patches
On Mon, Feb 01, 2021 at 01:46:13PM -0500, Ed Smith-Rowland wrote:
> @@ -0,0 +1,8 @@
> +// { dg-do compile { target c++23 } }
> +
> +#include 
> +#include 
> +
> +static_assert(std::is_same_v);
> +static_assert(std::is_same_v);

Shouldn't this be std::make_signed::type instead of std::ptrdiff_t

> +std::ptrdiff_t pd1 = 1234z; // { dg-warning "use of C\+\+23 ptrdiff_t 
> integer constant" "" { target c++20_down } }
> +std::ptrdiff_t PD1 = 5678Z; // { dg-warning "use of C\+\+23 ptrdiff_t 
> integer constant" "" { target c++20_down } }

Ditto here.

> +   const char *message = (result & CPP_N_UNSIGNED) == CPP_N_UNSIGNED
> + ? N_("use of C++23 size_t integer constant")
> + : N_("use of C++23 ptrdiff_t integer constant");

And here too (perhaps %::type%> )?
And maybe % too.

> --- a/libcpp/include/cpplib.h
> +++ b/libcpp/include/cpplib.h
> @@ -500,6 +500,9 @@ struct cpp_options
>/* Nonzero means tokenize C++20 module directives.  */
>unsigned char module_directives;
>  
> +  /* Nonzero for C++23 ptrdiff_t and size_t literals.  */

And drop "ptrdiff_t and " here?

> +#define CPP_N_SIZE_T 0x200 /* C++23 size_t or ptrdiff_t literal  */

And " or ptrdiff_t" here?

While ptrdiff_t will usually be the same type, seems there is e.g.:
config/darwin.h:#define SIZE_TYPE "long unsigned int"
config/darwin.h:#define PTRDIFF_TYPE "int"
config/i386/djgpp.h:#define SIZE_TYPE "long unsigned int"
config/i386/djgpp.h:#define PTRDIFF_TYPE "int"
config/m32c/m32c.h:#define PTRDIFF_TYPE (TARGET_A16 ? "int" : "long int")
config/m32c/m32c.h:#define SIZE_TYPE "unsigned int"
config/rs6000/rs6000.h:#define PTRDIFF_TYPE "int"
config/rs6000/rs6000.h:#define SIZE_TYPE "long unsigned int"
config/s390/linux.h:#define SIZE_TYPE "long unsigned int"
config/s390/linux.h:#define PTRDIFF_TYPE (TARGET_64BIT ? "long int" : "int")
config/visium/visium.h:#define SIZE_TYPE "unsigned int"
config/visium/visium.h:#define PTRDIFF_TYPE "long int"
config/vms/vms.h:#define SIZE_TYPE  "unsigned int"
config/vms/vms.h:#define PTRDIFF_TYPE (flag_vms_pointer_size == 
VMS_POINTER_SIZE_NONE ? \
config/vms/vms.h-  "int" : "long long int")
so quite a few differences.

Jakub



[PATCH,rs6000] do not generate fusion.md, update contrib/gcc_update

2021-02-01 Thread acsawdey--- via Gcc-patches
From: Aaron Sawdey 

In a previous fusion-combine patch for rs6000, Segher had asked me to
comment out the automatic regeneration of fusion.md. And more recently
Edelsohn pointed out that gcc_update needed to fix the timestamp of
fusion.md so it didn't get unnecessarily regenerated.

OK for trunk if bootstrap/regtest passes?

Thanks,
   Aaron

contrib/ChangeLog:

* gcc_update (files_and_dependencies): Add dependency for
gcc/config/rs6000/fusion.md on gcc/config/rs6000/genfusion.md.

gcc/ChangeLog:

* config/rs6000/t-rs6000: Comment out auto generation of
fusion.md for now.
---
 contrib/gcc_update | 1 +
 gcc/config/rs6000/t-rs6000 | 4 ++--
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/contrib/gcc_update b/contrib/gcc_update
index 43d284d8125..45a27b76cc3 100755
--- a/contrib/gcc_update
+++ b/contrib/gcc_update
@@ -89,6 +89,7 @@ gcc/config/c6x/c6x-mult.md: gcc/config/c6x/c6x-mult.md.in 
gcc/config/c6x/genmult
 gcc/config/m68k/m68k-tables.opt: gcc/config/m68k/m68k-devices.def 
gcc/config/m68k/m68k-isas.def gcc/config/m68k/m68k-microarchs.def 
gcc/config/m68k/genopt.sh
 gcc/config/mips/mips-tables.opt: gcc/config/mips/mips-cpus.def 
gcc/config/mips/genopt.sh
 gcc/config/rs6000/rs6000-tables.opt: gcc/config/rs6000/rs6000-cpus.def 
gcc/config/rs6000/genopt.sh
+gcc/config/rs6000/fusion.md: gcc/config/rs6000/genfusion.pl
 gcc/config/tilegx/mul-tables.c: gcc/config/tilepro/gen-mul-tables.cc
 gcc/config/tilepro/mul-tables.c: gcc/config/tilepro/gen-mul-tables.cc
 # And then, language-specific files
diff --git a/gcc/config/rs6000/t-rs6000 b/gcc/config/rs6000/t-rs6000
index e3a58bf31bf..1541a653738 100644
--- a/gcc/config/rs6000/t-rs6000
+++ b/gcc/config/rs6000/t-rs6000
@@ -47,8 +47,8 @@ rs6000-call.o: $(srcdir)/config/rs6000/rs6000-call.c
$(COMPILE) $<
$(POSTCOMPILE)
 
-$(srcdir)/config/rs6000/fusion.md: $(srcdir)/config/rs6000/genfusion.pl
-   $(srcdir)/config/rs6000/genfusion.pl > $(srcdir)/config/rs6000/fusion.md
+#$(srcdir)/config/rs6000/fusion.md: $(srcdir)/config/rs6000/genfusion.pl
+#  $(srcdir)/config/rs6000/genfusion.pl > $(srcdir)/config/rs6000/fusion.md
 
 $(srcdir)/config/rs6000/rs6000-tables.opt: $(srcdir)/config/rs6000/genopt.sh \
   $(srcdir)/config/rs6000/rs6000-cpus.def
-- 
2.27.0



Re: The performance data for two different implementation of new security feature -ftrivial-auto-var-init

2021-02-01 Thread Qing Zhao via Gcc-patches
Hi, Richard,

I have adjusted SRA phase to split calls to DEFERRED_INIT per you suggestion.

And now the routine “bump_map” in 511.povray is like following:
...

 # DEBUG BEGIN_STMT
  xcoor = 0.0;
  ycoor = 0.0;
  # DEBUG BEGIN_STMT
  index = .DEFERRED_INIT (index, 2);
  index2 = .DEFERRED_INIT (index2, 2);
  index3 = .DEFERRED_INIT (index3, 2);
  # DEBUG BEGIN_STMT
  colour1 = .DEFERRED_INIT (colour1, 2);
  colour2 = .DEFERRED_INIT (colour2, 2);
  colour3 = .DEFERRED_INIT (colour3, 2);
  # DEBUG BEGIN_STMT
  p1$0_181 = .DEFERRED_INIT (p1$0_195(D), 2);
  # DEBUG p1$0 => p1$0_181
  p1$1_184 = .DEFERRED_INIT (p1$1_182(D), 2);
  # DEBUG p1$1 => p1$1_184
  p1$2_172 = .DEFERRED_INIT (p1$2_185(D), 2);
  # DEBUG p1$2 => p1$2_172
  p2$0_177 = .DEFERRED_INIT (p2$0_173(D), 2);
  # DEBUG p2$0 => p2$0_177
  p2$1_135 = .DEFERRED_INIT (p2$1_178(D), 2);
  # DEBUG p2$1 => p2$1_135
  p2$2_137 = .DEFERRED_INIT (p2$2_136(D), 2);
  # DEBUG p2$2 => p2$2_137
  p3$0_377 = .DEFERRED_INIT (p3$0_376(D), 2);
  # DEBUG p3$0 => p3$0_377
  p3$1_379 = .DEFERRED_INIT (p3$1_378(D), 2);
  # DEBUG p3$1 => p3$1_379
  p3$2_381 = .DEFERRED_INIT (p3$2_380(D), 2);
  # DEBUG p3$2 => p3$2_381


In the above, p1, p2, and p3 are all splitted to calls to DEFERRED_INIT of the 
components of p1, p2 and p3. 

With this change, the stack usage numbers with -fstack-usage for approach A, 
old approach D and new D with the splitting in SRA are:

  Approach AApproach D-old  Approach D-new

272 624 368

From the above, we can see that splitting the call to DEFERRED_INIT in SRA can 
reduce the stack usage increase dramatically. 

However, looks like that the stack size for D is still bigger than A. 

I checked the IR again, and found that the alias analysis might be responsible 
for this (by compare the image.cpp.026t.ealias for both A and D):

(Due to the call to:

  colour1 = .DEFERRED_INIT (colour1, 2);
)

**Approach A:

Points_to analysis:

Constraints:
…
colour1 = 
…
colour1 = 
colour1 = 
colour1 = 
colour1 = 
colour1 = 
...
callarg(53) = 
...
_53 = colour1

Points_to sets:
…
colour1 = { NULL ESCAPED NONLOCAL } same as _53
...
CALLUSED(48) = { NULL ESCAPED NONLOCAL index colour1 }
CALLCLOBBERED(49) = { NULL ESCAPED NONLOCAL index colour1 } same as CALLUSED(48)
...
callarg(53) = { NULL ESCAPED NONLOCAL colour1 }

**Apprach D:

Points_to analysis:

Constraints:
…
callarg(19) = colour1
callarg(19) = 
colour1 = callarg(19) + UNKNOWN
colour1 = 
…
colour1 = 
colour1 = 
colour1 = 
colour1 = 
colour1 = 
…
callarg(74) = 
callarg(74) = callarg(74) + UNKNOWN
callarg(74) = *callarg(74) + UNKNOWN
…
_53 = colour1
_54 = _53
_55 = _54 + UNKNOWN
_55 = 
_56 = colour1
_57 = _56
_58 = _57 + UNKNOWN
_58 = 
_59 = _55 + UNKNOWN
_59 = _58 + UNKNOWN
_60 = colour1
_61 = _60
_62 = _61 + UNKNOWN
_62 = 
_63 = _59 + UNKNOWN
_63 = _62 + UNKNOWN
_64 = _63 + UNKNOWN
..
Points_to set:
…
colour1 = { ESCAPED NONLOCAL } same as callarg(19)
…
CALLUSED(69) = { ESCAPED NONLOCAL index colour1 }
CALLCLOBBERED(70) = { ESCAPED NONLOCAL index colour1 } same as CALLUSED(69)
callarg(71) = { ESCAPED NONLOCAL }
callarg(72) = { ESCAPED NONLOCAL }
callarg(73) = { ESCAPED NONLOCAL }
callarg(74) = { ESCAPED NONLOCAL colour1 }

My question:

Is it possible to adjust alias analysis to resolve this issue?

thanks.

Qing

> On Jan 18, 2021, at 10:12 AM, Qing Zhao via Gcc-patches 
>  wrote:
> 
> I checked the routine “poverties::bump_map” in 511.povray_r since it
> has a lot stack increase 
> due to implementation D, by examine the IR immediate before RTL
> expansion phase.  
> (image.cpp.244t.optimized), I found that we have the following
> additional statements for the array elements:
> 
> void  pov::bump_map (double * EPoint, struct TNORMAL * Tnormal, double
> * normal)
> {
> …
> double p3[3];
> double p2[3];
> double p1[3];
> float colour3[5];
> float colour2[5];
> float colour1[5];
> …
> # DEBUG BEGIN_STMT
> colour1 = .DEFERRED_INIT (colour1, 2);
> colour2 = .DEFERRED_INIT (colour2, 2);
> colour3 = .DEFERRED_INIT (colour3, 2);
> # DEBUG BEGIN_STMT
> MEM  [(double[3] *)] = p1$0_144(D);
> MEM  [(double[3] *) + 8B] = p1$1_135(D);
> MEM  [(double[3] *) + 16B] = p1$2_138(D);
> p1 = .DEFERRED_INIT (p1, 2);
> # DEBUG D#12 => MEM  [(double[3] *)]
> # DEBUG p1$0 => D#12
> # DEBUG D#11 => MEM  [(double[3] *) + 8B]
> # DEBUG p1$1 => D#11
> # DEBUG D#10 => MEM  [(double[3] *) + 16B]
> # DEBUG p1$2 => D#10
> MEM  [(double[3] *)] = p2$0_109(D);
> MEM  [(double[3] *) + 8B] = p2$1_111(D);
> MEM  [(double[3] *) + 16B] = p2$2_254(D);
> p2 = .DEFERRED_INIT (p2, 2);
> # DEBUG D#9 => MEM  [(double[3] *)]
> # DEBUG p2$0 => D#9
> # DEBUG D#8 => MEM  [(double[3] *) + 8B]
> # DEBUG p2$1 => D#8
> # DEBUG D#7 => MEM  [(double[3] *) + 16B]
> # DEBUG p2$2 => D#7
> MEM  [(double[3] *)] = p3$0_256(D);

Re: [PATCH][Bug libstdc++/70303] Value-initialized debug iterators

2021-02-01 Thread Jonathan Wakely via Gcc-patches

On 01/02/21 19:30 +0100, François Dumont via Libstdc++ wrote:

On 01/02/21 6:43 pm, Jonathan Wakely wrote:

On 31/01/21 16:59 +0100, François Dumont via Libstdc++ wrote:
After the debug issue has been fixed in PR 98466 the problem was 
not in the debug iterator implementation itself but in the deque 
iterator operator- implementation.


    libstdc++: Make deque iterator operator- usable with 
value-init iterators


    N3644 implies that operator- can be used on value-init 
iterators. We now return
    0 if both iterators are value initialized. If only one is 
value initialized we
    keep the UB by returning the result of a normal computation 
which is an unexpected

    value.

    libstdc++/ChangeLog:

    PR libstdc++/70303
    * include/bits/stl_deque.h 
(std::deque<>::operator-(iterator, iterator)):

    Return 0 if both iterators are value-initialized.
    * testsuite/23_containers/deque/70303.cc: New test.
    * testsuite/23_containers/vector/70303.cc: New test.

Tested under Linux x86_64, ok to commit ?


OK.

I don't like adding the branch there though. Even with the
__builtin_expect it causes worse code to be generated than the
original.

This would be branchless, but a bit harder to understand:

    return difference_type(__x._S_buffer_size())
  * (__x._M_node - __y._M_node - int(__x._M_node == __y._M_node))
  + (__x._M_cur - __x._M_first) + (__y._M_last - __y._M_cur);

Please commit the fix and we can think about it later.


I do not think this work because for value-init iterators we have 
__x._M_node == __y._M_node == nullptr so the diff would give 
-_S_buffer_size().


But I got the idear, I'll prepare the patch.


Yeah, I just came back to the computer to say that it's wrong. But
maybe this:

    return difference_type(_S_buffer_size())
  * (__x._M_node - __y._M_node - int(__x._M_node && __y._M_node))
  + (__x._M_cur - __x._M_first) + (__y._M_last - __y._M_cur);

For value-init'd iterators we'd get _S_buffer_size() * 0 + 0 - 0

We could even just use int(__x._M_node != 0) because if one is null
and the other isn't, it's UB anyway.





Re: [[C++ PATCH]] Implement C++2a P0330R2 - Literal Suffixes for ptrdiff_t and size_t

2021-02-01 Thread Ed Smith-Rowland via Gcc-patches

On 2/1/21 10:33 AM, Jason Merrill wrote:

On 1/30/21 6:22 PM, Ed Smith-Rowland wrote:

On 1/27/21 3:32 PM, Jakub Jelinek wrote:

On Sun, Oct 21, 2018 at 04:39:30PM -0400, Ed Smith-Rowland wrote:
This patch implements C++2a proposal P0330R2 Literal Suffixes for 
ptrdiff_t
and size_t*.  It's not official yet but looks very likely to pass.  
It is
incomplete because I'm looking for some opinions. 9We also might 
wait 'till

it actually passes).

This paper takes the direction of a language change rather than a 
library
change through C++11 literal operators.  This was after feedback on 
that

paper after a few iterations.

As coded in this patch, integer suffixes involving 'z' are errors 
in C and

warnings for C++ <= 17 (in addition to the usual warning about
implementation suffixes shadowing user-defined ones).

OTOH, the 'z' suffix is not currently legal - it can't break
currently-correct code in any C/C++ dialect.  furthermore, I 
suspect the
language direction was chosen to accommodate a similar addition to 
C20.


I'm thinking of making this feature available as an extension to 
all of

C/C++ perhaps with appropriate pedwarn.
GCC now supports -std=c++2b and -std=gnu++2b, are you going to 
update your

patch against it (and change for z/Z standing for ssize_t rather than
ptrdiff_t), plus incorporate the feedback from Joseph and Jason?

Jakub


Here is a rebased patch that is a bit leaner than the original.

Since I chose to be conservative in applying this just to C++23 I'm 
not adding this to C or t earlier versions of C++ as extensions. We 
can add that if people really want, maybe in stage 1.


The compat warning for C++ < 23 is not optional. since the suffixes 
are not preceded by '-' I don't hav much sympathy if people tried to 
make a literal 'z' operator. Which is the only reason I can see for a 
warning suppression.



+  /* itk refers to fundamental types not aliased size types.  */
+  if (flags & CPP_N_UNSIGNED)
+    type = size_type_node;
+  else
+    type = ptrdiff_type_node;


I thought size_type_node and ptrdiff_type_node were sort of fundamental 
the the others derived:


ssize_t:
  signed_size_type_node = c_common_signed_type (size_type_node);
???:
  unsigned_ptrdiff_type_node = c_common_unsigned_type (ptrdiff_type_node);

But I see in tree.c that things can be... interesting.

Fixed...

This is wrong if ptrdiff_t is a different size from size_t; it should 
be c_common_signed_type (size_type_node).



+  | (z ? (CPP_N_SIZE_T | CPP_N_LARGE) : 0));


Why CPP_N_LARGE here?  That would seem to suggest that size_t is 
always the same size as unsigned long long.



This doesn't need to be there. I don't use it anywhere.

Jason


Is this Ok if it passes testing on x86_64-linux?

Ed


diff --git a/gcc/c-family/c-cppbuiltin.c b/gcc/c-family/c-cppbuiltin.c
index dca6815a876..48dec21d4b4 100644
--- a/gcc/c-family/c-cppbuiltin.c
+++ b/gcc/c-family/c-cppbuiltin.c
@@ -1025,6 +1025,11 @@ c_cpp_builtins (cpp_reader *pfile)
  cpp_define (pfile, "__cpp_aggregate_paren_init=201902L");
  cpp_define (pfile, "__cpp_using_enum=201907L");
}
+  if (cxx_dialect > cxx20)
+   {
+ /* Set feature test macros for C++23.  */
+ cpp_define (pfile, "__cpp_size_t_suffix=202006L");
+   }
   if (flag_concepts)
 {
  if (cxx_dialect >= cxx20)
diff --git a/gcc/c-family/c-lex.c b/gcc/c-family/c-lex.c
index fe40a0f728b..bc4f6f9dfa7 100644
--- a/gcc/c-family/c-lex.c
+++ b/gcc/c-family/c-lex.c
@@ -834,6 +834,14 @@ interpret_integer (const cpp_token *token, unsigned int 
flags,
 type = ((flags & CPP_N_UNSIGNED)
? widest_unsigned_literal_type_node
: widest_integer_literal_type_node);
+  else if (flags & CPP_N_SIZE_T)
+{
+  /* itk refers to fundamental types not aliased size types.  */
+  if (flags & CPP_N_UNSIGNED)
+   type = size_type_node;
+  else
+   type = c_common_signed_type (size_type_node);
+}
   else
 {
   type = integer_types[itk];
diff --git a/gcc/testsuite/g++.dg/cpp0x/udlit-shadow-neg.C 
b/gcc/testsuite/g++.dg/cpp0x/udlit-shadow-neg.C
index f8d84ed..a30ec0f4f7e 100644
--- a/gcc/testsuite/g++.dg/cpp0x/udlit-shadow-neg.C
+++ b/gcc/testsuite/g++.dg/cpp0x/udlit-shadow-neg.C
@@ -17,6 +17,30 @@ unsigned long long int
 operator"" ull(unsigned long long int k)  // { dg-warning "integer 
suffix|shadowed by implementation" }
 { return k; }
 
+unsigned long long int
+operator"" z(unsigned long long int k)  // { dg-warning "integer 
suffix|shadowed by implementation" }
+{ return k; }
+
+unsigned long long int
+operator"" uz(unsigned long long int k)  // { dg-warning "integer 
suffix|shadowed by implementation" }
+{ return k; }
+
+unsigned long long int
+operator"" zu(unsigned long long int k)  // { dg-warning "integer 
suffix|shadowed by implementation" }
+{ return k; }
+
+unsigned long long int
+operator"" Z(unsigned long long int k)  // { dg-warning "integer 

[PATCH][_GLIBCXX_DEBUG] Enhance detection of invalid iterators usage

2021-02-01 Thread François Dumont via Gcc-patches
At the moment some iterators like std::list<>::end() looks like 
value-init iterators once detached.


I think using an iterator in such a state is wrong so here is a patch to 
detect this.


This patch also add a new iterator state: singular (value-initialized)

Example of the output of the forward_list/iterator2_neg.cc test:

In function:
    bool __gnu_debug::operator!=(const _Self&, const _Self&)

Error: attempt to compare a singular iterator to a
singular (value-initialized) iterator.

Objects involved in the operation:
    iterator "__lhs" @ 0x0x7fffb777d4b0 {
  type = std::__cxx1998::_Fwd_list_iterator (mutable iterator);
  state = singular;
    }
    iterator "__rhs" @ 0x0x7fffb777d4e0 {
  type = std::__cxx1998::_Fwd_list_iterator (mutable iterator);
  state = singular (value-initialized);
    }

    libstdc++: [_GLIBCXX_DEBUG] Do not consider detach iterators as 
value-initialized


    An attach iterator has its _M_version set to something != 0. This 
value shall be preserved
    when detaching it so that the iterator does not look like a value 
initialized one.


    libstdc++-v3/ChangeLog:

    * include/debug/formatter.h (__singular_value_init): New 
_Iterator_state enum entry.
    (_Parameter<>(const _Safe_iterator<>&, const char*, 
_Is_iterator)): Check if iterator

    parameter is value-initialized.
    (_Parameter<>(const _Safe_local_iterator<>&, const char*, 
_Is_iterator)): Likewise.
    * include/debug/safe_iterator.h 
(_Safe_iterator<>::_M_value_initialized()): New. Adapt

    checks.
    * include/debug/safe_local_iterator.h 
(_Safe_local_iterator<>::_M_value_initialized()): New.

    Adapt checks.
    * src/c++11/debug.cc (_Safe_iterator_base::_M_reset): Do 
not reset _M_version.
    (print_field(PrintContext&, const _Parameter&, const 
char*)): Adapt state_names.
    * testsuite/23_containers/deque/debug/iterator1_neg.cc: New 
test.
    * testsuite/23_containers/deque/debug/iterator2_neg.cc: New 
test.
    * 
testsuite/23_containers/forward_list/debug/iterator1_neg.cc: New test.
    * 
testsuite/23_containers/forward_list/debug/iterator2_neg.cc: New test.


Tested under Linux x86_64.

Ok to commit ?

François

diff --git a/libstdc++-v3/include/debug/formatter.h b/libstdc++-v3/include/debug/formatter.h
index 7140fed5e83..1b96de07261 100644
--- a/libstdc++-v3/include/debug/formatter.h
+++ b/libstdc++-v3/include/debug/formatter.h
@@ -185,6 +185,7 @@ namespace __gnu_debug
   __rbegin,		// dereferenceable, and at the reverse-beginning
   __rmiddle,	// reverse-dereferenceable, not at the reverse-beginning
   __rend,		// reverse-past-the-end
+  __singular_value_init,	// singular, value initialized
   __last_state
 };
 
@@ -278,7 +279,12 @@ namespace __gnu_debug
 	  _M_variant._M_iterator._M_seq_type = _GLIBCXX_TYPEID(_Sequence);
 
 	  if (__it._M_singular())
-	_M_variant._M_iterator._M_state = __singular;
+	{
+	  if (__it._M_value_initialized())
+		_M_variant._M_iterator._M_state = __singular_value_init;
+	  else
+		_M_variant._M_iterator._M_state = __singular;
+	}
 	  else
 	{
 	  if (__it._M_is_before_begin())
@@ -306,7 +312,12 @@ namespace __gnu_debug
 	  _M_variant._M_iterator._M_seq_type = _GLIBCXX_TYPEID(_Sequence);
 
 	  if (__it._M_singular())
-	_M_variant._M_iterator._M_state = __singular;
+	{
+	  if (__it._M_value_initialized())
+		_M_variant._M_iterator._M_state = __singular_value_init;
+	  else
+		_M_variant._M_iterator._M_state = __singular;
+	}
 	  else
 	{
 	  if (__it._M_is_end())
diff --git a/libstdc++-v3/include/debug/safe_iterator.h b/libstdc++-v3/include/debug/safe_iterator.h
index a10df190969..99d8830e45e 100644
--- a/libstdc++-v3/include/debug/safe_iterator.h
+++ b/libstdc++-v3/include/debug/safe_iterator.h
@@ -41,8 +41,8 @@
 
 #define _GLIBCXX_DEBUG_VERIFY_OPERANDS(_Lhs, _Rhs, _BadMsgId, _DiffMsgId) \
   _GLIBCXX_DEBUG_VERIFY(!_Lhs._M_singular() && !_Rhs._M_singular()	\
-			|| (_Lhs.base() == _Iterator()			\
-			&& _Rhs.base() == _Iterator()),		\
+			|| (_Lhs._M_value_initialized()			\
+			&& _Rhs._M_value_initialized()),		\
 			_M_message(_BadMsgId)\
 			._M_iterator(_Lhs, #_Lhs)			\
 			._M_iterator(_Rhs, #_Rhs));			\
@@ -177,7 +177,7 @@ namespace __gnu_debug
 	// _GLIBCXX_RESOLVE_LIB_DEFECTS
 	// DR 408. Is vector > forbidden?
 	_GLIBCXX_DEBUG_VERIFY(!__x._M_singular()
-			  || __x.base() == _Iterator(),
+			  || __x._M_value_initialized(),
 			  _M_message(__msg_init_copy_singular)
 			  ._M_iterator(*this, "this")
 			  ._M_iterator(__x, "other"));
@@ -193,7 +193,7 @@ namespace __gnu_debug
   : _Iter_base()
   {
 	_GLIBCXX_DEBUG_VERIFY(!__x._M_singular()
-			  || __x.base() == _Iterator(),
+			  || __x._M_value_initialized(),
 			  _M_message(__msg_init_copy_singular)
 			  ._M_iterator(*this, "this")

Re: [Patch, fortran] PR98342 - Allocatable component in call to assumed-rank routine causes invalid pointer

2021-02-01 Thread Tobias Burnus

On 01.02.21 19:28, Paul Richard Thomas wrote:


I have attached a memory leak free version of the testcase. I have
asked for Thomas's help to use frontend-passes.c tools to do the same
for compound constructors with allocatable components. My attempts to
do the job in other ways have failed totally.


Well, if we can do it in the FE passes, why not.

I also do not mind if we have memory leaks in the testcase – if either
the standard permits it or if (as here) a PR exists, which tracks the
issue. (Especially as it is not a new issue.)

I am fine with either testcase – the non-memory-leaking one is nicer, if
we are reasonably sure it tests the right thing. (I think we are.) If
not, we should instead/additionally add the currently leaking variant.

Thanks for looking into this,

Tobias

-
Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander 
Walter


Re: [PATCH][Bug libstdc++/70303] Value-initialized debug iterators

2021-02-01 Thread François Dumont via Gcc-patches

On 01/02/21 6:43 pm, Jonathan Wakely wrote:

On 31/01/21 16:59 +0100, François Dumont via Libstdc++ wrote:
After the debug issue has been fixed in PR 98466 the problem was not 
in the debug iterator implementation itself but in the deque iterator 
operator- implementation.


    libstdc++: Make deque iterator operator- usable with value-init 
iterators


    N3644 implies that operator- can be used on value-init iterators. 
We now return
    0 if both iterators are value initialized. If only one is value 
initialized we
    keep the UB by returning the result of a normal computation which 
is an unexpected

    value.

    libstdc++/ChangeLog:

    PR libstdc++/70303
    * include/bits/stl_deque.h 
(std::deque<>::operator-(iterator, iterator)):

    Return 0 if both iterators are value-initialized.
    * testsuite/23_containers/deque/70303.cc: New test.
    * testsuite/23_containers/vector/70303.cc: New test.

Tested under Linux x86_64, ok to commit ?


OK.

I don't like adding the branch there though. Even with the
__builtin_expect it causes worse code to be generated than the
original.

This would be branchless, but a bit harder to understand:

    return difference_type(__x._S_buffer_size())
  * (__x._M_node - __y._M_node - int(__x._M_node == __y._M_node))
  + (__x._M_cur - __x._M_first) + (__y._M_last - __y._M_cur);

Please commit the fix and we can think about it later.


I do not think this work because for value-init iterators we have 
__x._M_node == __y._M_node == nullptr so the diff would give 
-_S_buffer_size().


But I got the idear, I'll prepare the patch.




François



diff --git a/libstdc++-v3/include/bits/stl_deque.h 
b/libstdc++-v3/include/bits/stl_deque.h

index d41c27717a3..04b70b77621 100644
--- a/libstdc++-v3/include/bits/stl_deque.h
+++ b/libstdc++-v3/include/bits/stl_deque.h
@@ -352,9 +352,12 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
  friend difference_type
  operator-(const _Self& __x, const _Self& __y) _GLIBCXX_NOEXCEPT
  {
-    return difference_type(_S_buffer_size())
-  * (__x._M_node - __y._M_node - 1) + (__x._M_cur - __x._M_first)
-  + (__y._M_last - __y._M_cur);
+    if (__builtin_expect(__x._M_node || __y._M_node, true))
+  return difference_type(_S_buffer_size())
+    * (__x._M_node - __y._M_node - 1) + (__x._M_cur - __x._M_first)
+    + (__y._M_last - __y._M_cur);
+
+    return 0;
  }

  // _GLIBCXX_RESOLVE_LIB_DEFECTS
@@ -366,9 +369,12 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
operator-(const _Self& __x,
  const _Deque_iterator<_Tp, _RefR, _PtrR>& __y) 
_GLIBCXX_NOEXCEPT

{
-  return difference_type(_S_buffer_size())
-    * (__x._M_node - __y._M_node - 1) + (__x._M_cur - __x._M_first)
-    + (__y._M_last - __y._M_cur);
+  if (__builtin_expect(__x._M_node || __y._M_node, true))
+    return difference_type(_S_buffer_size())
+  * (__x._M_node - __y._M_node - 1) + (__x._M_cur - 
__x._M_first)

+  + (__y._M_last - __y._M_cur);
+
+  return 0;
}

  friend _Self
diff --git a/libstdc++-v3/testsuite/23_containers/deque/70303.cc 
b/libstdc++-v3/testsuite/23_containers/deque/70303.cc

new file mode 100644
index 000..e0e63694170
--- /dev/null
+++ b/libstdc++-v3/testsuite/23_containers/deque/70303.cc
@@ -0,0 +1,67 @@
+// Copyright (C) 2021 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// You should have received a copy of the GNU General Public License 
along

+// with this library; see the file COPYING3.  If not see
+// .
+
+// { dg-do run }
+
+#include 
+#include 
+
+// PR libstdc++/70303
+
+void test01()
+{
+  typedef typename std::deque::iterator It;
+  It it = It();
+  VERIFY(it == it);
+  VERIFY(!(it != it));
+  VERIFY(it - it == 0);
+  VERIFY(!(it < it));
+  VERIFY(!(it > it));
+  VERIFY(it <= it);
+  VERIFY(it >= it);
+
+  typedef typename std::deque::const_iterator CIt;
+  CIt cit = CIt();
+  VERIFY(cit == cit);
+  VERIFY(!(cit != cit));
+  VERIFY(cit - cit == 0);
+  VERIFY(!(cit < cit));
+  VERIFY(!(cit > cit));
+  VERIFY(cit <= cit);
+  VERIFY(cit >= cit);
+
+  VERIFY(it == cit);
+  VERIFY(!(it != cit));
+  VERIFY(cit == it);
+  VERIFY(!(cit != it));
+  VERIFY(it - cit == 0);
+  VERIFY(cit - it == 0);
+  VERIFY(!(it < cit));
+  VERIFY(!(it > cit));
+  VERIFY(it <= cit);
+  VERIFY(it >= cit);
+  VERIFY(!(cit < it));
+  VERIFY(!(cit > it));
+  VERIFY(cit <= it);
+  

Re: [Patch, fortran] PR98342 - Allocatable component in call to assumed-rank routine causes invalid pointer

2021-02-01 Thread Paul Richard Thomas via Gcc-patches
Hi Tobias,

I have attached a memory leak free version of the testcase. I have asked
for Thomas's help to use frontend-passes.c tools to do the same for
compound constructors with allocatable components. My attempts to do the
job in other ways have failed totally.

Cheers

Paul


On Fri, 29 Jan 2021 at 14:56, Tobias Burnus  wrote:

> Hi Paul,
>
> On 29.01.21 15:20, Paul Richard Thomas via Fortran wrote:
> > Regtests on FC33/x86_64
> > OK for master (and maybe for 10-branch?)
>
> The patch by itself looks good to me, but
>
>   gfortran-trunk assumed_rank_20.f90 -fsanitize=address,undefined -g
>
> shows three times the warning:
>
> Direct leak of 12 byte(s) in 1 object(s) allocated from:
>  #0 0x7f2d5ef6e517 in malloc
> (/usr/lib/x86_64-linux-gnu/libasan.so.6+0xb0517)
>  #1 0x404221 in __mod_MOD_get_tuple /dev/shm/assumed_rank_20.f90:60
>  #2 0x40ad8e in alloc_rank /dev/shm/assumed_rank_20.f90:78 (+ line 84,
> + line 90)
>  #3 0x40d9e7 in main /dev/shm/assumed_rank_20.f90:67
>
> Thus, the function-result temporary does not seem to get deallocated
> when a constructor is used:
>
> 78:  output = sel_rank1([get_tuple(x)])  ! This worked OK
> 84:  output = sel_rank2([get_tuple(x)])  ! This worked OK
> 90:  output = sel_rank3([get_tuple(x)])  ! runtime: segmentation fault
>
> Thanks,
>
> Tobias
>
> > Fortran: Fix memory problems with assumed rank formal args [PR98342].
> >
> > 2021-01-29  Paul Thomas  
> >
> > gcc/fortran
> > PR fortran/98342
> > * trans-expr.c (gfc_conv_derived_to_class): Add optional arg.
> > 'derived_array' to hold the fixed, parmse expr in the case of
> > assumed rank formal arguments. Deal with optional arguments.
> > (gfc_conv_procedure_call): Null 'derived' array for each actual
> > argument. Add its address to the call to gfc_conv_derived_to_
> > class. Access the 'data' field of scalar descriptors before
> > deallocating allocatable components. Also strip NOPs before the
> > calls to gfc_deallocate_alloc_comp. Use 'derived' array as the
> > input to gfc_deallocate_alloc_comp if it is available.
> > * trans.h : Include the optional argument 'derived_array' to
> > the prototype of gfc_conv_derived_to_class. The default value
> > is NULL_TREE.
> >
> > gcc/testsuite/
> > PR fortran/98342
> > * gfortran.dg/assumed_rank_20.f90 : New test.
> -
> Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München /
> Germany
> Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung,
> Alexander Walter
>


-- 
"If you can't explain it simply, you don't understand it well enough" -
Albert Einstein
! { dg-do run }
!
! Test the fix for PR98342.
!
! Contributed by Martin Stein  
!
module mod
  implicit none
  private
  public get_tuple, sel_rank1, sel_rank2, sel_rank3

  type, public :: tuple
  integer, dimension(:), allocatable :: t
end type tuple

contains

function sel_rank1(x) result(s)
  character(len=:), allocatable :: s
  type(tuple), dimension(..), intent(in) :: x
  select rank (x)
rank (0)
  s = '10'
rank (1)
  s = '11'
rank default
  s = '?'
  end select
end function sel_rank1

function sel_rank2(x) result(s)
  character(len=:), allocatable :: s
  class(tuple), dimension(..), intent(in) :: x
  select rank (x)
rank (0)
  s = '20'
rank (1)
  s = '21'
rank default
  s = '?'
  end select
end function sel_rank2

function sel_rank3(x) result(s)
  character(len=:), allocatable :: s
  class(*), dimension(..), intent(in) :: x
  select rank (x)
rank (0)
  s = '30'
rank (1)
  s = '31'
rank default
  s = '?'
  end select
end function sel_rank3

function get_tuple(t) result(a)
  type(tuple) :: a
  integer, dimension(:), intent(in) :: t
  allocate(a%t, source=t)
end function get_tuple

end module mod


program alloc_rank
  use mod
  implicit none

  integer, dimension(1:3) :: x
  character(len=:), allocatable :: output
  type(tuple) :: z

  x = [1,2,3]
  z = get_tuple (x)
   ! Derived type formal arg
  output = sel_rank1(get_tuple (x))! runtime: Error in `./alloc_rank.x':
  if (output .ne. '10') stop 1
  output = sel_rank1([z])  ! This worked OK
  if (output .ne. '11') stop 2

   ! Class formal arg
  output = sel_rank2(get_tuple (x))! runtime: Error in `./alloc_rank.x':
  if (output .ne. '20') stop 3
  output = sel_rank2([z])  ! This worked OK
  if (output .ne. '21') stop 4

   ! Unlimited polymorphic formal arg
  output = sel_rank3(get_tuple (x))! runtime: Error in `./alloc_rank.x':
  if (output .ne. '30') stop 5
  output = sel_rank3([z])  ! runtime: segmentation fault
  if (output .ne. '31') stop 6

  deallocate (output)
  deallocate (z%t)
end program alloc_rank


[RFC] Feedback on approach for adding support for V8QI->V8HI widening patterns

2021-02-01 Thread Joel Hutton via Gcc-patches
Hi Richard(s),

I'm just looking to see if I'm going about this the right way, based on the 
discussion we had on IRC. I've managed to hack something together, I've 
attached a (very) WIP patch which gives the correct codegen for the testcase in 
question (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98772). It would 
obviously need to support other widening patterns and differentiate between 
big/little endian among other things.

I added a backend pattern because I wasn't quite clear which changes to make in 
order to allow the existing backend patterns to be used with a V8QI, or how to 
represent V16QI where we don't care about the top/bottom 8. I made some attempt 
in optabs.c, which is in the patch commented out, but I'm not sure if I'm going 
about this the right way.

Joel
diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md
index be2a5a865172bdd7848be4082abb0fdfb0b35937..c66b8a367623c8daf4423677d292e292feee3606 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -3498,6 +3498,14 @@
   DONE;
 })
 
+(define_insn "vec_widen_usubl_half_v8qi"
+  [(match_operand:V8HI 0 "register_operand")
+(match_operand:V8QI 1 "register_operand")
+(match_operand:V8QI 2 "register_operand")]
+  "TARGET_SIMD"
+  "usubl\t%0., %1., %2."
+)
+
 (define_expand "vec_widen_subl_hi_"
   [(match_operand: 0 "register_operand")
(ANY_EXTEND: (match_operand:VQW 1 "register_operand"))
diff --git a/gcc/expr.c b/gcc/expr.c
index 04ef5ad114d0662948c896cdbf58e67737b39c7e..0939a156deef63f1cf2fa7e29c2c94925820f2ba 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -9785,6 +9785,7 @@ expand_expr_real_2 (sepops ops, rtx target, machine_mode tmode,
 
 case VEC_WIDEN_PLUS_HI_EXPR:
 case VEC_WIDEN_PLUS_LO_EXPR:
+case VEC_WIDEN_MINUS_HALF_EXPR:
 case VEC_WIDEN_MINUS_HI_EXPR:
 case VEC_WIDEN_MINUS_LO_EXPR:
 case VEC_WIDEN_MULT_HI_EXPR:
diff --git a/gcc/optabs-query.h b/gcc/optabs-query.h
index 876a3a6f348de122e5a52e6dd70d7946bc810162..10aa21d07595325fd8ef3057444853fc946385de 100644
--- a/gcc/optabs-query.h
+++ b/gcc/optabs-query.h
@@ -186,6 +186,9 @@ bool can_vec_perm_const_p (machine_mode, const vec_perm_indices &,
 enum insn_code find_widening_optab_handler_and_mode (optab, machine_mode,
 		 machine_mode,
 		 machine_mode *);
+enum insn_code find_half_mode_optab_and_mode (optab, machine_mode,
+		 machine_mode,
+		 machine_mode *);
 int can_mult_highpart_p (machine_mode, bool);
 bool can_vec_mask_load_store_p (machine_mode, machine_mode, bool);
 opt_machine_mode get_len_load_store_mode (machine_mode, bool);
diff --git a/gcc/optabs-query.c b/gcc/optabs-query.c
index 3248ce2c06e65c9c0366757907ab057407f7c594..7abfc04aa18b7ee5b734a1b1f4378b4615ee31fd 100644
--- a/gcc/optabs-query.c
+++ b/gcc/optabs-query.c
@@ -462,6 +462,17 @@ can_vec_perm_const_p (machine_mode mode, const vec_perm_indices ,
   return false;
 }
 
+enum insn_code
+find_half_mode_optab_and_mode (optab op, machine_mode to_mode,
+  machine_mode from_mode,
+  machine_mode *found_mode)
+{
+insn_code icode = CODE_FOR_nothing;
+if (GET_MODE_2XWIDER_MODE(from_mode).exists(found_mode))
+  icode = optab_handler (op, *found_mode);
+return icode;
+}
+
 /* Find a widening optab even if it doesn't widen as much as we want.
E.g. if from_mode is HImode, and to_mode is DImode, and there is no
direct HI->SI insn, then return SI->DI, if that exists.  */
diff --git a/gcc/optabs-tree.c b/gcc/optabs-tree.c
index c94073e3ed98f8c4cab65891f65dedebdb1ec274..eb52dc15f8094594c4aa22d5fc1c442886e4ebf6 100644
--- a/gcc/optabs-tree.c
+++ b/gcc/optabs-tree.c
@@ -185,6 +185,9 @@ optab_for_tree_code (enum tree_code code, const_tree type,
 case VEC_WIDEN_MINUS_HI_EXPR:
   return (TYPE_UNSIGNED (type)
 	  ? vec_widen_usubl_hi_optab : vec_widen_ssubl_hi_optab);
+
+case VEC_WIDEN_MINUS_HALF_EXPR:
+  return vec_widen_usubl_half_optab;
 
 case VEC_UNPACK_HI_EXPR:
   return (TYPE_UNSIGNED (type)
@@ -308,6 +311,16 @@ supportable_convert_operation (enum tree_code code,
   if (!VECTOR_MODE_P (m1) || !VECTOR_MODE_P (m2))
 return false;
 
+  /* The case where vectype_in is half the vector width, as opposed to the
+ normal case for widening patterns of vector width input, with output in
+ multiple registers. */
+  if (code == WIDEN_MINUS_EXPR &&
+  known_eq(TYPE_VECTOR_SUBPARTS(vectype_in),TYPE_VECTOR_SUBPARTS(vectype_out)) )
+  {
+*code1 = VEC_WIDEN_MINUS_HALF_EXPR;
+return true;
+  }
+
   /* First check if we can done conversion directly.  */
   if ((code == FIX_TRUNC_EXPR
&& can_fix_p (m1,m2,TYPE_UNSIGNED (vectype_out), )
diff --git a/gcc/optabs.c b/gcc/optabs.c
index f4614a394587787293dc8b680a38901f7906f61c..1252097be9893d7d65ea844fc0eda9bad70b9256 100644
--- a/gcc/optabs.c
+++ b/gcc/optabs.c
@@ -293,6 +293,13 @@ expand_widen_pattern_expr (sepops ops, rtx op0, rtx op1, rtx wide_op,
 icode = 

Re: [PATCH][Bug libstdc++/70303] Value-initialized debug iterators

2021-02-01 Thread Jonathan Wakely via Gcc-patches

On 31/01/21 16:59 +0100, François Dumont via Libstdc++ wrote:
After the debug issue has been fixed in PR 98466 the problem was not 
in the debug iterator implementation itself but in the deque iterator 
operator- implementation.


    libstdc++: Make deque iterator operator- usable with value-init 
iterators


    N3644 implies that operator- can be used on value-init iterators. 
We now return
    0 if both iterators are value initialized. If only one is value 
initialized we
    keep the UB by returning the result of a normal computation which 
is an unexpected

    value.

    libstdc++/ChangeLog:

    PR libstdc++/70303
    * include/bits/stl_deque.h 
(std::deque<>::operator-(iterator, iterator)):

    Return 0 if both iterators are value-initialized.
    * testsuite/23_containers/deque/70303.cc: New test.
    * testsuite/23_containers/vector/70303.cc: New test.

Tested under Linux x86_64, ok to commit ?


OK.

I don't like adding the branch there though. Even with the
__builtin_expect it causes worse code to be generated than the
original.

This would be branchless, but a bit harder to understand:

return difference_type(__x._S_buffer_size())
  * (__x._M_node - __y._M_node - int(__x._M_node == __y._M_node))
  + (__x._M_cur - __x._M_first) + (__y._M_last - __y._M_cur);

Please commit the fix and we can think about it later.


François




diff --git a/libstdc++-v3/include/bits/stl_deque.h 
b/libstdc++-v3/include/bits/stl_deque.h
index d41c27717a3..04b70b77621 100644
--- a/libstdc++-v3/include/bits/stl_deque.h
+++ b/libstdc++-v3/include/bits/stl_deque.h
@@ -352,9 +352,12 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
  friend difference_type
  operator-(const _Self& __x, const _Self& __y) _GLIBCXX_NOEXCEPT
  {
-   return difference_type(_S_buffer_size())
- * (__x._M_node - __y._M_node - 1) + (__x._M_cur - __x._M_first)
- + (__y._M_last - __y._M_cur);
+   if (__builtin_expect(__x._M_node || __y._M_node, true))
+ return difference_type(_S_buffer_size())
+   * (__x._M_node - __y._M_node - 1) + (__x._M_cur - __x._M_first)
+   + (__y._M_last - __y._M_cur);
+
+   return 0;
  }

  // _GLIBCXX_RESOLVE_LIB_DEFECTS
@@ -366,9 +369,12 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
operator-(const _Self& __x,
  const _Deque_iterator<_Tp, _RefR, _PtrR>& __y) 
_GLIBCXX_NOEXCEPT
{
- return difference_type(_S_buffer_size())
-   * (__x._M_node - __y._M_node - 1) + (__x._M_cur - __x._M_first)
-   + (__y._M_last - __y._M_cur);
+ if (__builtin_expect(__x._M_node || __y._M_node, true))
+   return difference_type(_S_buffer_size())
+ * (__x._M_node - __y._M_node - 1) + (__x._M_cur - __x._M_first)
+ + (__y._M_last - __y._M_cur);
+
+ return 0;
}

  friend _Self
diff --git a/libstdc++-v3/testsuite/23_containers/deque/70303.cc 
b/libstdc++-v3/testsuite/23_containers/deque/70303.cc
new file mode 100644
index 000..e0e63694170
--- /dev/null
+++ b/libstdc++-v3/testsuite/23_containers/deque/70303.cc
@@ -0,0 +1,67 @@
+// Copyright (C) 2021 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3.  If not see
+// .
+
+// { dg-do run }
+
+#include 
+#include 
+
+// PR libstdc++/70303
+
+void test01()
+{
+  typedef typename std::deque::iterator It;
+  It it = It();
+  VERIFY(it == it);
+  VERIFY(!(it != it));
+  VERIFY(it - it == 0);
+  VERIFY(!(it < it));
+  VERIFY(!(it > it));
+  VERIFY(it <= it);
+  VERIFY(it >= it);
+
+  typedef typename std::deque::const_iterator CIt;
+  CIt cit = CIt();
+  VERIFY(cit == cit);
+  VERIFY(!(cit != cit));
+  VERIFY(cit - cit == 0);
+  VERIFY(!(cit < cit));
+  VERIFY(!(cit > cit));
+  VERIFY(cit <= cit);
+  VERIFY(cit >= cit);
+
+  VERIFY(it == cit);
+  VERIFY(!(it != cit));
+  VERIFY(cit == it);
+  VERIFY(!(cit != it));
+  VERIFY(it - cit == 0);
+  VERIFY(cit - it == 0);
+  VERIFY(!(it < cit));
+  VERIFY(!(it > cit));
+  VERIFY(it <= cit);
+  VERIFY(it >= cit);
+  VERIFY(!(cit < it));
+  VERIFY(!(cit > it));
+  VERIFY(cit <= it);
+  VERIFY(cit >= it);
+}
+
+int main()
+{
+  test01();
+  return 0;
+}
diff --git a/libstdc++-v3/testsuite/23_containers/vector/70303.cc 

Re: [PATCH v3] clear VLA bounds in attribute access (PR 97172)

2021-02-01 Thread Martin Sebor via Gcc-patches

On 2/1/21 9:27 AM, Jakub Jelinek wrote:

On Mon, Feb 01, 2021 at 09:11:20AM -0700, Martin Sebor via Gcc-patches wrote:

Because free_lang_data only frees anything when LTO is enabled and
we want these trees cleared regardless to keep them from getting
clobbered during gimplification, this change also modifies the pass
to do the clearing even when the pass is otherwise inactive.


    if (TREE_CODE (bound) == NOP_EXPR)
+    bound = TREE_OPERAND (bound, 0);
+
+  if (TREE_CODE (bound) == CONVERT_EXPR)
+    {
+  tree op0 = TREE_OPERAND (bound, 0);
+  tree bndtyp = TREE_TYPE (bound);
+  tree op0typ = TREE_TYPE (op0);
+  if (TYPE_PRECISION (bndtyp) == TYPE_PRECISION (op0typ))
+   bound = op0;
+    }
+
+  if (TREE_CODE (bound) == NON_LVALUE_EXPR)
+    bound = TREE_OPERAND (bound, 0);

all of the above can be just

     STRIP_NOPS (bound);

which also handles nesting of the above in any order.


No, it can't be just STRIP_NOPS.

The goal is to strip the meaningless (to the user) cast to sizetype
from the array type.  For example:

    void f (int n, int[n]);
    void f (int n, int[n + 1]);

I want the type in the warning to reflect the source:

    warning: argument 2 of type ‘int[n + 1]’ declared with mismatched
bound ‘n + 1’ [-Wvla-parameter]

and not:

    warning: ... ‘int[(sizetype)(n + 1)]’ ...



+  if (TREE_CODE (bound) == PLUS_EXPR
+  && integer_all_onesp (TREE_OPERAND (bound, 1)))
+    {
+  bound = TREE_OPERAND (bound, 0);
+  if (TREE_CODE (bound) == NOP_EXPR)
+   bound = TREE_OPERAND (bound, 0);
+    }

so it either does or does not strip a -1 but has no indication on
whether it did that?  That looks fragile and broken.


Indication to what?  The caller?  The function is only used to recover
a meaningful VLA bound for warnings and its callers aren't interested
in whether the -1 was stripped or not.



Anyway, the split out of this function seems unrelated to the
original problem and should be submitted separately.


It was a remnant of the previous patch where it was used to format
the string representation of the VLA bounds and called from three
sites.  I've removed the function from this revision (and restored
the one site in the pretty printer that open-codes the same thing).



+  for (vblist = TREE_VALUE (vblist); vblist; vblist =
TREE_CHAIN (vblist))
+   {
+ tree *pvbnd = _VALUE (vblist);
+ if (!*pvbnd || DECL_P (*pvbnd))
+   continue;

so this doesn't let constant bounds prevail but only symbolical
ones?  Not
that I care but I'd have expected || CONSTANT_CLASS_P (*pvbnd)


There must be some confusion here.  There are no constant VLA bounds.
The essential purpose of this patch is to remove expressions from
the attributes, such as PLUS_EXPR, that denote nontrivial VLA bounds.
The test above retains decls that might refer to function parameters
or global variables so that they can be mentioned in middle end
warnings.

Attached is yet another revision of this fix that moves the call
to attr_access:free_lang_data() to c_parse_final_cleanups() as
Jakub suggested.


With no further comments I have committed the final patch in
g:0718336a528.


This is unacceptable, you chose to ignore Richard's comments,
nobody has approved the patch and you've committed it anyway.


You might want to look at the commit first before making accusations.
I committed the subset that Richard approved in the place you suggested.
That's also what I posted in my last reply:
https://gcc.gnu.org/pipermail/gcc-patches/2021-January/564475.html



The code of course should be using STRIP_NOPS, and if the callers
don't care if you sometimes strip away + -1 from it or not, they are just
broken.  Either the expression stands for the largest valid index into the
array, or it stands for the number of array elements.  If the former, you
don't want to strip away + -1 when it appears, if the latter, you do want to
strip it away but if you don't find it, you need to add + 1 yourself, the +
-1 could disappear from earlier folding.


None of this was committed.

Martin


Re: [PATCH v3] clear VLA bounds in attribute access (PR 97172)

2021-02-01 Thread Jakub Jelinek via Gcc-patches
On Mon, Feb 01, 2021 at 09:11:20AM -0700, Martin Sebor via Gcc-patches wrote:
> > > > Because free_lang_data only frees anything when LTO is enabled and
> > > > we want these trees cleared regardless to keep them from getting
> > > > clobbered during gimplification, this change also modifies the pass
> > > > to do the clearing even when the pass is otherwise inactive.
> > > 
> > >    if (TREE_CODE (bound) == NOP_EXPR)
> > > +    bound = TREE_OPERAND (bound, 0);
> > > +
> > > +  if (TREE_CODE (bound) == CONVERT_EXPR)
> > > +    {
> > > +  tree op0 = TREE_OPERAND (bound, 0);
> > > +  tree bndtyp = TREE_TYPE (bound);
> > > +  tree op0typ = TREE_TYPE (op0);
> > > +  if (TYPE_PRECISION (bndtyp) == TYPE_PRECISION (op0typ))
> > > +   bound = op0;
> > > +    }
> > > +
> > > +  if (TREE_CODE (bound) == NON_LVALUE_EXPR)
> > > +    bound = TREE_OPERAND (bound, 0);
> > > 
> > > all of the above can be just
> > > 
> > >     STRIP_NOPS (bound);
> > > 
> > > which also handles nesting of the above in any order.
> > 
> > No, it can't be just STRIP_NOPS.
> > 
> > The goal is to strip the meaningless (to the user) cast to sizetype
> > from the array type.  For example:
> > 
> >    void f (int n, int[n]);
> >    void f (int n, int[n + 1]);
> > 
> > I want the type in the warning to reflect the source:
> > 
> >    warning: argument 2 of type ‘int[n + 1]’ declared with mismatched
> > bound ‘n + 1’ [-Wvla-parameter]
> > 
> > and not:
> > 
> >    warning: ... ‘int[(sizetype)(n + 1)]’ ...
> > 
> > > 
> > > +  if (TREE_CODE (bound) == PLUS_EXPR
> > > +  && integer_all_onesp (TREE_OPERAND (bound, 1)))
> > > +    {
> > > +  bound = TREE_OPERAND (bound, 0);
> > > +  if (TREE_CODE (bound) == NOP_EXPR)
> > > +   bound = TREE_OPERAND (bound, 0);
> > > +    }
> > > 
> > > so it either does or does not strip a -1 but has no indication on
> > > whether it did that?  That looks fragile and broken.
> > 
> > Indication to what?  The caller?  The function is only used to recover
> > a meaningful VLA bound for warnings and its callers aren't interested
> > in whether the -1 was stripped or not.
> > 
> > > 
> > > Anyway, the split out of this function seems unrelated to the
> > > original problem and should be submitted separately.
> > 
> > It was a remnant of the previous patch where it was used to format
> > the string representation of the VLA bounds and called from three
> > sites.  I've removed the function from this revision (and restored
> > the one site in the pretty printer that open-codes the same thing).
> > 
> > > 
> > > +  for (vblist = TREE_VALUE (vblist); vblist; vblist =
> > > TREE_CHAIN (vblist))
> > > +   {
> > > + tree *pvbnd = _VALUE (vblist);
> > > + if (!*pvbnd || DECL_P (*pvbnd))
> > > +   continue;
> > > 
> > > so this doesn't let constant bounds prevail but only symbolical
> > > ones?  Not
> > > that I care but I'd have expected || CONSTANT_CLASS_P (*pvbnd)
> > 
> > There must be some confusion here.  There are no constant VLA bounds.
> > The essential purpose of this patch is to remove expressions from
> > the attributes, such as PLUS_EXPR, that denote nontrivial VLA bounds.
> > The test above retains decls that might refer to function parameters
> > or global variables so that they can be mentioned in middle end
> > warnings.
> > 
> > Attached is yet another revision of this fix that moves the call
> > to attr_access:free_lang_data() to c_parse_final_cleanups() as
> > Jakub suggested.
> 
> With no further comments I have committed the final patch in
> g:0718336a528.

This is unacceptable, you chose to ignore Richard's comments,
nobody has approved the patch and you've committed it anyway.

The code of course should be using STRIP_NOPS, and if the callers
don't care if you sometimes strip away + -1 from it or not, they are just
broken.  Either the expression stands for the largest valid index into the
array, or it stands for the number of array elements.  If the former, you
don't want to strip away + -1 when it appears, if the latter, you do want to
strip it away but if you don't find it, you need to add + 1 yourself, the +
-1 could disappear from earlier folding.

Jakub



Re: [PATCH] document BLOCK_ABSTRACT_ORIGIN et al.

2021-02-01 Thread Martin Sebor via Gcc-patches

I have pushed the tree.h comments in g:6a2053773b8.  I will wait
for an approval of the changes to the manual.

On 1/27/21 5:54 PM, Martin Sebor wrote:

Attached is an updated patch for both tree.h and the internals manual
documenting the most important BLOCK_ macros and what they represent.

On 1/21/21 2:52 PM, Martin Sebor wrote:

On 1/18/21 6:25 AM, Richard Biener wrote:

PS Here are my notes on the macros and the two related functions:

BLOCK: Denotes a lexical scope.  Contains BLOCK_VARS of variables
declared in it, BLOCK_SUBBLOCKS of scopes nested in it, and
BLOCK_CHAIN pointing to the next BLOCK.  Its BLOCK_SUPERCONTEXT
point to the BLOCK of the enclosing scope.  May have
a BLOCK_ABSTRACT_ORIGIN and a BLOCK_SOURCE_LOCATION.

BLOCK_SUPERCONTEXT: The scope of the enclosing block, or FUNCTION_DECL
for the "outermost" function scope.  Inlined functions are chained by
this so that given expression E and its TREE_BLOCK(E) B,
BLOCK_SUPERCONTEXT(B) is the scope (BLOCK) in which E has been made
or into which E has been inlined.  In the latter case,

BLOCK_ORIGIN(B) evaluates either to the enclosing BLOCK or to
the enclosing function DECL.  It's never null.

BLOCK_ABSTRACT_ORIGIN(B) is the FUNCTION_DECL of the function into
which it has been inlined, or null if B is not inlined.


It's the BLOCK or FUNCTION it was inlined _from_, not were it was 
inlined to.
It's the "ultimate" source, thus the abstract copy of the block or 
function decl
(for the outermost scope, aka inlined_function_outer_scope_p).  It 
corresponds

to what you'd expect for the DWARF abstract origin.


Thanks for the correction!  It's just the "innermost" block that
points to the "ultimate" destination into which it's been inlined.



BLOCK_ABSTRACT_ORIGIN can be NULL (in case it isn't an inline instance).


BLOCK_ABSTRACT_ORIGIN: A BLOCK, or FUNCTION_DECL of the function
into which a block has been inlined.  In a BLOCK immediately enclosing
an inlined leaf expression points to the outermost BLOCK into which it
has been inlined (thus bypassing all intermediate BLOCK_SUPERCONTEXTs).

BLOCK_FRAGMENT_ORIGIN: ???
BLOCK_FRAGMENT_CHAIN: ???


that's for scope blocks split by hot/cold partitioning and only 
temporarily

populated.


Thanks, I now see these documented in detail in tree.h.




bool inlined_function_outer_scope_p(BLOCK)   [tree.h]
    Returns true if a BLOCK has a source location.
    True for all but the innermost (no SUBBLOCKs?) and outermost blocks
    into which an expression has been inlined. (Is this always true?)

tree block_ultimate_origin(BLOCK)   [tree.c]
    Returns BLOCK_ABSTRACT_ORIGIN(BLOCK), AO, after asserting that
    (DECL_P(AO) && DECL_ORIGIN(AO) == AO) || BLOCK_ORIGIN(AO) == AO).


The attached diff adds the comments above to tree.h.

I looked for a good place in the manual to add the same text but I'm
not sure.  Would the Blocks @subsection in generic.texi be appropriate?

Martin







Re: [PATCH v3] clear VLA bounds in attribute access (PR 97172)

2021-02-01 Thread Martin Sebor via Gcc-patches

On 1/28/21 1:59 PM, Martin Sebor wrote:

On 1/28/21 1:31 AM, Richard Biener wrote:

On Thu, Jan 28, 2021 at 12:08 AM Martin Sebor via Gcc-patches
 wrote:


Attached is another attempt to fix the problem caused by allowing
front-end trees representing nontrivial VLA bound expressions to
stay in attribute access attached to functions.  Since removing
these trees seems to be everyone's preference this patch does that
by extending the free_lang_data pass to look for and zero out these
trees.

Because free_lang_data only frees anything when LTO is enabled and
we want these trees cleared regardless to keep them from getting
clobbered during gimplification, this change also modifies the pass
to do the clearing even when the pass is otherwise inactive.


   if (TREE_CODE (bound) == NOP_EXPR)
+    bound = TREE_OPERAND (bound, 0);
+
+  if (TREE_CODE (bound) == CONVERT_EXPR)
+    {
+  tree op0 = TREE_OPERAND (bound, 0);
+  tree bndtyp = TREE_TYPE (bound);
+  tree op0typ = TREE_TYPE (op0);
+  if (TYPE_PRECISION (bndtyp) == TYPE_PRECISION (op0typ))
+   bound = op0;
+    }
+
+  if (TREE_CODE (bound) == NON_LVALUE_EXPR)
+    bound = TREE_OPERAND (bound, 0);

all of the above can be just

    STRIP_NOPS (bound);

which also handles nesting of the above in any order.


No, it can't be just STRIP_NOPS.

The goal is to strip the meaningless (to the user) cast to sizetype
from the array type.  For example:

   void f (int n, int[n]);
   void f (int n, int[n + 1]);

I want the type in the warning to reflect the source:

   warning: argument 2 of type ‘int[n + 1]’ declared with mismatched 
bound ‘n + 1’ [-Wvla-parameter]


and not:

   warning: ... ‘int[(sizetype)(n + 1)]’ ...



+  if (TREE_CODE (bound) == PLUS_EXPR
+  && integer_all_onesp (TREE_OPERAND (bound, 1)))
+    {
+  bound = TREE_OPERAND (bound, 0);
+  if (TREE_CODE (bound) == NOP_EXPR)
+   bound = TREE_OPERAND (bound, 0);
+    }

so it either does or does not strip a -1 but has no indication on
whether it did that?  That looks fragile and broken.


Indication to what?  The caller?  The function is only used to recover
a meaningful VLA bound for warnings and its callers aren't interested
in whether the -1 was stripped or not.



Anyway, the split out of this function seems unrelated to the
original problem and should be submitted separately.


It was a remnant of the previous patch where it was used to format
the string representation of the VLA bounds and called from three
sites.  I've removed the function from this revision (and restored
the one site in the pretty printer that open-codes the same thing).



+  for (vblist = TREE_VALUE (vblist); vblist; vblist = TREE_CHAIN 
(vblist))

+   {
+ tree *pvbnd = _VALUE (vblist);
+ if (!*pvbnd || DECL_P (*pvbnd))
+   continue;

so this doesn't let constant bounds prevail but only symbolical ones?  
Not

that I care but I'd have expected || CONSTANT_CLASS_P (*pvbnd)


There must be some confusion here.  There are no constant VLA bounds.
The essential purpose of this patch is to remove expressions from
the attributes, such as PLUS_EXPR, that denote nontrivial VLA bounds.
The test above retains decls that might refer to function parameters
or global variables so that they can be mentioned in middle end
warnings.

Attached is yet another revision of this fix that moves the call
to attr_access:free_lang_data() to c_parse_final_cleanups() as
Jakub suggested.


With no further comments I have committed the final patch in
g:0718336a528.

Martin


[committed] libstdc++: Update C++17 status table for

2021-02-01 Thread Jonathan Wakely via Gcc-patches
libstdc++-v3/ChangeLog:

* doc/xml/manual/status_cxx2011.xml: Update std::call_once
status.
* doc/xml/manual/status_cxx2014.xml: Likewise.
* doc/xml/manual/status_cxx2017.xml: Likewise. Update
std::from_chars and std::to_chars status. Fix formatting.
* doc/html/manual/status.html: Regenerate.

Committed to trunk.

commit 90c9b2c17688f7be434415e90c5a655a6ecfaa9e
Author: Jonathan Wakely 
Date:   Mon Feb 1 15:39:24 2021

libstdc++: Update C++17 status table for 

libstdc++-v3/ChangeLog:

* doc/xml/manual/status_cxx2011.xml: Update std::call_once
status.
* doc/xml/manual/status_cxx2014.xml: Likewise.
* doc/xml/manual/status_cxx2017.xml: Likewise. Update
std::from_chars and std::to_chars status. Fix formatting.
* doc/html/manual/status.html: Regenerate.

diff --git a/libstdc++-v3/doc/xml/manual/status_cxx2011.xml 
b/libstdc++-v3/doc/xml/manual/status_cxx2011.xml
index e8f8784c1e9..e13ca566ea3 100644
--- a/libstdc++-v3/doc/xml/manual/status_cxx2011.xml
+++ b/libstdc++-v3/doc/xml/manual/status_cxx2011.xml
@@ -2401,10 +2401,11 @@ particular release.
   
   30.4.4.2
   Function call_once
-  Broken
-  See http://www.w3.org/1999/xlink;
+  Y
+  Exception support is broken on non-Linux targets.
+   See http://www.w3.org/1999/xlink;
xlink:href="https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66146;>PR
-   66146
+   66146.
   
 
 
diff --git a/libstdc++-v3/doc/xml/manual/status_cxx2014.xml 
b/libstdc++-v3/doc/xml/manual/status_cxx2014.xml
index 0d138abf794..7b2d4603b24 100644
--- a/libstdc++-v3/doc/xml/manual/status_cxx2014.xml
+++ b/libstdc++-v3/doc/xml/manual/status_cxx2014.xml
@@ -1388,9 +1388,10 @@ not in any particular release.
   30.4.4.2
   Function call_once
   Broken
-  See http://www.w3.org/1999/xlink;
+  Exception support is broken on non-Linux targets.
+   See http://www.w3.org/1999/xlink;
xlink:href="https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66146;>PR
-   66146
+   66146.
   
 
 
diff --git a/libstdc++-v3/doc/xml/manual/status_cxx2017.xml 
b/libstdc++-v3/doc/xml/manual/status_cxx2017.xml
index f97fc060fa0..7b5df3d1385 100644
--- a/libstdc++-v3/doc/xml/manual/status_cxx2017.xml
+++ b/libstdc++-v3/doc/xml/manual/status_cxx2017.xml
@@ -457,7 +457,7 @@ Feature-testing recommendations for C++.
P0185R1

   
-   7 (__is_swappable available since 
6.1)
+   7.1 (__is_swappable available since 
6.1)
__cpp_lib_is_swappable >= 201603 
 
 
@@ -641,7 +641,7 @@ Feature-testing recommendations for C++.

   
9.1 
-   __has_include(execution) ,
+   __has_include(execution),
  __cpp_lib_execution >= 201603 ,
  __cpp_lib_parallel_algorithm >= 201603 
  (requires linking with -ltbb, see Note 3)
@@ -702,7 +702,7 @@ Feature-testing recommendations for C++.

   
8.1 
-   __has_include(filesystem) ,
+   __has_include(filesystem),
  __cpp_lib_filesystem >= 201603 
 (GCC 8.x requires linking with -lstdc++fs)
   
@@ -787,15 +787,14 @@ Feature-testing recommendations for C++.
 
 
 
-  
Elementary string conversions 
   
http://www.w3.org/1999/xlink; 
xlink:href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2016/p0067r5.html;>
P0067R5

   
-   8.1 (only integral types supported) 
-   __has_include(charconv) 
+   11.1 (integral types supported since 8.1) 
+   __has_include(charconv),
   __cpp_lib_to_chars >= 201611 
 
 
@@ -1182,18 +1181,16 @@ since C++14 and the implementation is complete.
   
 
 
-  
   23.2.8
   Primitive numeric output conversion
   Partial
-  Only integer types supported, not floating-point types
+  
 
 
-  
   23.2.9
   Primitive numeric input conversion
   Partial
-  Only integer types supported, not floating-point types
+  
 
 
   23.3
@@ -2508,10 +2505,11 @@ since C++14 and the implementation is complete.
   
   33.4.6.2
   Function call_once
-  Broken
-  See http://www.w3.org/1999/xlink;
+  Y
+  Exception support is broken on non-Linux targets.
+   See http://www.w3.org/1999/xlink;
xlink:href="https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66146;>PR
-   66146
+   66146.
   
 
 


[PATCH] aarch64: Reimplement vmovl_high_* intrinsics using builtins

2021-02-01 Thread Kyrylo Tkachov via Gcc-patches
Hi all,

The vmovl_high_* intrinsics map down to the SXTL2/UXTL2 instructions that 
already have appropriately-named
patterns and expanders, so it's straightforward to wire them up.

Bootstrapped and tested on aarch64-none-linux-gnu.
Pushing to trunk.
Thanks,
Kyrill

gcc/ChangeLog:

* config/aarch64/aarch64-simd-builtins.def (vec_unpacks_hi,
vec_unpacku_hi_): Define builtins.
* config/aarch64/arm_neon.h (vmovl_high_s8): Reimplement using builtin.
(vmovl_high_s16): Likewise.
(vmovl_high_s32): Likewise.
(vmovl_high_u8): Likewise.
(vmovl_high_u16): Likewise.
(vmovl_high_u32): Likewise.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/simd/vmovl_high_1.c: New test.


vmovl-hi.patch
Description: vmovl-hi.patch


[PATCH] aarch64: Reimplement vabdl_* intrinsics using builtins

2021-02-01 Thread Kyrylo Tkachov via Gcc-patches
Hi all,

Another simple set of intrinsic moved to builtins in the straightforward way.

Bootstrapped and tested on aarch64-none-linux-gnu.
Pushing to trunk.
Thanks,
Kyrill

gcc/ChangeLog:

* config/aarch64/aarch64-simd-builtins.def (sabdl, uabdl): Define
builtins.
* config/aarch64/aarch64-simd.md (aarch64_abdl): New pattern.
* config/aarch64/aarch64.md (unspec): Define UNSPEC_SABDL, UNSPEC_UABDL.
* config/aarch64/arm_neon.h (vabdl_s8): Reimplemet using builtin.
(vabdl_s16): Likewise.
(vabdl_s32): Likewise.
(vabdl_u8): Likewise.
(vabdl_u16): Likewise.
(vabdl_u32): Likewise.
* config/aarch64/iterators.md (ABDL): New int iterator.
(sur): Handle UNSPEC_SABDL, UNSPEC_UABDL.


vabdl.patch
Description: vabdl.patch


[committed] add a new -Wclass-memaccess test (PR 98835)

2021-02-01 Thread Martin Sebor via Gcc-patches

The warning reported in PR 98835 is a true positive but there was
no test for this aspect of it.  I have added one on the attached
diff.

Martin
commit c2f8e378d64f65645e5f9c41a8221ca102c71208
Author: Martin Sebor 
Date:   Mon Feb 1 08:42:58 2021 -0700

Verify a warning for a class with a ref-qualified assignment (PR c++/98835).

gcc/testsuite/ChangeLog:
PR c++/98835
* g++.dg/Wclass-memaccess-6.C: New test.

diff --git a/gcc/testsuite/g++.dg/Wclass-memaccess-6.C b/gcc/testsuite/g++.dg/Wclass-memaccess-6.C
new file mode 100644
index 000..7f6fe03a939
--- /dev/null
+++ b/gcc/testsuite/g++.dg/Wclass-memaccess-6.C
@@ -0,0 +1,18 @@
+/* PR c++/98835 - -Wclass-memaccess with class with ref-qualified
+   copy-assignment operator
+   { dg-do compile { target { c++11 } } }
+   { dg-options "-Wall" } */
+
+struct Bad
+{
+  Bad* operator& () { return this; }
+  Bad & operator=(Bad const &) & = default;
+};
+
+void test ()
+{
+  static_assert (__has_trivial_copy (Bad));
+
+  // T () = T ();  // error
+  __builtin_memcpy ( (),  (), sizeof (Bad));   // { dg-warning "\\\[-Wclass-memaccess" }
+}


Re: [[C++ PATCH]] Implement C++2a P0330R2 - Literal Suffixes for ptrdiff_t and size_t

2021-02-01 Thread Jason Merrill via Gcc-patches

On 1/30/21 6:22 PM, Ed Smith-Rowland wrote:

On 1/27/21 3:32 PM, Jakub Jelinek wrote:

On Sun, Oct 21, 2018 at 04:39:30PM -0400, Ed Smith-Rowland wrote:
This patch implements C++2a proposal P0330R2 Literal Suffixes for 
ptrdiff_t
and size_t*.  It's not official yet but looks very likely to pass.  
It is
incomplete because I'm looking for some opinions. 9We also might wait 
'till

it actually passes).

This paper takes the direction of a language change rather than a 
library

change through C++11 literal operators.  This was after feedback on that
paper after a few iterations.

As coded in this patch, integer suffixes involving 'z' are errors in 
C and

warnings for C++ <= 17 (in addition to the usual warning about
implementation suffixes shadowing user-defined ones).

OTOH, the 'z' suffix is not currently legal - it can't break
currently-correct code in any C/C++ dialect.  furthermore, I suspect the
language direction was chosen to accommodate a similar addition to C20.

I'm thinking of making this feature available as an extension to all of
C/C++ perhaps with appropriate pedwarn.
GCC now supports -std=c++2b and -std=gnu++2b, are you going to update 
your

patch against it (and change for z/Z standing for ssize_t rather than
ptrdiff_t), plus incorporate the feedback from Joseph and Jason?

Jakub


Here is a rebased patch that is a bit leaner than the original.

Since I chose to be conservative in applying this just to C++23 I'm not 
adding this to C or t earlier versions of C++ as extensions. We can add 
that if people really want, maybe in stage 1.


The compat warning for C++ < 23 is not optional. since the suffixes are 
not preceded by '-' I don't hav much sympathy if people tried to make a 
literal 'z' operator. Which is the only reason I can see for a warning 
suppression.



+  /* itk refers to fundamental types not aliased size types.  */
+  if (flags & CPP_N_UNSIGNED)
+   type = size_type_node;
+  else
+   type = ptrdiff_type_node;


This is wrong if ptrdiff_t is a different size from size_t; it should be 
c_common_signed_type (size_type_node).



+ | (z ? (CPP_N_SIZE_T | CPP_N_LARGE) : 0));


Why CPP_N_LARGE here?  That would seem to suggest that size_t is always 
the same size as unsigned long long.


Jason



Re: [PATCH] c++: Fix ICE in verify_ctor_sanity [PR98295]

2021-02-01 Thread Jason Merrill via Gcc-patches

On 2/1/21 7:48 AM, Patrick Palka wrote:

On Fri, 29 Jan 2021, Jason Merrill wrote:


On 1/29/21 12:28 PM, Patrick Palka wrote:

In this testcase we're crash during constexpr evaluation of the
ARRAY_REF b[0] as part of folding the lambda's by-copy capture of b
(which is encoded as a VEC_INIT_EXPR).  Since A's default constructor
is not yet defined, b's initializer is not actually constant, but
because A is an empty type, evaluation of the referent b from
cxx_eval_array_ref yields an empty CONSTRUCTOR.  From there we proceed
to {}-initialize the missing array element at index 0.  We crash from
verify_ctor_sanity during evaluation of this initializer because we
updated constexpr_ctx::ctor without updating ::object; the former has
type A[3] and the latter is the target of the TARGET_EXPR for b[0][0]
created from cxx_eval_vec_init_1 (and so has type A).

This patch conservatively fixes this issue by clearing new_ctx.object at
the same time that we set new_ctx.ctor.  Strictly speaking, the object
under construction should perhaps be the ARRAY_REF itself


Yes.


but I haven't
been able to come up with a testcase for which this difference matters.


I suspect that any case where it would matter wouldn't get to this point,
because we would have already built up the initializer under digest_init.

But I also think it shouldn't hurt to use 't' as new_ctx.object, so let's do
that.


Sounds good.  I had convinced myself earlier that using 't' wouldn't be
quite right and that we ought to use something like
cxx_eval_array_reference(t, /*lval=*/true) as new_ctx.object instead in
order to ensure the operands of the ARRAY_REF are fully reduced.  But it
seems this isn't necessary since the routines which inspect ctx->object
already pass it through cxx_eval_constant_expression appropriately.  So
using 't', even when it's unreduced, should be safe in this respect.

Does the following look OK for trunk/10?


OK.


-- >8 --

Subject: [PATCH] c++: Fix ICE from verify_ctor_sanity [PR98295]

In this testcase we're crashing during constexpr evaluation of the
ARRAY_REF b[0] as part of evaluation of the lambda's by-copy capture of b
(which is encoded as a VEC_INIT_EXPR).  Since A's constexpr default
constructor is not yet defined, b's initialization is not actually
constant, but because A is an empty type, evaluation of the referent b
from cxx_eval_array_ref is successful and yields an empty CONSTRUCTOR.
Since the CONSTRUCTOR is empty, we {}-initialize the desired array
element, and we end up crashing from verify_ctor_sanity during
evaluation of this initializer because we updated new_ctx.ctor without
updating new_ctx.object: the former now has type A[3] and the latter is
still the target of a TARGET_EXPR for b[0][0] created from
cxx_eval_vec_init_1 (and so has type A).

This patch fixes this by setting new_ctx.object appropriately at the
same time that we set new_ctx.ctor from cxx_eval_array_reference.

gcc/cp/ChangeLog:

PR c++/98295
* constexpr.c (cxx_eval_array_reference): Also set
new_ctx.object when setting new_ctx.ctor.

gcc/testsuite/ChangeLog:

PR c++/98295
* g++.dg/cpp0x/constexpr-98295.C: New test.
---
  gcc/cp/constexpr.c   |  1 +
  gcc/testsuite/g++.dg/cpp0x/constexpr-98295.C | 11 +++
  2 files changed, 12 insertions(+)
  create mode 100644 gcc/testsuite/g++.dg/cpp0x/constexpr-98295.C

diff --git a/gcc/cp/constexpr.c b/gcc/cp/constexpr.c
index baa97a0ef17..1dbc2db9643 100644
--- a/gcc/cp/constexpr.c
+++ b/gcc/cp/constexpr.c
@@ -3760,6 +3760,7 @@ cxx_eval_array_reference (const constexpr_ctx *ctx, tree 
t,
tree empty_ctor = build_constructor (init_list_type_node, NULL);
val = digest_init (elem_type, empty_ctor, tf_warning_or_error);
new_ctx = *ctx;
+  new_ctx.object = t;
new_ctx.ctor = build_constructor (elem_type, NULL);
ctx = _ctx;
  }
diff --git a/gcc/testsuite/g++.dg/cpp0x/constexpr-98295.C 
b/gcc/testsuite/g++.dg/cpp0x/constexpr-98295.C
new file mode 100644
index 000..930bd5a67da
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/constexpr-98295.C
@@ -0,0 +1,11 @@
+// PR c++/98295
+// { dg-do compile { target c++11 } }
+
+struct A { constexpr A(); };
+
+void f() {
+  A b[2][3];
+  [b] { };
+}
+
+constexpr A::A() {}





Re: [PATCH v2] c++: Improve sorry for __builtin_has_attribute [PR98355]

2021-02-01 Thread Jason Merrill via Gcc-patches

On 2/1/21 10:07 AM, Marek Polacek wrote:

On Fri, Jan 29, 2021 at 10:31:15PM -0500, Jason Merrill via Gcc-patches wrote:

On 1/29/21 6:28 PM, Marek Polacek wrote:

On Fri, Jan 29, 2021 at 06:18:35PM -0500, Jason Merrill via Gcc-patches wrote:

On 1/29/21 5:52 PM, Marek Polacek wrote:

On Fri, Jan 29, 2021 at 04:23:14PM -0500, Marek Polacek via Gcc-patches wrote:

On Fri, Jan 29, 2021 at 04:02:51PM -0500, Marek Polacek via Gcc-patches wrote:

__builtin_has_attribute doesn't work in templates yet (bug 92104), so
in r11-471 I added a sorry.  But that only caught type-dependent
expressions and we also want to sorry on value-dependent expressions.
This patch uses v_d_e_p rather than uses_template_parms because u_t_p
sets p_t_d and then v_d_e_p considers variables with reference types
value-dependent, which breaks builtin-has-attribute-6.c.

This is a regression and I also plan to apply this to gcc-10.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk/10?

gcc/cp/ChangeLog:

PR c++/98355
* parser.c (cp_parser_has_attribute_expression): Use
value_dependent_expression_p instead of type_dependent_expression_p.

gcc/testsuite/ChangeLog:

PR c++/98355
* g++.dg/ext/builtin-has-attribute2.C: New test.
---
gcc/cp/parser.c   | 2 +-
gcc/testsuite/g++.dg/ext/builtin-has-attribute2.C | 8 
2 files changed, 9 insertions(+), 1 deletion(-)
create mode 100644 gcc/testsuite/g++.dg/ext/builtin-has-attribute2.C

diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index 5c1d880c9fc..7b1dc0dc93f 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -8934,7 +8934,7 @@ cp_parser_has_attribute_expression (cp_parser *parser)
{
  if (oper == error_mark_node)
/* Nothing.  */;
-  else if (type_dependent_expression_p (oper))
+  else if (value_dependent_expression_p (oper))
sorry_at (atloc, "%<__builtin_has_attribute%> with dependent argument "
  "not supported yet");


Actually I don't like this.  I think we want
 processing_template_decl && uses_template_parms ()

here.


So here's v2.  Sorry for the self-review.

-- >8 --
__builtin_has_attribute doesn't work in templates yet (bug 92104), so
in r11-471 I added a sorry.  But that only caught type-dependent
expressions and we also want to sorry on value-dependent expressions.
This patch uses uses_template_parms, but guarded with p_t_d, because
u_t_p sets p_t_d and then v_d_e_p considers variables with reference
types value-dependent, which breaks builtin-has-attribute-6.c.


Maybe instantiation_dependent_expression_p?


That didn't work here because "oper" can be a type, e.g. integer_type,
and then potential_constant_expression_1 called from i_d_e_p crashes
because it doesn't expect to see a TYPE_P.


Hmm, if we were calling type_dependent_expression_p with a type argument,
that was a bug, and we probably should have aborted; please add such an
assert.  But then your patch fixes the bug here, so the patch is OK with the
assert added.


Thanks.  Unfortunately adding

   gcc_assert (!TYPE_P (expression));

to type_dependent_expression_p causes some amount of ICEs in the testsuite.

Can I go ahead with my patch without the assert and deal with the assert
later?


Yes.

Jason



Re: [PATCH v2] c++: Improve sorry for __builtin_has_attribute [PR98355]

2021-02-01 Thread Marek Polacek via Gcc-patches
On Fri, Jan 29, 2021 at 10:31:15PM -0500, Jason Merrill via Gcc-patches wrote:
> On 1/29/21 6:28 PM, Marek Polacek wrote:
> > On Fri, Jan 29, 2021 at 06:18:35PM -0500, Jason Merrill via Gcc-patches 
> > wrote:
> > > On 1/29/21 5:52 PM, Marek Polacek wrote:
> > > > On Fri, Jan 29, 2021 at 04:23:14PM -0500, Marek Polacek via Gcc-patches 
> > > > wrote:
> > > > > On Fri, Jan 29, 2021 at 04:02:51PM -0500, Marek Polacek via 
> > > > > Gcc-patches wrote:
> > > > > > __builtin_has_attribute doesn't work in templates yet (bug 92104), 
> > > > > > so
> > > > > > in r11-471 I added a sorry.  But that only caught type-dependent
> > > > > > expressions and we also want to sorry on value-dependent 
> > > > > > expressions.
> > > > > > This patch uses v_d_e_p rather than uses_template_parms because 
> > > > > > u_t_p
> > > > > > sets p_t_d and then v_d_e_p considers variables with reference types
> > > > > > value-dependent, which breaks builtin-has-attribute-6.c.
> > > > > > 
> > > > > > This is a regression and I also plan to apply this to gcc-10.
> > > > > > 
> > > > > > Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk/10?
> > > > > > 
> > > > > > gcc/cp/ChangeLog:
> > > > > > 
> > > > > > PR c++/98355
> > > > > > * parser.c (cp_parser_has_attribute_expression): Use
> > > > > > value_dependent_expression_p instead of 
> > > > > > type_dependent_expression_p.
> > > > > > 
> > > > > > gcc/testsuite/ChangeLog:
> > > > > > 
> > > > > > PR c++/98355
> > > > > > * g++.dg/ext/builtin-has-attribute2.C: New test.
> > > > > > ---
> > > > > >gcc/cp/parser.c   | 2 +-
> > > > > >gcc/testsuite/g++.dg/ext/builtin-has-attribute2.C | 8 
> > > > > >2 files changed, 9 insertions(+), 1 deletion(-)
> > > > > >create mode 100644 
> > > > > > gcc/testsuite/g++.dg/ext/builtin-has-attribute2.C
> > > > > > 
> > > > > > diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
> > > > > > index 5c1d880c9fc..7b1dc0dc93f 100644
> > > > > > --- a/gcc/cp/parser.c
> > > > > > +++ b/gcc/cp/parser.c
> > > > > > @@ -8934,7 +8934,7 @@ cp_parser_has_attribute_expression (cp_parser 
> > > > > > *parser)
> > > > > >{
> > > > > >  if (oper == error_mark_node)
> > > > > > /* Nothing.  */;
> > > > > > -  else if (type_dependent_expression_p (oper))
> > > > > > +  else if (value_dependent_expression_p (oper))
> > > > > > sorry_at (atloc, "%<__builtin_has_attribute%> with dependent 
> > > > > > argument "
> > > > > >   "not supported yet");
> > > > > 
> > > > > Actually I don't like this.  I think we want
> > > > > processing_template_decl && uses_template_parms ()
> > > > > 
> > > > > here.
> > > > 
> > > > So here's v2.  Sorry for the self-review.
> > > > 
> > > > -- >8 --
> > > > __builtin_has_attribute doesn't work in templates yet (bug 92104), so
> > > > in r11-471 I added a sorry.  But that only caught type-dependent
> > > > expressions and we also want to sorry on value-dependent expressions.
> > > > This patch uses uses_template_parms, but guarded with p_t_d, because
> > > > u_t_p sets p_t_d and then v_d_e_p considers variables with reference
> > > > types value-dependent, which breaks builtin-has-attribute-6.c.
> > > 
> > > Maybe instantiation_dependent_expression_p?
> > 
> > That didn't work here because "oper" can be a type, e.g. integer_type,
> > and then potential_constant_expression_1 called from i_d_e_p crashes
> > because it doesn't expect to see a TYPE_P.
> 
> Hmm, if we were calling type_dependent_expression_p with a type argument,
> that was a bug, and we probably should have aborted; please add such an
> assert.  But then your patch fixes the bug here, so the patch is OK with the
> assert added.

Thanks.  Unfortunately adding 

  gcc_assert (!TYPE_P (expression));

to type_dependent_expression_p causes some amount of ICEs in the testsuite.

Can I go ahead with my patch without the assert and deal with the assert
later?

Marek



[PATCH] Fix statistic accounting for auto_vec and auto_bitmap

2021-02-01 Thread Richard Biener
This fixes accounting issues with using auto_vec and auto_bitmap
for -fmem-report.

Bootstrap running on x86_64-unknown-linux-gnu, with and without
--enable-gather-detailed-mem-stats

2021-02-01  Richard Biener  

* vec.h (auto_vec::auto_vec): Add memory stat parameters
and pass them on.
* bitmap.h (auto_bitmap::auto_bitmap): Likewise.
---
 gcc/bitmap.h | 6 --
 gcc/vec.h| 6 +++---
 2 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/gcc/bitmap.h b/gcc/bitmap.h
index f9d6f4a39db..84632af7009 100644
--- a/gcc/bitmap.h
+++ b/gcc/bitmap.h
@@ -939,8 +939,10 @@ bmp_iter_and_compl (bitmap_iterator *bi, unsigned *bit_no)
 class auto_bitmap
 {
  public:
-  auto_bitmap () { bitmap_initialize (_bits, _default_obstack); }
-  explicit auto_bitmap (bitmap_obstack *o) { bitmap_initialize (_bits, o); }
+  auto_bitmap (ALONE_CXX_MEM_STAT_INFO)
+{ bitmap_initialize (_bits, _default_obstack PASS_MEM_STAT); }
+  explicit auto_bitmap (bitmap_obstack *o CXX_MEM_STAT_INFO)
+{ bitmap_initialize (_bits, o PASS_MEM_STAT); }
   ~auto_bitmap () { bitmap_clear (_bits); }
   // Allow calling bitmap functions on our bitmap.
   operator bitmap () { return _bits; }
diff --git a/gcc/vec.h b/gcc/vec.h
index bc32299779e..24df2db0eeb 100644
--- a/gcc/vec.h
+++ b/gcc/vec.h
@@ -1519,11 +1519,11 @@ public:
 this->m_vec = _auto;
   }
 
-  auto_vec (size_t s)
+  auto_vec (size_t s CXX_MEM_STAT_INFO)
   {
 if (s > N)
   {
-   this->create (s);
+   this->create (s PASS_MEM_STAT);
return;
   }
 
@@ -1548,7 +1548,7 @@ class auto_vec : public vec
 {
 public:
   auto_vec () { this->m_vec = NULL; }
-  auto_vec (size_t n) { this->create (n); }
+  auto_vec (size_t n CXX_MEM_STAT_INFO) { this->create (n PASS_MEM_STAT); }
   ~auto_vec () { this->release (); }
 
   auto_vec (vec&& r)
-- 
2.26.2


[pushed] c++: alias in qualified-id in template arg [PR98570]

2021-02-01 Thread Jason Merrill via Gcc-patches
template_args_equal has handled dependent alias specializations for a while,
but in this testcase the actual template argument is a SCOPE_REF, so we
called cp_tree_equal, which doesn't handle aliases specially when we get to
them.

This patch generalizes this by setting a flag so structural_comptypes will
check for template alias equivalence (if we aren't doing partial ordering).
The existing flag, comparing_specializations, was too broad; in particular,
when we're doing decls_match, we want to treat corresponding parameters as
equivalent, so we need to separate that from alias comparison.  So I
introduce the comparing_dependent_aliases flag.

>From looking at other uses of comparing_specializations, it seems to me that
the new flag is what modules wants, as well.  Nathan, does this look right to
you?

The other use of comparing_specializations in structural_comptypes is a hack
to deal with spec_hasher::equal not calling push_to_top_level, which we
also don't want to tie to the alias comparison semantics.

This patch also changes how we get to structural comparison of aliases from
checking TYPE_CANONICAL in comptypes to marking the aliases as getting
structural comparison when they are built, which is more consistent with how
e.g. typename is handled.

As I mention in the comment for comparing_dependent_aliases, I think the
default should be to treat different dependent aliases for the same type as
distinct, only treating them as equal during deduction (particularly partial
ordering).  But that's a matter for the C++ committee, to explore in stage 1.

gcc/cp/ChangeLog:

PR c++/98570
* cp-tree.h: Declare it.
* pt.c (comparing_dependent_aliases): New flag.
(template_args_equal, spec_hasher::equal): Set it.
(dependent_alias_template_spec_p): Assert that we don't
get non-types other than error_mark_node.
(instantiate_alias_template): SET_TYPE_STRUCTURAL_EQUALITY
on complex alias specializations.  Set TYPE_DEPENDENT_P here.
(tsubst_decl): Not here.
* module.cc (module_state::read_cluster): Set
comparing_dependent_aliases instead of
comparing_specializations.
* tree.c (cp_tree_equal): Remove comparing_specializations
module handling.
* typeck.c (structural_comptypes): Adjust.
(comptypes): Remove comparing_specializations handling.

gcc/testsuite/ChangeLog:

PR c++/98570
* g++.dg/cpp0x/alias-decl-targ1.C: New test.
---
 gcc/cp/cp-tree.h  |  9 ++--
 gcc/cp/module.cc  |  4 +-
 gcc/cp/pt.c   | 52 ---
 gcc/cp/tree.c |  7 +--
 gcc/cp/typeck.c   |  9 ++--
 gcc/testsuite/g++.dg/cpp0x/alias-decl-targ1.C |  9 
 6 files changed, 53 insertions(+), 37 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/alias-decl-targ1.C

diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index f31319904eb..aed85d79287 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -5449,11 +5449,14 @@ extern GTY(()) tree integer_two_node;
function, two inside the body of a function in a local class, etc.)  */
 extern int function_depth;
 
-/* Nonzero if we are inside eq_specializations, which affects
-   comparison of PARM_DECLs in cp_tree_equal and alias specializations
-   in structrual_comptypes.  */
+/* Nonzero if we are inside spec_hasher::equal, which affects
+   comparison of PARM_DECLs in cp_tree_equal.  */
 extern int comparing_specializations;
 
+/* Nonzero if we want different dependent aliases to compare as unequal.
+   FIXME we should always do this except during deduction/ordering.  */
+extern int comparing_dependent_aliases;
+
 /* When comparing specializations permit context _FROM to match _TO.  */
 extern tree map_context_from;
 extern tree map_context_to;
diff --git a/gcc/cp/module.cc b/gcc/cp/module.cc
index 2d761452505..41ce2011525 100644
--- a/gcc/cp/module.cc
+++ b/gcc/cp/module.cc
@@ -14801,7 +14801,7 @@ module_state::read_cluster (unsigned snum)
   dump.indent ();
 
   /* We care about structural equality.  */
-  comparing_specializations++;
+  comparing_dependent_aliases++;
 
   /* First seed the imports.  */
   while (tree import = sec.tree_node ())
@@ -14976,7 +14976,7 @@ module_state::read_cluster (unsigned snum)
 #undef cfun
   cfun = old_cfun;
   current_function_decl = old_cfd;
-  comparing_specializations--;
+  comparing_dependent_aliases--;
 
   dump.outdent ();
   dump () && dump ("Read section:%u", snum);
diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 9089afb6ae8..db0ff73bdeb 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -1709,6 +1709,7 @@ register_specialization (tree spec, tree tmpl, tree args, 
bool is_friend,
 
 /* Restricts tree and type comparisons.  */
 int comparing_specializations;
+int comparing_dependent_aliases;
 
 /* Returns true iff two spec_entry nodes are equivalent.  */
 
@@ -1718,6 

[committed] c++: Add testcase for PR84494

2021-02-01 Thread Patrick Palka via Gcc-patches
We correctly accept this testcase ever since r10-5143.

gcc/testsuite/ChangeLog:

PR c++/84494
* g++.dg/cpp1y/constexpr-84494.C: New test.
---
 gcc/testsuite/g++.dg/cpp1y/constexpr-84494.C | 11 +++
 1 file changed, 11 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/cpp1y/constexpr-84494.C

diff --git a/gcc/testsuite/g++.dg/cpp1y/constexpr-84494.C 
b/gcc/testsuite/g++.dg/cpp1y/constexpr-84494.C
new file mode 100644
index 000..762cfb4d396
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1y/constexpr-84494.C
@@ -0,0 +1,11 @@
+// PR c++/84494
+// { dg-do compile { target c++14 } }
+
+struct X {
+  constexpr X() = default;
+  constexpr X(int x) : m_value(x) {}
+  constexpr X& operator=(const X ) = default;
+  int m_value {};
+};
+
+static_assert((X() = X(10)).m_value == 10, "");
-- 
2.30.0.335.ge6362826a0



RE: [PATCH]AArch64 Change canonization of smlal and smlsl in order to be able to optimize the vec_dup

2021-02-01 Thread Kyrylo Tkachov via Gcc-patches


> -Original Message-
> From: Tamar Christina 
> Sent: 01 February 2021 12:39
> To: gcc-patches@gcc.gnu.org
> Cc: nd ; Richard Earnshaw ;
> Marcus Shawcroft ; Kyrylo Tkachov
> ; Richard Sandiford
> 
> Subject: [PATCH]AArch64 Change canonization of smlal and smlsl in order to
> be able to optimize the vec_dup
> 
> Hi All,
> 
> g:87301e3956d44ad45e384a8eb16c79029d20213a and
> g:ee4c4fe289e768d3c6b6651c8bfa3fdf458934f4 changed the intrinsics to be
> proper RTL but accidentally ended up creating a regression because of the
> ordering in the RTL pattern.
> 
> The existing RTL that combine should try to match to remove the vec_dup is
> aarch64_vec_mlal_lane and
> aarch64_vec_mult_lane which
> expects the select register to be the second operand of mult.
> 
> The pattern introduced has it as the first operand so combine was unable to
> remove the vec_dup.  This flips the order such that the patterns optimize
> correctly.
> 
> Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> 
> Ok for master?

Ok. I wonder how many of these unfortunate occurrences we have in the backend...
Thanks,
Kyrill

> 
> Thanks,
> Tamar
> 
> gcc/ChangeLog:
> 
>   * config/aarch64/aarch64-simd.md (aarch64_mlal_n,
>   aarch64_mlsl, aarch64_mlsl_n): Flip mult
> operands.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/aarch64/advsimd-intrinsics/smlal-smlsl-mull-
> optimized.c: New test.
> 
> --- inline copy of patch --
> diff --git a/gcc/config/aarch64/aarch64-simd.md
> b/gcc/config/aarch64/aarch64-simd.md
> index
> bca2d8a3437fdcee77c7c357663c78c418b32a88..d1858663a4e78c0861d902
> b37e93c0b00d75e661 100644
> --- a/gcc/config/aarch64/aarch64-simd.md
> +++ b/gcc/config/aarch64/aarch64-simd.md
> @@ -1950,10 +1950,10 @@ (define_insn "aarch64_mlal_n"
>  (plus:
>(mult:
>  (ANY_EXTEND:
> -  (vec_duplicate:VD_HSI
> -   (match_operand: 3 "register_operand" "")))
> +  (match_operand:VD_HSI 2 "register_operand" "w"))
>  (ANY_EXTEND:
> -  (match_operand:VD_HSI 2 "register_operand" "w")))
> +  (vec_duplicate:VD_HSI
> +   (match_operand: 3 "register_operand" ""
>(match_operand: 1 "register_operand" "0")))]
>"TARGET_SIMD"
>"mlal\t%0., %2., %3.[0]"
> @@ -1980,10 +1980,10 @@ (define_insn "aarch64_mlsl_n"
>(match_operand: 1 "register_operand" "0")
>(mult:
>  (ANY_EXTEND:
> -  (vec_duplicate:VD_HSI
> -   (match_operand: 3 "register_operand" "")))
> +  (match_operand:VD_HSI 2 "register_operand" "w"))
>  (ANY_EXTEND:
> -  (match_operand:VD_HSI 2 "register_operand" "w")]
> +  (vec_duplicate:VD_HSI
> +   (match_operand: 3 "register_operand" ""))]
>"TARGET_SIMD"
>"mlsl\t%0., %2., %3.[0]"
>[(set_attr "type" "neon_mla__long")]
> @@ -2078,10 +2078,10 @@ (define_insn "aarch64_mull_n"
>[(set (match_operand: 0 "register_operand" "=w")
>  (mult:
>(ANY_EXTEND:
> -(vec_duplicate:
> -   (match_operand: 2 "register_operand" "")))
> +(match_operand:VD_HSI 1 "register_operand" "w"))
>(ANY_EXTEND:
> -(match_operand:VD_HSI 1 "register_operand" "w"]
> +(vec_duplicate:
> +   (match_operand: 2 "register_operand" "")]
>"TARGET_SIMD"
>"mull\t%0., %1., %2.[0]"
>[(set_attr "type" "neon_mul__scalar_long")]
> diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/smlal-smlsl-
> mull-optimized.c b/gcc/testsuite/gcc.target/aarch64/advsimd-
> intrinsics/smlal-smlsl-mull-optimized.c
> new file mode 100644
> index
> ..1e963e5002e666e32e12b
> 2eef965b206c7344015
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/smlal-smlsl-mull-
> optimized.c
> @@ -0,0 +1,45 @@
> +/* { dg-do compile { target aarch64-*-* } } */
> +
> +#include 
> +
> +/*
> +**add:
> +** smlal   v0.4s, v1.4h, v2.h[3]
> +** ret
> +*/
> +
> +int32x4_t add(int32x4_t acc, int16x4_t b, int16x4_t c) {
> +return vmlal_n_s16(acc, b, c[3]);
> +}
> +
> +/*
> +**sub:
> +** smlsl   v0.4s, v1.4h, v2.h[3]
> +** ret
> +*/
> +
> +int32x4_t sub(int32x4_t acc, int16x4_t b, int16x4_t c) {
> +return vmlsl_n_s16(acc, b, c[3]);
> +}
> +
> +/*
> +**smull:
> +** smull   v0.4s, v1.4h, v2.h[3]
> +** ret
> +*/
> +
> +int32x4_t smull(int16x4_t b, int16x4_t c) {
> +return vmull_n_s16(b, c[3]);
> +}
> +
> +/*
> +**umull:
> +** umull   v0.4s, v1.4h, v2.h[3]
> +** ret
> +*/
> +
> +uint32x4_t umull(uint16x4_t b, uint16x4_t c) {
> +return vmull_n_u16(b, c[3]);
> +}
> +
> +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" {-O[^0]} } } */
> 
> 
> --


RE: [PATCH] testsuite: aarch64: Add tests for vmlXl_high intrinsics

2021-02-01 Thread Kyrylo Tkachov via Gcc-patches



> -Original Message-
> From: Jonathan Wright 
> Sent: 01 February 2021 12:35
> To: gcc-patches@gcc.gnu.org
> Cc: Kyrylo Tkachov 
> Subject: [PATCH] testsuite: aarch64: Add tests for vmlXl_high intrinsics
> 
> Hi,
> 
> As subject, this patch adds tests for vmlal_high_* and vmlsl_high_*
> Neon intrinsics. Since these intrinsics are only supported for AArch64,
> these tests are restricted to only run on AArch64 targets.
> 
> Ok for master?

Ok.
Thanks,
Kyrill

> 
> Thanks,
> Jonathan
> 
> ---
> 
> gcc/testsuite/ChangeLog:
> 
> 2021-01-31  Jonathan Wright  
> 
> * gcc.target/aarch64/advsimd-intrinsics/vmlXl_high.inc:
> New test template.
> * gcc.target/aarch64/advsimd-intrinsics/vmlXl_high_lane.inc:
> New test template.
> * gcc.target/aarch64/advsimd-intrinsics/vmlXl_high_laneq.inc:
> New test template.
> * gcc.target/aarch64/advsimd-intrinsics/vmlXl_high_n.inc:
> New test.
> * gcc.target/aarch64/advsimd-intrinsics/vmlal_high.c:
> New test.
> * gcc.target/aarch64/advsimd-intrinsics/vmlal_high_lane.c:
> New test.
> * gcc.target/aarch64/advsimd-intrinsics/vmlal_high_laneq.c:
> New test.
> * gcc.target/aarch64/advsimd-intrinsics/vmlal_high_n.c:
> New test.
> * gcc.target/aarch64/advsimd-intrinsics/vmlsl_high.c:
> New test.
> * gcc.target/aarch64/advsimd-intrinsics/vmlsl_high_lane.c:
> New test.
> * gcc.target/aarch64/advsimd-intrinsics/vmlsl_high_laneq.c:
> New test.
> * gcc.target/aarch64/advsimd-intrinsics/vmlsl_high_n.c:
> New test.



Re: [PATCH] libstdc++: Don't use reserved identifiers in simd headers

2021-02-01 Thread Matthias Kretz
On Montag, 1. Februar 2021 13:21:33 CET Rainer Orth wrote:
> Two simd tests FAIL on Solaris, both SPARC and x86:
> 
> FAIL: experimental/simd/standard_abi_usable.cc -msse2 -O2 -Wno-psabi (test
> for excess errors) FAIL: experimental/simd/standard_abi_usable_2.cc -msse2
> -O2 -Wno-psabi (test for excess errors)
> 
> This happens because the simd headers use identifiers documented in the
> libstdc++ manual as reserved by system headers.

Sorry, this code was originally written as non-stdlib code, i.e. without any 
reserved identifiers. I had hoped I found all issues...

> Fixed as follows, tested on i386-pc-solaris2.11, sparc-sun-solaris2.11,
> and x86_64-pc-linux-gnu.
> 
> Ok for master?

Looks good to me.

> As an aside, the use of vim: markers initially confused the hell out of
> me.  As an Emacs user, I rarely use vi for much more than a pager, but
> when I wanted to check the lines mentioned in the g++ errors, I had no
> idea what was going on or how to disable the folding enabled there:
> 
> // vim: foldmethod=marker sw=2 noet ts=8 sts=2 tw=80
> 
> I can't help but feel that this is just a personal preference and
> doesn't belong into the upstream code.

Yes. I guess it's better to remove at least foldmethod. The rest isn't 
personal preference, but coding style requirements. However, I don't need any 
of it anymore: by now my vim config autodetects GCC / libstdc++ code. If the 
rest of libstdc++ doesn't have it, the simd headers probably shouldn't have it 
either.

Best,
  Matthias

-- 
──
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 std::experimental::simd  https://github.com/VcDevel/std-simd
──


RE: [PATCH] testsuite: aarch64: Add tests for vmull_high intrinsics

2021-02-01 Thread Kyrylo Tkachov via Gcc-patches
Ok.
Thanks for doing this.
Kyrill

From: Jonathan Wright  
Sent: 01 February 2021 11:45
To: gcc-patches@gcc.gnu.org
Cc: Kyrylo Tkachov 
Subject: Re: [PATCH] testsuite: aarch64: Add tests for vmull_high intrinsics

Woops, didn't attach the diff. Here we go.

Thanks,
Jonathan

From: Jonathan Wright
Sent: 01 February 2021 11:42
To: mailto:gcc-patches@gcc.gnu.org 
Cc: Kyrylo Tkachov 
Subject: [PATCH] testsuite: aarch64: Add tests for vmull_high intrinsics 
 
Hi,

As subject, this patch adds tests for vmull_high_* Neon intrinsics. Since
these intrinsics are only supported for AArch64, these tests are
restricted to only run on AArch64 targets.

Ok for master?

Thanks,
Jonathan

---

gcc/testsuite/ChangeLog:

2021-01-29  Jonathan Wright  

    * gcc.target/aarch64/advsimd-intrinsics/vmull_high.c:
    New test.
    * gcc.target/aarch64/advsimd-intrinsics/vmull_high_lane.c:
    New test.
    * gcc.target/aarch64/advsimd-intrinsics/vmull_high_laneq.c:
    New test.
    * gcc.target/aarch64/advsimd-intrinsics/vmull_high_n.c:
    New test.


Re: [PATCH] c++: Fix ICE in verify_ctor_sanity [PR98295]

2021-02-01 Thread Patrick Palka via Gcc-patches
On Fri, 29 Jan 2021, Jason Merrill wrote:

> On 1/29/21 12:28 PM, Patrick Palka wrote:
> > In this testcase we're crash during constexpr evaluation of the
> > ARRAY_REF b[0] as part of folding the lambda's by-copy capture of b
> > (which is encoded as a VEC_INIT_EXPR).  Since A's default constructor
> > is not yet defined, b's initializer is not actually constant, but
> > because A is an empty type, evaluation of the referent b from
> > cxx_eval_array_ref yields an empty CONSTRUCTOR.  From there we proceed
> > to {}-initialize the missing array element at index 0.  We crash from
> > verify_ctor_sanity during evaluation of this initializer because we
> > updated constexpr_ctx::ctor without updating ::object; the former has
> > type A[3] and the latter is the target of the TARGET_EXPR for b[0][0]
> > created from cxx_eval_vec_init_1 (and so has type A).
> > 
> > This patch conservatively fixes this issue by clearing new_ctx.object at
> > the same time that we set new_ctx.ctor.  Strictly speaking, the object
> > under construction should perhaps be the ARRAY_REF itself
> 
> Yes.
> 
> > but I haven't
> > been able to come up with a testcase for which this difference matters.
> 
> I suspect that any case where it would matter wouldn't get to this point,
> because we would have already built up the initializer under digest_init.
> 
> But I also think it shouldn't hurt to use 't' as new_ctx.object, so let's do
> that.

Sounds good.  I had convinced myself earlier that using 't' wouldn't be
quite right and that we ought to use something like
cxx_eval_array_reference(t, /*lval=*/true) as new_ctx.object instead in
order to ensure the operands of the ARRAY_REF are fully reduced.  But it
seems this isn't necessary since the routines which inspect ctx->object
already pass it through cxx_eval_constant_expression appropriately.  So
using 't', even when it's unreduced, should be safe in this respect.

Does the following look OK for trunk/10?

-- >8 --

Subject: [PATCH] c++: Fix ICE from verify_ctor_sanity [PR98295]

In this testcase we're crashing during constexpr evaluation of the
ARRAY_REF b[0] as part of evaluation of the lambda's by-copy capture of b
(which is encoded as a VEC_INIT_EXPR).  Since A's constexpr default
constructor is not yet defined, b's initialization is not actually
constant, but because A is an empty type, evaluation of the referent b
from cxx_eval_array_ref is successful and yields an empty CONSTRUCTOR.
Since the CONSTRUCTOR is empty, we {}-initialize the desired array
element, and we end up crashing from verify_ctor_sanity during
evaluation of this initializer because we updated new_ctx.ctor without
updating new_ctx.object: the former now has type A[3] and the latter is
still the target of a TARGET_EXPR for b[0][0] created from
cxx_eval_vec_init_1 (and so has type A).

This patch fixes this by setting new_ctx.object appropriately at the
same time that we set new_ctx.ctor from cxx_eval_array_reference.

gcc/cp/ChangeLog:

PR c++/98295
* constexpr.c (cxx_eval_array_reference): Also set
new_ctx.object when setting new_ctx.ctor.

gcc/testsuite/ChangeLog:

PR c++/98295
* g++.dg/cpp0x/constexpr-98295.C: New test.
---
 gcc/cp/constexpr.c   |  1 +
 gcc/testsuite/g++.dg/cpp0x/constexpr-98295.C | 11 +++
 2 files changed, 12 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/constexpr-98295.C

diff --git a/gcc/cp/constexpr.c b/gcc/cp/constexpr.c
index baa97a0ef17..1dbc2db9643 100644
--- a/gcc/cp/constexpr.c
+++ b/gcc/cp/constexpr.c
@@ -3760,6 +3760,7 @@ cxx_eval_array_reference (const constexpr_ctx *ctx, tree 
t,
   tree empty_ctor = build_constructor (init_list_type_node, NULL);
   val = digest_init (elem_type, empty_ctor, tf_warning_or_error);
   new_ctx = *ctx;
+  new_ctx.object = t;
   new_ctx.ctor = build_constructor (elem_type, NULL);
   ctx = _ctx;
 }
diff --git a/gcc/testsuite/g++.dg/cpp0x/constexpr-98295.C 
b/gcc/testsuite/g++.dg/cpp0x/constexpr-98295.C
new file mode 100644
index 000..930bd5a67da
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/constexpr-98295.C
@@ -0,0 +1,11 @@
+// PR c++/98295
+// { dg-do compile { target c++11 } }
+
+struct A { constexpr A(); };
+
+void f() {
+  A b[2][3];
+  [b] { };
+}
+
+constexpr A::A() {}
-- 
2.30.0.335.ge6362826a0



Re: c++: cross-module __cxa_atexit use [PR 98531]

2021-02-01 Thread Rainer Orth
Hi Nathan,

> As Rainer pointed out, there were some regressions in the library tests.
> That's because we didn't build the correct ehspec for __cxa_atexit. 
> This adds that, but also, I realize we can use the 'hidden' flag on
> pushdecl to make this lazy-builtin not visible to user name lookup 
> without them already declaring it.  And I realized we should be setting the
> location as BUILTIN_LOCATION because that's what we're doing. Finally the
> duplicate_decl processing needed augmenting to allow such a lazy builtin's
> eh spec to differ from one already known.  (The converse is already dealt
> with, as that's the only case we had before.)
>
> As I mentioned in the first patch, a cleanup of the 'lazily declare a
> builtin' API would be a nice stage-1 thing.
>
> I'll hold off applying this until next week.  Rainer, I don't yet know if
> this resolves all of the issues you encountered in the PR.

unfortunately, the results are not very encouraging: while the new
libstdc++ failures caused by the v1 patch are gone, the g++.dg/modules
ICEs remain unchanged, and on Solaris 11.3 (or with
-fno-use-cxa-atexit), the new tests ICE, too.  Full details in the PR.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


[PATCH]AArch64 Change canonization of smlal and smlsl in order to be able to optimize the vec_dup

2021-02-01 Thread Tamar Christina via Gcc-patches
Hi All,

g:87301e3956d44ad45e384a8eb16c79029d20213a and
g:ee4c4fe289e768d3c6b6651c8bfa3fdf458934f4 changed the intrinsics to be
proper RTL but accidentally ended up creating a regression because of the
ordering in the RTL pattern.

The existing RTL that combine should try to match to remove the vec_dup is 
aarch64_vec_mlal_lane and aarch64_vec_mult_lane which
expects the select register to be the second operand of mult.

The pattern introduced has it as the first operand so combine was unable to
remove the vec_dup.  This flips the order such that the patterns optimize
correctly.

Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.

Ok for master?

Thanks,
Tamar

gcc/ChangeLog:

* config/aarch64/aarch64-simd.md (aarch64_mlal_n,
aarch64_mlsl, aarch64_mlsl_n): Flip mult operands.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/advsimd-intrinsics/smlal-smlsl-mull-optimized.c: 
New test.

--- inline copy of patch -- 
diff --git a/gcc/config/aarch64/aarch64-simd.md 
b/gcc/config/aarch64/aarch64-simd.md
index 
bca2d8a3437fdcee77c7c357663c78c418b32a88..d1858663a4e78c0861d902b37e93c0b00d75e661
 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -1950,10 +1950,10 @@ (define_insn "aarch64_mlal_n"
 (plus:
   (mult:
 (ANY_EXTEND:
-  (vec_duplicate:VD_HSI
- (match_operand: 3 "register_operand" "")))
+  (match_operand:VD_HSI 2 "register_operand" "w"))
 (ANY_EXTEND:
-  (match_operand:VD_HSI 2 "register_operand" "w")))
+  (vec_duplicate:VD_HSI
+ (match_operand: 3 "register_operand" ""
   (match_operand: 1 "register_operand" "0")))]
   "TARGET_SIMD"
   "mlal\t%0., %2., %3.[0]"
@@ -1980,10 +1980,10 @@ (define_insn "aarch64_mlsl_n"
   (match_operand: 1 "register_operand" "0")
   (mult:
 (ANY_EXTEND:
-  (vec_duplicate:VD_HSI
- (match_operand: 3 "register_operand" "")))
+  (match_operand:VD_HSI 2 "register_operand" "w"))
 (ANY_EXTEND:
-  (match_operand:VD_HSI 2 "register_operand" "w")]
+  (vec_duplicate:VD_HSI
+ (match_operand: 3 "register_operand" ""))]
   "TARGET_SIMD"
   "mlsl\t%0., %2., %3.[0]"
   [(set_attr "type" "neon_mla__long")]
@@ -2078,10 +2078,10 @@ (define_insn "aarch64_mull_n"
   [(set (match_operand: 0 "register_operand" "=w")
 (mult:
   (ANY_EXTEND:
-(vec_duplicate:
- (match_operand: 2 "register_operand" "")))
+(match_operand:VD_HSI 1 "register_operand" "w"))
   (ANY_EXTEND:
-(match_operand:VD_HSI 1 "register_operand" "w"]
+(vec_duplicate:
+ (match_operand: 2 "register_operand" "")]
   "TARGET_SIMD"
   "mull\t%0., %1., %2.[0]"
   [(set_attr "type" "neon_mul__scalar_long")]
diff --git 
a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/smlal-smlsl-mull-optimized.c
 
b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/smlal-smlsl-mull-optimized.c
new file mode 100644
index 
..1e963e5002e666e32e12b2eef965b206c7344015
--- /dev/null
+++ 
b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/smlal-smlsl-mull-optimized.c
@@ -0,0 +1,45 @@
+/* { dg-do compile { target aarch64-*-* } } */
+
+#include 
+
+/*
+**add:
+** smlal   v0.4s, v1.4h, v2.h[3]
+** ret
+*/
+
+int32x4_t add(int32x4_t acc, int16x4_t b, int16x4_t c) {
+return vmlal_n_s16(acc, b, c[3]);
+}
+
+/*
+**sub:
+** smlsl   v0.4s, v1.4h, v2.h[3]
+** ret
+*/
+
+int32x4_t sub(int32x4_t acc, int16x4_t b, int16x4_t c) {
+return vmlsl_n_s16(acc, b, c[3]);
+}
+
+/*
+**smull:
+** smull   v0.4s, v1.4h, v2.h[3]
+** ret
+*/
+
+int32x4_t smull(int16x4_t b, int16x4_t c) {
+return vmull_n_s16(b, c[3]);
+}
+
+/*
+**umull:
+** umull   v0.4s, v1.4h, v2.h[3]
+** ret
+*/
+
+uint32x4_t umull(uint16x4_t b, uint16x4_t c) {
+return vmull_n_u16(b, c[3]);
+}
+
+/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" {-O[^0]} } } */


-- 
diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md
index bca2d8a3437fdcee77c7c357663c78c418b32a88..d1858663a4e78c0861d902b37e93c0b00d75e661 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -1950,10 +1950,10 @@ (define_insn "aarch64_mlal_n"
 (plus:
   (mult:
 (ANY_EXTEND:
-  (vec_duplicate:VD_HSI
-	  (match_operand: 3 "register_operand" "")))
+  (match_operand:VD_HSI 2 "register_operand" "w"))
 (ANY_EXTEND:
-  (match_operand:VD_HSI 2 "register_operand" "w")))
+  (vec_duplicate:VD_HSI
+	  (match_operand: 3 "register_operand" ""
   (match_operand: 1 "register_operand" "0")))]
   "TARGET_SIMD"
   "mlal\t%0., %2., 

[PATCH] testsuite: aarch64: Add tests for vmlXl_high intrinsics

2021-02-01 Thread Jonathan Wright via Gcc-patches
Hi,

As subject, this patch adds tests for vmlal_high_* and vmlsl_high_*
Neon intrinsics. Since these intrinsics are only supported for AArch64,
these tests are restricted to only run on AArch64 targets.

Ok for master?

Thanks,
Jonathan

---

gcc/testsuite/ChangeLog:

2021-01-31  Jonathan Wright  

* gcc.target/aarch64/advsimd-intrinsics/vmlXl_high.inc:
New test template.
* gcc.target/aarch64/advsimd-intrinsics/vmlXl_high_lane.inc:
New test template.
* gcc.target/aarch64/advsimd-intrinsics/vmlXl_high_laneq.inc:
New test template.
* gcc.target/aarch64/advsimd-intrinsics/vmlXl_high_n.inc:
New test.
* gcc.target/aarch64/advsimd-intrinsics/vmlal_high.c:
New test.
* gcc.target/aarch64/advsimd-intrinsics/vmlal_high_lane.c:
New test.
* gcc.target/aarch64/advsimd-intrinsics/vmlal_high_laneq.c:
New test.
* gcc.target/aarch64/advsimd-intrinsics/vmlal_high_n.c:
New test.
* gcc.target/aarch64/advsimd-intrinsics/vmlsl_high.c:
New test.
* gcc.target/aarch64/advsimd-intrinsics/vmlsl_high_lane.c:
New test.
* gcc.target/aarch64/advsimd-intrinsics/vmlsl_high_laneq.c:
New test.
* gcc.target/aarch64/advsimd-intrinsics/vmlsl_high_n.c:
New test.
diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vmlXl_high.inc b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vmlXl_high.inc
new file mode 100644
index ..7c9ee26b142669c48d27aca6bd11988e948cf52d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vmlXl_high.inc
@@ -0,0 +1,89 @@
+#define FNNAME1(NAME) exec_ ## NAME
+#define FNNAME(NAME) FNNAME1(NAME)
+
+void FNNAME (INSN_NAME) (void)
+{
+  /* vector_res = OP(vector, vector3, vector4),
+ then store the result.  */
+#define TEST_VMLXL_HIGH1(INSN, T1, T2, W1, W2, N1, N2)			   \
+  VECT_VAR(vector_res, T1, W1, N1) =	   \
+INSN##_##T2##W2(VECT_VAR(vector, T1, W1, N1),			   \
+VECT_VAR(vector3, T1, W2, N2),			   \
+VECT_VAR(vector4, T1, W2, N2));			   \
+  vst1q_##T2##W1(VECT_VAR(result, T1, W1, N1), VECT_VAR(vector_res, T1, W1, N1))
+
+#define TEST_VMLXL_HIGH(INSN, T1, T2, W1, W2, N1, N2)			   \
+  TEST_VMLXL_HIGH1(INSN, T1, T2, W1, W2, N1, N2)
+
+  DECL_VARIABLE(vector, int, 16, 8);
+  DECL_VARIABLE(vector3, int, 8, 16);
+  DECL_VARIABLE(vector4, int, 8, 16);
+  DECL_VARIABLE(vector_res, int, 16, 8);
+
+  DECL_VARIABLE(vector, int, 32, 4);
+  DECL_VARIABLE(vector3, int, 16, 8);
+  DECL_VARIABLE(vector4, int, 16, 8);
+  DECL_VARIABLE(vector_res, int, 32, 4);
+
+  DECL_VARIABLE(vector, int, 64, 2);
+  DECL_VARIABLE(vector3, int, 32, 4);
+  DECL_VARIABLE(vector4, int, 32, 4);
+  DECL_VARIABLE(vector_res, int, 64, 2);
+
+  DECL_VARIABLE(vector, uint, 16, 8);
+  DECL_VARIABLE(vector3, uint, 8, 16);
+  DECL_VARIABLE(vector4, uint, 8, 16);
+  DECL_VARIABLE(vector_res, uint, 16, 8);
+
+  DECL_VARIABLE(vector, uint, 32, 4);
+  DECL_VARIABLE(vector3, uint, 16, 8);
+  DECL_VARIABLE(vector4, uint, 16, 8);
+  DECL_VARIABLE(vector_res, uint, 32, 4);
+
+  DECL_VARIABLE(vector, uint, 64, 2);
+  DECL_VARIABLE(vector3, uint, 32, 4);
+  DECL_VARIABLE(vector4, uint, 32, 4);
+  DECL_VARIABLE(vector_res, uint, 64, 2);
+
+  clean_results ();
+
+  VLOAD(vector, buffer, q, int, s, 16, 8);
+  VLOAD(vector, buffer, q, int, s, 32, 4);
+  VLOAD(vector, buffer, q, int, s, 64, 2);
+  VLOAD(vector, buffer, q, uint, u, 16, 8);
+  VLOAD(vector, buffer, q, uint, u, 32, 4);
+  VLOAD(vector, buffer, q, uint, u, 64, 2);
+
+  VDUP(vector3, q, int, s, 8, 16, 0x55);
+  VDUP(vector4, q, int, s, 8, 16, 0xBB);
+  VDUP(vector3, q, int, s, 16, 8, 0x55);
+  VDUP(vector4, q, int, s, 16, 8, 0xBB);
+  VDUP(vector3, q, int, s, 32, 4, 0x55);
+  VDUP(vector4, q, int, s, 32, 4, 0xBB);
+  VDUP(vector3, q, uint, u, 8, 16, 0x55);
+  VDUP(vector4, q, uint, u, 8, 16, 0xBB);
+  VDUP(vector3, q, uint, u, 16, 8, 0x55);
+  VDUP(vector4, q, uint, u, 16, 8, 0xBB);
+  VDUP(vector3, q, uint, u, 32, 4, 0x55);
+  VDUP(vector4, q, uint, u, 32, 4, 0xBB);
+
+  TEST_VMLXL_HIGH(INSN_NAME, int, s, 16, 8, 8, 16);
+  TEST_VMLXL_HIGH(INSN_NAME, int, s, 32, 16, 4, 8);
+  TEST_VMLXL_HIGH(INSN_NAME, int, s, 64, 32, 2, 4);
+  TEST_VMLXL_HIGH(INSN_NAME, uint, u, 16, 8, 8, 16);
+  TEST_VMLXL_HIGH(INSN_NAME, uint, u, 32, 16, 4, 8);
+  TEST_VMLXL_HIGH(INSN_NAME, uint, u, 64, 32, 2, 4);
+
+  CHECK(TEST_MSG, int, 16, 8, PRIx16, expected, "");
+  CHECK(TEST_MSG, int, 32, 4, PRIx32, expected, "");
+  CHECK(TEST_MSG, int, 64, 2, PRIx64, expected, "");
+  CHECK(TEST_MSG, uint, 16, 8, PRIx16, expected, "");
+  CHECK(TEST_MSG, uint, 32, 4, PRIx32, expected, "");
+  CHECK(TEST_MSG, uint, 64, 2, PRIx64, expected, "");
+}
+
+int main (void)
+{
+  FNNAME (INSN_NAME) ();
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vmlXl_high_lane.inc 

[PATCH] libstdc++: Don't use reserved identifiers in simd headers

2021-02-01 Thread Rainer Orth
Two simd tests FAIL on Solaris, both SPARC and x86:

FAIL: experimental/simd/standard_abi_usable.cc -msse2 -O2 -Wno-psabi (test for 
excess errors)
FAIL: experimental/simd/standard_abi_usable_2.cc -msse2 -O2 -Wno-psabi (test 
for excess errors)

This happens because the simd headers use identifiers documented in the
libstdc++ manual as reserved by system headers.

Fixed as follows, tested on i386-pc-solaris2.11, sparc-sun-solaris2.11,
and x86_64-pc-linux-gnu.

Ok for master?

As an aside, the use of vim: markers initially confused the hell out of
me.  As an Emacs user, I rarely use vi for much more than a pager, but
when I wanted to check the lines mentioned in the g++ errors, I had no
idea what was going on or how to disable the folding enabled there:

// vim: foldmethod=marker sw=2 noet ts=8 sts=2 tw=80

I can't help but feel that this is just a personal preference and
doesn't belong into the upstream code.

For avoidance of doubt, I'd consider equivalent Emacs local variables
equally inappropriate.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


2021-02-01  Rainer Orth  

libstdc++-v3:
* include/experimental/bits/simd.h: Replace reserved _X, _B by
_Xp, _Bp.
* include/experimental/bits/simd_builtin.h: Likewise.
* include/experimental/bits/simd_x86.h: Likewise.

# HG changeset patch
# Parent  585afa0bd4a3567358ed9efd210d1a1dab740005
libstdc++: Don't use reserved identifiers in simd headers

diff --git a/libstdc++-v3/include/experimental/bits/simd.h b/libstdc++-v3/include/experimental/bits/simd.h
--- a/libstdc++-v3/include/experimental/bits/simd.h
+++ b/libstdc++-v3/include/experimental/bits/simd.h
@@ -201,8 +201,8 @@ template 
   inline constexpr overaligned_tag<_Np> overaligned = {};
 
 // }}}
-template 
-  using _SizeConstant = integral_constant;
+template 
+  using _SizeConstant = integral_constant;
 
 // unrolled/pack execution helpers
 // __execute_n_times{{{
@@ -4060,11 +4060,11 @@ template  class _A0, temp
 	  return typename __decay_abi<_A0<_Bytes>>::type{};
 	else
 	  {
-		using _B =
+		using _Bp =
 		  typename __find_next_valid_abi<_A0, _Bytes, _Tp>::type;
-		if constexpr (_B::template _S_is_valid_v<
-_Tp> && _B::template _S_size<_Tp> <= _Np)
-		  return _B{};
+		if constexpr (_Bp::template _S_is_valid_v<
+_Tp> && _Bp::template _S_size<_Tp> <= _Np)
+		  return _Bp{};
 		else
 		  return
 		typename _AbiList<_Rest...>::template _BestAbi<_Tp, _Np>{};
diff --git a/libstdc++-v3/include/experimental/bits/simd_builtin.h b/libstdc++-v3/include/experimental/bits/simd_builtin.h
--- a/libstdc++-v3/include/experimental/bits/simd_builtin.h
+++ b/libstdc++-v3/include/experimental/bits/simd_builtin.h
@@ -894,12 +894,12 @@ template ;
-  using _B = __vector_type_t<_Tp, _Np>;
+  using _Bp = __vector_type_t<_Tp, _Np>;
   _SimdMember _M_data;
 
 public:
   _SimdCastType2(_Ap __a) : _M_data(__vector_bitcast<_Tp>(__a)) {}
-  _SimdCastType2(_B __b) : _M_data(__b) {}
+  _SimdCastType2(_Bp __b) : _M_data(__b) {}
   operator _SimdMember() const { return _M_data; }
 };
 
diff --git a/libstdc++-v3/include/experimental/bits/simd_x86.h b/libstdc++-v3/include/experimental/bits/simd_x86.h
--- a/libstdc++-v3/include/experimental/bits/simd_x86.h
+++ b/libstdc++-v3/include/experimental/bits/simd_x86.h
@@ -55,10 +55,10 @@ template ,
+template ,
 	  typename _Trait = _VectorTraits<_Tp>>
   _GLIBCXX_SIMD_INTRINSIC constexpr _Tp
-  __interleave128_lo(const _Ap& __av, const _B& __bv)
+  __interleave128_lo(const _Ap& __av, const _Bp& __bv)
   {
 const _Tp __a(__av);
 const _Tp __b(__bv);


Re: [PATCH] testsuite: aarch64: Add tests for vmull_high intrinsics

2021-02-01 Thread Jonathan Wright via Gcc-patches
Woops, didn't attach the diff. Here we go.

Thanks,
Jonathan

From: Jonathan Wright
Sent: 01 February 2021 11:42
To: gcc-patches@gcc.gnu.org 
Cc: Kyrylo Tkachov 
Subject: [PATCH] testsuite: aarch64: Add tests for vmull_high intrinsics

Hi,

As subject, this patch adds tests for vmull_high_* Neon intrinsics. Since
these intrinsics are only supported for AArch64, these tests are
restricted to only run on AArch64 targets.

Ok for master?

Thanks,
Jonathan

---

gcc/testsuite/ChangeLog:

2021-01-29  Jonathan Wright  

* gcc.target/aarch64/advsimd-intrinsics/vmull_high.c:
New test.
* gcc.target/aarch64/advsimd-intrinsics/vmull_high_lane.c:
New test.
* gcc.target/aarch64/advsimd-intrinsics/vmull_high_laneq.c:
New test.
* gcc.target/aarch64/advsimd-intrinsics/vmull_high_n.c:
New test.
diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vmull_high.c b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vmull_high.c
new file mode 100644
index ..36094fce24f364f6a314f66ae153a211b2a75dff
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vmull_high.c
@@ -0,0 +1,78 @@
+/* { dg-skip-if "" { arm*-*-* } } */
+
+#include 
+#include "arm-neon-ref.h"
+#include "compute-ref-data.h"
+
+/* Expected results.  */
+VECT_VAR_DECL(expected, int, 16, 8) [] = { 0x40, 0x31,0x24, 0x19,
+	   0x10, 0x9, 0x4, 0x1 };
+VECT_VAR_DECL(expected, int, 32, 4) [] = { 0x90, 0x79, 0x64, 0x51 };
+VECT_VAR_DECL(expected, int, 64, 2) [] = { 0xc4, 0xa9 };
+VECT_VAR_DECL(expected, uint, 16, 8) [] = { 0xf040, 0xf231, 0xf424, 0xf619,
+	0xf810, 0xfa09, 0xfc04, 0xfe01 };
+VECT_VAR_DECL(expected, uint, 32, 4) [] = { 0xffe80090, 0xffea0079,
+	0xffec0064, 0xffee0051 };
+VECT_VAR_DECL(expected, uint, 64, 2) [] = { 0xffe400c4,
+	0xffe600a9 };
+VECT_VAR_DECL(expected, poly, 16, 8) [] = { 0x5540, 0x5541, 0x5544, 0x5545,
+	0x5550, 0x5551, 0x5554, 0x };
+
+#define TEST_MSG "VMULL_HIGH"
+void exec_vmull_high (void)
+{
+  /* Basic test: y = vmull_high(x, x), then store the result.  */
+#define TEST_VMULL_HIGH(T1, T2, W1, W2, N1, N2) \
+  VECT_VAR(vector_res, T1, W2, N1) =	 \
+vmull_high_##T2##W1(VECT_VAR(vector, T1, W1, N2),			 \
+			VECT_VAR(vector, T1, W1, N2));			 \
+  vst1q_##T2##W2(VECT_VAR(result, T1, W2, N1), \
+		 VECT_VAR(vector_res, T1, W2, N1))
+
+  DECL_VARIABLE(vector, int, 8, 16);
+  DECL_VARIABLE(vector, int, 16, 8);
+  DECL_VARIABLE(vector, int, 32, 4);
+  DECL_VARIABLE(vector, uint, 8, 16);
+  DECL_VARIABLE(vector, uint, 16, 8);
+  DECL_VARIABLE(vector, uint, 32, 4);
+  DECL_VARIABLE(vector, poly, 8, 16);
+  DECL_VARIABLE(vector_res, int, 16, 8);
+  DECL_VARIABLE(vector_res, int, 32, 4);
+  DECL_VARIABLE(vector_res, int, 64, 2);
+  DECL_VARIABLE(vector_res, uint, 16, 8);
+  DECL_VARIABLE(vector_res, uint, 32, 4);
+  DECL_VARIABLE(vector_res, uint, 64, 2);
+  DECL_VARIABLE(vector_res, poly, 16, 8);
+
+  clean_results ();
+
+  VLOAD(vector, buffer, q, int, s, 8, 16);
+  VLOAD(vector, buffer, q, int, s, 16, 8);
+  VLOAD(vector, buffer, q, int, s, 32, 4);
+  VLOAD(vector, buffer, q, uint, u, 8, 16);
+  VLOAD(vector, buffer, q, uint, u, 16, 8);
+  VLOAD(vector, buffer, q, uint, u, 32, 4);
+  VLOAD(vector, buffer, q, poly, p, 8, 16);
+
+  TEST_VMULL_HIGH(int, s, 8, 16, 8, 16);
+  TEST_VMULL_HIGH(int, s, 16, 32, 4, 8);
+  TEST_VMULL_HIGH(int, s, 32, 64, 2, 4);
+  TEST_VMULL_HIGH(uint, u, 8, 16, 8, 16);
+  TEST_VMULL_HIGH(uint, u, 16, 32, 4, 8);
+  TEST_VMULL_HIGH(uint, u, 32, 64, 2, 4);
+  TEST_VMULL_HIGH(poly, p, 8, 16, 8, 16);
+
+  CHECK(TEST_MSG, int, 16, 8, PRIx16, expected, "");
+  CHECK(TEST_MSG, int, 32, 4, PRIx32, expected, "");
+  CHECK(TEST_MSG, int, 64, 2, PRIx64, expected, "");
+  CHECK(TEST_MSG, uint, 16, 8, PRIx16, expected, "");
+  CHECK(TEST_MSG, uint, 32, 4, PRIx32, expected, "");
+  CHECK(TEST_MSG, uint, 64, 2, PRIx64, expected, "");
+  CHECK_POLY(TEST_MSG, poly, 16, 8, PRIx16, expected, "");
+}
+
+int main (void)
+{
+  exec_vmull_high ();
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vmull_high_lane.c b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vmull_high_lane.c
new file mode 100644
index ..30bc954cd18f9f9f72f985aba8745fc1808dbbf1
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vmull_high_lane.c
@@ -0,0 +1,69 @@
+/* { dg-skip-if "" { arm*-*-* } } */
+
+#include 
+#include "arm-neon-ref.h"
+#include "compute-ref-data.h"
+
+/* Expected results.  */
+VECT_VAR_DECL(expected, int, 32, 4) [] = { 0x4000, 0x4000, 0x4000, 0x4000 };
+VECT_VAR_DECL(expected, int, 64, 2) [] = { 0x2000, 0x2000 };
+VECT_VAR_DECL(expected, uint, 32, 4) [] = { 0x4000, 0x4000, 0x4000, 0x4000 };
+VECT_VAR_DECL(expected, uint, 64, 2) [] = { 0x2000, 0x2000 };
+
+#define TEST_MSG "VMULL_HIGH_LANE"
+void exec_vmull_high_lane (void)

[PATCH] testsuite: aarch64: Add tests for vmull_high intrinsics

2021-02-01 Thread Jonathan Wright via Gcc-patches
Hi,

As subject, this patch adds tests for vmull_high_* Neon intrinsics. Since
these intrinsics are only supported for AArch64, these tests are
restricted to only run on AArch64 targets.

Ok for master?

Thanks,
Jonathan

---

gcc/testsuite/ChangeLog:

2021-01-29  Jonathan Wright  

* gcc.target/aarch64/advsimd-intrinsics/vmull_high.c:
New test.
* gcc.target/aarch64/advsimd-intrinsics/vmull_high_lane.c:
New test.
* gcc.target/aarch64/advsimd-intrinsics/vmull_high_laneq.c:
New test.
* gcc.target/aarch64/advsimd-intrinsics/vmull_high_n.c:
New test.


Re: [PATCH] tree-optimization/98499 - fix modref analysis on RVO statements

2021-02-01 Thread Jan Hubicka
> From: Sergei Trofimovich 
> 
> Before the change RVO gimple statements were treated as local
> stores by modres analysis. But in practice RVO escapes target.
> 
> 2021-01-30  Sergei Trofimovich  
> 
> gcc/ChangeLog:
> 
>   PR tree-optimization/98499
>   * ipa-modref.c: treat RVO conservatively and assume
>   all possible side-effects.
> 
> gcc/testsuite/ChangeLog:
> 
>   PR tree-optimization/98499
>   * g++.dg/pr98499.C: new test.

This is OK.  Thanks a lot for debugging this.
> +   /* Return slot optiomization would require bit of propagation;
> +  give up for now.  */
> +   if (gimple_call_return_slot_opt_p (call)
> +   && gimple_call_lhs (call) != NULL_TREE
> +   && TREE_ADDRESSABLE (TREE_TYPE (gimple_call_lhs (call
> + {
> +   if (dump_file)
> + fprintf (dump_file, "%*s  Unhandled return slot opt\n",
> +  depth * 4, "");
> +   lattice[index].merge (0);

Here we are really missing a way to track that argument is "write only".
We could probably still set EAF_DIRECT, but it is useless withtout
noescape anyway, so 0 is OK.

I implemented tracking of noescape here but only at local modref, since
for global modref we are missing jump functions tracking the fact that
the return value is passed to another function, so probably someting for
next stage1 (I am gathering some stats now though)

Honza
> + }
> /* Recursion would require bit of propagation; give up for now.  */
> -   if (callee && !ipa && recursive_call_p (current_function_decl,
> +   else if (callee && !ipa && recursive_call_p (current_function_decl,
> callee))
>   lattice[index].merge (0);
> else
> diff --git a/gcc/testsuite/g++.dg/pr98499.C b/gcc/testsuite/g++.dg/pr98499.C
> new file mode 100644
> index 000..ace088aeed9
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/pr98499.C
> @@ -0,0 +1,31 @@
> +/* PR tree-optimization/98499.  */
> +/* { dg-do run } */
> +/* { dg-options "-O2" } */
> +
> +struct string {
> +  // pointer to local store
> +  char * _M_buf;
> +  // local store
> +  char _M_local_buf[16];
> +
> +  __attribute__((noinline)) string() : _M_buf(_M_local_buf) {}
> +
> +  ~string() {
> +if (_M_buf != _M_local_buf)
> +  __builtin_trap();
> +  }
> +
> +  string(const string &__str); // no copies
> +};
> +
> +__attribute__((noinline)) static string dir_name() { return string(); }
> +class Importer {
> +  string base_path;
> +
> +public:
> +  __attribute__((noinline)) Importer() : base_path (dir_name()) {}
> +};
> +
> +int main() {
> +  Importer imp;
> +}
> -- 
> 2.30.0
> 


Re: [PATCH] PR target/98743: Fix ICE in convert_move for RISC-V

2021-02-01 Thread Kito Cheng via Gcc-patches
On Mon, Feb 1, 2021 at 6:10 PM Jakub Jelinek  wrote:
>
> On Mon, Feb 01, 2021 at 05:57:28PM +0800, Kito Cheng wrote:
> > > >  - Check `TO` mode is not BLMmode before call store_expr, calling 
> > > > store_expr
> > > >with BLKmode will cause ICE.
> > >
> > > How do you end up with a SUBREG_PROMOTED* of something that has bitsize 
> > > of 0
> > > (GET_MODE_BITSIZE of BLKmode is 0, right)?
> >
> > to_rtx is already having a mode other than BLKmode in this point,
> > it's SImode for riscv64*-*-*, so bitsize is 32 rather than 0.
> >
> > I guess my comment isn't clear enough, the root cause why
> > `store_expr (from, to_rtx, 0, nontemporal, false)` ICE is
> > because `from` is still BLKmode here.
>
> But mode is TYPE_MODE (TREE_TYPE (to)), that is expanded already at this
> point and got some different mode.  So, if anything, you care about
> TYPE_MODE (TREE_TYPE (from)) rather than mode, don't you?

Yeah, that's correct, let me send v2 for updating the checking and comment.

>
> Jakub
>


Re: [PATCH 14/16] Implement hmin and hmax

2021-02-01 Thread Matthias Kretz
On Mittwoch, 27. Januar 2021 21:42:50 CET Matthias Kretz wrote:
> --- a/libstdc++-v3/include/experimental/bits/simd.h
> +++ b/libstdc++-v3/include/experimental/bits/simd.h
> @@ -204,6 +204,27 @@ template 
>  template 
>using _SizeConstant = integral_constant;
> 
> +namespace __detail {
> +  struct _Minimum {
> +template 
> +  _GLIBCXX_SIMD_INTRINSIC constexpr
> +  _Tp
> +  operator()(_Tp __a, _Tp __b) const {

Reviewing my own patch :) This needs line breaks before { for namespace, 
struct, and operator(). And another line break before the next struct. New 
patch attached.

From: Matthias Kretz 

From 9.7.4 in Parallelism TS 2. For some reason I overlooked these two
functions. Implement them via call to _S_reduce.

libstdc++-v3/ChangeLog:
* include/experimental/bits/simd.h: Add __detail::_Minimum and
__detail::_Maximum to use them as _BinaryOperation to _S_reduce.
Add hmin and hmax overloads for simd and const_where_expression.
* include/experimental/bits/simd_scalar.h
(_SimdImplScalar::_S_reduce): Make unused _BinaryOperation
parameter const-ref to allow calling _S_reduce with an rvalue.
* testsuite/experimental/simd/tests/reductions.cc: Add tests for
hmin and hmax. Since the compiler statically determined that all
tests pass, repeat the test after a call to make_value_unknown.

-- 
──
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 std::experimental::simd  https://github.com/VcDevel/std-simd
──
diff --git a/libstdc++-v3/include/experimental/bits/simd.h b/libstdc++-v3/include/experimental/bits/simd.h
index 14179491f9d..a90cb3b2d98 100644
--- a/libstdc++-v3/include/experimental/bits/simd.h
+++ b/libstdc++-v3/include/experimental/bits/simd.h
@@ -204,6 +204,33 @@ template 
 template 
   using _SizeConstant = integral_constant;
 
+namespace __detail
+{
+  struct _Minimum
+  {
+template 
+  _GLIBCXX_SIMD_INTRINSIC constexpr
+  _Tp
+  operator()(_Tp __a, _Tp __b) const
+  {
+	using std::min;
+	return min(__a, __b);
+  }
+  };
+
+  struct _Maximum
+  {
+template 
+  _GLIBCXX_SIMD_INTRINSIC constexpr
+  _Tp
+  operator()(_Tp __a, _Tp __b) const
+  {
+	using std::max;
+	return max(__a, __b);
+  }
+  };
+} // namespace __detail
+
 // unrolled/pack execution helpers
 // __execute_n_times{{{
 template 
@@ -3408,7 +3435,7 @@ template 
 
 // }}}1
 // reductions [simd.reductions] {{{1
-  template >
+template >
   _GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR _Tp
   reduce(const simd<_Tp, _Abi>& __v,
 	 _BinaryOperation __binary_op = _BinaryOperation())
@@ -3454,6 +3481,61 @@ template 
   reduce(const const_where_expression<_M, _V>& __x, bit_xor<> __binary_op)
   { return reduce(__x, 0, __binary_op); }
 
+template 
+  _GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR _Tp
+  hmin(const simd<_Tp, _Abi>& __v) noexcept
+  {
+return _Abi::_SimdImpl::_S_reduce(__v, __detail::_Minimum());
+  }
+
+template 
+  _GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR _Tp
+  hmax(const simd<_Tp, _Abi>& __v) noexcept
+  {
+return _Abi::_SimdImpl::_S_reduce(__v, __detail::_Maximum());
+  }
+
+template 
+  _GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR
+  typename _V::value_type
+  hmin(const const_where_expression<_M, _V>& __x) noexcept
+  {
+using _Tp = typename _V::value_type;
+constexpr _Tp __id_elem =
+#ifdef __FINITE_MATH_ONLY__
+  __finite_max_v<_Tp>;
+#else
+  __value_or<__infinity, _Tp>(__finite_max_v<_Tp>);
+#endif
+_V __tmp = __id_elem;
+_V::_Impl::_S_masked_assign(__data(__get_mask(__x)), __data(__tmp),
+__data(__get_lvalue(__x)));
+return _V::abi_type::_SimdImpl::_S_reduce(__tmp, __detail::_Minimum());
+  }
+
+template 
+  _GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR
+  typename _V::value_type
+  hmax(const const_where_expression<_M, _V>& __x) noexcept
+  {
+using _Tp = typename _V::value_type;
+constexpr _Tp __id_elem =
+#ifdef __FINITE_MATH_ONLY__
+  __finite_min_v<_Tp>;
+#else
+  [] {
+	if constexpr (__value_exists_v<__infinity, _Tp>)
+	  return -__infinity_v<_Tp>;
+	else
+	  return __finite_min_v<_Tp>;
+  }();
+#endif
+_V __tmp = __id_elem;
+_V::_Impl::_S_masked_assign(__data(__get_mask(__x)), __data(__tmp),
+__data(__get_lvalue(__x)));
+return _V::abi_type::_SimdImpl::_S_reduce(__tmp, __detail::_Maximum());
+  }
+
 // }}}1
 // algorithms [simd.alg] {{{
 template 
diff --git a/libstdc++-v3/include/experimental/bits/simd_scalar.h b/libstdc++-v3/include/experimental/bits/simd_scalar.h
index 7680bc39c30..7e480ecdb37 100644
--- a/libstdc++-v3/include/experimental/bits/simd_scalar.h
+++ b/libstdc++-v3/include/experimental/bits/simd_scalar.h
@@ -182,7 +182,7 @@ 

Re: [PATCH] PR target/98743: Fix ICE in convert_move for RISC-V

2021-02-01 Thread Jakub Jelinek via Gcc-patches
On Mon, Feb 01, 2021 at 05:57:28PM +0800, Kito Cheng wrote:
> > >  - Check `TO` mode is not BLMmode before call store_expr, calling 
> > > store_expr
> > >with BLKmode will cause ICE.
> >
> > How do you end up with a SUBREG_PROMOTED* of something that has bitsize of 0
> > (GET_MODE_BITSIZE of BLKmode is 0, right)?
> 
> to_rtx is already having a mode other than BLKmode in this point,
> it's SImode for riscv64*-*-*, so bitsize is 32 rather than 0.
> 
> I guess my comment isn't clear enough, the root cause why
> `store_expr (from, to_rtx, 0, nontemporal, false)` ICE is
> because `from` is still BLKmode here.

But mode is TYPE_MODE (TREE_TYPE (to)), that is expanded already at this
point and got some different mode.  So, if anything, you care about
TYPE_MODE (TREE_TYPE (from)) rather than mode, don't you?

Jakub



Re: [PATCH] PR target/98743: Fix ICE in convert_move for RISC-V

2021-02-01 Thread Kito Cheng
> >  - Check `TO` mode is not BLMmode before call store_expr, calling store_expr
> >with BLKmode will cause ICE.
>
> How do you end up with a SUBREG_PROMOTED* of something that has bitsize of 0
> (GET_MODE_BITSIZE of BLKmode is 0, right)?

to_rtx is already having a mode other than BLKmode in this point,
it's SImode for riscv64*-*-*, so bitsize is 32 rather than 0.

I guess my comment isn't clear enough, the root cause why
`store_expr (from, to_rtx, 0, nontemporal, false)` ICE is
because `from` is still BLKmode here.

> That makes no sense.
>
> Jakub
>


Re: [PATCH] PR target/98743: Fix ICE in convert_move for RISC-V

2021-02-01 Thread Jakub Jelinek via Gcc-patches
On Mon, Feb 01, 2021 at 05:38:46PM +0800, Kito Cheng wrote:
>  - Check `TO` mode is not BLMmode before call store_expr, calling store_expr
>with BLKmode will cause ICE.

How do you end up with a SUBREG_PROMOTED* of something that has bitsize of 0
(GET_MODE_BITSIZE of BLKmode is 0, right)?

That makes no sense.

Jakub



[PATCH] PR target/98743: Fix ICE in convert_move for RISC-V

2021-02-01 Thread Kito Cheng
 - Check `TO` mode is not BLMmode before call store_expr, calling store_expr
   with BLKmode will cause ICE.

 - Verified with riscv64, x86_64 and aarch64, no introduce new regression.

Note: Those logic was introduced by 3e60ddeb8220ed388819bb3f14e8caa9309fd3c2,
  so I cc Jakub for reivew.

gcc/ChangeLog:

PR target/98743
* expr.c: Check mode before calling store_expr.

gcc/testsuite/ChangeLog:

PR target/98743
* g++.target/riscv/pr98743.C: New.
---
 gcc/expr.c   |  1 +
 gcc/testsuite/g++.target/riscv/pr98743.C | 27 
 2 files changed, 28 insertions(+)
 create mode 100644 gcc/testsuite/g++.target/riscv/pr98743.C

diff --git a/gcc/expr.c b/gcc/expr.c
index 04ef5ad114d..3cf492acea3 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -5459,6 +5459,7 @@ expand_assignment (tree to, tree from, bool nontemporal)
  /* If to_rtx is a promoted subreg, we need to zero or sign
 extend the value afterwards.  */
  if (TREE_CODE (to) == MEM_REF
+ && mode != BLKmode
  && !REF_REVERSE_STORAGE_ORDER (to)
  && known_eq (bitpos, 0)
  && known_eq (bitsize, GET_MODE_BITSIZE (GET_MODE (to_rtx
diff --git a/gcc/testsuite/g++.target/riscv/pr98743.C 
b/gcc/testsuite/g++.target/riscv/pr98743.C
new file mode 100644
index 000..1b94597ac68
--- /dev/null
+++ b/gcc/testsuite/g++.target/riscv/pr98743.C
@@ -0,0 +1,27 @@
+// Test for value-initialization via {}
+// { dg-do run { target c++11 } }
+/* { dg-options "-Og -version -fno-early-inlining -finline-small-functions 
-fpack-struct" */
+void * operator new (__SIZE_TYPE__, void *p) { return p; }
+void * operator new[] (__SIZE_TYPE__, void *p) { return p; }
+
+// Empty base so A isn't an aggregate
+struct B {};
+struct A: B {
+  int i;
+};
+
+struct C: A {
+  C(): A{} {}
+};
+
+int main()
+{
+  int space = 42;
+  A* ap = new () A{};
+  int space1[1] = { 42 };
+  A* a1p = new (space1) A[1]{};
+  if (ap->i != 0 ||
+  a1p[0].i != 0)
+return 1;
+  return 0;
+}
-- 
2.30.0



Re: [PATCH] RISC-V: Fix -march option parsing when `p` extension exists.

2021-02-01 Thread Kito Cheng via Gcc-patches
Pushed, thanks :)

On Mon, Feb 1, 2021 at 4:58 PM Xing GUO  wrote:
>
> Hi,
>
> I've reproduced the failure. It's because my gcc is configured as a
> bare-metal toolchain and built with binutils that supports RISC-V
> attribute. That is to say, my gcc emits RISC-V attributes by default.
> Below is the patch that should fix the failure. Sorry for the
> inconvenience.
>
> diff --git a/gcc/testsuite/gcc.target/riscv/attribute-18.c
> b/gcc/testsuite/gcc.target/riscv/attribute-18.c
> index 1fd80fed51b..492360cf7c1 100644
> --- a/gcc/testsuite/gcc.target/riscv/attribute-18.c
> +++ b/gcc/testsuite/gcc.target/riscv/attribute-18.c
> @@ -1,4 +1,4 @@
>  /* { dg-do compile } */
> -/* { dg-options "-march=rv64imafdcp -mabi=lp64d -misa-spec=2.2" } */
> +/* { dg-options "-mriscv-attribute -march=rv64imafdcp -mabi=lp64d
> -misa-spec=2.2" } */
>  int foo() {}
>  /* { dg-final { scan-assembler ".attribute arch,
> \"rv64i2p0_m2p0_a2p0_f2p0_d2p0_c2p0_p\"" } } */
>
> On 2/1/21, Xing GUO  wrote:
> > Hi Andreas and Kito,
> >
> > I haven't reproduced this failure, but it looks that I forget to
> > append `-mriscv-attribute` to dg-options in attribute-18.c. I'll reply
> > to this thread ASAP.
> >
> > Thanks,
> > Xing
> >
> > On 2/1/21, Andreas Schwab  wrote:
> >> FAIL: gcc.target/riscv/attribute-18.c scan-assembler .attribute arch,
> >> "rv64i2p0_m2p0_a2p0_f2p0_d2p0_c2p0_p"
> >>
> >> $ grep -c 'attribute arch' attribute-18.s
> >> 0
> >>
> >> Andreas.
> >>
> >> --
> >> Andreas Schwab, sch...@linux-m68k.org
> >> GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
> >> "And now for something completely different."
> >>
> >
> >
> > --
> > Cheers,
> > Xing
> >
>
>
> --
> Cheers,
> Xing


RE: [PATCH] arm: Auto-vectorization for MVE: vorn

2021-02-01 Thread Kyrylo Tkachov via Gcc-patches


> -Original Message-
> From: Christophe Lyon 
> Sent: 29 January 2021 18:18
> To: Kyrylo Tkachov 
> Cc: gcc-patches@gcc.gnu.org
> Subject: Re: [PATCH] arm: Auto-vectorization for MVE: vorn
> 
> On Fri, 29 Jan 2021 at 16:03, Kyrylo Tkachov 
> wrote:
> >
> >
> >
> > > -Original Message-
> > > From: Gcc-patches  On Behalf Of
> > > Christophe Lyon via Gcc-patches
> > > Sent: 29 January 2021 14:55
> > > To: gcc-patches@gcc.gnu.org
> > > Subject: [PATCH] arm: Auto-vectorization for MVE: vorn
> > >
> > > This patch enables MVE vornq instructions for auto-vectorization.  MVE
> > > vornq insns in mve.md are modified to use ior instead of unspec
> > > expression to support ior3.  The ior3 expander is added
> to
> > > vec-common.md
> > >
> > > 2021-01-29  Christophe Lyon  
> > >
> > >   gcc/
> > >   * config/arm/iterators.md (supf): Remove VORNQ_S and VORNQ_U.
> > >   (VORNQ): Remove.
> > >   * config/arm/mve.md (mve_vornq_s): New entry for vorn
> > >   instruction using expression ior.
> > >   (mve_vornq_u): New expander.
> > >   (mve_vornq_f): Use ior code instead of unspec.
> > >   * config/arm/unspecs.md (VORNQ_S, VORNQ_U, VORNQ_F):
> > > Remove.
> > >   * config/arm/vec-common.md (orn3): New expander.
> > >
> > >   gcc/testsuite/
> > >   * gcc.target/arm/simd/mve-vorn.c: Add vorn tests.
> > > ---
> > >  gcc/config/arm/iterators.md  |  3 +--
> > >  gcc/config/arm/mve.md| 23 +++--
> > >  gcc/config/arm/unspecs.md|  3 ---
> > >  gcc/config/arm/vec-common.md |  8 ++
> > >  gcc/testsuite/gcc.target/arm/simd/mve-vorn.c | 38
> > > 
> > >  5 files changed, 62 insertions(+), 13 deletions(-)
> > >  create mode 100644 gcc/testsuite/gcc.target/arm/simd/mve-vorn.c
> > >
> > > diff --git a/gcc/config/arm/iterators.md b/gcc/config/arm/iterators.md
> > > index b902790..43aab23 100644
> > > --- a/gcc/config/arm/iterators.md
> > > +++ b/gcc/config/arm/iterators.md
> > > @@ -1293,7 +1293,7 @@ (define_int_attr supf [(VCVTQ_TO_F_S "s")
> > > (VCVTQ_TO_F_U "u") (VREV16Q_S "s")
> > >  (VMULLBQ_INT_S "s") (VMULLBQ_INT_U "u")
> > > (VQADDQ_S "s")
> > >  (VMULLTQ_INT_S "s") (VMULLTQ_INT_U "u")
> > > (VQADDQ_U "u")
> > >  (VMULQ_N_S "s") (VMULQ_N_U "u") (VMULQ_S "s")
> > > -(VMULQ_U "u") (VORNQ_S "s") (VORNQ_U "u")
> > > +(VMULQ_U "u")
> > >  (VQADDQ_N_S "s") (VQADDQ_N_U "u")
> > >  (VQRSHLQ_N_S "s") (VQRSHLQ_N_U "u") (VQRSHLQ_S
> > > "s")
> > >  (VQRSHLQ_U "u") (VQSHLQ_N_S "s")
> > >   (VQSHLQ_N_U "u")
> > > @@ -1563,7 +1563,6 @@ (define_int_iterator VMULLBQ_INT
> > > [VMULLBQ_INT_U VMULLBQ_INT_S])
> > >  (define_int_iterator VMULLTQ_INT [VMULLTQ_INT_U VMULLTQ_INT_S])
> > >  (define_int_iterator VMULQ [VMULQ_U VMULQ_S])
> > >  (define_int_iterator VMULQ_N [VMULQ_N_U VMULQ_N_S])
> > > -(define_int_iterator VORNQ [VORNQ_U VORNQ_S])
> > >  (define_int_iterator VQADDQ [VQADDQ_U VQADDQ_S])
> > >  (define_int_iterator VQADDQ_N [VQADDQ_N_S VQADDQ_N_U])
> > >  (define_int_iterator VQRSHLQ [VQRSHLQ_S VQRSHLQ_U])
> > > diff --git a/gcc/config/arm/mve.md b/gcc/config/arm/mve.md
> > > index 465f71c..ec0ef7b 100644
> > > --- a/gcc/config/arm/mve.md
> > > +++ b/gcc/config/arm/mve.md
> > > @@ -1634,18 +1634,26 @@ (define_insn "mve_vmulq"
> > >  ;;
> > >  ;; [vornq_u, vornq_s])
> > >  ;;
> > > -(define_insn "mve_vornq_"
> > > +(define_insn "mve_vornq_s"
> > >[
> > > (set (match_operand:MVE_2 0 "s_register_operand" "=w")
> > > - (unspec:MVE_2 [(match_operand:MVE_2 1 "s_register_operand" "w")
> > > -(match_operand:MVE_2 2 "s_register_operand" "w")]
> > > -  VORNQ))
> > > + (ior:MVE_2 (not:MVE_2 (match_operand:MVE_2 2
> > > "s_register_operand" "w"))
> > > +(match_operand:MVE_2 1 "s_register_operand" "w")))
> > >]
> > >"TARGET_HAVE_MVE"
> > > -  "vorn %q0, %q1, %q2"
> > > +   "vorn\t%q0, %q1, %q2"
> > >[(set_attr "type" "mve_move")
> > >  ])
> > >
> > > +(define_expand "mve_vornq_u"
> > > +  [
> > > +   (set (match_operand:MVE_2 0 "s_register_operand")
> > > + (ior:MVE_2 (not:MVE_2 (match_operand:MVE_2 2
> > > "s_register_operand"))
> > > +(match_operand:MVE_2 1 "s_register_operand")))
> > > +  ]
> > > +  "TARGET_HAVE_MVE"
> > > +)
> > > +
> > >  ;;
> > >  ;; [vorrq_s, vorrq_u])
> > >  ;;
> > > @@ -2630,9 +2638,8 @@ (define_insn "mve_vmulq_n_f"
> > >  (define_insn "mve_vornq_f"
> > >[
> > > (set (match_operand:MVE_0 0 "s_register_operand" "=w")
> > > - (unspec:MVE_0 [(match_operand:MVE_0 1 "s_register_operand" "w")
> > > -(match_operand:MVE_0 2 "s_register_operand" "w")]
> > > -  VORNQ_F))
> > > + (ior:MVE_0 (not:MVE_0 (match_operand:MVE_0 2
> > > "s_register_operand" "w"))
> > > +  

RE: [PATCH] aarch64: Remove testing of saturation cumulative QC bit

2021-02-01 Thread Kyrylo Tkachov via Gcc-patches


> -Original Message-
> From: Christophe Lyon 
> Sent: 29 January 2021 18:26
> To: Kyrylo Tkachov 
> Cc: gcc-patches@gcc.gnu.org
> Subject: Re: [PATCH] aarch64: Remove testing of saturation cumulative QC
> bit
> 
> On Tue, 19 Jan 2021 at 18:31, Kyrylo Tkachov via Gcc-patches
>  wrote:
> >
> > Hi all,
> >
> > Since we don't guarantee the ordering of the QC flag in FPSR in the
> saturation intrinsics, we shouldn't be testing for it.
> > I want to relax the flags for some of the builtins to enable more
> optimisation but that triggers the QC flag tests in advsimd-intrinsics.exp.
> > We don't implement the saturation flag access intrinsics in aarch64 anyway
> and we don't want to.
> 
> So this means that the tests would no longer catch an unexpected
> regression on arm wrt QC flag?

Yes, the things that make it impractical to support the QC flag in aarch64 also 
apply to arm.

> 
> IIUC, this also means that the QC flag is only meaningful with writing
> assembly after these two patches?

I suppose so, yes. Perhaps we can note it somewhere in the documentation.
Thanks,
Kyrill

> 
> >
> > Tested on aarch64-none-elf and arm-none-eabi.
> >
> > Pushing to master.
> > Thanks,
> > Kyrill
> >
> > gcc/testsuite/ChangeLog:
> >
> > * gcc.target/aarch64/advsimd-intrinsics/arm-neon-ref.h
> (CHECK_CUMULATIVE_SAT): Delete.
> > (CHECK_CUMULATIVE_SAT_NAMED): Likewise.  Deleted related
> variables.
> > * gcc.target/aarch64/advsimd-intrinsics/binary_sat_op.inc: Remove
> uses of the above.
> > * gcc.target/aarch64/advsimd-intrinsics/unary_sat_op.inc: Likewise.
> > * gcc.target/aarch64/advsimd-intrinsics/vqabs.c: Likewise.
> > * gcc.target/aarch64/advsimd-intrinsics/vqadd.c: Likewise.
> > * gcc.target/aarch64/advsimd-intrinsics/vqdmlXl.inc: Likewise.
> > * gcc.target/aarch64/advsimd-intrinsics/vqdmlXl_lane.inc: Likewise.
> > * gcc.target/aarch64/advsimd-intrinsics/vqdmlXl_n.inc: Likewise.
> > * gcc.target/aarch64/advsimd-intrinsics/vqdmlal.c: Likewise.
> > * gcc.target/aarch64/advsimd-intrinsics/vqdmlal_lane.c: Likewise.
> > * gcc.target/aarch64/advsimd-intrinsics/vqdmlal_n.c: Likewise.
> > * gcc.target/aarch64/advsimd-intrinsics/vqdmlsl.c: Likewise.
> > * gcc.target/aarch64/advsimd-intrinsics/vqdmlsl_lane.c: Likewise.
> > * gcc.target/aarch64/advsimd-intrinsics/vqdmlsl_n.c: Likewise.
> > * gcc.target/aarch64/advsimd-intrinsics/vqdmulh.c: Likewise.
> > * gcc.target/aarch64/advsimd-intrinsics/vqdmulh_lane.c: Likewise.
> > * gcc.target/aarch64/advsimd-intrinsics/vqdmulh_n.c: Likewise.
> > * gcc.target/aarch64/advsimd-intrinsics/vqdmull.c: Likewise.
> > * gcc.target/aarch64/advsimd-intrinsics/vqdmull_lane.c: Likewise.
> > * gcc.target/aarch64/advsimd-intrinsics/vqdmull_n.c: Likewise.
> > * gcc.target/aarch64/advsimd-intrinsics/vqmovn.c: Likewise.
> > * gcc.target/aarch64/advsimd-intrinsics/vqmovun.c: Likewise.
> > * gcc.target/aarch64/advsimd-intrinsics/vqneg.c: Likewise.
> > * gcc.target/aarch64/advsimd-intrinsics/vqrdmlXh.inc: Likewise.
> > * gcc.target/aarch64/advsimd-intrinsics/vqrdmlXh_lane.inc: Likewise.
> > * gcc.target/aarch64/advsimd-intrinsics/vqrdmlah.c: Likewise.
> > * gcc.target/aarch64/advsimd-intrinsics/vqrdmlah_lane.c: Likewise.
> > * gcc.target/aarch64/advsimd-intrinsics/vqrdmlsh.c: Likewise.
> > * gcc.target/aarch64/advsimd-intrinsics/vqrdmlsh_lane.c: Likewise.
> > * gcc.target/aarch64/advsimd-intrinsics/vqrdmulh.c: Likewise.
> > * gcc.target/aarch64/advsimd-intrinsics/vqrdmulh_lane.c: Likewise.
> > * gcc.target/aarch64/advsimd-intrinsics/vqrdmulh_n.c: Likewise.
> > * gcc.target/aarch64/advsimd-intrinsics/vqrshl.c: Likewise.
> > * gcc.target/aarch64/advsimd-intrinsics/vqrshrn_n.c: Likewise.
> > * gcc.target/aarch64/advsimd-intrinsics/vqrshrun_n.c: Likewise.
> > * gcc.target/aarch64/advsimd-intrinsics/vqshl.c: Likewise.
> > * gcc.target/aarch64/advsimd-intrinsics/vqshl_n.c: Likewise.
> > * gcc.target/aarch64/advsimd-intrinsics/vqshlu_n.c: Likewise.
> > * gcc.target/aarch64/advsimd-intrinsics/vqshrn_n.c: Likewise.
> > * gcc.target/aarch64/advsimd-intrinsics/vqshrun_n.c: Likewise.
> > * gcc.target/aarch64/advsimd-intrinsics/vqsub.c: Likewise.


Re: [PATCH] RISC-V: Fix -march option parsing when `p` extension exists.

2021-02-01 Thread Xing GUO via Gcc-patches
Hi,

I've reproduced the failure. It's because my gcc is configured as a
bare-metal toolchain and built with binutils that supports RISC-V
attribute. That is to say, my gcc emits RISC-V attributes by default.
Below is the patch that should fix the failure. Sorry for the
inconvenience.

diff --git a/gcc/testsuite/gcc.target/riscv/attribute-18.c
b/gcc/testsuite/gcc.target/riscv/attribute-18.c
index 1fd80fed51b..492360cf7c1 100644
--- a/gcc/testsuite/gcc.target/riscv/attribute-18.c
+++ b/gcc/testsuite/gcc.target/riscv/attribute-18.c
@@ -1,4 +1,4 @@
 /* { dg-do compile } */
-/* { dg-options "-march=rv64imafdcp -mabi=lp64d -misa-spec=2.2" } */
+/* { dg-options "-mriscv-attribute -march=rv64imafdcp -mabi=lp64d
-misa-spec=2.2" } */
 int foo() {}
 /* { dg-final { scan-assembler ".attribute arch,
\"rv64i2p0_m2p0_a2p0_f2p0_d2p0_c2p0_p\"" } } */

On 2/1/21, Xing GUO  wrote:
> Hi Andreas and Kito,
>
> I haven't reproduced this failure, but it looks that I forget to
> append `-mriscv-attribute` to dg-options in attribute-18.c. I'll reply
> to this thread ASAP.
>
> Thanks,
> Xing
>
> On 2/1/21, Andreas Schwab  wrote:
>> FAIL: gcc.target/riscv/attribute-18.c scan-assembler .attribute arch,
>> "rv64i2p0_m2p0_a2p0_f2p0_d2p0_c2p0_p"
>>
>> $ grep -c 'attribute arch' attribute-18.s
>> 0
>>
>> Andreas.
>>
>> --
>> Andreas Schwab, sch...@linux-m68k.org
>> GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
>> "And now for something completely different."
>>
>
>
> --
> Cheers,
> Xing
>


-- 
Cheers,
Xing


[PATCH] rtl-optimization/98863 - prune RD with LIVE in STV

2021-02-01 Thread Richard Biener
This sets DF_RD_PRUNE_DEAD_DEFS like all other uses of the UD/DU
chain problems which makes the RD problem consume a lot less memory.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

2021-02-01  Richard Biener  

PR rtl-optimization/98863
* config/i386/i386-features.c (convert_scalars_to_vector):
Set DF_RD_PRUNE_DEAD_DEFS.
---
 gcc/config/i386/i386-features.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/config/i386/i386-features.c b/gcc/config/i386/i386-features.c
index c7d64822a13..41891c94469 100644
--- a/gcc/config/i386/i386-features.c
+++ b/gcc/config/i386/i386-features.c
@@ -1627,7 +1627,7 @@ convert_scalars_to_vector (bool timode_p)
 bitmap_initialize ([i], _default_obstack);
 
   calculate_dominance_info (CDI_DOMINATORS);
-  df_set_flags (DF_DEFER_INSN_RESCAN);
+  df_set_flags (DF_DEFER_INSN_RESCAN | DF_RD_PRUNE_DEAD_DEFS);
   df_chain_add_problem (DF_DU_CHAIN | DF_UD_CHAIN);
   df_analyze ();
 
-- 
2.26.2