Re: [RS6000] PR96493, powerpc local call linkage failure

2020-08-11 Thread Alan Modra via Gcc-patches
On Tue, Aug 11, 2020 at 01:30:36PM -0500, Segher Boessenkool wrote:
> Either always running or what this patch does will work.  But please add
> comments what the test case wants to test,

That's already in the testcase.

/* Test local calls between pcrel and non-pcrel code.

The comment goes on to say

   Despite the cpu=power10 option, the code generated here should just
   be plain powerpc64, even the necessary linker stubs.  */

which was the justification for using "dg-do run" unqualified in the
current testcase.

> and for the tricky bits.

There aren't any.  And other tests use multiple dg-do lines, eg.
gcc/testsuite/g++.dg/ext/altivec-3.C

/* { dg-do run { target { powerpc*-*-* && vmx_hw } } } */
/* { dg-do compile { target { powerpc*-*-* && { ! vmx_hw } } } } */

Committed 2ba0674c657, and apologies for missing the power10_ok first
time around on this test.

-- 
Alan Modra
Australia Development Lab, IBM


Re: [PATCH 2/2] PowerPC: Add power10 IEEE 128-bit min/max/cmove.

2020-08-11 Thread Michael Meissner via Gcc-patches
On Tue, Aug 11, 2020 at 08:01:50PM -0500, Segher Boessenkool wrote:
> Hi!
> 
> On Tue, Aug 11, 2020 at 12:23:07PM -0400, Michael Meissner wrote:
> > +  /* See if we can use the ISA 3.1 min/max/compare instructions for IEEE
> > + 128-bit floating point.  At present, don't worry about doing 
> > conditional
> > + moves with different types for the comparison and movement (unlike 
> > SF/DF,
> > + where you can do a conditional test between double and use float as 
> > the
> > + if/then parts. */
> 
> Why is that?  That makes the code quite different too (harder to
> review), but also, it could just use existing code more.

It is a combinatorial expansion problem.

In order to do:

double cmove (double x, double y, float a, float b)
{
  return (a == b) ? x : y;
}

You need two pair of SFDF iterators.  You use SFDF for the target, and SFDF2
for the comparison.

(define_insn_and_split "*movcc_p9"
  [(set (match_operand:SFDF 0 "vsx_register_operand" 
"=&,")
(if_then_else:SFDF
 (match_operator:CCFP 1 "fpmask_comparison_operator"
[(match_operand:SFDF2 2 "vsx_register_operand" 
",")
 (match_operand:SFDF2 3 "vsx_register_operand" 
",")])
 (match_operand:SFDF 4 "vsx_register_operand" 
",")
 (match_operand:SFDF 5 "vsx_register_operand" 
",")))
   (clobber (match_scratch:V2DI 6 "=0,"))]
  "TARGET_P9_MINMAX"
  "#"
  ""
  [(set (match_dup 6)
(if_then_else:V2DI (match_dup 1)
   (match_dup 7)
   (match_dup 8)))
   (set (match_dup 0)
(if_then_else:SFDF (ne (match_dup 6)
   (match_dup 8))
   (match_dup 4)
   (match_dup 5)))]
{
  if (GET_CODE (operands[6]) == SCRATCH)
operands[6] = gen_reg_rtx (V2DImode);

  operands[7] = CONSTM1_RTX (V2DImode);
  operands[8] = CONST0_RTX (V2DImode);
}
 [(set_attr "length" "8")
  (set_attr "type" "vecperm")])

This means there are 4 versions of the pattern:
1: DF target, DF comparison
2: DF target, SF comparison
3: SF target, DF comparison
4: SF target, SF comparison

The dueling iterators was added in May 26th, 2016.  I believe we had separate
insns for the 4 cases for fsel before I added the xscmpeqdp, etc. support.

To grow this for IEEE 128, you would need two iterators, each with 4 elements
(DF, SF, KF, and optionally TF).  Thus you would need 16 patterns to represent
all of the patterns.  I can do this, I just didn't think it was worth it.

In addition, it becomes more involved due to constraints.  Currently for SF/DF
you essentially want to use any vector register at this point (when I added it
in 2016, there was still the support for limiting whether SF/DF could use the
Altivec registers).  But for IEEE 128-bit fp types, you want "v" for the
registers used for comparison.  You might want "v" for the contitional move
result, or you might want "wa".

I looked at combining the SF/DF and IEEE 128-bit cases, but it was becoming too
complex to describe these cases.  It is doable, but it makes the diffs even
harder to read.

> > +;; IEEE 128-bit min/max
> > +(define_insn "s3"
> > +  [(set (match_operand:IEEE128 0 "altivec_register_operand" "=v")
> > +   (fp_minmax:IEEE128
> > +(match_operand:IEEE128 1 "altivec_register_operand" "v")
> > +(match_operand:IEEE128 2 "altivec_register_operand" "v")))]
> > +  "TARGET_FLOAT128_HW && TARGET_POWER10 && FLOAT128_IEEE_P (mode)"
> > +  "xscqp %0,%1,%2"
> > +  [(set_attr "type" "fp")
> > +   (set_attr "size" "128")])
> 
> So why do we want the asymmetrical xsmincqp instead of xsminqp?  This
> should be documented at the very least.  (We also should have min/max
> that work correctly without -ffast-math, of course :-( ).

We don't have an xsminqp or xsmaxqp instruction in power10.  We only have
xsmincqp and xsmaxcqp instructions.

> > +;; IEEE 128-bit conditional move.  At present, don't worry about doing
> > +;; conditional moves with different types for the comparison and movement
> > +;; (unlike SF/DF, where you can do a conditional test between double and 
> > use
> > +;; float as the if/then parts.
> 
> (Unmatched brackets).  So why is this?  "At present" doesn't belong in
> the code btw, but in your patch description.

Typo.

> > +(define_insn_and_split "*movcc_hardware"
> 
> Please don't use meaningless names like this.
> 
> > +(define_insn "*xxsel"

Yes, this was more due to the fact that I was cloning the SF/DF code, and that
doesn't have XXSEL support.  I'll look at that.

> This already exists for other vector modes.  Put it together with that
> please.  Same for all other insn patterns.
> 
> Ideally you can make those one pattern then, 

PING: [PATCH] AArch64: Add if condition in aarch64_function_value [PR96479]

2020-08-11 Thread qiaopeixin
PING this issue.

-邮件原件-
发件人: qiaopeixin 
发送时间: 2020年8月6日 21:01
收件人: 'gcc-patches@gcc.gnu.org' 
抄送: 'richard.sandif...@arm.com' 
主题: [PATCH] AArch64: Add if condition in aarch64_function_value [PR96479]

Hi,

The test case vector-subscript-2.c in the gcc testsuit will report an ICE in 
the expand pass since ‘-mgeneral-regs-only’ is incompatible with the use of 
V4SI mode. I propose to report the diagnostic information instead of ICE, and 
the problem has been discussed on PR 96479.

I attached the patch to solve the problem. Bootstrapped and tested on 
aarch64-linux-gnu. Any suggestions?

All the best,
Peixin


Re: [PATCH] rs6000: ICE when using an MMA type as a function param

2020-08-11 Thread Segher Boessenkool
On Tue, Aug 11, 2020 at 09:07:40PM -0500, Peter Bergner wrote:
> >> +  static struct function *fn = NULL;
> >> +
> >> +  /* We do not allow MMA types being used as return values.  Only report
> >> + the invalid return value usage the first time we encounter it.  */
> >> +  if (for_return
> >> +  && fn != cfun
> >> +  && (mode == POImode || mode == PXImode))
> > 
> > "fn" is always zero here.
> > 
> >> +{
> >> +  fn = cfun;
> > 
> > And what you set here is unused.
> 
> It's a static local variable, so how is it always zero and unused?

Oh, trickiness with it being called a second time.  Ouch!

This needs a H U G E comment then...  Or better, get rid of that?


Segher


Re: [PATCH] rs6000: ICE when using an MMA type as a function param

2020-08-11 Thread Peter Bergner via Gcc-patches
On 8/11/20 9:00 PM, Segher Boessenkool wrote:
> Not just params, but return values as well.  "Error on MMA types in
> function prototype"?

Yes, it started out as a function param issue and then while working
on this, I decided I better look at what happens when they're used
as return values.  I'll update the commit message to include return
values.



>> +  static struct function *fn = NULL;
>> +
>> +  /* We do not allow MMA types being used as return values.  Only report
>> + the invalid return value usage the first time we encounter it.  */
>> +  if (for_return
>> +  && fn != cfun
>> +  && (mode == POImode || mode == PXImode))
> 
> "fn" is always zero here.
> 
>> +{
>> +  fn = cfun;
> 
> And what you set here is unused.

It's a static local variable, so how is it always zero and unused?



>> +/* { dg-options "-mdejagnu-cpu=power10 -O2 -w" } */
> 
> Do you need -w or could a less heavy hammer work as well?

I could probably declare bar0(), bar1(), bar2() and bar3() and
those might go away?  I didn't for some reason, but that may have
been for some earlier iteration of the test case.  I'll have a
look at removing that.

Peter




Re: [PATCH] rs6000: ICE when using an MMA type as a function param

2020-08-11 Thread Segher Boessenkool
Hi!

Not just params, but return values as well.  "Error on MMA types in
function prototype"?

On Sun, Aug 09, 2020 at 10:03:35PM -0500, Peter Bergner wrote:
> --- a/gcc/config/rs6000/rs6000-call.c
> +++ b/gcc/config/rs6000/rs6000-call.c
> @@ -6444,8 +6444,23 @@ machine_mode
>  rs6000_promote_function_mode (const_tree type ATTRIBUTE_UNUSED,
> machine_mode mode,
> int *punsignedp ATTRIBUTE_UNUSED,
> -   const_tree, int)
> +   const_tree, int for_return)
>  {
> +  static struct function *fn = NULL;
> +
> +  /* We do not allow MMA types being used as return values.  Only report
> + the invalid return value usage the first time we encounter it.  */
> +  if (for_return
> +  && fn != cfun
> +  && (mode == POImode || mode == PXImode))

"fn" is always zero here.

> +{
> +  fn = cfun;

And what you set here is unused.

So just remove fn?

> +  if (TYPE_CANONICAL (type) != NULL_TREE)

!= NULL_TREE != false != 0

(sorry sorry)

> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/pr96506.c
> @@ -0,0 +1,61 @@
> +/* PR target/96506 */
> +/* { dg-do compile } */
> +/* { dg-require-effective-target power10_ok } */
> +/* { dg-options "-mdejagnu-cpu=power10 -O2 -w" } */

Do you need -w or could a less heavy hammer work as well?

Okay for trunk (and backports after some simmering) with those things
looked at.  Thanks!


Segher


Re: [PATCH 2/2] PowerPC: Add power10 IEEE 128-bit min/max/cmove.

2020-08-11 Thread Segher Boessenkool
Hi!

On Tue, Aug 11, 2020 at 12:23:07PM -0400, Michael Meissner wrote:
> +  /* See if we can use the ISA 3.1 min/max/compare instructions for IEEE
> + 128-bit floating point.  At present, don't worry about doing conditional
> + moves with different types for the comparison and movement (unlike 
> SF/DF,
> + where you can do a conditional test between double and use float as the
> + if/then parts. */

Why is that?  That makes the code quite different too (harder to
review), but also, it could just use existing code more.

> +;; IEEE 128-bit min/max
> +(define_insn "s3"
> +  [(set (match_operand:IEEE128 0 "altivec_register_operand" "=v")
> + (fp_minmax:IEEE128
> +  (match_operand:IEEE128 1 "altivec_register_operand" "v")
> +  (match_operand:IEEE128 2 "altivec_register_operand" "v")))]
> +  "TARGET_FLOAT128_HW && TARGET_POWER10 && FLOAT128_IEEE_P (mode)"
> +  "xscqp %0,%1,%2"
> +  [(set_attr "type" "fp")
> +   (set_attr "size" "128")])

So why do we want the asymmetrical xsmincqp instead of xsminqp?  This
should be documented at the very least.  (We also should have min/max
that work correctly without -ffast-math, of course :-( ).

> +;; IEEE 128-bit conditional move.  At present, don't worry about doing
> +;; conditional moves with different types for the comparison and movement
> +;; (unlike SF/DF, where you can do a conditional test between double and use
> +;; float as the if/then parts.

(Unmatched brackets).  So why is this?  "At present" doesn't belong in
the code btw, but in your patch description.

> +(define_insn_and_split "*movcc_hardware"

Please don't use meaningless names like this.

> +(define_insn "*xxsel"

This already exists for other vector modes.  Put it together with that
please.  Same for all other insn patterns.

Ideally you can make those one pattern then, even.

> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/float128-minmax-2.c
> @@ -0,0 +1,70 @@
> +/* { dg-do compile { target { powerpc*-*-* && lp64 } } } */
> +/* { dg-require-effective-target power10_ok } */
> +/* { dg-options "-mdejagnu-cpu=power10 -O2 -ffast-math" } */
> +/* { dg-final { scan-assembler-not "xscmpuqp"  } } */
> +/* { dg-final { scan-assembler "xscmpeqqp" } } */
> +/* { dg-final { scan-assembler "xscmpgtqp" } } */
> +/* { dg-final { scan-assembler "xscmpgeqp" } } */
> +/* { dg-final { scan-assembler "xsmaxcqp"  } } */
> +/* { dg-final { scan-assembler "xsmincqp"  } } */
> +/* { dg-final { scan-assembler "xxsel" } } */

Use \m \M please.  Don't ask for powerpc* in gcc.target/powerpc/ (it is
implied there).  Why does it need lp64?  There should be some test for
__float128, instead.

All in all, this is very hard to review :-(


Segher


Re: [PATCH] rs6000: Update powerpc test cases to use -mdejagnu-cpu=.

2020-08-11 Thread Segher Boessenkool
Hi!

On Tue, Aug 11, 2020 at 08:22:47AM -0500, Peter Bergner wrote:
> I was looking through some POWER10 test cases and noticed that we used
> -mcpu=power10 rather than the preferred -mdejagnu-cpu=power10.

It is not just *preferred*; things work incorrectly on many systems
without it.  If all systems we tested on had a brand new software
install (a very bad idea -- we should test on older systems as well!),
or if we included dejagnu with GCC itself, this would not be a problem.

> I went
> looking for more tests that were not converted over and came up with the
> following patch.  Ok for trunk?

Very many, huh.  I sweeped this not so long ago...


> gcc/testsuite/
>   * g++.dg/ext/spe1.C (dg-options): Use -mdejagnu-cpu=.

That test is not for us (it is for powerpcspe) (and the testcase could
really be deleted by now).  So NAK for this one.

>   * gcc.target/powerpc/pr93122.c: Likewise.

My fault, thanks.

>   * gcc.target/powerpc/vsx_mask-count-runnable.c: Likewise.
>   * gcc.target/powerpc/vsx_mask-expand-runnable.c: Likewise.
>   * gcc.target/powerpc/vsx_mask-extract-runnable.c: Likewise.
>   * gcc.target/powerpc/vsx_mask-move-runnable.c: Likewise.

Also new.

>   * gfortran.dg/pr47614.f (dg-options): Likewise.
>   * gfortran.dg/pr58968.f: Likewise.

Ah, I missed these two from the start.

>   * gfortran.dg/nint_p7.f90: Likewise.  Remove unneeded dg-skip-if.
>   * gfortran.dg/vect/pr45714-b.f: Likewise.

And these.

Maybe those should be moved to ppc-fortran/ ?

>   * g++.dg/pr65240-1.C: Likewise.
>   * g++.dg/pr65240-2.C: Likewise.
>   * g++.dg/pr65240-3.C: Likewise.
>   * g++.dg/pr65240-4.C: Likewise.
>   * g++.dg/pr65242.C: Likewise.
>   * g++.dg/pr67211.C: Likewise.
>   * g++.dg/pr69667.C: Likewise.
>   * g++.dg/pr71294.C: Likewise.
>   * g++.dg/pr84279.C: Likewise.
>   * g++.dg/torture/ppc-ldst-array.C: Likewise.
>   * g++.dg/torture/pr69264.C (dg-additional-options): Use -mdejagnu-cpu=.

Similar for these, but you did skip the SPE ones here?  And you skipped
a whole bunch more C++ tests?

>   * gcc.dg/pr84032.c: Likewise.
>   * gcc.dg/torture/pr90972.c: Likewise.
>   * gcc.dg/vect/O3-pr70130.c: Likewise.
>   * gcc.dg/vect/pr48765.c: Likewise.  Remove unneeded dg-skip-if.

5234d2e686ff shows the scripts I used originally:

perl -ni -e 'print unless /dg-skip-if "do not override -mcpu"/' \
  $(find gcc/testsuite/gcc.target/powerpc/ -type f)
perl -pi -e 's/(dg-options.*)-mcpu=/\1-mdejagnu-cpu=/'  \
  $(find gcc/testsuite/gcc.target/powerpc/ -type f)

You might find more false positives than I did with that in other dirs,
but if not, might be useful to keep in a script ;-) (And I know it
doesn't find everything, yes :-/ )

So other then the SPE ones it is fine, but perhaps you missed some?


Segher


[PATCH] c++: Fixing the wording of () aggregate-init [PR92812]

2020-08-11 Thread Marek Polacek via Gcc-patches
P1975R0 tweaks the static_cast wording: it says that "An expression e can be
explicitly converted to a type T if [...] T is an aggregate type having a first
element x and there is an implicit conversion sequence from e to the type of
x."  This already works for classes, e.g.:

  struct Aggr { int x; int y; };
  Aggr a = static_cast(1);

for which we create TARGET_EXPR .

The proposal also mentions "If T is ``array of unknown bound of U'',
this direct-initialization defines the type of the expression as U[1]" which
suggest that this should work for arrays (they're aggregates too, after all):

  int (&)[3] = static_cast(42);
  int (&)[1] = static_cast(42);

So I handled that specifically in build_static_cast_1: wrap the
expression in { } and initialize from that.  For the 'r' case above
this creates TARGET_EXPR .

There are multiple things in play, as usual, so the tests test brace
elision, narrowing, explicit constructors, and lifetime extension too.
I think it's in line with what we discussed on the core reflector.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

gcc/cp/ChangeLog:

PR c++/92812
* typeck.c (build_static_cast_1): Implement P1975R0 by allowing
static_cast to aggregate type.

gcc/testsuite/ChangeLog:

PR c++/92812
* g++.dg/cpp2a/paren-init27.C: New test.
* g++.dg/cpp2a/paren-init28.C: New test.
* g++.dg/cpp2a/paren-init29.C: New test.
* g++.dg/cpp2a/paren-init30.C: New test.
* g++.dg/cpp2a/paren-init31.C: New test.
* g++.dg/cpp2a/paren-init32.C: New test.
---
 gcc/cp/typeck.c   | 14 +
 gcc/testsuite/g++.dg/cpp2a/paren-init27.C | 24 +++
 gcc/testsuite/g++.dg/cpp2a/paren-init28.C | 15 ++
 gcc/testsuite/g++.dg/cpp2a/paren-init29.C | 15 ++
 gcc/testsuite/g++.dg/cpp2a/paren-init30.C | 23 ++
 gcc/testsuite/g++.dg/cpp2a/paren-init31.C | 10 ++
 gcc/testsuite/g++.dg/cpp2a/paren-init32.C | 21 
 7 files changed, 122 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/paren-init27.C
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/paren-init28.C
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/paren-init29.C
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/paren-init30.C
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/paren-init31.C
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/paren-init32.C

diff --git a/gcc/cp/typeck.c b/gcc/cp/typeck.c
index a557f3439a8..9166156a5d5 100644
--- a/gcc/cp/typeck.c
+++ b/gcc/cp/typeck.c
@@ -7480,6 +7480,20 @@ build_static_cast_1 (location_t loc, tree type, tree 
expr, bool c_cast_p,
  t.  */
   result = perform_direct_initialization_if_possible (type, expr,
  c_cast_p, complain);
+  /* P1975 allows static_cast(42), as well as static_cast(42),
+ which initialize the first element of the aggregate.  We need to handle
+ the array case specifically.  */
+  if (result == NULL_TREE
+  && cxx_dialect >= cxx20
+  && TREE_CODE (type) == ARRAY_TYPE)
+{
+  /* Create { EXPR } and perform direct-initialization from it.  */
+  tree e = build_constructor_single (init_list_type_node, NULL_TREE, expr);
+  CONSTRUCTOR_IS_DIRECT_INIT (e) = true;
+  CONSTRUCTOR_IS_PAREN_INIT (e) = true;
+  result = perform_direct_initialization_if_possible (type, e, c_cast_p,
+ complain);
+}
   if (result)
 {
   if (processing_template_decl)
diff --git a/gcc/testsuite/g++.dg/cpp2a/paren-init27.C 
b/gcc/testsuite/g++.dg/cpp2a/paren-init27.C
new file mode 100644
index 000..0b8cbe33b69
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/paren-init27.C
@@ -0,0 +1,24 @@
+// PR c++/92812
+// P1975R0
+// { dg-do run { target c++20 } }
+// { dg-options "-Wall -Wextra" }
+
+struct Aggr { int x; int y; };
+struct Base { int i; Base(int i_) : i{i_} { } };
+struct BaseAggr : Base { };
+struct X { };
+struct AggrSDM { static X x; int i; int j; };
+
+int
+main ()
+{
+  Aggr a = static_cast(42); // { dg-warning "missing initializer" }
+  if (a.x != 42 || a.y != 0)
+__builtin_abort ();
+  BaseAggr b = static_cast(42);
+  if (b.i != 42)
+__builtin_abort ();
+  AggrSDM s = static_cast(42); // { dg-warning "missing initializer" }
+  if (s.i != 42 || s.j != 0)
+__builtin_abort ();
+}
diff --git a/gcc/testsuite/g++.dg/cpp2a/paren-init28.C 
b/gcc/testsuite/g++.dg/cpp2a/paren-init28.C
new file mode 100644
index 000..8c57dc8e155
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/paren-init28.C
@@ -0,0 +1,15 @@
+// PR c++/92812
+// P1975R0
+// { dg-do compile { target c++20 } }
+
+// In both cases the reference declarations lifetime-extend the array
+// temporary.
+int (&)[3] = static_cast(42);
+int (&)[1] = static_cast(42);
+
+// Make sure we've lifetime-extended.
+// { dg-final { scan-assembler "_ZGR1r_" } }
+// { dg-final { 

Re: [PATCH] emit-rtl.c: Allow splitting of RTX_FRAME_RELATED_P insns?

2020-08-11 Thread Segher Boessenkool
On Tue, Aug 11, 2020 at 07:59:44AM +0100, Richard Sandiford wrote:
> >> I agree there's no obvious reason why splitting to a single insn
> >> should be rejected but a peephole2 to a single instruction should be OK.
> >> And reusing the existing, tried-and-tested code is the way to go.
> >
> > The only obvious difference is that the splitters run many times, while
> > peep2 runs only once, very late.  If you make this only do stuff for
> > reload_completed splitters, that difference is gone as well.
> 
> Yeah, but I was talking specifically about RTX_FRAME_RELATED_P stuff,
> rather than in general, and RTX_FRAME_RELATED_P insns shouldn't exist
> until prologue/epilogue generation.

Yeah, good points.

> The reference to “single insn”
> was because both passes would still reject splitting/peepholing an
> RTX_FRAME_RELATED_P insn to multiple insns.

Is that the only thing RTX_FRAME_RELATED_P is actually useful for now,
btw?  (I do get that it is handy to have the simple cases in the
prologue automatically figured out, but that is at best a nicety).

What is actually bad about splitting FRAME_RELATED insns, anyway?  I
can think of many things that could go wrong, but all of those can go
wrong with 1-1 splits as well.  Maybe this all just works because not
very many 1-1 splits are used in practice?

So many questions, feel free to ignore all :-)


Segher


Re: [PATCH V3] Practical Improvement to libgcc Complex Divide

2020-08-11 Thread Patrick McGehearty via Gcc-patches

2nd ping.

Any estimate on when a reviewer might get to this improvement
in the accuracy of Complex Divide?

I'm happy to supply more info on what testing I've done
and details about design decisions. I'd prefer to do that
sooner than later as who knows when corporate priority decisions
might prevent me from having time for rapid response.

- Patrick McGehearty (patrick.mcgehea...@oracle.com)


On 7/21/2020 12:19 PM, Patrick McGehearty via Gcc-patches wrote:

Ping



On 7/1/2020 11:30 AM, Patrick McGehearty via Gcc-patches wrote:

(Version 3)

(Added in version 3)
Support for half, float, extended, and long double precision has
been added to the prior work for double precision. Since half precision
is computed with float precision as per current libgcc practice,
the enhanced underflow/overflow tests provide no benefit for half
precision and would cost performance. Therefore half precision is
left unchanged.

The existing constants for each precision:
float: FLT_MAX, FLT_MIN;
double: DBL_MAX, DBL_MIN;
extended and/or long double: LDBL_MAX, LDBL_MIN
are used for avoiding the more common overflow/underflow cases.

Additional investigation showed that testing for when both parts of
the denominator had exponents roughly small enough to allow shifting
any subnormal values to normal values, all input values could be
scaled up without risking unnecessary overflow and gaining a clear
improvement in accuracy. The test and scaling values used all fit
within the allowed exponent range for each precision required by the C
standard. The remaining number of troubling results in version 3 is
measurably smaller than in versions 1 and 2.

The timing and precision tables below have been revised appropriately
to match the algorithms used in this version for double precision
and additional tables added to include results for other precisions.

In prior versions, I omitted mention of the bug report that started me
on this project: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59714
complex division is surprising on targets with FMA (was: on aarch64)
With the proposed method, whether using FMA or not, dividing
1.0+3.0i by 1.0+3.0i correctly returns 1.0+0.0i.

I also have added a reference to Beebe's "The Mathematical Function
Computation Handbook" [4] which was my starting point for research
into better complex divide methods.

(Added for Version 2)
In my initial research, I missed Elen Kalda's proposed patch
https://gcc.gnu.org/legacy-ml/gcc-patches/2019-08/msg01629.html [3]
Thanks to Joseph Myers for providing me with the pointer.
This version includes performance and accuracy comparisions
between Elen's proposed patch and my latest patch version for
double precision.

(from earlier Versions)

The following patch to libgcc/libgcc2.c __divdc3 provides an
opportunity to gain important improvements to the quality of answers
for the default complex divide routine (half, float, double, extended,
long double precisions) when dealing with very large or very small 
exponents.


The current code correctly implements Smith's method (1962) [1]
further modified by c99's requirements for dealing with NaN (not a
number) results. When working with input values where the exponents
are greater than *_MAX_EXP/2 or less than -(*_MAX_EXP)/2, results are
substantially different from the answers provided by quad precision
more than 1% of the time. This error rate may be unacceptable for many
applications that cannot a priori restrict their computations to the
safe range. The proposed method reduces the frequency of
"substantially different" answers by more than 99% for double
precision at a modest cost of performance.

Differences between current gcc methods and the new method will be
described. Then accuracy and performance differences will be discussed.

NOTATION

For all of the following, the notation is:
Input complex values:
   a+bi  (a= real part, b= imaginary part)
   c+di
Output complex value:
   e+fi = (a+bi)/(c+di)

For the result tables:
current = current method (SMITH)
b1div = method proposed by Elen Kalda
b2div = alternate method considered by Elen Kalda
new1 = new method using 1 divide and 2 multiplies
new = new method proposed by this patch

DESCRIPTIONS of different complex divide methods:

NAIVE COMPUTATION (-fcx-limited-range):
   e = (a*c + b*d)/(c*c + d*d)
   f = (b*c - a*d)/(c*c + d*d)

Note that c*c and d*d will overflow or underflow if either
c or d is outside the range 2^-538 to 2^512.

This method is available in gcc when the switch -fcx-limited-range is
used. That switch is also enabled by -ffast-math. Only one who has a
clear understanding of the maximum range of intermediate values
generated by a computation should consider using this switch.

SMITH's METHOD (current libgcc):
   if(fabs(c) /* Changing the order of operations avoids the underflow of r 
impacting

  the result. */
 x = (a + (d * (b / c))) * t;
 y = (b - (d * (a / c))) * t;
 }
   }

   if (FABS (d) < FABS (c)) {
   

Re: [PATCH] c++: Improve RANGE_EXPR optimization in cxx_eval_vec_init

2020-08-11 Thread Patrick Palka via Gcc-patches
On Tue, 11 Aug 2020, Jason Merrill wrote:

> On 8/10/20 9:21 AM, Patrick Palka wrote:
> > On Fri, 7 Aug 2020, Jason Merrill wrote:
> > 
> > > On 8/6/20 1:50 PM, Patrick Palka wrote:
> > > > This patch eliminates an exponential dependence in cxx_eval_vec_init on
> > > > the array dimension of a VEC_INIT_EXPR when the RANGE_EXPR optimization
> > > > applies.  This is achieved by using a single constructor_elt (with index
> > > > RANGE_EXPR 0...max-1) per dimension instead of two constructor_elts
> > > > (with index 0 and RANGE_EXPR 1...max-1 respectively).  In doing so, we
> > > > can also get rid of the call to unshare_constructor since the element
> > > > initializer now gets used in exactly one spot.
> > > > 
> > > > The patch also removes the 'eltinit = new_ctx.ctor' assignment within
> > > > the
> > > > RANGE_EXPR optimization since eltinit should already always be equal to
> > > > new_ctx.ctor here (modulo encountering an error when computing eltinit).
> > > > This was verified by running the testsuite against an appropriate
> > > > assert.
> > > 
> > > Maybe keep that assert?
> > 
> > FWIW, the assert was
> > 
> >gcc_assert (*non_constant_p || eltinit == new_ctx->ctor);
> > 
> > and apparently it survives the testsuite when added to either the
> > RANGE_EXPR or non-RANGE_EXPR code paths in cxx_eval_vec_init.
> > 
> > I then tried adding an analogous assert to cxx_eval_bare_aggregate, but
> > this assert triggers for lots of our testcases, in particular when (but
> > not only when) an elt initializer is already a reduced constant
> > CONSTRUCTOR (since then cxx_eval_constant_expression just returns this
> > already-reduced CONSTRUCTOR without updating ctx->ctor).
> > 
> > I'm not sure why the assert should necessarily hold in cxx_eval_vec_init
> > but not in cxx_eval_bare_aggregate.  I guess we never see a
> > VEC_INIT_EXPR whose elt initializer is a reduced constant CONSTRUCTOR or
> > similar?
> 
> That sounds like a plausible reason.
> 
> > > 
> > > > Finally, this patch reverses the sense of the ctx->quiet test that
> > > > controls whether to short-circuit evaluation upon seeing an error.  This
> > > > should speed up speculative evaluation of non-constant VEC_INIT_EXPRs
> > > > (since ctx->quiet is true then).  I'm not sure why we were testing
> > > > !ctx->quiet originally; it's inconsistent with how we short-circuit in
> > > > other spots.
> > > 
> > > Good question.  That code seems to go back to the initial implementation
> > > of
> > > constexpr.
> > > 
> > >I contrived the testcase array60.C below which verifies
> > > > that we now short-circuit quickly.
> > > > 
> > > > Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK to
> > > > commit?
> > > > 
> > > > gcc/cp/ChangeLog:
> > > > 
> > > > * constexpr.c (cxx_eval_vec_init_1): Move the i == 0 test to the
> > > > if statement that guards the RANGE_EXPR optimization.  Invert
> > > > the ctx->quiet test. Apply the RANGE_EXPR optimization before we
> > > > append the first element initializer.  Truncate ctx->ctor when
> > > > performing the RANGE_EXPR optimization.  Make the built
> > > > RANGE_EXPR start at index 0 instead of 1.  Don't call
> > > > unshare_constructor.
> > > > 
> > > > gcc/testsuite/ChangeLog:
> > > > 
> > > > * g++.dg/cpp0x/constexpr-array28.C: New test.
> > > > * g++.dg/init/array60.C: New test.
> > > > ---
> > > >gcc/cp/constexpr.c| 34
> > > > ++-
> > > >.../g++.dg/cpp0x/constexpr-array28.C  | 14 
> > > >gcc/testsuite/g++.dg/init/array60.C   | 13 +++
> > > >3 files changed, 45 insertions(+), 16 deletions(-)
> > > >create mode 100644 gcc/testsuite/g++.dg/cpp0x/constexpr-array28.C
> > > >create mode 100644 gcc/testsuite/g++.dg/init/array60.C
> > > > 
> > > > diff --git a/gcc/cp/constexpr.c b/gcc/cp/constexpr.c
> > > > index ab747a58fa0..e67ce5da355 100644
> > > > --- a/gcc/cp/constexpr.c
> > > > +++ b/gcc/cp/constexpr.c
> > > > @@ -4205,7 +4205,7 @@ cxx_eval_vec_init_1 (const constexpr_ctx *ctx,
> > > > tree
> > > > atype, tree init,
> > > >   if (value_init || init == NULL_TREE)
> > > > {
> > > >   eltinit = NULL_TREE;
> > > > - reuse = i == 0;
> > > > + reuse = true;
> > > > }
> > > >   else
> > > > eltinit = cp_build_array_ref (input_location, init, idx,
> > > > complain);
> > > > @@ -4222,7 +4222,7 @@ cxx_eval_vec_init_1 (const constexpr_ctx *ctx,
> > > > tree
> > > > atype, tree init,
> > > > return ctx->ctor;
> > > >   eltinit = cxx_eval_constant_expression (_ctx, init,
> > > > lval,
> > > >   non_constant_p,
> > > > overflow_p);
> > > > - reuse = i == 0;
> > > > + reuse = true;
> 
> The patch seems to replace checking i == 0 here with checking it in the
> 

libgo patch committed: Fix system call numbers for ppc

2020-08-11 Thread Ian Lance Taylor via Gcc-patches
The libgo update to the Go1.15rc1 release accidentally broke the
system call numbers used for 32-bit PPC.  This patch fixes the
problem.  This fixes GCC PR 96567.  Bootstrapped and ran Go tests on
x86_64-pc-linux-gnu.  Committed to mainline.

Ian
2dda65380ae0b038884ed0e8b30c624486a516c2
diff --git a/gcc/go/gofrontend/MERGE b/gcc/go/gofrontend/MERGE
index 93aa18cec06..08daa1a5924 100644
--- a/gcc/go/gofrontend/MERGE
+++ b/gcc/go/gofrontend/MERGE
@@ -1,4 +1,4 @@
-c512af85eb8c75a759b5e4fc6b72041fe09b75f1
+e08f1d7d1bc14c0a29eb9ee17980f14fa2397239
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
diff --git a/libgo/go/internal/syscall/unix/sysnum_linux_ppc64x.go 
b/libgo/go/internal/syscall/unix/sysnum_linux_ppc64x.go
index 576937e3f5c..aa2e81a667e 100644
--- a/libgo/go/internal/syscall/unix/sysnum_linux_ppc64x.go
+++ b/libgo/go/internal/syscall/unix/sysnum_linux_ppc64x.go
@@ -2,7 +2,7 @@
 // Use of this source code is governed by a BSD-style
 // license that can be found in the LICENSE file.
 
-// +build ppc64 ppc64le
+// +build ppc ppc64 ppc64le
 
 package unix
 


[Patch 1/5] rs6000, Add 128-bit sign extension support

2020-08-11 Thread Carl Love via Gcc-patches
Segher, Will:

Patch 1, adds the sign extension instruction support and corresponding
builtins.

 Carl Love

-
RS6000 Add 128-bit sign extension support

gcc/ChangeLog

2020-08-10  Carl Love  
* config/rs6000/altivec.h (vec_signextll, vec_signexti): Add define
for new builtins.
* config/rs6000/rs6000-builtin.def (VSIGNEXTI, VSIGNEXTLL):  Add
overloaded builtin definitions.
(VSIGNEXTSB2W, VSIGNEXTSB2D, VSIGNEXTSH2D,VSIGNEXTSW2D): Add builtin
expansions.
* config/rs6000-call.c (P9V_BUILTIN_VEC_VSIGNEXTI,
P9V_BUILTIN_VEC_VSIGNEXTLL): Add overloaded argument definitions.
* config/rs6000/vsx.md: Make define_insn vsx_sign_extend_si_v2di
visible.
* doc/extend.texi:  Add documentation for the vec_signexti and
vec_signextll builtins.

gcc/testsuite/ChangeLog

2020-08-10  Carl Love  
* gcc.target/powerpc/p9-sign_extend-runnable.c:  New test case.
---
 gcc/config/rs6000/altivec.h   |   3 +
 gcc/config/rs6000/rs6000-builtin.def  |   9 ++
 gcc/config/rs6000/rs6000-call.c   |  13 ++
 gcc/config/rs6000/vsx.md  |   2 +-
 gcc/doc/extend.texi   |  15 ++
 .../powerpc/p9-sign_extend-runnable.c | 128 ++
 6 files changed, 169 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/p9-sign_extend-runnable.c

diff --git a/gcc/config/rs6000/altivec.h b/gcc/config/rs6000/altivec.h
index bf2240f16a2..09320df14ca 100644
--- a/gcc/config/rs6000/altivec.h
+++ b/gcc/config/rs6000/altivec.h
@@ -498,6 +498,9 @@
 
 #define vec_xlx __builtin_vec_vextulx
 #define vec_xrx __builtin_vec_vexturx
+#define vec_signexti  __builtin_vec_vsignexti
+#define vec_signextll __builtin_vec_vsignextll
+
 #endif
 
 /* Predicates.
diff --git a/gcc/config/rs6000/rs6000-builtin.def 
b/gcc/config/rs6000/rs6000-builtin.def
index f9f0fece549..667c2450d41 100644
--- a/gcc/config/rs6000/rs6000-builtin.def
+++ b/gcc/config/rs6000/rs6000-builtin.def
@@ -2691,6 +2691,8 @@ BU_P9V_OVERLOAD_1 (VPRTYBD,   "vprtybd")
 BU_P9V_OVERLOAD_1 (VPRTYBQ,"vprtybq")
 BU_P9V_OVERLOAD_1 (VPRTYBW,"vprtybw")
 BU_P9V_OVERLOAD_1 (VPARITY_LSBB,   "vparity_lsbb")
+BU_P9V_OVERLOAD_1 (VSIGNEXTI,  "vsignexti")
+BU_P9V_OVERLOAD_1 (VSIGNEXTLL, "vsignextll")
 
 /* 2 argument functions added in ISA 3.0 (power9).  */
 BU_P9_2 (CMPRB,"byte_in_range",CONST,  cmprb)
@@ -2702,6 +2704,13 @@ BU_P9_OVERLOAD_2 (CMPRB, "byte_in_range")
 BU_P9_OVERLOAD_2 (CMPRB2,  "byte_in_either_range")
 BU_P9_OVERLOAD_2 (CMPEQB,  "byte_in_set")
 
+/* Sign extend builtins that work on ISA 3.0, but not defined until ISA 3.1.  
*/
+BU_P9V_AV_1 (VSIGNEXTSB2W, "vsignextsb2w", CONST,  
vsx_sign_extend_qi_v4si)
+BU_P9V_AV_1 (VSIGNEXTSH2W, "vsignextsh2w", CONST,  
vsx_sign_extend_hi_v4si)
+BU_P9V_AV_1 (VSIGNEXTSB2D, "vsignextsb2d", CONST,  
vsx_sign_extend_qi_v2di)
+BU_P9V_AV_1 (VSIGNEXTSH2D, "vsignextsh2d", CONST,  
vsx_sign_extend_hi_v2di)
+BU_P9V_AV_1 (VSIGNEXTSW2D, "vsignextsw2d", CONST,  
vsx_sign_extend_si_v2di)
+
 /* Builtins for scalar instructions added in ISA 3.1 (power10).  */
 BU_P10_MISC_2 (CFUGED, "cfuged", CONST, cfuged)
 BU_P10_MISC_2 (CNTLZDM, "cntlzdm", CONST, cntlzdm)
diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c
index 189497efb45..87699be8a07 100644
--- a/gcc/config/rs6000/rs6000-call.c
+++ b/gcc/config/rs6000/rs6000-call.c
@@ -5527,6 +5527,19 @@ const struct altivec_builtin_types 
altivec_overloaded_builtins[] = {
 RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI,
 RS6000_BTI_INTSI, RS6000_BTI_INTSI },
 
+  /* Sign extend builtins that work work on ISA 3.0, not added until ISA 3.1 */
+  { P9V_BUILTIN_VEC_VSIGNEXTI, P9V_BUILTIN_VSIGNEXTSB2W,
+RS6000_BTI_V4SI, RS6000_BTI_V16QI, 0, 0 },
+  { P9V_BUILTIN_VEC_VSIGNEXTI, P9V_BUILTIN_VSIGNEXTSH2W,
+RS6000_BTI_V4SI, RS6000_BTI_V8HI, 0, 0 },
+
+  { P9V_BUILTIN_VEC_VSIGNEXTLL, P9V_BUILTIN_VSIGNEXTSB2D,
+RS6000_BTI_V2DI, RS6000_BTI_V16QI, 0, 0 },
+  { P9V_BUILTIN_VEC_VSIGNEXTLL, P9V_BUILTIN_VSIGNEXTSH2D,
+RS6000_BTI_V2DI, RS6000_BTI_V8HI, 0, 0 },
+  { P9V_BUILTIN_VEC_VSIGNEXTLL, P9V_BUILTIN_VSIGNEXTSW2D,
+RS6000_BTI_V2DI, RS6000_BTI_V4SI, 0, 0 },
+
   /* Overloaded built-in functions for ISA3.1 (power10). */
   { P10_BUILTIN_VEC_CLRL, P10_BUILTIN_VCLRLB,
 RS6000_BTI_V16QI, RS6000_BTI_V16QI, RS6000_BTI_UINTSI, 0 },
diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
index dd750210758..1153a01b4ef 100644
--- a/gcc/config/rs6000/vsx.md
+++ b/gcc/config/rs6000/vsx.md
@@ -4787,7 +4787,7 @@
   "vextsh2 %0,%1"
   [(set_attr "type" "vecexts")])
 
-(define_insn "*vsx_sign_extend_si_v2di"
+(define_insn "vsx_sign_extend_si_v2di"
   [(set (match_operand:V2DI 0 "vsx_register_operand" "=v")

[Patch 5/5] rs6000, Conversions between 128-bit integer and floating point values.

2020-08-11 Thread Carl Love via Gcc-patches
Segher, Will:

Patch 5 adds the 128-bit integer to/from 128-floating point
conversions.  This patch has to invoke the routines to use the 128-bit
hardware instructions if on Power 10 or use software routines if
running on a pre Power 10 system via the resolve function.  

  Carl 

---
Conversions between 128-bit integer and floating point values.

gcc/ChangeLog

2020-08-10  Carl Love  
config/rs6000/rs6000.md (floatunsti2,
fix_truncti2, fixuns_truncti2): Add
define_insn for mode IEEE 128.
libgcc/config/rs6000/fixkfi-sw.c: New file.
libgcc/config/rs6000/fixkfi.c: Remove file.
libgcc/config/rs6000/fixunskfi-sw.c: New file.
libgcc/config/rs6000/fixunskfi.c: Remove file.
libgcc/config/rs6000/float128-hw.c (__floattikf_hw,
__floatuntikf_hw, __fixkfti_hw, __fixunskfti_hw):
New functions.
libgcc/config/rs6000/float128-ifunc.c (SW_OR_HW_ISA3_1):
New macro.
(__floattikf_resolve, __floatuntikf_resolve, __fixkfti_resolve,
__fixunskfti_resolve): Add resolve functions.
(__floattikf, __floatuntikf, __fixkfti, __fixunskfti): New
functions.
libgcc/config/rs6000/float128-sed (floattitf, __floatuntitf,
__fixtfti, __fixunstfti): Add editor commands to change
names.
libgcc/config/rs6000/float128-sed-hw (__floattitf,
__floatuntitf, __fixtfti, __fixunstfti): Add editor commands
to change names.
libgcc/config/rs6000/floattikf-sw.c: New file.
libgcc/config/rs6000/floattikf.c: Remove file.
libgcc/config/rs6000/floatuntikf-sw.c: New file.
libgcc/config/rs6000/floatuntikf.c: Remove file.
libgcc/config/rs6000/floatuntikf-sw.c: New file.
libgcc/config/rs6000/quaad-float128.h (__floattikf_sw,
__floatuntikf_sw, __fixkfti_sw, __fixunskfti_sw, __floattikf_hw,
__floatuntikf_hw, __fixkfti_hw, __fixunskfti_hw, __floattikf,
__floatuntikf, __fixkfti, __fixunskfti):New extern declarations.
libgcc/config/rs6000/t-float128 (floattikf, floatuntikf,
fixkfti, fixunskfti): Remove file names from fp128_ppc_funcs.
(floattikf-sw, floatuntikf-sw, fixkfti-sw, fixunskfti-sw): Add
file names to fp128_ppc_funcs.

gcc/testsuite/ChangeLog

2020-08-10  Carl Love  
gcc.target/powerpc/fl128_conversions.c: New file.
---
 gcc/config/rs6000/rs6000.md   |  36 +++
 .../gcc.target/powerpc/fp128_conversions.c| 287 ++
 .../config/rs6000/{fixkfti.c => fixkfti-sw.c} |   4 +-
 .../rs6000/{fixunskfti.c => fixunskfti-sw.c}  |   4 +-
 libgcc/config/rs6000/float128-hw.c|  24 ++
 libgcc/config/rs6000/float128-ifunc.c |  44 ++-
 libgcc/config/rs6000/float128-sed |   4 +
 libgcc/config/rs6000/float128-sed-hw  |   4 +
 .../rs6000/{floattikf.c => floattikf-sw.c}|   4 +-
 .../{floatuntikf.c => floatuntikf-sw.c}   |   4 +-
 libgcc/config/rs6000/quad-float128.h  |  17 +-
 libgcc/config/rs6000/t-float128   |   3 +-
 12 files changed, 415 insertions(+), 20 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/fp128_conversions.c
 rename libgcc/config/rs6000/{fixkfti.c => fixkfti-sw.c} (96%)
 rename libgcc/config/rs6000/{fixunskfti.c => fixunskfti-sw.c} (96%)
 rename libgcc/config/rs6000/{floattikf.c => floattikf-sw.c} (96%)
 rename libgcc/config/rs6000/{floatuntikf.c => floatuntikf-sw.c} (96%)

diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index 43b620ae1c0..3853ebd4195 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -6390,6 +6390,42 @@
xscvsxddp %x0,%x1"
   [(set_attr "type" "fp")])
 
+(define_insn "floatti2"
+  [(set (match_operand:IEEE128 0 "vsx_register_operand" "=v")
+   (float:IEEE128 (match_operand:TI 1 "vsx_register_operand" "v")))]
+  "TARGET_POWER10"
+{
+  return  "xscvsqqp %0,%1";
+}
+  [(set_attr "type" "fp")])
+
+(define_insn "floatunsti2"
+  [(set (match_operand:IEEE128 0 "vsx_register_operand" "=v")
+   (unsigned_float:IEEE128 (match_operand:TI 1 "vsx_register_operand" 
"v")))]
+  "TARGET_POWER10"
+{
+  return  "xscvuqqp %0,%1";
+}
+  [(set_attr "type" "fp")])
+
+(define_insn "fix_truncti2"
+  [(set (match_operand:TI 0 "vsx_register_operand" "=v")
+   (fix:TI (match_operand:IEEE128 1 "vsx_register_operand" "v")))]
+  "TARGET_POWER10"
+{
+  return  "xscvqpsqz %0,%1";
+}
+  [(set_attr "type" "fp")])
+
+(define_insn "fixuns_truncti2"
+  [(set (match_operand:TI 0 "vsx_register_operand" "=v")
+   (unsigned_fix:TI (match_operand:IEEE128 1 "vsx_register_operand" "v")))]
+  "TARGET_POWER10"
+{
+  return  "xscvqpuqz %0,%1";
+}
+  [(set_attr "type" "fp")])
+
 ; Allow the combiner to merge source memory operands to the conversion so that
 ; the optimizer/register allocator doesn't try to load the value too early in a
 ; GPR 

[Patch 2/5] rs6000, 128-bit multiply, divide, modulo, shift, compare

2020-08-11 Thread Carl Love via Gcc-patches
Segher, Will:

Patch 2, adds support for divide, modulo, shift, compare of 128-bit
integers.  The support adds the instruction and builtin support.

 Carl Love


---
rs6000, 128-bit multiply, divide, shift, compare

gcc/ChangeLog

2020-08-10  Carl Love  
* config/rs6000/altivec.h (vec_signextq, vec_dive, vec_mod): Add define
for new builtins .
* config/rs6000/altivec.md (UNSPEC_VMULEUD, UNSPEC_VMULESD,
UNSPEC_VMULOUD, UNSPEC_VMULOSD): New unspecs.
(altivec_eqv1ti, altivec_gtv1ti, altivec_gtuv1ti, altivec_vmuleud,
altivec_vmuloud, altivec_vmulesd, altivec_vmulosd, altivec_vrlq,
altivec_vrlqmi, altivec_vrlqmi_inst, altivec_vrlqnm,
altivec_vrlqnm_inst, altivec_vslq, altivec_vsrq, altivec_vsraq,
altivec_vcmpequt_p, altivec_vcmpgtst_p, altivec_vcmpgtut_p): New
define_insn.
(vec_widen_umult_even_v2di, vec_widen_smult_even_v2di,
vec_widen_umult_odd_v2di, vec_widen_smult_odd_v2di, altivec_vrlqmi,
altivec_vrlqnm): New define_expands.
* config/rs6000/rs6000-builtin.def (BU_P10_P, BU_P10_128BIT_1,
BU_P10_128BIT_2, BU_P10_128BIT_3): New macro definitions.
(VCMPEQUT_P, VCMPGTST_P, VCMPGTUT_P): Add macro expansions.
(VCMPGTUT, VCMPGTST, VCMPEQUT, CMPNET, CMPGE_1TI,
CMPGE_U1TI, CMPLE_1TI, CMPLE_U1TI, VNOR_V1TI_UNS, VNOR_V1TI, VCMPNET_P,
VCMPAET_P): New macro expansions.
(VSIGNEXTSD2Q,VMULEUD, VMULESD, VMULOUD, VMULOSD, VRLQ, VSLQ,
VSRQ, VSRAQ, VRLQNM, DIV_V1TI, UDIV_V1TI, DIVES_V1TI, DIVEU_V1TI,
MODS_V1TI, MODU_V1TI, VRLQMI): New macro expansions.
(VRLQ, VSLQ, VSRQ, VSRAQ, SIGNEXT): New overload expansions.
* config/rs6000/rs6000-call.c (P10_BUILTIN_VCMPEQUT,
P10_BUILTIN_VCMPEQUT, P10_BUILTIN_CMPGE_1TI,
P10_BUILTIN_CMPGE_U1TI, P10_BUILTIN_VCMPGTUT,
P10_BUILTIN_VCMPGTST, P10_BUILTIN_CMPLE_1TI,
P10_BUILTIN_128BIT_DIV_V1TI, P10_BUILTIN_128BIT_UDIV_V1TI,
P10_BUILTIN_128BIT_VMULESD, P10_BUILTIN_128BIT_VMULEUD,
P10_BUILTIN_128BIT_VMULOSD, P10_BUILTIN_128BIT_VMULOUD,
P10_BUILTIN_VNOR_V1TI, P10_BUILTIN_VNOR_V1TI_UNS,
P10_BUILTIN_128BIT_VRLQ, P10_BUILTIN_128BIT_VRLQMI,
P10_BUILTIN_128BIT_VRLQNM, P10_BUILTIN_128BIT_VSLQ,
P10_BUILTIN_128BIT_VSRQ, P10_BUILTIN_128BIT_VSRAQ,
P10_BUILTIN_VCMPGTUT_P, P10_BUILTIN_VCMPGTST_P,
P10_BUILTIN_VCMPEQUT_P, P10_BUILTIN_VCMPGTUT_P,
P10_BUILTIN_VCMPGTST_P, P10_BUILTIN_CMPNET,
P10_BUILTIN_VCMPNET_P, P10_BUILTIN_VCMPAET_P,
P10_BUILTIN_128BIT_VSIGNEXTSD2Q, P10_BUILTIN_128BIT_DIVES_V1TI,
P10_BUILTIN_128BIT_MODS_V1TI, P10_BUILTIN_128BIT_MODU_V1TI):
New overloaded definitions.
(int_ftype_int_v1ti_v1ti) [P10_BUILTIN_VCMPEQUT,
P10_BUILTIN_CMPNET, P10_BUILTIN_CMPGE_1TI,
P10_BUILTIN_CMPGE_U1TI, P10_BUILTIN_VCMPGTUT,
P10_BUILTIN_VCMPGTST, P10_BUILTIN_CMPLE_1TI,
P10_BUILTIN_CMPLE_U1TI, E_V1TImode]: New case statements.
(int_ftype_int_v1ti_v1ti) [bool_V1TI_type_node, 
int_ftype_int_v1ti_v1ti]:
New assignments.
(int_ftype_int_v1ti_v1ti)[P10_BUILTIN_128BIT_VMULEUD,
P10_BUILTIN_128BIT_VMULOUD, P10_BUILTIN_128BIT_DIVEU_V1TI,
P10_BUILTIN_128BIT_MODU_V1TI, P10_BUILTIN_CMPGE_U1TI,
P10_BUILTIN_VCMPGTUT, P10_BUILTIN_VCMPEQUT]: New case statements.
* config/rs6000/r6000.c (rs6000_builtin_mask_calculate): New
TARGET_TI_VECTOR_OPS definition.
(rs6000_option_override_internal): Add if TARGET_POWER10 statement.
(rs6000_handle_altivec_attribute)[ E_TImode, E_V1TImode]: New case
statements.
(rs6000_opt_masks): Add ti-vector-ops entry.
* config/rs6000/r6000.h (MASK_TI_VECTOR_OPS, RS6000_BTM_P10_128BIT,
RS6000_BTM_TI_VECTOR_OPS, bool_V1TI_type_node): New defines.
(rs6000_builtin_type_index): New enum value RS6000_BTI_bool_V1TI.
* config/rs6000/rs6000.opt: New mti-vector-ops entry.
* config/rs6000/vector.md (vector_eqv1ti, vector_gtv1ti,
vector_nltv1ti, vector_gtuv1ti, vector_nltuv1ti, vector_ngtv1ti,
vector_ngtuv1ti, vector_eq_v1ti_p, vector_ne_v1ti_p, vector_ae_v1ti_p,
vector_gt_v1ti_p, vector_gtu_v1ti_p, vrotlv1ti3, vashlv1ti3,
vlshrv1ti3, vashrv1ti3): New define_expands.
* config/rs6000/vsx.md (UNSPEC_VSX_DIVSQ, UNSPEC_VSX_DIVUQ,
UNSPEC_VSX_DIVESQ, UNSPEC_VSX_DIVEUQ, UNSPEC_VSX_MODSQ,
UNSPEC_VSX_MODUQ, UNSPEC_XXSWAPD_V1TI): New unspecs.
(vsx_div_v1ti, vsx_udiv_v1ti, vsx_dives_v1ti, vsx_diveu_v1ti,
vsx_mods_v1ti, vsx_modu_v1ti, xxswapd_v1ti, vsx_sign_extend_v2di_v1ti):
New define_insns.
(vcmpnet): New define_expand.
* gcc/doc/extend.texi: Add documentation for the new builtins vec_rl,
vec_rlmi, vec_rlnm, vec_sl, vec_sr, vec_sra, vec_mule, vec_mulo,
vec_div, 

[Patch 4/5] rs6000, Test 128-bit shifts for just the int128 type.

2020-08-11 Thread Carl Love via Gcc-patches
Segher, Will:

Patch 4 adds 128-bit integer shift instruction support.

 Carl Love

-
Test 128-bit shifts for just the int128 type.

gcc/ChangeLog

2020-08-10  Carl Love  
* config/rs6000/altivec.md (altivec_vslq, altivec_vsrq): Add mode
VEC_I128.
* config/rs6000/vector.md (VEC_I128): New mode iterator.
(vashlv1ti3): Change to vashl3, mode VEC_I128.
(vlshrv1ti3): Change to vlshr3, mode VEC_I128.
* config/rs6000/vsx.md (UNSPEC_XXSWAPD_V1TI): Change to
UNSPEC_XXSWAPD_VEC_I128.
(xxswapd_v1ti): Change to xxswapd_, mode VEC_I128.

gcc/testsuite/ChangeLog

2020-08-10  Carl Love  
gcc.target/powerpc/int_128bit-runnable.c: Add shift_right, shift_left
tests.
---
 gcc/config/rs6000/altivec.md  | 16 +--
 gcc/config/rs6000/vector.md   | 27 ++-
 gcc/config/rs6000/vsx.md  | 14 +-
 .../gcc.target/powerpc/int_128bit-runnable.c  | 24 +++--
 4 files changed, 52 insertions(+), 29 deletions(-)

diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md
index 2763d920828..cba39852070 100644
--- a/gcc/config/rs6000/altivec.md
+++ b/gcc/config/rs6000/altivec.md
@@ -2219,10 +2219,10 @@
   "vsl %0,%1,%2"
   [(set_attr "type" "vecsimple")])
 
-(define_insn "altivec_vslq"
-  [(set (match_operand:V1TI 0 "vsx_register_operand" "=v")
-   (ashift:V1TI (match_operand:V1TI 1 "vsx_register_operand" "v")
-(match_operand:V1TI 2 "vsx_register_operand" "v")))]
+(define_insn "altivec_vslq_"
+  [(set (match_operand:VEC_I128 0 "vsx_register_operand" "=v")
+   (ashift:VEC_I128 (match_operand:VEC_I128 1 "vsx_register_operand" "v")
+(match_operand:VEC_I128 2 "vsx_register_operand" "v")))]
   "TARGET_TI_VECTOR_OPS"
   /* Shift amount in needs to be in bits[57:63] of 128-bit operand. */
   "vslq %0,%1,%2"
@@ -2236,10 +2236,10 @@
   "vsr %0,%1,%2"
   [(set_attr "type" "vecsimple")])
 
-(define_insn "altivec_vsrq"
-  [(set (match_operand:V1TI 0 "vsx_register_operand" "=v")
-   (lshiftrt:V1TI (match_operand:V1TI 1 "vsx_register_operand" "v")
-  (match_operand:V1TI 2 "vsx_register_operand" "v")))]
+(define_insn "altivec_vsrq_"
+  [(set (match_operand:VEC_I128 0 "vsx_register_operand" "=v")
+   (lshiftrt:VEC_I128 (match_operand:VEC_I128 1 "vsx_register_operand" "v")
+  (match_operand:VEC_I128 2 "vsx_register_operand" 
"v")))]
   "TARGET_TI_VECTOR_OPS"
   /* Shift amount in needs to be in bits[57:63] of 128-bit operand. */
   "vsrq %0,%1,%2"
diff --git a/gcc/config/rs6000/vector.md b/gcc/config/rs6000/vector.md
index 2deff282076..682aabc4657 100644
--- a/gcc/config/rs6000/vector.md
+++ b/gcc/config/rs6000/vector.md
@@ -26,6 +26,9 @@
 ;; Vector int modes
 (define_mode_iterator VEC_I [V16QI V8HI V4SI V2DI])
 
+;; 128-bit int modes
+(define_mode_iterator VEC_I128 [V1TI TI])
+
 ;; Vector int modes for parity
 (define_mode_iterator VEC_IP [V8HI
  V4SI
@@ -1635,17 +1638,17 @@
   "")
 
 ;; No immediate version of this 128-bit instruction
-(define_expand "vashlv1ti3"
-  [(set (match_operand:V1TI 0 "vsx_register_operand" "=v")
-   (ashift:V1TI (match_operand:V1TI 1 "vsx_register_operand" "v")
-(match_operand:V1TI 2 "vsx_register_operand" "v")))]
+(define_expand "vashl3"
+  [(set (match_operand:VEC_I128 0 "vsx_register_operand" "=v")
+   (ashift:VEC_I128 (match_operand:VEC_I128 1 "vsx_register_operand")
+(match_operand:VEC_I128 2 "vsx_register_operand")))]
   "TARGET_TI_VECTOR_OPS"
 {
   /* Shift amount in needs to be put in bits[57:63] of 128-bit operand2. */
-  rtx tmp = gen_reg_rtx (V1TImode);
+  rtx tmp = gen_reg_rtx (mode);
 
   emit_insn(gen_xxswapd_v1ti (tmp, operands[2]));
-  emit_insn(gen_altivec_vslq (operands[0], operands[1], tmp));
+  emit_insn(gen_altivec_vslq_ (operands[0], operands[1], tmp));
   DONE;
 })
 
@@ -1658,17 +1661,17 @@
   "")
 
 ;; No immediate version of this 128-bit instruction
-(define_expand "vlshrv1ti3"
-  [(set (match_operand:V1TI 0 "vsx_register_operand" "=v")
-   (lshiftrt:V1TI (match_operand:V1TI 1 "vsx_register_operand" "v")
-  (match_operand:V1TI 2 "vsx_register_operand" "v")))]
+(define_expand "vlshr3"
+  [(set (match_operand:VEC_I128 0 "vsx_register_operand" "=v")
+   (lshiftrt:VEC_I128 (match_operand:VEC_I128 1 "vsx_register_operand")
+  (match_operand:VEC_I128 2 "vsx_register_operand")))]
   "TARGET_TI_VECTOR_OPS"
 {
   /* Shift amount in needs to be put into bits[57:63] of 128-bit operand2. */
-  rtx tmp = gen_reg_rtx (V1TImode);
+  rtx tmp = gen_reg_rtx (mode);
 
   emit_insn(gen_xxswapd_v1ti (tmp, operands[2]));
-  emit_insn(gen_altivec_vsrq (operands[0], operands[1], tmp));
+  emit_insn(gen_altivec_vsrq_ (operands[0], operands[1], 

[Patch 3/5] rs6000, Add TI to TD (128-bit DFP) and TD to TI support

2020-08-11 Thread Carl Love via Gcc-patches
Segher, Will:

Path 3 adds support for converting to/from 128-bit integers and 128-bit 
decimal floating point formats.  

  Carl Love



Add TI to TD (128-bit DFP) and TD to TI support

gcc/ChangeLog

2020-08-10  Carl Love  
* config/rs6000/dfp.md (floattitd2, fixtdti2): New define_insns.

gcc/testsuite/ChangeLog

2020-08-10  Carl Love  
* gcc.target/powerpc/int_128bit-runnable.c:  Add tests.
---
 gcc/config/rs6000/dfp.md  | 15 +
 .../gcc.target/powerpc/int_128bit-runnable.c  | 64 +++
 2 files changed, 79 insertions(+)

diff --git a/gcc/config/rs6000/dfp.md b/gcc/config/rs6000/dfp.md
index 8f822732bac..ac9fe189f3e 100644
--- a/gcc/config/rs6000/dfp.md
+++ b/gcc/config/rs6000/dfp.md
@@ -222,6 +222,13 @@
   "dcffixq %0,%1"
   [(set_attr "type" "dfp")])
 
+(define_insn "floattitd2"
+  [(set (match_operand:TD 0 "gpc_reg_operand" "=d")
+   (float:TD (match_operand:TI 1 "gpc_reg_operand" "v")))]
+  "TARGET_TI_VECTOR_OPS"
+  "dcffixqq %0,%1"
+  [(set_attr "type" "dfp")])
+
 ;; Convert a decimal64/128 to a decimal64/128 whose value is an integer.
 ;; This is the first stage of converting it to an integer type.
 
@@ -241,6 +248,14 @@
   "TARGET_DFP"
   "dctfix %0,%1"
   [(set_attr "type" "dfp")])
+
+  ;; carll
+(define_insn "fixtdti2"
+  [(set (match_operand:TI 0 "gpc_reg_operand" "=v")
+   (fix:TI (match_operand:TD 1 "gpc_reg_operand" "d")))]
+  "TARGET_TI_VECTOR_OPS"
+  "dctfixqq %0,%1"
+  [(set_attr "type" "dfp")])
 
 ;; Decimal builtin support
 
diff --git a/gcc/testsuite/gcc.target/powerpc/int_128bit-runnable.c 
b/gcc/testsuite/gcc.target/powerpc/int_128bit-runnable.c
index c84494fc28d..d1e69cea021 100644
--- a/gcc/testsuite/gcc.target/powerpc/int_128bit-runnable.c
+++ b/gcc/testsuite/gcc.target/powerpc/int_128bit-runnable.c
@@ -38,6 +38,7 @@
 #if DEBUG
 #include 
 #include 
+#include 
 
 
 void print_i128(__int128_t val)
@@ -59,6 +60,13 @@ int main ()
   __int128_t arg1, result;
   __uint128_t uarg2;
 
+  _Decimal128 arg1_dfp128, result_dfp128, expected_result_dfp128;
+
+  struct conv_t {
+__uint128_t u128;
+_Decimal128 d128;
+  } conv, conv2;
+
   vector signed long long int vec_arg1_di, vec_arg2_di;
   vector unsigned long long int vec_uarg1_di, vec_uarg2_di, vec_uarg3_di;
   vector unsigned long long int vec_uresult_di;
@@ -2249,6 +2257,62 @@ int main ()
 abort();
 #endif
   }
+  
+  /* DFP to __int128 and __int128 to DFP conversions */
+  /* Can't get printing of DFP values to work.  Print the DFP value as an
+ unsigned int so we can see the bit patterns.  */
+#if 1
+  conv.u128 = 0x2208ULL;
+  conv.u128 = (conv.u128 << 64) | 0x4ULL;   //DFP bit pattern for integer 4
+  expected_result_dfp128 = conv.d128;
+
+  arg1 = 4;
+
+  conv.d128 = (_Decimal128) arg1;
+
+  result_dfp128 = (_Decimal128) arg1;
+  if (((conv.u128 >>64) != 0x2208ULL) &&
+  ((conv.u128 & 0x) != 0x4ULL)) {
+#if DEBUG
+printf("ERROR:  convert int128 value ");
+print_i128 (arg1);
+conv.d128 = result_dfp128;
+printf("\nto DFP value 0x%llx %llx (printed as hex bit string) ",
+  (unsigned long long)((conv.u128) >>64),
+  (unsigned long long)((conv.u128) & 0x));
+
+conv.d128 = expected_result_dfp128;
+printf("\ndoes not match expected_result = 0x%llx %llx\n\n",
+  (unsigned long long) (conv.u128>>64),
+  (unsigned long long) (conv.u128 & 0x));
+#else
+abort();
+#endif
+  }
+#endif
+
+  expected_result = 4;
 
+  conv.u128 = 0x2208ULL;
+  conv.u128 = (conv.u128 << 64) | 0x4ULL;  // 4 as DFP
+  arg1_dfp128 = conv.d128;
+
+  result = (__int128_t) arg1_dfp128;
+
+  if (result != expected_result) {
+#if DEBUG
+printf("ERROR:  convert DFP value ");
+printf("0x%llx %llx (printed as hex bit string) ",
+  (unsigned long long)(conv.u128>>64),
+  (unsigned long long)(conv.u128 & 0x));
+printf("to __int128 value = ");
+print_i128 (result);
+printf("\ndoes not match expected_result = ");
+print_i128 (expected_result);
+printf("\n");
+#else
+abort();
+#endif
+  }
   return 0;
 }
-- 
2.25.1




[Patch 0/5] rs6000, 128-bit Binary Integer Operations

2020-08-11 Thread Carl Love via Gcc-patches
Segher:

The following is a five patch series for the 128-bit Binary Integer
Operations (RFC 2608).

The last patch does the 128-bit integer to 128-bit float to/from
conversions.  The patch has been reviewed by Michael Meissner to make
sure the Floating point 128-mode handling is correct.

The patches have been tested on Power 8 and Power 9 to ensure there are
no regression errors.  The new tests have been manually compiled and
run on mambo to ensure they work correctly.

Please review the patches and let me know if they are acceptable for
mainline.  Thanks.

   Carl Love



Re: [PATCH] c++: Improve RANGE_EXPR optimization in cxx_eval_vec_init

2020-08-11 Thread Jason Merrill via Gcc-patches

On 8/10/20 9:21 AM, Patrick Palka wrote:

On Fri, 7 Aug 2020, Jason Merrill wrote:


On 8/6/20 1:50 PM, Patrick Palka wrote:

This patch eliminates an exponential dependence in cxx_eval_vec_init on
the array dimension of a VEC_INIT_EXPR when the RANGE_EXPR optimization
applies.  This is achieved by using a single constructor_elt (with index
RANGE_EXPR 0...max-1) per dimension instead of two constructor_elts
(with index 0 and RANGE_EXPR 1...max-1 respectively).  In doing so, we
can also get rid of the call to unshare_constructor since the element
initializer now gets used in exactly one spot.

The patch also removes the 'eltinit = new_ctx.ctor' assignment within the
RANGE_EXPR optimization since eltinit should already always be equal to
new_ctx.ctor here (modulo encountering an error when computing eltinit).
This was verified by running the testsuite against an appropriate assert.


Maybe keep that assert?


FWIW, the assert was

   gcc_assert (*non_constant_p || eltinit == new_ctx->ctor);

and apparently it survives the testsuite when added to either the
RANGE_EXPR or non-RANGE_EXPR code paths in cxx_eval_vec_init.

I then tried adding an analogous assert to cxx_eval_bare_aggregate, but
this assert triggers for lots of our testcases, in particular when (but
not only when) an elt initializer is already a reduced constant
CONSTRUCTOR (since then cxx_eval_constant_expression just returns this
already-reduced CONSTRUCTOR without updating ctx->ctor).

I'm not sure why the assert should necessarily hold in cxx_eval_vec_init
but not in cxx_eval_bare_aggregate.  I guess we never see a
VEC_INIT_EXPR whose elt initializer is a reduced constant CONSTRUCTOR or
similar?


That sounds like a plausible reason.




Finally, this patch reverses the sense of the ctx->quiet test that
controls whether to short-circuit evaluation upon seeing an error.  This
should speed up speculative evaluation of non-constant VEC_INIT_EXPRs
(since ctx->quiet is true then).  I'm not sure why we were testing
!ctx->quiet originally; it's inconsistent with how we short-circuit in
other spots.


Good question.  That code seems to go back to the initial implementation of
constexpr.

   I contrived the testcase array60.C below which verifies

that we now short-circuit quickly.

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK to
commit?

gcc/cp/ChangeLog:

* constexpr.c (cxx_eval_vec_init_1): Move the i == 0 test to the
if statement that guards the RANGE_EXPR optimization.  Invert
the ctx->quiet test. Apply the RANGE_EXPR optimization before we
append the first element initializer.  Truncate ctx->ctor when
performing the RANGE_EXPR optimization.  Make the built
RANGE_EXPR start at index 0 instead of 1.  Don't call
unshare_constructor.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/constexpr-array28.C: New test.
* g++.dg/init/array60.C: New test.
---
   gcc/cp/constexpr.c| 34 ++-
   .../g++.dg/cpp0x/constexpr-array28.C  | 14 
   gcc/testsuite/g++.dg/init/array60.C   | 13 +++
   3 files changed, 45 insertions(+), 16 deletions(-)
   create mode 100644 gcc/testsuite/g++.dg/cpp0x/constexpr-array28.C
   create mode 100644 gcc/testsuite/g++.dg/init/array60.C

diff --git a/gcc/cp/constexpr.c b/gcc/cp/constexpr.c
index ab747a58fa0..e67ce5da355 100644
--- a/gcc/cp/constexpr.c
+++ b/gcc/cp/constexpr.c
@@ -4205,7 +4205,7 @@ cxx_eval_vec_init_1 (const constexpr_ctx *ctx, tree
atype, tree init,
  if (value_init || init == NULL_TREE)
{
  eltinit = NULL_TREE;
- reuse = i == 0;
+ reuse = true;
}
  else
eltinit = cp_build_array_ref (input_location, init, idx,
complain);
@@ -4222,7 +4222,7 @@ cxx_eval_vec_init_1 (const constexpr_ctx *ctx, tree
atype, tree init,
return ctx->ctor;
  eltinit = cxx_eval_constant_expression (_ctx, init, lval,
  non_constant_p, overflow_p);
- reuse = i == 0;
+ reuse = true;


The patch seems to replace checking i == 0 here with checking it in the 
condition below, which seems equivalent.  Why?



}
 else
{
@@ -4236,35 +4236,37 @@ cxx_eval_vec_init_1 (const constexpr_ctx *ctx, tree
atype, tree init,
  eltinit = cxx_eval_constant_expression (_ctx, eltinit, lval,
  non_constant_p, overflow_p);
}
-  if (*non_constant_p && !ctx->quiet)
+  if (*non_constant_p && ctx->quiet)
break;
-  if (new_ctx.ctor != ctx->ctor)
-   {
- /* We appended this element above; update the value.  */
- gcc_assert ((*p)->last().index == idx);
- (*p)->last().value = eltinit;
-   }
-  else
-   CONSTRUCTOR_APPEND_ELT (*p, idx, eltinit);
+
 /* Reuse the result of cxx_eval_constant_expression 

[PR96519] Re: [PATCH][testsuite] Add gcc.dg/ia64-sync-5.c

2020-08-11 Thread Kwok Cheung Yeung

Hello

On 06/08/2020 1:23 pm, Tom de Vries wrote:
> +static char AC[4];
> +static char init_qi[4] = { -30,-30,-50,-50 };
> +static char test_qi[4] = { -115,-115,25,25 };
> +
> +static void
> +do_qi (void)
> +{
> +  if (__sync_val_compare_and_swap(AC+0, -30, -115) != -30)
> +abort ();

If 'char' is unsigned by default, then init_qi will contain { 226, 226, 206, 
206} and test_qi { 141, 141, 25, 25 }, which will result in the comparison 
against -30 failing when the previous value of AC[0] is implicitly promoted to 
signed int. This can be fixed by making the array element types explicitly signed.


This issue is tracked as issue 96519 on the tracker. I have checked that the 
test now passes on PowerPC and Aarch64. Is the fix okay for trunk?


Thanks

Kwok
commit fc6ac3af45a238da0bd65e020ae6f0f165b57b87
Author: Kwok Cheung Yeung 
Date:   Tue Aug 11 09:41:10 2020 -0700

Fix gcc.dg/ia64-sync-5.c for architectures with unsigned char as default 
(PR 96519)

If char is unsigned, then comparisons of the char array elements against
negative integers in the test will fail as values in the array will always
be positive, and will remain so when promoted to signed int.

2020-08-11  Kwok Cheung Yeung  

PR testsuite/96519

gcc/testsuite/
* gcc.dg/ia64-sync-5.c (AC, init_qi, test_qi): Change element type to
signed char.

diff --git a/gcc/testsuite/gcc.dg/ia64-sync-5.c 
b/gcc/testsuite/gcc.dg/ia64-sync-5.c
index 8b16b29..a3923b0 100644
--- a/gcc/testsuite/gcc.dg/ia64-sync-5.c
+++ b/gcc/testsuite/gcc.dg/ia64-sync-5.c
@@ -14,9 +14,9 @@ extern void abort (void);
 extern void *memcpy (void *, const void *, size_t);
 extern int memcmp (const void *, const void *, size_t);
 
-static char AC[4];
-static char init_qi[4] = { -30,-30,-50,-50 };
-static char test_qi[4] = { -115,-115,25,25 };
+static signed char AC[4];
+static signed char init_qi[4] = { -30,-30,-50,-50 };
+static signed char test_qi[4] = { -115,-115,25,25 };
 
 static void
 do_qi (void)


Re: [RS6000] PR96493, powerpc local call linkage failure

2020-08-11 Thread Segher Boessenkool
On Tue, Aug 11, 2020 at 12:36:28PM -0500, Peter Bergner wrote:
> On 8/11/20 11:35 AM, Segher Boessenkool wrote:
> > Hi Alan,
> > 
> > On Tue, Aug 11, 2020 at 06:38:53PM +0930, Alan Modra wrote:
> >> This fixes a fail when power10 isn't supported by binutils, and
> >> ensures the test isn't run without power10 hardware or simulation on
> >> the off chance that power10 insns are emitted in the future for this
> >> testcase.
> > 
> > The testcases said it wanted power8, so why did it fail?  GCC shouldn't
> > use anything that requires p10 support in binutils then, or what do I
> > miss here?
> 
> It failed with an assembler error because one of the functions in the
> test uses an attribute target power10 and GCC emits a ".machine power10"
> assembler directive in case we do generate a power10 instruction(s).
> The old binutils Bill used doesn't know about power10, so boom.
> That is what requires the dg-require-effective-target power10_ok.

Ah, okay.

> Now given the power10 function is so small (just a call to a p8
> function), the chance we'll generate a p10 instruction is low (zero?),
> so we could just keep the dg-do run as is (ie, always run), but
> might that change one day?

On a non-p10 it will just use the generated non-p10 code, and that will
just work, now and for forever (yeah right :-) )

Either always running or what this patch does will work.  But please add
comments what the test case wants to test, and for the tricky bits.


Segher


[wwwdocs] Update C++ DR table

2020-08-11 Thread Marek Polacek via Gcc-patches
Three new DRs.  Pushed.

commit 0f4de13033aff42c43e1bbde9a3f7a5a31f33559
Author: Marek Polacek 
Date:   Tue Aug 11 14:17:15 2020 -0400

Update C++ DR table.

diff --git a/htdocs/projects/cxx-dr-status.html 
b/htdocs/projects/cxx-dr-status.html
index 5ee13d6d..7c0b9f3e 100644
--- a/htdocs/projects/cxx-dr-status.html
+++ b/htdocs/projects/cxx-dr-status.html
@@ -17164,11 +17164,32 @@
   ?
   
 
+
+  https://wg21.link/cwg2448;>2448
+  open
+  Cv-qualification of arithmetic types and deprecation of volatile
+  -
+  
+
+
+  https://wg21.link/cwg2449;>2449
+  open
+  Thunks as an implementation technique for pointers to virtual 
functions
+  -
+  
+
+
+  https://wg21.link/cwg2450;>2450
+  open
+  braced-init-list as a template-argument
+  -
+  
+
   
 
   This page is currently maintained by mailto:pola...@redhat.com;>pola...@redhat.com.
   Last update:
-Mon 10 Aug 2020 01:59:22 PM EDT
+Tue 11 Aug 2020 02:12:01 PM EDT
   
 
 



Re: [PATCH 1/2] PowerPC: Rename min/max/cmove functions.

2020-08-11 Thread Segher Boessenkool
Hi!

On Tue, Aug 11, 2020 at 12:22:06PM -0400, Michael Meissner wrote:
>   (rs6000_emit_p9_fp_minmax,generate_fp_min_max): Rename.

Space after comma.  "Rename." is never useful (and you should show what
is renamed to what).  The patch does *not* do the same or similar to the
two names inside the parens here, so you cannot do this in one entry.

> -/* ISA 3.0 (power9) minmax subcase to emit a XSMAXCDP or XSMINCDP instruction
> -   for SF/DF scalars.  Move TRUE_COND to DEST if OP of the operands of the 
> last
> -   comparison is nonzero/true, FALSE_COND if it is zero/false.  Return 0 if 
> the
> -   hardware has no such operation.  */
> +/* Possibly emit an appropriate minimum or maximum instruction for floating
> +   point scalars.
> +
> +   Move TRUE_COND to DEST if OP of the operands of the last comparison is
> +   nonzero/true, FALSE_COND if it is zero/false.
> +
> +   Return 0 if we can't generate the appropriate minimum or maximum, and 1 if
> +   we can did the minimum or maximum.  */

It should say these are "C" variants, not the proper min/max ones.
"Appropriate" is very misleading here.

> -/* ISA 3.0 (power9) conditional move subcase to emit XSCMP{EQ,GE,GT,NE}DP and
> -   XXSEL instructions for SF/DF scalars.  Move TRUE_COND to DEST if OP of the
> -   operands of the last comparison is nonzero/true, FALSE_COND if it is
> -   zero/false.  Return 0 if the hardware has no such operation.  */
> +/* Possibly emit a floating point conditional move by generating a compare 
> that
> +   sets a mask instruction and a XXSEL select instruction.
> +
> +   Move TRUE_COND to DEST if OP of the operands of the last comparison is
> +   nonzero/true, FALSE_COND if it is zero/false.
> +
> +   Return 0 if the operation cannot be generated, and 1 if we could generate
> +   the instruction.  */

... and 1 if we *did* generate it.

Change this to a bool, and rename to "maybe_emit" etc.?

"generate_" is a horrible name...  We already have "gen_" things, with
different semantics, and this emits the insn, doesn't just generate it.

Thanks,


Segher


Re: [RS6000] PR96493, powerpc local call linkage failure

2020-08-11 Thread Peter Bergner via Gcc-patches
On 8/11/20 11:35 AM, Segher Boessenkool wrote:
> Hi Alan,
> 
> On Tue, Aug 11, 2020 at 06:38:53PM +0930, Alan Modra wrote:
>> This fixes a fail when power10 isn't supported by binutils, and
>> ensures the test isn't run without power10 hardware or simulation on
>> the off chance that power10 insns are emitted in the future for this
>> testcase.
> 
> The testcases said it wanted power8, so why did it fail?  GCC shouldn't
> use anything that requires p10 support in binutils then, or what do I
> miss here?

It failed with an assembler error because one of the functions in the
test uses an attribute target power10 and GCC emits a ".machine power10"
assembler directive in case we do generate a power10 instruction(s).
The old binutils Bill used doesn't know about power10, so boom.
That is what requires the dg-require-effective-target power10_ok.

Now given the power10 function is so small (just a call to a p8
function), the chance we'll generate a p10 instruction is low (zero?),
so we could just keep the dg-do run as is (ie, always run), but
might that change one day?

Peter




Re: PING [Patch][Middle-end]Add -fzero-call-used-regs=[skip|used-gpr|all-gpr|used|all]

2020-08-11 Thread Qing Zhao via Gcc-patches
Hi, Alexandre,

CC’ing Richard for his comments on this.


> On Aug 10, 2020, at 9:39 PM, Alexandre Oliva  wrote:
>> I think that moving how to zeroing the registers part to each target
>> will be a better solution since each target has
>> Better idea on how to use the most efficient insns to do the work.
> 
> It's certainly good to allow machine-specific optimized code sequences,
> but it would certainly be desirable to have a machine-independent
> fallback.  It doesn't seem exceedingly hard to loop over the registers
> and emit a (set (reg:M N) (const_int 0)) for each one that is to be
> zeroed out.

The current implementation already includes such machine-independent code, it 
should be very easy to add this.

Richard, what’s your opinion on this?
Do we need a machine-independent implementation to zeroing the registers for 
the default when the target does not provide a optimized
Implementation?

Thanks.

Qing

> 
> 



Re: [PATCH] arm: Clear canary value after stack_protect_test [PR96191]

2020-08-11 Thread Richard Sandiford
Christophe Lyon  writes:
> On Mon, 10 Aug 2020 at 17:27, Richard Sandiford
>  wrote:
>>
>> Christophe Lyon  writes:
>> > On Wed, 5 Aug 2020 at 16:33, Richard Sandiford
>> >  wrote:
>> >>
>> >> The stack_protect_test patterns were leaving the canary value in the
>> >> temporary register, meaning that it was often still in registers on
>> >> return from the function.  An attacker might therefore have been
>> >> able to use it to defeat stack-smash protection for a later function.
>> >>
>> >> Tested on arm-linux-gnueabi, arm-linux-gnueabihf and armeb-eabi.
>> >> I tested the thumb1.md part using arm-linux-gnueabi with the
>> >> test flags -march=armv5t -mthumb.  OK for trunk and branches?
>> >>
>> >> As I mentioned in the corresponding aarch64 patch, this is needed
>> >> to make arm conform to GCC's current -fstack-protector implementation.
>> >> However, I think we should reconsider whether the zeroing is actually
>> >> necessary and what it's actually protecting against.  I'll send a
>> >> separate message about that to gcc@.  But since the port isn't even
>> >> self-consistent (the *set patterns do clear the registers), I think
>> >> we should do this first rather than wait for any outcome of that
>> >> discussion.
>> >>
>> >> Richard
>> >>
>> >>
>> >> gcc/
>> >> PR target/96191
>> >> * config/arm/arm.md (arm_stack_protect_test_insn): Zero out
>> >> operand 2 after use.
>> >> * config/arm/thumb1.md (thumb1_stack_protect_test_insn): Likewise.
>> >>
>> >> gcc/testsuite/
>> >> * gcc.target/arm/stack-protector-1.c: New test.
>> >> * gcc.target/arm/stack-protector-2.c: Likewise.
>> >
>> > Hi Richard,
>> >
>> > The new tests fail when compiled with -mcpu=cortex-mXX because gas 
>> > complains:
>> > use of r13 is deprecated
>> > It has a comment saying: "In the Thumb-2 ISA, use of R13 as Rm is
>> > deprecated, but valid."
>> >
>> > It's a minor nuisance, I'm not sure what the best way of getting rid of it?
>> > Add #ifndef __thumb2__ around CHECK(r13) ?
>>
>> Hmm, maybe we should just drop that line altogether.  It wasn't exactly
>> likely that r13 would be the register to leak the value :-)
>>
>> Should I post a patch or do you already have one ready?
>
> I was about to push the patch that removes the line CHECK(r13).
>
> However, I've noticed that when using -mcpu=cortex-m[01], we have an
> error from gas:
> Error: Thumb does not support this addressing mode -- `str r0,[sp,#-8]!'

Seems like writing a correct arm.exp test is almost as difficult
(for me) as writing a correct vect.exp test :-)

> This patch replaces the str instruction with
>  sub   sp, sp, #8
>  str r0, [sp]
> and removes the check for r13, which is unlikely to leak the canary
> value.
>
> 2020-08-11  Christophe Lyon  
>
>   gcc/testsuite/
>   * gcc.target/arm/stack-protector-1.c: Adapt code to Cortex-M
>   restrictions.

OK, thanks.  I'm afraid this is already on GCC 10 and 9, so OK there too.
I'll fold this in when backporting to GCC 8.

Richard


Re: [PATCH] testsuite: Fix gcc.target/arm/multilib.exp use of gcc_opts

2020-08-11 Thread Richard Sandiford
Christophe Lyon via Gcc-patches  writes:
> This patch fixes an incorrect parameter passing for $gcc_opts, which
> produces a DejaGnu error: (DejaGnu) proc "gcc_opts" does not exist.

Huh, wonder how that went unnoticed for so long…

> 2020-08-11  Christophe Lyon  
>
> gcc/testsuite/
> * gcc.target/arm/multilib.exp: Fix parameter passing for gcc_opts.

OK everywhere that needs it, thanks.

Richard

> diff --git a/gcc/testsuite/gcc.target/arm/multilib.exp 
> b/gcc/testsuite/gcc.target/arm/multilib.exp
> index f67a92a..c5f3c02 100644
> --- a/gcc/testsuite/gcc.target/arm/multilib.exp
> +++ b/gcc/testsuite/gcc.target/arm/multilib.exp
> @@ -40,7 +40,7 @@ proc multilib_config {profile} {
>  proc check_multi_dir { gcc_opts multi_dir } {
>  global tool
>  
> -set options [list "additional_flags=[concat "--print-multi-directory" 
> [gcc_opts]]"]
> +set options [list "additional_flags=[concat "--print-multi-directory" 
> $gcc_opts]"]
>  set gcc_output [${tool}_target_compile "" "" "none" $options]
>  if { [string match "$multi_dir\n" $gcc_output] } {
>   pass "multilibdir $gcc_opts $multi_dir"


Re: [RS6000] PR96493, powerpc local call linkage failure

2020-08-11 Thread Segher Boessenkool
Hi Alan,

On Tue, Aug 11, 2020 at 06:38:53PM +0930, Alan Modra wrote:
> This fixes a fail when power10 isn't supported by binutils, and
> ensures the test isn't run without power10 hardware or simulation on
> the off chance that power10 insns are emitted in the future for this
> testcase.

The testcases said it wanted power8, so why did it fail?  GCC shouldn't
use anything that requires p10 support in binutils then, or what do I
miss here?


Segher


> --- a/gcc/testsuite/gcc.target/powerpc/pr96493.c
> +++ b/gcc/testsuite/gcc.target/powerpc/pr96493.c
> @@ -1,6 +1,8 @@
> -/* { dg-do run } */
> +/* { dg-do run { target { power10_hw } } } */
> +/* { dg-do link { target { ! power10_hw } } } */
>  /* { dg-options "-mdejagnu-cpu=power8 -O2" } */
>  /* { dg-require-effective-target powerpc_elfv2 } */
> +/* { dg-require-effective-target power10_ok } */
>  
>  /* Test local calls between pcrel and non-pcrel code.
>  


[committed][testsuite] Add missing require-effective-target directives in gcc.dg

2020-08-11 Thread Tom de Vries
Hi,

Add some missing require-effect-targets directives (alloca, indirect_jumps,
label_values and nonlocal_goto).

Tested on nvptx.

Committed to trunk.

Thanks,
- Tom

[testsuite] Add missing require-effective-target directives in gcc.dg

gcc/testsuite/ChangeLog:

* gcc.dg/Warray-bounds-46.c: Add missing require-effective-target
directive.
* gcc.dg/Warray-bounds-48.c: Same.
* gcc.dg/Warray-bounds-50.c: Same.
* gcc.dg/Wreturn-local-addr-2.c: Same.
* gcc.dg/Wreturn-local-addr-3.c: Same.
* gcc.dg/Wreturn-local-addr-4.c: Same.
* gcc.dg/Wreturn-local-addr-6.c: Same.
* gcc.dg/Wstack-usage.c: Same.
* gcc.dg/Wstringop-overflow-15.c: Same.
* gcc.dg/Wstringop-overflow-23.c: Same.
* gcc.dg/Wstringop-overflow-25.c: Same.
* gcc.dg/Wstringop-overflow-27.c: Same.
* gcc.dg/Wstringop-overflow-39.c: Same.
* gcc.dg/analyzer/alloca-leak.c: Same.
* gcc.dg/analyzer/data-model-1.c: Same.
* gcc.dg/analyzer/data-model-16.c: Same.
* gcc.dg/analyzer/malloc-1.c: Same.
* gcc.dg/analyzer/malloc-paths-8.c: Same.
* gcc.dg/analyzer/pr93546.c: Same.
* gcc.dg/analyzer/setjmp-1.c: Same.
* gcc.dg/analyzer/setjmp-2.c: Same.
* gcc.dg/analyzer/setjmp-3.c: Same.
* gcc.dg/analyzer/setjmp-4.c: Same.
* gcc.dg/analyzer/setjmp-5.c: Same.
* gcc.dg/analyzer/setjmp-6.c: Same.
* gcc.dg/analyzer/setjmp-7.c: Same.
* gcc.dg/analyzer/setjmp-7a.c: Same.
* gcc.dg/analyzer/setjmp-8.c: Same.
* gcc.dg/analyzer/setjmp-9.c: Same.
* gcc.dg/analyzer/setjmp-pr93378.c: Same.
* gcc.dg/gimplefe-44.c: Same.
* gcc.dg/pr84131.c: Same.
* gcc.dg/pr93986.c: Same.
* gcc.dg/pr95133.c: Same.
* gcc.dg/pr95857.c: Same.
* gcc.dg/strlenopt-83.c: Same.
* gcc.dg/strlenopt-84.c: Same.
* gcc.dg/strlenopt-91.c: Same.
* gcc.dg/uninit-32.c: Same.
* gcc.dg/uninit-36.c: Same.

---
 gcc/testsuite/gcc.dg/Warray-bounds-46.c| 3 ++-
 gcc/testsuite/gcc.dg/Warray-bounds-48.c| 3 ++-
 gcc/testsuite/gcc.dg/Warray-bounds-50.c| 3 ++-
 gcc/testsuite/gcc.dg/Wreturn-local-addr-2.c| 3 ++-
 gcc/testsuite/gcc.dg/Wreturn-local-addr-3.c| 3 ++-
 gcc/testsuite/gcc.dg/Wreturn-local-addr-4.c| 3 ++-
 gcc/testsuite/gcc.dg/Wreturn-local-addr-6.c| 3 ++-
 gcc/testsuite/gcc.dg/Wstack-usage.c| 3 ++-
 gcc/testsuite/gcc.dg/Wstringop-overflow-15.c   | 3 ++-
 gcc/testsuite/gcc.dg/Wstringop-overflow-23.c   | 3 ++-
 gcc/testsuite/gcc.dg/Wstringop-overflow-25.c   | 3 ++-
 gcc/testsuite/gcc.dg/Wstringop-overflow-27.c   | 3 ++-
 gcc/testsuite/gcc.dg/Wstringop-overflow-39.c   | 3 ++-
 gcc/testsuite/gcc.dg/analyzer/alloca-leak.c| 2 ++
 gcc/testsuite/gcc.dg/analyzer/data-model-1.c   | 2 ++
 gcc/testsuite/gcc.dg/analyzer/data-model-16.c  | 2 ++
 gcc/testsuite/gcc.dg/analyzer/malloc-1.c   | 2 ++
 gcc/testsuite/gcc.dg/analyzer/malloc-paths-8.c | 1 +
 gcc/testsuite/gcc.dg/analyzer/pr93546.c| 1 +
 gcc/testsuite/gcc.dg/analyzer/setjmp-1.c   | 1 +
 gcc/testsuite/gcc.dg/analyzer/setjmp-2.c   | 1 +
 gcc/testsuite/gcc.dg/analyzer/setjmp-3.c   | 1 +
 gcc/testsuite/gcc.dg/analyzer/setjmp-4.c   | 1 +
 gcc/testsuite/gcc.dg/analyzer/setjmp-5.c   | 1 +
 gcc/testsuite/gcc.dg/analyzer/setjmp-6.c   | 2 ++
 gcc/testsuite/gcc.dg/analyzer/setjmp-7.c   | 2 ++
 gcc/testsuite/gcc.dg/analyzer/setjmp-7a.c  | 1 +
 gcc/testsuite/gcc.dg/analyzer/setjmp-8.c   | 1 +
 gcc/testsuite/gcc.dg/analyzer/setjmp-9.c   | 1 +
 gcc/testsuite/gcc.dg/analyzer/setjmp-pr93378.c | 1 +
 gcc/testsuite/gcc.dg/gimplefe-44.c | 1 +
 gcc/testsuite/gcc.dg/pr84131.c | 3 ++-
 gcc/testsuite/gcc.dg/pr93986.c | 3 ++-
 gcc/testsuite/gcc.dg/pr95133.c | 1 +
 gcc/testsuite/gcc.dg/pr95857.c | 1 +
 gcc/testsuite/gcc.dg/strlenopt-83.c| 3 ++-
 gcc/testsuite/gcc.dg/strlenopt-84.c| 3 ++-
 gcc/testsuite/gcc.dg/strlenopt-91.c| 3 ++-
 gcc/testsuite/gcc.dg/uninit-32.c   | 3 ++-
 gcc/testsuite/gcc.dg/uninit-36.c   | 3 ++-
 40 files changed, 66 insertions(+), 20 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/Warray-bounds-46.c 
b/gcc/testsuite/gcc.dg/Warray-bounds-46.c
index 3f1c6c715ea..4effe5c2051 100644
--- a/gcc/testsuite/gcc.dg/Warray-bounds-46.c
+++ b/gcc/testsuite/gcc.dg/Warray-bounds-46.c
@@ -3,7 +3,8 @@
Test to verify that past-the-end accesses by string functions to member
arrays by-reference objects are diagnosed.
{ dg-do compile }
-   { dg-options "-O2 -Wall -Wno-unused-local-typedefs -Wno-stringop-overflow 
-ftrack-macro-expansion=0" }  */
+   { dg-options "-O2 -Wall -Wno-unused-local-typedefs -Wno-stringop-overflow 
-ftrack-macro-expansion=0" }
+   { dg-require-effective-target alloca } */
 
 #define SA(expr) typedef int 

[PATCH 2/2] PowerPC: Add power10 IEEE 128-bit min/max/cmove.

2020-08-11 Thread Michael Meissner via Gcc-patches
PowerPC: Add power10 IEEE 128-bit min/max/cmove.

This patch adds support for the new ISA 3.1 (power10) xscmp{eq,gt,ge}qp,
xsmincqp, and xsmaxcqp instructions for IEEE 128-bit conditional move, minimum,
and maximum.

gcc/
2020-08-07  Michael Meissner  

* config/rs6000/rs6000.c (rs6000_emit_cmove): Add support for IEEE
128-bit min, max, and comparisons on ISA 3.1.
(rs6000_emit_minmax): Add support for IEEE 128-bit min/max on ISA
3.1.
* config/rs6000/rs6000.md (s3, IEEE128 iterator):
New insns for IEEE 128-bit min/max.
(movcc, IEEE128 iterator): New insns for IEEE 128-bit
conditional move.
(movcc_hardware, IEEE128 iterator): New insns for IEEE
128-bit conditional move.
(movcc_invert_hardware, IEEE128 iterator): New insns for
IEEE 128-bit conditional move.
(fpmask, IEEE128 iterator): New insns for IEEE 128-bit
conditional move.

gcc/testsuite/
2020-08-07  Michael Meissner  

* gcc.target/powerpc/float128-minmax-2.c: New test.
---
 gcc/config/rs6000/rs6000.c |  18 ++-
 gcc/config/rs6000/rs6000.md| 121 +
 .../gcc.target/powerpc/float128-minmax-2.c |  70 
 3 files changed, 208 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/float128-minmax-2.c

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 1c042b0..6874262 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -15177,6 +15177,21 @@ rs6000_emit_cmove (rtx dest, rtx op, rtx true_cond, 
rtx false_cond)
return 1;
 }
 
+  /* See if we can use the ISA 3.1 min/max/compare instructions for IEEE
+ 128-bit floating point.  At present, don't worry about doing conditional
+ moves with different types for the comparison and movement (unlike SF/DF,
+ where you can do a conditional test between double and use float as the
+ if/then parts. */
+  if (TARGET_FLOAT128_HW && TARGET_POWER10 && FLOAT128_IEEE_P (compare_mode)
+  && compare_mode == result_mode)
+{
+  if (generate_fp_min_max (dest, op, true_cond, false_cond))
+   return 1;
+
+  if (generate_fp_cmove (dest, op, true_cond, false_cond))
+   return 1;
+}
+
   /* Don't allow using floating point comparisons for integer results for
  now.  */
   if (FLOAT_MODE_P (compare_mode) && !FLOAT_MODE_P (result_mode))
@@ -15400,7 +15415,8 @@ rs6000_emit_minmax (rtx dest, enum rtx_code code, rtx 
op0, rtx op1)
   /* VSX/altivec have direct min/max insns.  */
   if ((code == SMAX || code == SMIN)
   && (VECTOR_UNIT_ALTIVEC_OR_VSX_P (mode)
- || (mode == SFmode && VECTOR_UNIT_VSX_P (DFmode
+ || (mode == SFmode && VECTOR_UNIT_VSX_P (DFmode))
+ || (TARGET_FLOAT128_HW && TARGET_POWER10 && FLOAT128_IEEE_P (mode
 {
   emit_insn (gen_rtx_SET (dest, gen_rtx_fmt_ee (code, mode, op0, op1)));
   return;
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index 43b620a..7c2f3e7 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -14671,6 +14671,127 @@ (define_insn "*cmp_hw"
"xscmpuqp %0,%1,%2"
   [(set_attr "type" "veccmp")
(set_attr "size" "128")])
+
+;; IEEE 128-bit min/max
+(define_insn "s3"
+  [(set (match_operand:IEEE128 0 "altivec_register_operand" "=v")
+   (fp_minmax:IEEE128
+(match_operand:IEEE128 1 "altivec_register_operand" "v")
+(match_operand:IEEE128 2 "altivec_register_operand" "v")))]
+  "TARGET_FLOAT128_HW && TARGET_POWER10 && FLOAT128_IEEE_P (mode)"
+  "xscqp %0,%1,%2"
+  [(set_attr "type" "fp")
+   (set_attr "size" "128")])
+
+;; IEEE 128-bit conditional move.  At present, don't worry about doing
+;; conditional moves with different types for the comparison and movement
+;; (unlike SF/DF, where you can do a conditional test between double and use
+;; float as the if/then parts.
+(define_expand "movcc"
+   [(set (match_operand:IEEE128 0 "gpc_reg_operand")
+(if_then_else:IEEE128 (match_operand 1 "comparison_operator")
+  (match_operand:IEEE128 2 "gpc_reg_operand")
+  (match_operand:IEEE128 3 "gpc_reg_operand")))]
+  "TARGET_FLOAT128_HW && TARGET_POWER10 && FLOAT128_IEEE_P (mode)"
+{
+  if (rs6000_emit_cmove (operands[0], operands[1], operands[2], operands[3]))
+DONE;
+  else
+FAIL;
+})
+
+(define_insn_and_split "*movcc_hardware"
+  [(set (match_operand:IEEE128 0 "altivec_register_operand" "=,v")
+   (if_then_else:IEEE128
+(match_operator:CCFP 1 "fpmask_comparison_operator"
+   [(match_operand:IEEE128 2 "altivec_register_operand" "v,v")
+(match_operand:IEEE128 3 "altivec_register_operand" "v,v")])
+(match_operand:IEEE128 4 "altivec_register_operand" "v,v")
+(match_operand:IEEE128 5 "altivec_register_operand" "v,v")))
+   (clobber 

[PATCH 1/2] PowerPC: Rename min/max/cmove functions.

2020-08-11 Thread Michael Meissner via Gcc-patches
PowerPC: Rename min/max/cmove functions.

This patch renames some of the functions in rs6000.c that are used to generate
floating point scalar minimum, maximum, and conditional move sequences to use a
more generic name then _p9.

gcc/
2020-08-07  Michael Meissner  

* config/rs6000/rs6000.c
(rs6000_emit_p9_fp_minmax,generate_fp_min_max): Rename.
(rs6000_emit_p9_fp_cmove,generate_fp_cmove): Rename.
(rs6000_emit_cmove): Update to use renamed functions.
---
 gcc/config/rs6000/rs6000.c | 32 
 1 file changed, 20 insertions(+), 12 deletions(-)

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index d26a18f..1c042b0 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -15032,13 +15032,17 @@ rs6000_emit_vector_cond_expr (rtx dest, rtx op_true, 
rtx op_false,
   return 1;
 }
 
-/* ISA 3.0 (power9) minmax subcase to emit a XSMAXCDP or XSMINCDP instruction
-   for SF/DF scalars.  Move TRUE_COND to DEST if OP of the operands of the last
-   comparison is nonzero/true, FALSE_COND if it is zero/false.  Return 0 if the
-   hardware has no such operation.  */
+/* Possibly emit an appropriate minimum or maximum instruction for floating
+   point scalars.
+
+   Move TRUE_COND to DEST if OP of the operands of the last comparison is
+   nonzero/true, FALSE_COND if it is zero/false.
+
+   Return 0 if we can't generate the appropriate minimum or maximum, and 1 if
+   we can did the minimum or maximum.  */
 
 static int
-rs6000_emit_p9_fp_minmax (rtx dest, rtx op, rtx true_cond, rtx false_cond)
+generate_fp_min_max (rtx dest, rtx op, rtx true_cond, rtx false_cond)
 {
   enum rtx_code code = GET_CODE (op);
   rtx op0 = XEXP (op, 0);
@@ -15074,13 +15078,17 @@ rs6000_emit_p9_fp_minmax (rtx dest, rtx op, rtx 
true_cond, rtx false_cond)
   return 1;
 }
 
-/* ISA 3.0 (power9) conditional move subcase to emit XSCMP{EQ,GE,GT,NE}DP and
-   XXSEL instructions for SF/DF scalars.  Move TRUE_COND to DEST if OP of the
-   operands of the last comparison is nonzero/true, FALSE_COND if it is
-   zero/false.  Return 0 if the hardware has no such operation.  */
+/* Possibly emit a floating point conditional move by generating a compare that
+   sets a mask instruction and a XXSEL select instruction.
+
+   Move TRUE_COND to DEST if OP of the operands of the last comparison is
+   nonzero/true, FALSE_COND if it is zero/false.
+
+   Return 0 if the operation cannot be generated, and 1 if we could generate
+   the instruction.  */
 
 static int
-rs6000_emit_p9_fp_cmove (rtx dest, rtx op, rtx true_cond, rtx false_cond)
+generate_fp_cmove (rtx dest, rtx op, rtx true_cond, rtx false_cond)
 {
   enum rtx_code code = GET_CODE (op);
   rtx op0 = XEXP (op, 0);
@@ -15162,10 +15170,10 @@ rs6000_emit_cmove (rtx dest, rtx op, rtx true_cond, 
rtx false_cond)
   && (compare_mode == SFmode || compare_mode == DFmode)
   && (result_mode == SFmode || result_mode == DFmode))
 {
-  if (rs6000_emit_p9_fp_minmax (dest, op, true_cond, false_cond))
+  if (generate_fp_min_max (dest, op, true_cond, false_cond))
return 1;
 
-  if (rs6000_emit_p9_fp_cmove (dest, op, true_cond, false_cond))
+  if (generate_fp_cmove (dest, op, true_cond, false_cond))
return 1;
 }
 
-- 
1.8.3.1


-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797


[PATCH, 0 of 2] Add support for power10 IEEE 128-bit min, max, and compare

2020-08-11 Thread Michael Meissner via Gcc-patches
These two patches are a reworking of similar patches that I submitted in July.
The change in these patches are to rename the functions that generate the
minimum, maximum, and compare IEEE 128-bit to produce a mask to use better
names, and to rework the comments.  In addition, I have changed the target from
'future' to 'power10'.

The first patch renames the support functions in rs6000.c from using a _p9
suffix.  These support functions were added to support ISA 3.0 (power9) to
support the new c-format minimum, maximum instructions along with the compare
to set a bit mask instruction.  Power10 has similar functions for IEEE 128-bit
floating point.

The second patch adds the specific support for IEEE 128-bit floating point.  It
also adds a new test to make sure the IEEE 128-bit support works.

I have built and bootstrapped compilers with/without the patches, and there
were no regressions in the test suite.  Can I check these patches into the
master branch?

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797


[PATCH] correct handling of indices into arrays with elements larger than 1 (PR c++/96511)

2020-08-11 Thread Martin Sebor via Gcc-patches

-Wplacement-new handles array indices and pointer offsets the same:
by adjusting them by the size of the element.  That's correct for
the latter but wrong for the former, causing false positives when
the element size is greater than one.

In addition, the warning doesn't even attempt to handle arrays of
arrays.  I'm not sure if I forgot or if I simply didn't think of
it.

The attached patch corrects these oversights by replacing most
of the -Wplacement-new code with a call to compute_objsize which
handles all this correctly (plus more), and is also better tested.
But even compute_objsize has bugs: it trips up while converting
wide_int to offset_int for some pointer offset ranges.  Since
handling the C++ IL required changes in this area the patch also
fixes that.

For review purposes, the patch affects just the middle end.
The C++ diff pretty much just removes code from the front end.

Tested on x86_64-linux plus by building the latest Glibc and
confirming no new warnings.

Martin
Correct handling of indices into arrays with elements larger than 1 (PR c++/96511)

Resolves:
PR c++/96511 - Incorrect -Wplacement-new on POINTER_PLUS into an array with 4-byte elements
PR middle-end/96561 - missing warning for buffer overflow with negative offset
PR middle-end/96384 - bogus -Wstringop-overflow= storing into multidimensional array with index in range

gcc/ChangeLog:

	PR c++/96511
	PR middle-end/96384
	* builtins.c (get_range): Return full range of type when neither
	value nor its range is available.  Fail for ranges inverted due
	to the signedness of offsets.
	(compute_objsize): Handle more special array members.  Handle
	POINTER_PLUS_EXPR and VIEW_CONVERT_EXPR that come up in front end
	code.
	(access_ref::offset_bounded): Define new member function.
	* builtins.h (access_ref::eval): New data member.
	(access_ref::offset_bounded): New member function.
	(access_ref::offset_zero): New member function.
	(compute_objsize): Declare a new overload.
	* gimple-array-bounds.cc (array_bounds_checker::check_array_ref): Use
	enum special_array_member.
	* tree-object-size.c (decl_init_size): Return the size of the structure
	type if the decl size is null.
	* tree.c (component_ref_size): Use special_array_member.
	* tree.h (special_array_member): Define a new type.
	(component_ref_size): Change signature/	

gcc/cp/ChangeLog:

	PR c++/96511
	PR middle-end/96384
	* init.c (warn_placement_new_too_small): Call builtin_objsize instead
	of duplicating what it does.

gcc/testsuite/ChangeLog:

	PR c++/96511
	PR middle-end/96384
	* g++.dg/warn/Wplacement-new-size-1.C: Relax warnings.
	* g++.dg/warn/Wplacement-new-size-2.C: Same.
	* g++.dg/warn/Wplacement-new-size-6.C: Same.
	* g++.dg/warn/Wplacement-new-size-7.C: New test.
	* gcc.dg/Wstringop-overflow-40.c: New test.

diff --git a/gcc/builtins.c b/gcc/builtins.c
index beb56e06d8a..4f34a99c2f9 100644
--- a/gcc/builtins.c
+++ b/gcc/builtins.c
@@ -3977,13 +3977,32 @@ static bool
 get_range (tree x, signop sgn, offset_int r[2],
 	   const vr_values *rvals /* = NULL */)
 {
+  tree type = TREE_TYPE (x);
+  if (TREE_CODE (x) != INTEGER_CST
+  && TREE_CODE (x) != SSA_NAME)
+{
+  if (TYPE_UNSIGNED (type))
+	{
+	  if (sgn == SIGNED)
+	type = signed_type_for (type);
+	}
+  else if (sgn == UNSIGNED)
+	type = unsigned_type_for (type);
+
+  r[0] = wi::to_offset (TYPE_MIN_VALUE (type));
+  r[1] = wi::to_offset (TYPE_MAX_VALUE (type));
+  return x;
+}
+
   wide_int wr[2];
   if (!get_range (x, wr, rvals))
 return false;
 
   r[0] = offset_int::from (wr[0], sgn);
   r[1] = offset_int::from (wr[1], sgn);
-  return true;
+  /* Succeed only for valid ranges (pointer offsets are represented
+ as unsigned despite taking on "negative" values).  */
+  return r[0] <= r[1];
 }
 
 /* Helper to compute the size of the object referenced by the PTR
@@ -4001,9 +4020,11 @@ get_range (tree x, signop sgn, offset_int r[2],
to influence code generation or optimization.  */
 
 static bool
-compute_objsize (tree ptr, int ostype, access_ref *pref,
-		 bitmap *visited, const vr_values *rvals /* = NULL */)
+compute_objsize (tree ptr, int ostype, access_ref *pref, bitmap *visited,
+		 const vr_values *rvals)
 {
+  STRIP_NOPS (ptr);
+
   const bool addr = TREE_CODE (ptr) == ADDR_EXPR;
   if (addr)
 ptr = TREE_OPERAND (ptr, 0);
@@ -4015,12 +4036,15 @@ compute_objsize (tree ptr, int ostype, access_ref *pref,
   if (!addr && POINTER_TYPE_P (TREE_TYPE (ptr)))
 	return false;
 
-  tree size = decl_init_size (ptr, false);
-  if (!size || TREE_CODE (size) != INTEGER_CST)
-	return false;
-
   pref->ref = ptr;
-  pref->sizrng[0] = pref->sizrng[1] = wi::to_offset (size);
+  if (tree size = decl_init_size (ptr, false))
+	if (TREE_CODE (size) == INTEGER_CST)
+	  {
+	pref->sizrng[0] = pref->sizrng[1] = wi::to_offset (size);
+	return true;
+	  }
+  pref->sizrng[0] = 0;
+  pref->sizrng[1] = wi::to_offset (TYPE_MAX_VALUE (ptrdiff_type_node));
   

Re: [committed] libstdc++: Make Networking TS work without gthreads [PR 89760]

2020-08-11 Thread Jonathan Wakely via Gcc-patches

On 11/08/20 16:38 +0100, Jonathan Wakely wrote:

Make the experimental Networking TS code work without std::mutex and
std::condition_variable.

libstdc++-v3/ChangeLog:

PR libstdc++/89760
* include/experimental/executor [!_GLIBCXX_HAS_GTHREADS]:
(execution_context::mutex_type): Define dummy mutex type.
(system_context): Use execution_context::mutex_type.
(system_context) [!_GLIBCXX_HAS_GTHREADS]: Define dummy
thread and condition variable types.
[!_GLIBCXX_HAS_GTHREADS] (system_context::_M_run()): Do not
define.
(system_context::_M_post) [!_GLIBCXX_HAS_GTHREADS]: Throw
an exception when threads aren't available.
(strand::running_in_this_thread()): Defer to _M_state.
(strand::_State::running_in_this_thread()): New function.
(use_future_t): Do not depend on _GLIBCXX_USE_C99_STDINT_TR1.
* include/experimental/io_context (io_context): Use the
execution_context::mutex_type alias. Replace stack of thread
IDs with counter.
* testsuite/experimental/net/execution_context/use_service.cc:
Enable test for non-pthread targets.


For the branches I'm just making the tests depend on gthreads.

Tested x86_64-linux, with both --enable-threads and --disable-threads.
Committed to trunk.


commit afd61b43808cebe0882cdf13dcdd766cae4ce4e7
Author: Jonathan Wakely 
Date:   Tue Aug 11 16:55:01 2020

libstdc++: Disable net tests that depend on threads [PR 89760]

libstdc++-v3/ChangeLog:

PR libstdc++/89760
* testsuite/experimental/net/execution_context/make_service.cc:
Add dg-require-gthreads.
* testsuite/experimental/net/executor/1.cc: Likewise.
* testsuite/experimental/net/headers.cc: Likewise.
* testsuite/experimental/net/internet/address/v4/comparisons.cc:
Likewise.
* testsuite/experimental/net/internet/address/v4/cons.cc:
Likewise.
* testsuite/experimental/net/internet/address/v4/creation.cc:
Likewise.
* testsuite/experimental/net/internet/address/v4/members.cc:
Likewise.
* testsuite/experimental/net/internet/resolver/base.cc:
Likewise.
* testsuite/experimental/net/internet/resolver/ops/lookup.cc:
Likewise.
* testsuite/experimental/net/internet/resolver/ops/reverse.cc:
Likewise.
* testsuite/experimental/net/socket/basic_socket.cc: Likewise.
* testsuite/experimental/net/timer/waitable/cons.cc: Likewise.
* testsuite/experimental/net/timer/waitable/dest.cc: Likewise.
* testsuite/experimental/net/timer/waitable/ops.cc: Likewise.

diff --git a/libstdc++-v3/testsuite/experimental/net/execution_context/make_service.cc b/libstdc++-v3/testsuite/experimental/net/execution_context/make_service.cc
index 0898d12927a..fe8d385b0f7 100644
--- a/libstdc++-v3/testsuite/experimental/net/execution_context/make_service.cc
+++ b/libstdc++-v3/testsuite/experimental/net/execution_context/make_service.cc
@@ -16,6 +16,7 @@
 // .
 
 // { dg-do compile { target c++14 } }
+// { dg-require-gthreads "" }
 
 #include 
 
diff --git a/libstdc++-v3/testsuite/experimental/net/executor/1.cc b/libstdc++-v3/testsuite/experimental/net/executor/1.cc
index cd0af4b7737..88e263297ee 100644
--- a/libstdc++-v3/testsuite/experimental/net/executor/1.cc
+++ b/libstdc++-v3/testsuite/experimental/net/executor/1.cc
@@ -16,6 +16,7 @@
 // .
 
 // { dg-do run { target c++14 } }
+// { dg-require-gthreads "" }
 
 #include 
 #include 
diff --git a/libstdc++-v3/testsuite/experimental/net/headers.cc b/libstdc++-v3/testsuite/experimental/net/headers.cc
index 957135bbf23..a896f9509ee 100644
--- a/libstdc++-v3/testsuite/experimental/net/headers.cc
+++ b/libstdc++-v3/testsuite/experimental/net/headers.cc
@@ -16,6 +16,7 @@
 // .
 
 // { dg-do compile }
+// { dg-require-gthreads "" }
 
 #include 
 
diff --git a/libstdc++-v3/testsuite/experimental/net/internet/address/v4/comparisons.cc b/libstdc++-v3/testsuite/experimental/net/internet/address/v4/comparisons.cc
index 098fc5e18e2..51fc2917d0a 100644
--- a/libstdc++-v3/testsuite/experimental/net/internet/address/v4/comparisons.cc
+++ b/libstdc++-v3/testsuite/experimental/net/internet/address/v4/comparisons.cc
@@ -17,6 +17,7 @@
 
 // { dg-do run { target c++14 } }
 // { dg-add-options net_ts }
+// { dg-require-gthreads "" }
 
 #include 
 #include 
diff --git a/libstdc++-v3/testsuite/experimental/net/internet/address/v4/cons.cc b/libstdc++-v3/testsuite/experimental/net/internet/address/v4/cons.cc
index 93c42c59b0f..0f47d9a863b 100644
--- a/libstdc++-v3/testsuite/experimental/net/internet/address/v4/cons.cc
+++ b/libstdc++-v3/testsuite/experimental/net/internet/address/v4/cons.cc
@@ -17,6 +17,7 @@
 
 // { dg-do run { target c++14 

[committed] libstdc++: Fix failing tests for AIX

2020-08-11 Thread Jonathan Wakely via Gcc-patches
These two tests fail on AIX because  defines struct thread
in the global namespace (despite it not being a reserved name). That
means the using-declaration that adds it to the global namespace causes
a redeclaration error.

libstdc++-v3/ChangeLog:

* testsuite/30_threads/thread/cons/84535.cc: Use a custom
namespace.
* testsuite/30_threads/thread/cons/lwg2097.cc: Likewise.

Tested powerpc64le-linux and powerpc-aix. Committed to trunk.

commit fe8d7fec4db838cae536eeef1965db83959cf6ee
Author: Jonathan Wakely 
Date:   Tue Aug 11 16:16:22 2020

libstdc++: Fix failing tests for AIX

These two tests fail on AIX because  defines struct thread
in the global namespace (despite it not being a reserved name). That
means the using-declaration that adds it to the global namespace causes
a redeclaration error.

libstdc++-v3/ChangeLog:

* testsuite/30_threads/thread/cons/84535.cc: Use a custom
namespace.
* testsuite/30_threads/thread/cons/lwg2097.cc: Likewise.

diff --git a/libstdc++-v3/testsuite/30_threads/thread/cons/84535.cc 
b/libstdc++-v3/testsuite/30_threads/thread/cons/84535.cc
index 7846d3f7b68..711687b4f5c 100644
--- a/libstdc++-v3/testsuite/30_threads/thread/cons/84535.cc
+++ b/libstdc++-v3/testsuite/30_threads/thread/cons/84535.cc
@@ -20,6 +20,8 @@
 
 #include 
 
+namespace __gnu_test
+{
 using std::is_constructible;
 using std::thread;
 
@@ -28,3 +30,4 @@ static_assert(!is_constructible::value, 
"");
 static_assert(!is_constructible::value, "");
 static_assert(!is_constructible::value, "");
 static_assert(!is_constructible::value, "");
+}
diff --git a/libstdc++-v3/testsuite/30_threads/thread/cons/lwg2097.cc 
b/libstdc++-v3/testsuite/30_threads/thread/cons/lwg2097.cc
index e0d588e51f9..1ad2a76cb58 100644
--- a/libstdc++-v3/testsuite/30_threads/thread/cons/lwg2097.cc
+++ b/libstdc++-v3/testsuite/30_threads/thread/cons/lwg2097.cc
@@ -20,9 +20,12 @@
 
 #include 
 
+namespace __gnu_test
+{
 using std::thread;
 using std::is_constructible;
 
 static_assert( !is_constructible::value, "" );
 static_assert( !is_constructible::value, "" );
 static_assert( !is_constructible::value, "" );
+}


Re: [PATCH] libstdc++: Implement integer-class types as defined in [iterator.concept.winc]

2020-08-11 Thread Patrick Palka via Gcc-patches
On Thu, 7 May 2020, Patrick Palka wrote:

> On Mon, 2 Mar 2020, Patrick Palka wrote:
> 
> > On Mon, 24 Feb 2020, Patrick Palka wrote:
> > 
> > > On Mon, 24 Feb 2020, Patrick Palka wrote:
> > > 
> > > > This implements signed and unsigned integer-class types, whose width is 
> > > > one bit
> > > > larger than the widest native signed and unsigned integral type 
> > > > respectively.
> > > > In our case this is either __int128 and unsigned __int128, or long long 
> > > > and
> > > > unsigned long long.
> > > > 
> > > > Internally, the two integer-class types are represented as a largest 
> > > > native
> > > > unsigned integral type plus one extra bit.  The signed integer-class 
> > > > type is
> > > > represented in two's complement form with the extra bit acting as the 
> > > > sign bit.
> > > > 
> > > > libstdc++-v3/ChangeLog:
> > > > 
> > > > * include/bits/iterator_concepts.h 
> > > > (ranges::__detail::__max_diff_type):
> > > > Remove definition, replace with forward declaration of class
> > > > __max_diff_type.
> > > > (ranges::__detail::__max_size_type): Remove definition, replace 
> > > > with
> > > > forward declaration of class __max_size_type.
> > > > (__detail::__is_integer_like): Accept __int128 and unsigned 
> > > > __int128.
> > > > (__detail::__is_signed_integer_like): Accept __int128.
> > > > * include/bits/range_access.h (__detail::__max_size_type): New 
> > > > class.
> > > > (__detail::__max_diff_type): New class.
> > > > (__detail::__max_size_type::__max_size_type): Define this 
> > > > constructor
> > > > out-of-line to break the cycle.
> > > > (__detail::__to_unsigned_like): New function.
> > > > (numeric_limits<__detail::__max_size_type>): New explicit 
> > > > specialization.
> > > > (numeric_limits<__detail::__max_diff_type>): New explicit 
> > > > specialization.
> > > > * testsuite/std/ranges/iota/differenc_type.cc: New test.
> > > 
> > > Here's v2 of the patch that splits out __max_size_type and
> > > __max_diff_type into a dedicated header, along with other misc
> > > improvements and fixes.
> > > 
> > > -- >8 --
> > 
> > Here's v3 of the patch.  Changes from v2:
> > 
> > * The arithmetic tests in difference_type.cc have been split out to a
> > separate file.
> > 
> > * The arithmetic tests now run successfully in strict ANSI mode.  The
> > issue was that __int128 does not model the integral concept in strict
> > ANSI mode, which we use to make operations on this type behave as
> > integer operations do.  But for that we need to always treat __int128 as
> > an integer type in this API.  So a new concept __integralish which is
> > always modelled by __int128 is introduced and used in the API instead.
> > 
> > * Comments have been added explaining why __int128 is always used as the
> > underlying type even when the widest integer type in strict ANSI mode is
> > long long.
> > 
> > * New tests, some minor code clean-ups, and added comments to the
> > unsigned division and multiplication routines.
> > 
> > Tested on x86_64-pc-linux-gnu in both strict and GNU compilation modes,
> > with and without -U__SIZEOF_INT128__.
> 
> Ping (now that stage 1 is open).  Here's the latest rebased of version
> of the patch:

Here's the patch rebased against today's trunk.  Compared to the
previous version, this version resolves some trivial merge conflicts in
include/bits/{iterator_concepts.h,range_access.h} and it replaces the
use of the removed trait __detail::__int_limits::digits
with __gnu_cxx::__int_traits::__digits.

Tested with and without -U__SIZEOF_INT128__, and in both strict and GNU
c++20 modes.

-- >8 --

Subject: [PATCH] libstdc++: integer-class types as per [iterator.concept.winc]

This implements signed and unsigned integer-class types, whose width is
one bit larger than the widest supported signed and unsigned integral
type respectively.  In our case this is either __int128 and unsigned
__int128, or long long and unsigned long long.

Internally, the two integer-class types are represented as a largest
supported unsigned integral type plus one extra bit.  The signed
integer-class type is represented in two's complement form with the
extra bit acting as the sign bit.

libstdc++-v3/ChangeLog:

* include/Makefile.am (bits_headers): Add new header
.
* include/Makefile.in: Regenerate.
* include/bits/iterator_concepts.h
(ranges::__detail::__max_diff_type): Remove definition, replace
with forward declaration of class __max_diff_type.
(__detail::__max_size_type): Remove definition, replace with
forward declaration of class __max_size_type.
(__detail::__is_unsigned_int128, __is_signed_int128,
__is_int128): New concepts.
(__detail::__is_integer_like): Accept __int128 and unsigned
__int128.
(__detail::__is_signed_integer_like): Accept __int128.
* 

[committed] libstdc++: Make net::system_context tag type constructor explicit

2020-08-11 Thread Jonathan Wakely via Gcc-patches
libstdc++-v3/ChangeLog:

* include/experimental/executor (system_context::a__tag): Make
default constructor explicit.

Tested powerpc64le-linux. Committed to trunk.

commit 2a6918e4fa57edbe0dc326d5f142350b1dd4afd7
Author: Jonathan Wakely 
Date:   Tue Aug 11 16:16:21 2020

libstdc++: Make net::system_context tag type constructor explicit

libstdc++-v3/ChangeLog:

* include/experimental/executor (system_context::a__tag): Make
default constructor explicit.

diff --git a/libstdc++-v3/include/experimental/executor 
b/libstdc++-v3/include/experimental/executor
index b5dc3035756..1561050ae23 100644
--- a/libstdc++-v3/include/experimental/executor
+++ b/libstdc++-v3/include/experimental/executor
@@ -884,7 +884,7 @@ inline namespace v1
   private:
 friend system_executor;
 
-struct __tag { };
+struct __tag { explicit __tag() = default; };
 system_context(__tag) { }
 
 thread _M_thread;


[committed] libstdc++: Make Networking TS work without gthreads [PR 89760]

2020-08-11 Thread Jonathan Wakely via Gcc-patches
Make the experimental Networking TS code work without std::mutex and
std::condition_variable.

libstdc++-v3/ChangeLog:

PR libstdc++/89760
* include/experimental/executor [!_GLIBCXX_HAS_GTHREADS]:
(execution_context::mutex_type): Define dummy mutex type.
(system_context): Use execution_context::mutex_type.
(system_context) [!_GLIBCXX_HAS_GTHREADS]: Define dummy
thread and condition variable types.
[!_GLIBCXX_HAS_GTHREADS] (system_context::_M_run()): Do not
define.
(system_context::_M_post) [!_GLIBCXX_HAS_GTHREADS]: Throw
an exception when threads aren't available.
(strand::running_in_this_thread()): Defer to _M_state.
(strand::_State::running_in_this_thread()): New function.
(use_future_t): Do not depend on _GLIBCXX_USE_C99_STDINT_TR1.
* include/experimental/io_context (io_context): Use the
execution_context::mutex_type alias. Replace stack of thread
IDs with counter.
* testsuite/experimental/net/execution_context/use_service.cc:
Enable test for non-pthread targets.

Tested powerpc64le-linux and powerpc-aix, with both --enable-threads 
and --disable threads. Committed to trunk.

commit 18095be17013444d9e91aa8c73ebe5cf58ccb3f1
Author: Jonathan Wakely 
Date:   Tue Aug 11 16:16:22 2020

libstdc++: Make Networking TS work without gthreads [PR 89760]

Make the experimental Networking TS code work without std::mutex and
std::condition_variable.

libstdc++-v3/ChangeLog:

PR libstdc++/89760
* include/experimental/executor [!_GLIBCXX_HAS_GTHREADS]:
(execution_context::mutex_type): Define dummy mutex type.
(system_context): Use execution_context::mutex_type.
(system_context) [!_GLIBCXX_HAS_GTHREADS]: Define dummy
thread and condition variable types.
[!_GLIBCXX_HAS_GTHREADS] (system_context::_M_run()): Do not
define.
(system_context::_M_post) [!_GLIBCXX_HAS_GTHREADS]: Throw
an exception when threads aren't available.
(strand::running_in_this_thread()): Defer to _M_state.
(strand::_State::running_in_this_thread()): New function.
(use_future_t): Do not depend on _GLIBCXX_USE_C99_STDINT_TR1.
* include/experimental/io_context (io_context): Use the
execution_context::mutex_type alias. Replace stack of thread
IDs with counter.
* testsuite/experimental/net/execution_context/use_service.cc:
Enable test for non-pthread targets.

diff --git a/libstdc++-v3/include/experimental/executor 
b/libstdc++-v3/include/experimental/executor
index 1561050ae23..45e813f6747 100644
--- a/libstdc++-v3/include/experimental/executor
+++ b/libstdc++-v3/include/experimental/executor
@@ -506,7 +506,16 @@ inline namespace v1
   bool _M_active;
 };
 
-mutable std::mutex _M_mutex;
+#if defined(_GLIBCXX_HAS_GTHREADS)
+using mutex_type = std::mutex;
+#else
+struct mutex_type
+{
+  void lock() const { }
+  void unlock() const { }
+};
+#endif
+mutable mutex_type _M_mutex;
 
 // Sorted in order of beginning of service object lifetime.
 std::list<_ServicePtr> _M_services;
@@ -553,7 +562,7 @@ inline namespace v1
   static_assert(is_base_of<_Key, _Service>::value,
  "a service type must match or derive from its key_type");
   auto __key = execution_context::_S_key<_Key>();
-  std::lock_guard __lock(__ctx._M_mutex);
+  lock_guard __lock(__ctx._M_mutex);
   auto& __svc = __ctx._M_keys[__key];
   if (__svc == nullptr)
{
@@ -577,7 +586,7 @@ inline namespace v1
   static_assert(is_base_of<_Key, _Service>::value,
  "a service type must match or derive from its key_type");
   auto __key = execution_context::_S_key<_Key>();
-  std::lock_guard __lock(__ctx._M_mutex);
+  lock_guard __lock(__ctx._M_mutex);
   auto& __svc = __ctx._M_keys[__key];
   if (__svc != nullptr)
throw service_already_exists();
@@ -599,7 +608,7 @@ inline namespace v1
  "a service type must derive from execution_context::service");
   static_assert(is_base_of<_Key, _Service>::value,
  "a service type must match or derive from its key_type");
-  std::lock_guard __lock(__ctx._M_mutex);
+  lock_guard __lock(__ctx._M_mutex);
   return __ctx._M_keys.count(execution_context::_S_key<_Key>());
 }
 
@@ -865,20 +874,21 @@ inline namespace v1
 
 void stop()
 {
-  lock_guard __lock(_M_mtx);
+  lock_guard __lock(_M_mtx);
   _M_stopped = true;
   _M_cv.notify_all();
 }
 
 bool stopped() const noexcept
 {
-  lock_guard __lock(_M_mtx);
+  lock_guard __lock(_M_mtx);
   return _M_stopped;
 }
 
 void join()
 {
-  _M_thread.join();
+  if (_M_thread.joinable())
+   _M_thread.join();
 }
 
   private:

[committed] libstdc++: Fix net::system_context stop condition

2020-08-11 Thread Jonathan Wakely via Gcc-patches
libstdc++-v3/ChangeLog:

* include/experimental/executor (system_context::_M_run()):
Fix predicate.
* testsuite/experimental/net/system_context/1.cc: New test.

Tested powerpc64le-linux and powerpc-aix, with both --enable-threads 
and --disable threads. Committed to trunk.

commit 61759518adc7679a6f46369543e30a761a16490a
Author: Jonathan Wakely 
Date:   Tue Aug 11 16:16:21 2020

libstdc++: Fix net::system_context stop condition

libstdc++-v3/ChangeLog:

* include/experimental/executor (system_context::_M_run()):
Fix predicate.
* testsuite/experimental/net/system_context/1.cc: New test.

diff --git a/libstdc++-v3/include/experimental/executor 
b/libstdc++-v3/include/experimental/executor
index 763f4ce0e17..b5dc3035756 100644
--- a/libstdc++-v3/include/experimental/executor
+++ b/libstdc++-v3/include/experimental/executor
@@ -902,7 +902,7 @@ inline namespace v1
  {
unique_lock __lock(_M_mtx);
_M_cv.wait(__lock,
-  [this]{ return !_M_stopped && !_M_tasks.empty(); });
+  [this]{ return _M_stopped || !_M_tasks.empty(); });
if (_M_stopped)
  return;
__f = std::move(_M_tasks.front());
diff --git a/libstdc++-v3/testsuite/experimental/net/system_context/1.cc 
b/libstdc++-v3/testsuite/experimental/net/system_context/1.cc
new file mode 100644
index 000..bc2405d31d3
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/net/system_context/1.cc
@@ -0,0 +1,42 @@
+// Copyright (C) 2020 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3.  If not see
+// .
+
+// { dg-do run }
+// { dg-options "-pthread"  }
+// { dg-require-effective-target c++14 }
+// { dg-require-effective-target pthread }
+// { dg-require-gthreads "" }
+
+#include 
+#include 
+
+namespace net = std::experimental::net;
+
+void
+test01()
+{
+  net::system_context& c = net::system_executor{}.context();
+  net::post( [] { c.stop(); } );
+  c.join();
+  VERIFY( c.stopped() );
+}
+
+int
+main()
+{
+  test01();
+}


[committed] libstdc++: Fix to compile without gthreads

2020-08-11 Thread Jonathan Wakely via Gcc-patches
libstdc++-v3/ChangeLog:

* include/std/stop_token: Check _GLIBCXX_HAS_GTHREADS using
#ifdef instead of #if.
(stop_token::_S_yield()): Check _GLIBCXX_HAS_GTHREADS before
using __gthread_yield.

Tested powerpc64le-linux and powerpc-aix, with both --enable-threads 
and --disable threads. Committed to trunk.


commit 35e5294c4b779f8fc24fdc86464f999867332995
Author: Jonathan Wakely 
Date:   Tue Aug 11 16:16:21 2020

libstdc++: Fix  to compile without gthreads

libstdc++-v3/ChangeLog:

* include/std/stop_token: Check _GLIBCXX_HAS_GTHREADS using
#ifdef instead of #if.
(stop_token::_S_yield()): Check _GLIBCXX_HAS_GTHREADS before
using __gthread_yield.

diff --git a/libstdc++-v3/include/std/stop_token 
b/libstdc++-v3/include/std/stop_token
index 847d12f7454..ccec6fab15c 100644
--- a/libstdc++-v3/include/std/stop_token
+++ b/libstdc++-v3/include/std/stop_token
@@ -105,7 +105,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 {
 #if defined __i386__ || defined __x86_64__
   __builtin_ia32_pause();
-#elif defined _GLIBCXX_USE_SCHED_YIELD
+#elif defined _GLIBCXX_HAS_GTHREADS && defined _GLIBCXX_USE_SCHED_YIELD
   __gthread_yield();
 #endif
 }
@@ -162,7 +162,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   std::atomic _M_owners{1};
   std::atomic _M_value{_S_ssrc_counter_inc};
   _Stop_cb* _M_head = nullptr;
-#if _GLIBCXX_HAS_GTHREADS
+#ifdef _GLIBCXX_HAS_GTHREADS
   __gthread_t _M_requester;
 #endif
 
@@ -237,7 +237,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  }
while (!_M_try_lock_and_stop(__old));
 
-#if _GLIBCXX_HAS_GTHREADS
+#ifdef _GLIBCXX_HAS_GTHREADS
_M_requester = __gthread_self();
 #endif
 
@@ -266,7 +266,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
if (!__destroyed)
  {
__cb->_M_destroyed = nullptr;
-#if _GLIBCXX_HAS_GTHREADS
+#ifdef _GLIBCXX_HAS_GTHREADS
// synchronize with destructor of stop_callback that owns *__cb
__cb->_M_done.release();
 #endif
@@ -343,7 +343,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
// Callback is not in the list, so must have been removed by a call to
// _M_request_stop.
 
-#if _GLIBCXX_HAS_GTHREADS
+#ifdef _GLIBCXX_HAS_GTHREADS
// Despite appearances there is no data race on _M_requester. The only
// write to it happens before the callback is removed from the list,
// and removing it from the list happens before this read.


[committed] libstdc++: Make std::this_thread functions work without gthreads

2020-08-11 Thread Jonathan Wakely via Gcc-patches
The only function in namespace std::this_thread that actually depends on
thread support being present is this_thread::get_id(). The other
functions (yield, sleep_for and sleep_until) can be defined for targets
without gthreads.

A small change is needed in std::this_thread::sleep_for which currently
uses the __gthread_time_t typedef. Since it just calls nanosleep
directly, it should use timespec directly instead of the typedef.

Even std::this_thread::get_id() could be made to work, the only
difficulty is that it returns a value of type std::thread::id and
std::thread is only defined when gthreads support exists.

libstdc++-v3/ChangeLog:

* include/std/thread [!_GLIBCXX_HAS_GTHREADS] (this_thread::yield)
(this_thread::sleep_until): Define.
[!_GLIBCXX_HAS_GTHREADS] (this_thread::sleep_for): Define. Replace
use of __gthread_time_t typedef with timespec.
* src/c++11/thread.cc [!_GLIBCXX_HAS_GTHREADS] (__sleep_for):
Likewise.
* testsuite/30_threads/this_thread/2.cc: Moved to...
* testsuite/30_threads/this_thread/yield.cc: ...here.
* testsuite/30_threads/this_thread/3.cc: Moved to...
* testsuite/30_threads/this_thread/sleep_for-mt.cc: ...here.
* testsuite/30_threads/this_thread/4.cc: Moved to...
* testsuite/30_threads/this_thread/sleep_until-mt.cc: ...here.
* testsuite/30_threads/this_thread/58038.cc: Add
dg-require-sleep.
* testsuite/30_threads/this_thread/60421.cc: Likewise.
* testsuite/30_threads/this_thread/sleep_for.cc: New test.
* testsuite/30_threads/this_thread/sleep_until.cc: New test.

Tested powerpc64le-linux and powerpc-aix, with both --enable-threads
and --disable threads. Committed to trunk.

commit 5bbb1f3000c57fd4d95969b30fa0e35be6d54ffb
Author: Jonathan Wakely 
Date:   Tue Aug 11 16:16:21 2020

libstdc++: Make std::this_thread functions work without gthreads

The only function in namespace std::this_thread that actually depends on
thread support being present is this_thread::get_id(). The other
functions (yield, sleep_for and sleep_until) can be defined for targets
without gthreads.

A small change is needed in std::this_thread::sleep_for which currently
uses the __gthread_time_t typedef. Since it just calls nanosleep
directly, it should use timespec directly instead of the typedef.

Even std::this_thread::get_id() could be made to work, the only
difficulty is that it returns a value of type std::thread::id and
std::thread is only defined when gthreads support exists.

libstdc++-v3/ChangeLog:

* include/std/thread [!_GLIBCXX_HAS_GTHREADS] (this_thread::yield)
(this_thread::sleep_until): Define.
[!_GLIBCXX_HAS_GTHREADS] (this_thread::sleep_for): Define. Replace
use of __gthread_time_t typedef with timespec.
* src/c++11/thread.cc [!_GLIBCXX_HAS_GTHREADS] (__sleep_for):
Likewise.
* testsuite/30_threads/this_thread/2.cc: Moved to...
* testsuite/30_threads/this_thread/yield.cc: ...here.
* testsuite/30_threads/this_thread/3.cc: Moved to...
* testsuite/30_threads/this_thread/sleep_for-mt.cc: ...here.
* testsuite/30_threads/this_thread/4.cc: Moved to...
* testsuite/30_threads/this_thread/sleep_until-mt.cc: ...here.
* testsuite/30_threads/this_thread/58038.cc: Add
dg-require-sleep.
* testsuite/30_threads/this_thread/60421.cc: Likewise.
* testsuite/30_threads/this_thread/sleep_for.cc: New test.
* testsuite/30_threads/this_thread/sleep_until.cc: New test.

diff --git a/libstdc++-v3/include/std/thread b/libstdc++-v3/include/std/thread
index 0445ab1e319..30ae93a0d5b 100644
--- a/libstdc++-v3/include/std/thread
+++ b/libstdc++-v3/include/std/thread
@@ -35,12 +35,16 @@
 # include 
 #else
 
-#include 
+#include  // std::chrono::*
+
+#ifdef _GLIBCXX_USE_NANOSLEEP
+# include   // errno, EINTR
+# include   // nanosleep
+#endif
 
 #if defined(_GLIBCXX_HAS_GTHREADS)
 #include 
 
-#include  // std::chrono::*
 #include  // std::unique_ptr
 #include   // std::tuple
 
@@ -49,14 +53,11 @@
 # include  // std::stop_source, std::stop_token, std::nostopstate
 #endif
 
-#ifdef _GLIBCXX_USE_NANOSLEEP
-# include   // errno, EINTR
-# include   // nanosleep
-#endif
-
 #include  // std::hash
 #include  // std::__invoke
 
+#endif // _GLIBCXX_HAS_GTHREADS
+
 namespace std _GLIBCXX_VISIBILITY(default)
 {
 _GLIBCXX_BEGIN_NAMESPACE_VERSION
@@ -69,6 +70,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
* @{
*/
 
+#if defined(_GLIBCXX_HAS_GTHREADS)
   /// thread
   class thread
   {
@@ -352,6 +354,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   else
return __out << __id._M_thread;
 }
+#endif // _GLIBCXX_HAS_GTHREADS
 
   /** @namespace std::this_thread
*  @brief ISO C++ 2011 namespace for interacting with the current 

Re: [PATCH 1/3] vec: add exact argument for various grow functions.

2020-08-11 Thread Martin Sebor via Gcc-patches

On 8/11/20 5:36 AM, Martin Liška wrote:

Hello.

All right, I did it in 3 steps:
1) - new exact argument is added (no default value) - I tested the on 
x86_64-linux-gnu

and I build all cross targets.
2) set default value of exact = false
3) change places which calculate its own growth to use the default argument


The usual intent of a default argument is to supply a value the function
is the most commonly invoked with.   But in this case it looks like it's
the opposite: most of the callers (hundreds) provide the non-default
value (true) and only a handful make use of the default.  I feel I must
be  missing something.  What is it?

Martin



I would like to install first 1) and then wait some time before the rest 
is installed.


Thoughts?
Martin




[PATCH] diagnostics: Add new option -fdiagnostics-plain-output

2020-08-11 Thread Lewis Hyatt via Gcc-patches
Hello-

Attached is the patch I mentioned in another discussion here:
https://gcc.gnu.org/pipermail/gcc-patches/2020-August/551442.html

This adds a new option -fdiagnostics-plain-output that currently means the
same thing as:
-fno-diagnostics-show-caret -fno-diagnostics-show-line-numbers
-fdiagnostics-color=never -fdiagnostics-urls=never

The idea is that over time, if diagnostics output changes to get more bells
and whistles by default (such as the UTF-8 output I suggested in the above
discussion), -fdiagnostics-plain-output will adjust to turn that back off,
so that the testsuite needs only the one option and doesn't need to get
updated every time something new is added. It seems to me that when
diagnostics change, it's otherwise a bit hard to update the testsuite
correctly, especially for compat.exp that is often not run by default. I
think this would also be useful for utilities that want to parse the
diagnostics (but aren't using -fdiagnostics-output-format=json).

BTW, I considered avoiding adding a new switch by having this option take
the form -fdiagnostics-output-format=plain instead, but this seems to have
problematic semantics when multiple related options are specified. Given that
this option needs to be expanded early in the parsing process, so that it
can be compatible with the special handling for -fdiagnostics-color, it
seemed best to just make it a simple option with no negated form.

I hope this may be useful, please let me know if you'd like me to push
it. bootstrap and regtest were done for all languages on x86-64 Linux, all
tests the same before and after, and same for the compat.exp with
alternate compiler GCC 8.2.

-Lewis
Subject: [PATCH] diagnostics: Add new option -fdiagnostics-plain-output

Adds the new option -fdiagnostics-plain-output, which is an alias for
several others:

-fno-diagnostics-show-caret
-fno-diagnostics-show-line-numbers
-fdiagnostics-color=never
-fdiagnostics-urls=never

The idea is that in the future, if the default behavior of diagnostics is
changed to add some fancy feature or other, then the
-fdiagnostics-plain-output option will also be changed accordingly so that
the old behavior is preserved in the presence of this option. This allows
us to use -fdiagnostics-plain-output in in the testsuite, such that the
testsuite (specifically the setting of TEST_ALWAYS_FLAGS in prune.exp)
does not need to be touched whenever diagnostics get a new look. This also
removes the need to add workarounds to compat.exp for every new option
that may be needed in a newer version of the compiler, but is not
supported in older versions.

gcc/ChangeLog:

* common.opt: Add new option -fdiagnostics-plain-output.
* doc/invoke.texi: Document it.
* opts-common.c (decode_cmdline_options_to_array): Implement it.
(decode_cmdline_option): Add missing const qualifier to argv.

libstdc++-v3/ChangeLog:

* testsuite/lib/libstdc++.exp: Use the new option
-fdiagnostics-plain-output.

gcc/testsuite/ChangeLog:

* lib/prune.exp: Change TEST_ALWAYS_FLAGS to use 
-fdiagnostics-plain-output.
* lib/c-compat.exp: Adapt to the prune.exp change.

diff --git a/gcc/common.opt b/gcc/common.opt
index c16d1faff88..20fdcc45fe9 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -1378,6 +1378,10 @@ fdiagnostics-path-format=
 Common Joined RejectNegative Var(flag_diagnostics_path_format) 
Enum(diagnostic_path_format) Init(DPF_INLINE_EVENTS)
 Specify how to print any control-flow path associated with a diagnostic.
 
+fdiagnostics-plain-output
+Driver Common RejectNegative
+Turn off any diagnostics features that complicate the output, such as line 
numbers, color, and warning URLs.
+
 ftabstop=
 Common Joined RejectNegative UInteger
 -ftabstop=  Distance between tab stops for column reporting.
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index dea1e1866a4..70dc1ab73a1 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -280,6 +280,7 @@ Objective-C and Objective-C++ Dialects}.
 @item Diagnostic Message Formatting Options
 @xref{Diagnostic Message Formatting Options,,Options to Control Diagnostic 
Messages Formatting}.
 @gccoptlist{-fmessage-length=@var{n}  @gol
+-fdiagnostics-plain-output @gol
 -fdiagnostics-show-location=@r{[}once@r{|}every-line@r{]}  @gol
 -fdiagnostics-color=@r{[}auto@r{|}never@r{|}always@r{]}  @gol
 -fdiagnostics-urls=@r{[}auto@r{|}never@r{|}always@r{]}  @gol
@@ -4291,6 +4292,19 @@ Note - this option also affects the display of the 
@samp{#error} and
 function/type/variable attribute.  It does not however affect the
 @samp{pragma GCC warning} and @samp{pragma GCC error} pragmas.
 
+@item -fdiagnostics-plain-output
+This option requests that diagnostic output look as plain as possible, which
+may be useful when running @command{dejagnu} or other utilities that need to
+parse diagnostics output and prefer that it remain more stable over time.
+@option{-fdiagnostics-plain-output} is 

Re: [PATCH] c-family: Fix ICE in get_atomic_generic_size [PR96545]

2020-08-11 Thread Marek Polacek via Gcc-patches
On Tue, Aug 11, 2020 at 10:50:28AM +0200, Jakub Jelinek via Gcc-patches wrote:
> Hi!
> 
> As the testcase shows, we would ICE if the type of the first argument of
> various atomic builtins was pointer to (non-void) incomplete type, we would
> assume that TYPE_SIZE_UNIT must be non-NULL.  This patch diagnoses it
> instead.  And also changes the TREE_CODE != INTEGER_CST check to
> !tree_fits_uhwi_p, as we use tree_to_uhwi after this and at least in theory
> the int could be too large and not fit.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

OK.

> 2020-08-10  Jakub Jelinek  
> 
>   PR c/96545
>   * c-common.c (get_atomic_generic_size): Require that first argument's
>   type points to a complete type and use tree_fits_uhwi_p instead of
>   just INTEGER_CST TREE_CODE check for the TYPE_SIZE_UNIT.
> 
>   * c-c++-common/pr96545.c: New test.
> 
> --- gcc/c-family/c-common.c.jj2020-07-31 23:07:00.566153515 +0200
> +++ gcc/c-family/c-common.c   2020-08-10 12:03:35.236841534 +0200
> @@ -7017,8 +7017,15 @@ get_atomic_generic_size (location_t loc,
>return 0;
>  }
>  
> +  if (!COMPLETE_TYPE_P (TREE_TYPE (type_0)))
> +{
> +  error_at (loc, "argument 1 of %qE must be a pointer to a complete 
> type",
> + function);
> +  return 0;
> +}
> +
>/* Types must be compile time constant sizes. */
> -  if (TREE_CODE ((TYPE_SIZE_UNIT (TREE_TYPE (type_0 != INTEGER_CST)
> +  if (!tree_fits_uhwi_p ((TYPE_SIZE_UNIT (TREE_TYPE (type_0)
>  {
>error_at (loc, 
>   "argument 1 of %qE must be a pointer to a constant size type",
> --- gcc/testsuite/c-c++-common/pr96545.c.jj   2020-08-10 12:28:43.296222401 
> +0200
> +++ gcc/testsuite/c-c++-common/pr96545.c  2020-08-10 12:28:28.258428487 
> +0200
> @@ -0,0 +1,31 @@
> +/* PR c/96545 */
> +/* { dg-do compile } */
> +
> +extern char x[], y[], z[];
> +struct S;
> +extern struct S s, t, u;
> +int v, w;
> +
> +void
> +foo (void)
> +{
> +  __atomic_exchange (, , , 0); /* { dg-error "must be a pointer to a 
> complete type" } */
> +}
> +
> +void
> +bar (void)
> +{
> +  __atomic_exchange (, , , 0); /* { dg-error "must be a pointer to a 
> complete type" } */
> +}
> +
> +void
> +baz (void)
> +{
> +  __atomic_exchange (, , , 0); /* { dg-error "size mismatch in 
> argument 2 of" } */
> +}
> +
> +void
> +qux (void)
> +{
> +  __atomic_exchange (, , , 0); /* { dg-error "size mismatch in 
> argument 3 of" } */
> +}
> 
>   Jakub
> 

Marek



[PATCH] rs6000: Update powerpc test cases to use -mdejagnu-cpu=.

2020-08-11 Thread Peter Bergner via Gcc-patches
I was looking through some POWER10 test cases and noticed that we used
-mcpu=power10 rather than the preferred -mdejagnu-cpu=power10.  I went
looking for more tests that were not converted over and came up with the
following patch.  Ok for trunk?

Peter


gcc/testsuite/
* g++.dg/ext/spe1.C (dg-options): Use -mdejagnu-cpu=.
* gcc.target/powerpc/pr93122.c: Likewise.
* gcc.target/powerpc/vsx_mask-count-runnable.c: Likewise.
* gcc.target/powerpc/vsx_mask-expand-runnable.c: Likewise.
* gcc.target/powerpc/vsx_mask-extract-runnable.c: Likewise.
* gcc.target/powerpc/vsx_mask-move-runnable.c: Likewise.
* gfortran.dg/pr47614.f (dg-options): Likewise.
* gfortran.dg/pr58968.f: Likewise.
* gfortran.dg/nint_p7.f90: Likewise.  Remove unneeded dg-skip-if.
* g++.dg/pr65240-1.C: Likewise.
* g++.dg/pr65240-2.C: Likewise.
* g++.dg/pr65240-3.C: Likewise.
* g++.dg/pr65240-4.C: Likewise.
* g++.dg/pr65242.C: Likewise.
* g++.dg/pr67211.C: Likewise.
* g++.dg/pr69667.C: Likewise.
* g++.dg/pr71294.C: Likewise.
* g++.dg/pr84279.C: Likewise.
* g++.dg/torture/ppc-ldst-array.C: Likewise.
* g++.dg/torture/pr69264.C (dg-additional-options): Use -mdejagnu-cpu=.
* gcc.dg/pr84032.c: Likewise.
* gcc.dg/torture/pr90972.c: Likewise.
* gcc.dg/vect/O3-pr70130.c: Likewise.
* gfortran.dg/vect/pr45714-b.f: Likewise.
* gcc.dg/vect/pr48765.c: Likewise.  Remove unneeded dg-skip-if.

diff --git a/gcc/testsuite/g++.dg/ext/spe1.C b/gcc/testsuite/g++.dg/ext/spe1.C
index b98d4b27b3d..a2b0224f09e 100644
--- a/gcc/testsuite/g++.dg/ext/spe1.C
+++ b/gcc/testsuite/g++.dg/ext/spe1.C
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-mcpu=8540 -mspe -mabi=spe -mfloat-gprs=single -O0" } */
+/* { dg-options "-mdejagnu-cpu=8540 -mspe -mabi=spe -mfloat-gprs=single -O0" } 
*/
 /* { dg-skip-if "not an SPE target" { ! powerpc_spe_nocache } } */
 
 typedef int v2si __attribute__ ((vector_size (8)));
diff --git a/gcc/testsuite/gcc.target/powerpc/pr93122.c 
b/gcc/testsuite/gcc.target/powerpc/pr93122.c
index 8ea4eb6a48b..97bcb0cea5f 100644
--- a/gcc/testsuite/gcc.target/powerpc/pr93122.c
+++ b/gcc/testsuite/gcc.target/powerpc/pr93122.c
@@ -1,7 +1,7 @@
 /* PR target/93122 */
 /* { dg-require-effective-target power10_ok } */
 /* { dg-do compile { target lp64 } } */
-/* { dg-options "-fstack-clash-protection -mprefixed -mcpu=power10" } */
+/* { dg-options "-fstack-clash-protection -mprefixed -mdejagnu-cpu=power10" } 
*/
 
 void bar (char *);
 
diff --git a/gcc/testsuite/gcc.target/powerpc/vsx_mask-count-runnable.c 
b/gcc/testsuite/gcc.target/powerpc/vsx_mask-count-runnable.c
index f1e3860ee43..ca7e11ba83f 100644
--- a/gcc/testsuite/gcc.target/powerpc/vsx_mask-count-runnable.c
+++ b/gcc/testsuite/gcc.target/powerpc/vsx_mask-count-runnable.c
@@ -1,5 +1,5 @@
 /* { dg-do run } */
-/* { dg-options "-mcpu=power10 -O2" } */
+/* { dg-options "-mdejagnu-cpu=power10 -O2" } */
 /* { dg-require-effective-target power10_hw } */
 
 /* Check that the expected 128-bit instructions are generated if the processor
diff --git a/gcc/testsuite/gcc.target/powerpc/vsx_mask-expand-runnable.c 
b/gcc/testsuite/gcc.target/powerpc/vsx_mask-expand-runnable.c
index 0c5695e4807..f61f62a3c24 100644
--- a/gcc/testsuite/gcc.target/powerpc/vsx_mask-expand-runnable.c
+++ b/gcc/testsuite/gcc.target/powerpc/vsx_mask-expand-runnable.c
@@ -1,5 +1,5 @@
 /* { dg-do run } */
-/* { dg-options "-mcpu=power10 -O2" } */
+/* { dg-options "-mdejagnu-cpu=power10 -O2" } */
 /* { dg-require-effective-target power10_hw } */
 
 /* Check that the expected 128-bit instructions are generated if the processor
diff --git a/gcc/testsuite/gcc.target/powerpc/vsx_mask-extract-runnable.c 
b/gcc/testsuite/gcc.target/powerpc/vsx_mask-extract-runnable.c
index 93c3c720246..63ab59005ad 100644
--- a/gcc/testsuite/gcc.target/powerpc/vsx_mask-extract-runnable.c
+++ b/gcc/testsuite/gcc.target/powerpc/vsx_mask-extract-runnable.c
@@ -1,5 +1,5 @@
 /* { dg-do run } */
-/* { dg-options "-mcpu=power10 -O2" } */
+/* { dg-options "-mdejagnu-cpu=power10 -O2" } */
 /* { dg-require-effective-target power10_hw } */
 
 /* Check that the expected 128-bit instructions are generated if the processor
diff --git a/gcc/testsuite/gcc.target/powerpc/vsx_mask-move-runnable.c 
b/gcc/testsuite/gcc.target/powerpc/vsx_mask-move-runnable.c
index 41dee583e59..ed08172bca6 100644
--- a/gcc/testsuite/gcc.target/powerpc/vsx_mask-move-runnable.c
+++ b/gcc/testsuite/gcc.target/powerpc/vsx_mask-move-runnable.c
@@ -1,5 +1,5 @@
 /* { dg-do run } */
-/* { dg-options "-mcpu=power10 -O2" } */
+/* { dg-options "-mdejagnu-cpu=power10 -O2" } */
 /* { dg-require-effective-target power10_hw } */
 
 /* Check that the expected 128-bit instructions are generated if the processor
diff --git a/gcc/testsuite/g++.dg/pr65240-1.C b/gcc/testsuite/g++.dg/pr65240-1.C
index d2e25b65fca..ff8910df6a1 

[PATCH] testsuite: Fix gcc.target/arm/multilib.exp use of gcc_opts

2020-08-11 Thread Christophe Lyon via Gcc-patches
This patch fixes an incorrect parameter passing for $gcc_opts, which
produces a DejaGnu error: (DejaGnu) proc "gcc_opts" does not exist.

2020-08-11  Christophe Lyon  

gcc/testsuite/
* gcc.target/arm/multilib.exp: Fix parameter passing for gcc_opts.

diff --git a/gcc/testsuite/gcc.target/arm/multilib.exp
b/gcc/testsuite/gcc.target/arm/multilib.exp
index f67a92a..c5f3c02 100644
--- a/gcc/testsuite/gcc.target/arm/multilib.exp
+++ b/gcc/testsuite/gcc.target/arm/multilib.exp
@@ -40,7 +40,7 @@ proc multilib_config {profile} {
 proc check_multi_dir { gcc_opts multi_dir } {
 global tool

-set options [list "additional_flags=[concat
"--print-multi-directory" [gcc_opts]]"]
+set options [list "additional_flags=[concat
"--print-multi-directory" $gcc_opts]"]
 set gcc_output [${tool}_target_compile "" "" "none" $options]
 if { [string match "$multi_dir\n" $gcc_output] } {
pass "multilibdir $gcc_opts $multi_dir"
testsuite: Fix gcc.target/arm/multilib.exp use of gcc_opts

This patch fixes an incorrect parameter passing for $gcc_opts, which
produces a DejaGnu error: (DejaGnu) proc "gcc_opts" does not exist.

2020-08-11  Christophe Lyon  

gcc/testsuite/
* gcc.target/arm/multilib.exp: Fix parameter passing for gcc_opts.

diff --git a/gcc/testsuite/gcc.target/arm/multilib.exp 
b/gcc/testsuite/gcc.target/arm/multilib.exp
index f67a92a..c5f3c02 100644
--- a/gcc/testsuite/gcc.target/arm/multilib.exp
+++ b/gcc/testsuite/gcc.target/arm/multilib.exp
@@ -40,7 +40,7 @@ proc multilib_config {profile} {
 proc check_multi_dir { gcc_opts multi_dir } {
 global tool
 
-set options [list "additional_flags=[concat "--print-multi-directory" 
[gcc_opts]]"]
+set options [list "additional_flags=[concat "--print-multi-directory" 
$gcc_opts]"]
 set gcc_output [${tool}_target_compile "" "" "none" $options]
 if { [string match "$multi_dir\n" $gcc_output] } {
pass "multilibdir $gcc_opts $multi_dir"


Re: [PATCH] arm: Clear canary value after stack_protect_test [PR96191]

2020-08-11 Thread Christophe Lyon via Gcc-patches
On Mon, 10 Aug 2020 at 17:27, Richard Sandiford
 wrote:
>
> Christophe Lyon  writes:
> > On Wed, 5 Aug 2020 at 16:33, Richard Sandiford
> >  wrote:
> >>
> >> The stack_protect_test patterns were leaving the canary value in the
> >> temporary register, meaning that it was often still in registers on
> >> return from the function.  An attacker might therefore have been
> >> able to use it to defeat stack-smash protection for a later function.
> >>
> >> Tested on arm-linux-gnueabi, arm-linux-gnueabihf and armeb-eabi.
> >> I tested the thumb1.md part using arm-linux-gnueabi with the
> >> test flags -march=armv5t -mthumb.  OK for trunk and branches?
> >>
> >> As I mentioned in the corresponding aarch64 patch, this is needed
> >> to make arm conform to GCC's current -fstack-protector implementation.
> >> However, I think we should reconsider whether the zeroing is actually
> >> necessary and what it's actually protecting against.  I'll send a
> >> separate message about that to gcc@.  But since the port isn't even
> >> self-consistent (the *set patterns do clear the registers), I think
> >> we should do this first rather than wait for any outcome of that
> >> discussion.
> >>
> >> Richard
> >>
> >>
> >> gcc/
> >> PR target/96191
> >> * config/arm/arm.md (arm_stack_protect_test_insn): Zero out
> >> operand 2 after use.
> >> * config/arm/thumb1.md (thumb1_stack_protect_test_insn): Likewise.
> >>
> >> gcc/testsuite/
> >> * gcc.target/arm/stack-protector-1.c: New test.
> >> * gcc.target/arm/stack-protector-2.c: Likewise.
> >
> > Hi Richard,
> >
> > The new tests fail when compiled with -mcpu=cortex-mXX because gas 
> > complains:
> > use of r13 is deprecated
> > It has a comment saying: "In the Thumb-2 ISA, use of R13 as Rm is
> > deprecated, but valid."
> >
> > It's a minor nuisance, I'm not sure what the best way of getting rid of it?
> > Add #ifndef __thumb2__ around CHECK(r13) ?
>
> Hmm, maybe we should just drop that line altogether.  It wasn't exactly
> likely that r13 would be the register to leak the value :-)
>
> Should I post a patch or do you already have one ready?

I was about to push the patch that removes the line CHECK(r13).

However, I've noticed that when using -mcpu=cortex-m[01], we have an
error from gas:
Error: Thumb does not support this addressing mode -- `str r0,[sp,#-8]!'

The attached patch replaces this instruction with
sub sp,sp,8
str r0,[rp]

Checked with cortex-m0 and cortex-m3.

OK?

Thanks,

Christophe


>
> Thanks,
> Richard
testsuite: Fix gcc.target/arm/stack-protector-1.c for Cortex-M

The stack-protector-1.c test fails when compiled for Cortex-M:
- for Cortex-M0/M1, str r0, [sp #-8]! is not supported
- for Cortex-M3/M4..., the assembler complains that "use of r13 is
  deprecated"

This patch replaces the str instruction with
 sub   sp, sp, #8
 str r0, [sp]
and removes the check for r13, which is unlikely to leak the canary
value.

2020-08-11  Christophe Lyon  

gcc/testsuite/
* gcc.target/arm/stack-protector-1.c: Adapt code to Cortex-M
restrictions.

diff --git a/gcc/testsuite/gcc.target/arm/stack-protector-1.c 
b/gcc/testsuite/gcc.target/arm/stack-protector-1.c
index b03ea14..8d28b0a 100644
--- a/gcc/testsuite/gcc.target/arm/stack-protector-1.c
+++ b/gcc/testsuite/gcc.target/arm/stack-protector-1.c
@@ -34,7 +34,8 @@ asm (
 "  .type   main, %function\n"
 "main:\n"
 "  bl  get_ptr\n"
-"  str r0, [sp, #-8]!\n"
+"  sub sp, sp, #8\n"
+"  str r0, [sp]\n"
 "  bl  f\n"
 "  str r0, [sp, #4]\n"
 "  ldr r0, [sp]\n"
@@ -51,7 +52,6 @@ asm (
CHECK (r10)
CHECK (r11)
CHECK (r12)
-   CHECK (r13)
CHECK (r14)
 "  ldr r1, [sp, #4]\n"
CHECK (r1)


PING: Fwd: [PATCH] Rewrite get_size_range for irange API.

2020-08-11 Thread Aldy Hernandez via Gcc-patches
-- Forwarded message -
From: Aldy Hernandez 
Date: Thu, Aug 6, 2020, 16:54
Subject: [PATCH] Rewrite get_size_range for irange API.
To: 
Cc: , Aldy Hernandez 


[Martin, does this sound reasonable to you?]

The following patch converts get_size_range to the irange API, thus
removing the use of VR_ANTI_RANGE.

This was a bit tricky because of the gymnastics we do in get_size_range
to ignore negatives and all that.  I didn't convert the function for
multi-ranges.  The code still returns a pair of trees indicating the
massaged range.  But I do believe the code is cleaner and smaller.

I'm not sure the current code (or my adaptation) gets all cases, but
the goal was to keep to the existing functionality, nothing more.

OK?

gcc/ChangeLog:

* calls.c (range_remove_non_positives): New.
(set_bounds_from_range): New.
(get_size_range): Rewrite for irange API.
* tree-affine.c (expr_to_aff_combination): Call
determine_value_range
with a range.
* tree-vrp.c (determine_value_range_1): Rename to...
(determine_value_range): ...this.
* tree-vrp.h (determine_value_range): Adjust prototype.
---
 gcc/calls.c   | 139 ++
 gcc/tree-affine.c |   5 +-
 gcc/tree-vrp.c|  44 ++-
 gcc/tree-vrp.h|   2 +-
 4 files changed, 73 insertions(+), 117 deletions(-)

diff --git a/gcc/calls.c b/gcc/calls.c
index 44401e6350d..4aeeb36a2be 100644
--- a/gcc/calls.c
+++ b/gcc/calls.c
@@ -57,6 +57,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "attribs.h"
 #include "builtins.h"
 #include "gimple-fold.h"
+#include "range.h"

 /* Like PREFERRED_STACK_BOUNDARY but in units of bytes, not bits.  */
 #define STACK_BYTES (PREFERRED_STACK_BOUNDARY / BITS_PER_UNIT)
@@ -1237,6 +1238,31 @@ alloc_max_size (void)
   return alloc_object_size_limit;
 }

+// Remove non-positive numbers from a range.  ALLOW_ZERO is TRUE if 0
+// is considered positive.
+
+static void
+range_remove_non_positives (irange *vr, bool allow_zero)
+{
+  tree floor, type = vr->type ();
+  if (allow_zero)
+floor = build_zero_cst (type);
+  else
+floor = build_one_cst (type);
+  value_range positives (floor, TYPE_MAX_VALUE (type));
+  vr->intersect (positives);
+}
+
+// Set the extreme bounds of range VR into range[].
+
+static bool
+set_bounds_from_range (const irange *vr, tree range[2])
+{
+  range[0] = wide_int_to_tree (vr->type (), vr->lower_bound ());
+  range[1] = wide_int_to_tree (vr->type (), vr->upper_bound ());
+  return true;
+}
+
 /* Return true when EXP's range can be determined and set RANGE[] to it
after adjusting it if necessary to make EXP a represents a valid size
of object, or a valid size argument to an allocation function declared
@@ -1250,9 +1276,11 @@ alloc_max_size (void)
 bool
 get_size_range (tree exp, tree range[2], bool allow_zero /* = false */)
 {
-  if (!exp)
-return false;
-
+  if (!exp || !INTEGRAL_TYPE_P (TREE_TYPE (exp)))
+{
+  range[0] = range[1] = NULL_TREE;
+  return false;
+}
   if (tree_fits_uhwi_p (exp))
 {
   /* EXP is a constant.  */
@@ -1261,91 +1289,30 @@ get_size_range (tree exp, tree range[2], bool
allow_zero /* = false */)
 }

   tree exptype = TREE_TYPE (exp);
-  bool integral = INTEGRAL_TYPE_P (exptype);
-
-  wide_int min, max;
-  enum value_range_kind range_type;
-
-  if (integral)
-range_type = determine_value_range (exp, , );
-  else
-range_type = VR_VARYING;
-
-  if (range_type == VR_VARYING)
+  value_range vr;
+  determine_value_range (, exp);
+  if (vr.num_pairs () == 1)
+return set_bounds_from_range (, range);
+
+  widest_irange positives (vr);
+  range_remove_non_positives (, allow_zero);
+
+  // If all numbers are negative, let the caller sort it out.
+  if (positives.undefined_p ())
+return set_bounds_from_range (, range);
+
+  // Remove the unknown parts of a multi-range.
+  // This will transform [5,10][20,MAX] into [5,10].
+  int pairs = positives.num_pairs ();
+  if (pairs > 1
+  && positives.upper_bound () == wi::to_wide (TYPE_MAX_VALUE
(exptype)))
 {
-  if (integral)
-   {
- /* Use the full range of the type of the expression when
-no value range information is available.  */
- range[0] = TYPE_MIN_VALUE (exptype);
- range[1] = TYPE_MAX_VALUE (exptype);
- return true;
-   }
-
-  range[0] = NULL_TREE;
-  range[1] = NULL_TREE;
-  return false;
+  value_range last_range (exptype,
+ positives.lower_bound (pairs - 1),
+ positives.upper_bound (pairs - 1),
VR_ANTI_RANGE);
+  positives.intersect (last_range);
 }
-
-  unsigned expprec = TYPE_PRECISION (exptype);
-
-  bool signed_p = !TYPE_UNSIGNED (exptype);
-
-  if (range_type == VR_ANTI_RANGE)
-{
-  if (signed_p)
-   {
- if (wi::les_p (max, 0))
-   {
- /* EXP is not in a strictly 

PING: Fwd: [PATCH] Adjust tree-ssa-strlen.c for irange API.

2020-08-11 Thread Aldy Hernandez via Gcc-patches
-- Forwarded message -
From: Aldy Hernandez 
Date: Tue, Aug 4, 2020, 13:34
Subject: [PATCH] Adjust tree-ssa-strlen.c for irange API.
To: 
Cc: , , Aldy Hernandez 


This patch adapts the strlen pass to use the irange API.

I wasn't able to remove the one annoying use of VR_ANTI_RANGE, because
I'm not sure what to do.  Perhaps Martin can shed some light.  The
current code has:

  else if (rng == VR_ANTI_RANGE)
{
  wide_int maxobjsize = wi::to_wide (TYPE_MAX_VALUE
(ptrdiff_type_node));
  if (wi::ltu_p (cntrange[1], maxobjsize))
{
  cntrange[0] = cntrange[1] + 1;
  cntrange[1] = maxobjsize;

Suppose we have ~[10,20], won't the above set cntrange[] to [21,MAX]?  Won't
this ignore the 0..9 that is part of the range?  What should we do here?

Anyways, I've left the anti-range in place, but the rest of the patch still
stands.

OK?

gcc/ChangeLog:

* tree-ssa-strlen.c (get_range): Adjust for irange API.
(compare_nonzero_chars): Same.
(dump_strlen_info): Same.
(get_range_strlen_dynamic): Same.
(set_strlen_range): Same.
(maybe_diag_stxncpy_trunc): Same.
(get_len_or_size): Same.
(count_nonzero_bytes_addr): Same.
(handle_integral_assign): Same.
---
 gcc/tree-ssa-strlen.c | 122 --
 1 file changed, 57 insertions(+), 65 deletions(-)

diff --git a/gcc/tree-ssa-strlen.c b/gcc/tree-ssa-strlen.c
index fbaee745f7d..e6009874ee5 100644
--- a/gcc/tree-ssa-strlen.c
+++ b/gcc/tree-ssa-strlen.c
@@ -220,21 +220,25 @@ get_range (tree val, wide_int minmax[2], const
vr_values *rvals /* = NULL */)
 GCC 11).  */
   const value_range *vr
= (CONST_CAST (class vr_values *, rvals)->get_value_range (val));
-  value_range_kind rng = vr->kind ();
-  if (rng != VR_RANGE || !range_int_cst_p (vr))
+  if (vr->undefined_p () || vr->varying_p ())
return NULL_TREE;

-  minmax[0] = wi::to_wide (vr->min ());
-  minmax[1] = wi::to_wide (vr->max ());
+  minmax[0] = vr->lower_bound ();
+  minmax[1] = vr->upper_bound ();
   return val;
 }

-  value_range_kind rng = get_range_info (val, minmax, minmax + 1);
-  if (rng == VR_RANGE)
-return val;
+  value_range vr;
+  get_range_info (val, vr);
+  if (!vr.undefined_p () && !vr.varying_p ())
+{
+  minmax[0] = vr.lower_bound ();
+  minmax[1] = vr.upper_bound ();
+  return val;
+}

-  /* Do not handle anti-ranges and instead make use of the on-demand
- VRP if/when it becomes available (hopefully in GCC 11).  */
+  /* We should adjust for the on-demand VRP if/when it becomes
+ available (hopefully in GCC 11).  */
   return NULL_TREE;
 }

@@ -278,16 +282,18 @@ compare_nonzero_chars (strinfo *si, unsigned
HOST_WIDE_INT off,
 = (CONST_CAST (class vr_values *, rvals)
->get_value_range (si->nonzero_chars));

-  value_range_kind rng = vr->kind ();
-  if (rng != VR_RANGE || !range_int_cst_p (vr))
+  if (vr->undefined_p () || vr->varying_p ())
 return -1;

   /* If the offset is less than the minimum length or if the bounds
  of the length range are equal return the result of the comparison
  same as in the constant case.  Otherwise return a conservative
  result.  */
-  int cmpmin = compare_tree_int (vr->min (), off);
-  if (cmpmin > 0 || tree_int_cst_equal (vr->min (), vr->max ()))
+  tree type = TREE_TYPE (si->nonzero_chars);
+  tree tmin = wide_int_to_tree (type, vr->lower_bound ());
+  tree tmax = wide_int_to_tree (type, vr->upper_bound ());
+  int cmpmin = compare_tree_int (tmin, off);
+  if (cmpmin > 0 || tree_int_cst_equal (tmin, tmax))
 return cmpmin;

   return -1;
@@ -905,32 +911,14 @@ dump_strlen_info (FILE *fp, gimple *stmt, const
vr_values *rvals)
  print_generic_expr (fp, si->nonzero_chars);
  if (TREE_CODE (si->nonzero_chars) == SSA_NAME)
{
- value_range_kind rng = VR_UNDEFINED;
- wide_int min, max;
+ value_range vr;
  if (rvals)
-   {
- const value_range *vr
-   = CONST_CAST (class vr_values *, rvals)
-   ->get_value_range (si->nonzero_chars);
- rng = vr->kind ();
- if (range_int_cst_p (vr))
-   {
- min = wi::to_wide (vr->min ());
- max = wi::to_wide (vr->max ());
-   }
- else
-   rng = VR_UNDEFINED;
-   }
+   vr = *(CONST_CAST (class vr_values *, rvals)
+  ->get_value_range (si->nonzero_chars));
  else
-   rng = get_range_info (si->nonzero_chars, ,
);
-
- 

PING: Fwd: [PATCH 2/2] Decouple adjust_range_from_scev from vr_values and value_range_equiv.

2020-08-11 Thread Aldy Hernandez via Gcc-patches
-- Forwarded message -
From: Aldy Hernandez 
Date: Tue, Aug 4, 2020, 14:03
Subject: [PATCH 2/2] Decouple adjust_range_from_scev from vr_values and
value_range_equiv.
To: 
Cc: , Aldy Hernandez 


I've abstracted out the parts of the code that had nothing to do with
value_range_equiv into an externally visible range_of_var_in_loop().
This way, it can be called with any range.

adjust_range_with_scev still works as before, intersecting with a
known range.  Due to the way value_range_equiv::intersect works,
intersecting a value_range_equiv with no equivalences into one
with equivalences will result in the resulting range maintaining
whatever equivalences it had.  So everything works as the
vr->update() did before (remember that ::update() retains
equivalences).

OK?

gcc/ChangeLog:

* vr-values.c (check_for_binary_op_overflow): Change type of store
to range_query.
(vr_values::adjust_range_with_scev): Abstract most of the code...
(range_of_var_in_loop): ...here.  Remove value_range_equiv uses.
(simplify_using_ranges::simplify_using_ranges): Change type of store
to range_query.
* vr-values.h (class range_query): New.
(class simplify_using_ranges): Use range_query.
(class vr_values): Add OVERRIDE to get_value_range.
(range_of_var_in_loop): New.
---
 gcc/vr-values.c | 140 ++--
 gcc/vr-values.h |  23 ++--
 2 files changed, 81 insertions(+), 82 deletions(-)

diff --git a/gcc/vr-values.c b/gcc/vr-values.c
index 9002d87c14b..e7f97bdbf7b 100644
--- a/gcc/vr-values.c
+++ b/gcc/vr-values.c
@@ -1004,7 +1004,7 @@ vr_values::extract_range_from_comparison
(value_range_equiv *vr,
overflow.  */

 static bool
-check_for_binary_op_overflow (vr_values *store,
+check_for_binary_op_overflow (range_query *store,
  enum tree_code subcode, tree type,
  tree op0, tree op1, bool *ovf)
 {
@@ -1737,22 +1737,18 @@ compare_range_with_value (enum tree_code comp,
const value_range *vr,

   gcc_unreachable ();
 }
+
 /* Given a range VR, a LOOP and a variable VAR, determine whether it
would be profitable to adjust VR using scalar evolution information
for VAR.  If so, update VR with the new limits.  */

 void
-vr_values::adjust_range_with_scev (value_range_equiv *vr, class loop *loop,
-  gimple *stmt, tree var)
+range_of_var_in_loop (irange *vr, range_query *query,
+ class loop *loop, gimple *stmt, tree var)
 {
-  tree init, step, chrec, tmin, tmax, min, max, type, tem;
+  tree init, step, chrec, tmin, tmax, min, max, type;
   enum ev_direction dir;

-  /* TODO.  Don't adjust anti-ranges.  An anti-range may provide
- better opportunities than a regular range, but I'm not sure.  */
-  if (vr->kind () == VR_ANTI_RANGE)
-return;
-
   chrec = instantiate_parameters (loop, analyze_scalar_evolution (loop,
var));

   /* Like in PR19590, scev can return a constant function.  */
@@ -1763,16 +1759,17 @@ vr_values::adjust_range_with_scev
(value_range_equiv *vr, class loop *loop,
 }

   if (TREE_CODE (chrec) != POLYNOMIAL_CHREC)
-return;
+{
+  vr->set_varying (TREE_TYPE (var));
+  return;
+}

   init = initial_condition_in_loop_num (chrec, loop->num);
-  tem = op_with_constant_singleton_value_range (init);
-  if (tem)
-init = tem;
+  if (TREE_CODE (init) == SSA_NAME)
+query->get_value_range (init, stmt)->singleton_p ();
   step = evolution_part_in_loop_num (chrec, loop->num);
-  tem = op_with_constant_singleton_value_range (step);
-  if (tem)
-step = tem;
+  if (TREE_CODE (step) == SSA_NAME)
+query->get_value_range (step, stmt)->singleton_p ();

   /* If STEP is symbolic, we can't know whether INIT will be the
  minimum or maximum value in the range.  Also, unless INIT is
@@ -1781,7 +1778,10 @@ vr_values::adjust_range_with_scev (value_range_equiv
*vr, class loop *loop,
   if (step == NULL_TREE
   || !is_gimple_min_invariant (step)
   || !valid_value_p (init))
-return;
+{
+  vr->set_varying (TREE_TYPE (var));
+  return;
+}

   dir = scev_direction (chrec);
   if (/* Do not adjust ranges if we do not know whether the iv increases
@@ -1790,7 +1790,10 @@ vr_values::adjust_range_with_scev (value_range_equiv
*vr, class loop *loop,
   /* ... or if it may wrap.  */
   || scev_probably_wraps_p (NULL_TREE, init, step, stmt,
get_chrec_loop (chrec), true))
-return;
+{
+  vr->set_varying (TREE_TYPE (var));
+  return;
+}

   type = TREE_TYPE (var);
   if (POINTER_TYPE_P (type) || !TYPE_MIN_VALUE (type))
@@ -1807,7 +1810,7 @@ vr_values::adjust_range_with_scev (value_range_equiv
*vr, class loop *loop,
   if (TREE_CODE (step) == INTEGER_CST
   && is_gimple_val (init)
   && (TREE_CODE (init) != SSA_NAME
- || get_value_range (init, stmt)->kind () == VR_RANGE))
+ 

PING: Fwd: [PATCH 1/2] Add statement context to get_value_range.

2020-08-11 Thread Aldy Hernandez via Gcc-patches
-- Forwarded message -
From: Aldy Hernandez 
Date: Tue, Aug 4, 2020, 13:55
Subject: [PATCH 1/2] Add statement context to get_value_range.
To: 
Cc: , Aldy Hernandez 


This is in line with the statement context that we have for get_value()
in the substitute_and_fold_engine class.
---
 gcc/vr-values.c | 64 ++---
 gcc/vr-values.h | 14 +--
 2 files changed, 41 insertions(+), 37 deletions(-)

diff --git a/gcc/vr-values.c b/gcc/vr-values.c
index 511342f2f13..9002d87c14b 100644
--- a/gcc/vr-values.c
+++ b/gcc/vr-values.c
@@ -147,7 +147,8 @@ vr_values::get_lattice_entry (const_tree var)
return NULL.  Otherwise create an empty range if none existed for VAR.
*/

 const value_range_equiv *
-vr_values::get_value_range (const_tree var)
+vr_values::get_value_range (const_tree var,
+   gimple *stmt ATTRIBUTE_UNUSED)
 {
   /* If we have no recorded ranges, then return NULL.  */
   if (!vr_value)
@@ -450,7 +451,7 @@ simplify_using_ranges::op_with_boolean_value_range_p
(tree op)

   /* ?? Errr, this should probably check for [0,0] and [1,1] as well
  as [0,1].  */
-  const value_range *vr = get_value_range (op);
+  const value_range *vr = get_value_range (op, NULL);
   return *vr == value_range (build_zero_cst (TREE_TYPE (op)),
 build_one_cst (TREE_TYPE (op)));
 }
@@ -972,12 +973,13 @@ vr_values::extract_range_from_cond_expr
(value_range_equiv *vr, gassign *stmt)

 void
 vr_values::extract_range_from_comparison (value_range_equiv *vr,
+ gimple *stmt,
  enum tree_code code,
  tree type, tree op0, tree op1)
 {
   bool sop;
   tree val
-= simplifier.vrp_evaluate_conditional_warnv_with_ops (code, op0, op1,
+= simplifier.vrp_evaluate_conditional_warnv_with_ops (stmt, code, op0,
op1,
  false, ,
NULL);
   if (val)
 {
@@ -1008,14 +1010,14 @@ check_for_binary_op_overflow (vr_values *store,
 {
   value_range vr0, vr1;
   if (TREE_CODE (op0) == SSA_NAME)
-vr0 = *store->get_value_range (op0);
+vr0 = *store->get_value_range (op0, NULL);
   else if (TREE_CODE (op0) == INTEGER_CST)
 vr0.set (op0);
   else
 vr0.set_varying (TREE_TYPE (op0));

   if (TREE_CODE (op1) == SSA_NAME)
-vr1 = *store->get_value_range (op1);
+vr1 = *store->get_value_range (op1, NULL);
   else if (TREE_CODE (op1) == INTEGER_CST)
 vr1.set (op1);
   else
@@ -1472,7 +1474,7 @@ vr_values::extract_range_from_assignment
(value_range_equiv *vr, gassign *stmt)
   else if (code == COND_EXPR)
 extract_range_from_cond_expr (vr, stmt);
   else if (TREE_CODE_CLASS (code) == tcc_comparison)
-extract_range_from_comparison (vr, gimple_assign_rhs_code (stmt),
+extract_range_from_comparison (vr, stmt, gimple_assign_rhs_code (stmt),
   gimple_expr_type (stmt),
   gimple_assign_rhs1 (stmt),
   gimple_assign_rhs2 (stmt));
@@ -1805,7 +1807,7 @@ vr_values::adjust_range_with_scev (value_range_equiv
*vr, class loop *loop,
   if (TREE_CODE (step) == INTEGER_CST
   && is_gimple_val (init)
   && (TREE_CODE (init) != SSA_NAME
- || get_value_range (init)->kind () == VR_RANGE))
+ || get_value_range (init, stmt)->kind () == VR_RANGE))
 {
   widest_int nit;

@@ -1838,7 +1840,7 @@ vr_values::adjust_range_with_scev (value_range_equiv
*vr, class loop *loop,
  value_range initvr;

  if (TREE_CODE (init) == SSA_NAME)
-   initvr = *(get_value_range (init));
+   initvr = *(get_value_range (init, stmt));
  else if (is_gimple_min_invariant (init))
initvr.set (init);
  else
@@ -2090,7 +2092,7 @@ const value_range_equiv *
 simplify_using_ranges::get_vr_for_comparison (int i, value_range_equiv
*tem)
 {
   /* Shallow-copy equiv bitmap.  */
-  const value_range_equiv *vr = get_value_range (ssa_name (i));
+  const value_range_equiv *vr = get_value_range (ssa_name (i), NULL);

   /* If name N_i does not have a valid range, use N_i as its own
  range.  This allows us to compare against names that may
@@ -2115,7 +2117,7 @@ simplify_using_ranges::compare_name_with_value
 bool *strict_overflow_p, bool use_equiv_p)
 {
   /* Get the set of equivalences for VAR.  */
-  bitmap e = get_value_range (var)->equiv ();
+  bitmap e = get_value_range (var, NULL)->equiv ();

   /* Start at -1.  Set it to 0 if we do a comparison without relying
  on overflow, or 1 if all comparisons rely on overflow.  */
@@ -2195,8 +2197,8 @@ simplify_using_ranges::compare_names (enum tree_code
comp, tree n1, tree n2,
 {
   /* Compare the ranges of every name equivalent to N1 against the
  ranges of every name 

PING: Fwd: [PATCH] Adjust tree-ssa-dom.c for irange API.

2020-08-11 Thread Aldy Hernandez via Gcc-patches
-- Forwarded message -
From: Aldy Hernandez 
Date: Tue, Aug 4, 2020, 13:39
Subject: [PATCH] Adjust tree-ssa-dom.c for irange API.
To: 
Cc: , Aldy Hernandez 


This patch removes all uses of VR_ANTI_RANGE in DOM.  It required
minor surgery in the switch handling code.

In doing so, I was able to abstract all the code handling the cases
with ranges into its own function.  Interestingly, there is an exact
copy of this function in VRP, so I was able to use that there too.

I also saw that most of simplify_stmt_for_jump_threading() is
duplicated in VRP/DOM, but I left that alone.  The amount of
duplicated code in this space is mind boggling.

OK?

gcc/ChangeLog:

* tree-ssa-dom.c (simplify_stmt_for_jump_threading): Abstract code
out to...
* tree-vrp.c (find_case_label_range): ...here.  Rewrite for to use
irange
API.
(simplify_stmt_for_jump_threading): Call find_case_label_range
instead of
duplicating the code in simplify_stmt_for_jump_threading.
* tree-vrp.h (find_case_label_range): New prototype.
---
 gcc/tree-ssa-dom.c |  56 +++---
 gcc/tree-vrp.c | 117 +++--
 gcc/tree-vrp.h |   1 +
 3 files changed, 67 insertions(+), 107 deletions(-)

diff --git a/gcc/tree-ssa-dom.c b/gcc/tree-ssa-dom.c
index 69eaec345bf..de5025f3879 100644
--- a/gcc/tree-ssa-dom.c
+++ b/gcc/tree-ssa-dom.c
@@ -868,7 +868,11 @@ make_pass_dominator (gcc::context *ctxt)
 static class vr_values *x_vr_values;

 /* A trivial wrapper so that we can present the generic jump
-   threading code with a simple API for simplifying statements.  */
+   threading code with a simple API for simplifying statements.
+
+   ?? This should be cleaned up.  There's a virtually identical copy
+   of this function in tree-vrp.c.  */
+
 static tree
 simplify_stmt_for_jump_threading (gimple *stmt,
  gimple *within_stmt ATTRIBUTE_UNUSED,
@@ -901,55 +905,7 @@ simplify_stmt_for_jump_threading (gimple *stmt,
return NULL_TREE;

   const value_range_equiv *vr = x_vr_values->get_value_range (op);
-  if (vr->undefined_p ()
- || vr->varying_p ()
- || vr->symbolic_p ())
-   return NULL_TREE;
-
-  if (vr->kind () == VR_RANGE)
-   {
- size_t i, j;
-
- find_case_label_range (switch_stmt, vr->min (), vr->max (), ,
);
-
- /* Is there only one such label?  */
- if (i == j)
-   {
- tree label = gimple_switch_label (switch_stmt, i);
- tree singleton;
-
- /* The i'th label will only be taken if the value range of the
-operand is entirely within the bounds of this label.  */
- if (CASE_HIGH (label) != NULL_TREE
- ? (tree_int_cst_compare (CASE_LOW (label), vr->min ()) <=
0
-&& tree_int_cst_compare (CASE_HIGH (label), vr->max
()) >= 0)
- : (vr->singleton_p ()
-&& tree_int_cst_equal (CASE_LOW (label), singleton)))
-   return label;
-   }
-
- /* If there are no such labels, then the default label
-will be taken.  */
- if (i > j)
-   return gimple_switch_label (switch_stmt, 0);
-   }
-
-  if (vr->kind () == VR_ANTI_RANGE)
-  {
-unsigned n = gimple_switch_num_labels (switch_stmt);
-tree min_label = gimple_switch_label (switch_stmt, 1);
-tree max_label = gimple_switch_label (switch_stmt, n - 1);
-
-/* The default label will be taken only if the anti-range of
the
-   operand is entirely outside the bounds of all the
(non-default)
-   case labels.  */
-if (tree_int_cst_compare (vr->min (), CASE_LOW (min_label)) <=
0
-&& (CASE_HIGH (max_label) != NULL_TREE
-? tree_int_cst_compare (vr->max (), CASE_HIGH
(max_label)) >= 0
-: tree_int_cst_compare (vr->max (), CASE_LOW
(max_label)) >= 0))
-return gimple_switch_label (switch_stmt, 0);
-  }
-   return NULL_TREE;
+  return find_case_label_range (switch_stmt, vr);
 }

   if (gassign *assign_stmt = dyn_cast  (stmt))
diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
index de84c1d505d..8c1a1854daa 100644
--- a/gcc/tree-vrp.c
+++ b/gcc/tree-vrp.c
@@ -3802,6 +3802,61 @@ find_case_label_range (gswitch *stmt, tree min, tree
max, size_t *min_idx,
 }
 }

+/* Given a SWITCH_STMT, return the case label that encompasses the
+   known possible values for the switch operand.  RANGE_OF_OP is a
+   range for the known values of the switch operand.  */
+
+tree
+find_case_label_range (gswitch *switch_stmt, const irange *range_of_op)
+{
+  if (range_of_op->undefined_p ()
+  || range_of_op->varying_p ()
+  || range_of_op->symbolic_p ())
+return NULL_TREE;
+
+  size_t i, j;
+  tree op = gimple_switch_index (switch_stmt);
+  tree type = TREE_TYPE (op);
+ 

[PATCH 2/3] vec: default exact = false in grow functions.

2020-08-11 Thread Martin Liška


>From 292532ea9e3d42ca164b9951674c1eccc86a1f11 Mon Sep 17 00:00:00 2001
From: Martin Liska 
Date: Mon, 10 Aug 2020 12:01:59 +0200
Subject: [PATCH 2/3] vec: default exect = false in grow functions.

gcc/ChangeLog:

	* vec.h (vec_safe_grow): Change default of exact to false.
	(vec_safe_grow_cleared): Likewise.
---
 gcc/vec.h | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/gcc/vec.h b/gcc/vec.h
index e6e40e2f265..a908d751ab7 100644
--- a/gcc/vec.h
+++ b/gcc/vec.h
@@ -724,7 +724,7 @@ vec_free (vec *)
 template
 inline void
 vec_safe_grow (vec *, unsigned len,
-	   bool exact CXX_MEM_STAT_INFO)
+	   bool exact = false CXX_MEM_STAT_INFO)
 {
   unsigned oldlen = vec_safe_length (v);
   gcc_checking_assert (len >= oldlen);
@@ -737,7 +737,7 @@ vec_safe_grow (vec *, unsigned len,
 template
 inline void
 vec_safe_grow_cleared (vec *, unsigned len,
-		   bool exact CXX_MEM_STAT_INFO)
+		   bool exact = false CXX_MEM_STAT_INFO)
 {
   unsigned oldlen = vec_safe_length (v);
   vec_safe_grow (v, len, exact PASS_MEM_STAT);
@@ -750,7 +750,7 @@ vec_safe_grow_cleared (vec *, unsigned len,
 template
 inline void
 vec_safe_grow_cleared (vec *,
-		   unsigned len, bool exact CXX_MEM_STAT_INFO)
+		   unsigned len, bool exact = false CXX_MEM_STAT_INFO)
 {
   v->safe_grow_cleared (len, exact PASS_MEM_STAT);
 }
@@ -1462,8 +1462,8 @@ public:
   T *safe_push (const T _MEM_STAT_INFO);
   T  (void);
   void truncate (unsigned);
-  void safe_grow (unsigned, bool CXX_MEM_STAT_INFO);
-  void safe_grow_cleared (unsigned, bool CXX_MEM_STAT_INFO);
+  void safe_grow (unsigned, bool = false CXX_MEM_STAT_INFO);
+  void safe_grow_cleared (unsigned, bool = false CXX_MEM_STAT_INFO);
   void quick_grow (unsigned);
   void quick_grow_cleared (unsigned);
   void quick_insert (unsigned, const T &);
-- 
2.28.0



[PATCH 3/3] vec: use inexact growth where possible.

2020-08-11 Thread Martin Liška


>From cc1d41a469d76f2f8e4f44bed788ace77a1c6d62 Mon Sep 17 00:00:00 2001
From: Martin Liska 
Date: Mon, 10 Aug 2020 12:09:19 +0200
Subject: [PATCH 3/3] vec: use inexact growth where possible.

gcc/ChangeLog:

	* cfgrtl.c (rtl_create_basic_block): Use default value for
	growth vector function.
	* gimple.c (gimple_set_bb): Likewise.
	* symbol-summary.h: Likewise.
	* tree-cfg.c (init_empty_tree_cfg_for_function): Likewise.
	(build_gimple_cfg): Likewise.
	(create_bb): Likewise.
	(move_block_to_fn): Likewise.
---
 gcc/cfgrtl.c |  8 ++--
 gcc/gimple.c |  7 +--
 gcc/symbol-summary.h | 13 +++--
 gcc/tree-cfg.c   | 27 +--
 4 files changed, 15 insertions(+), 40 deletions(-)

diff --git a/gcc/cfgrtl.c b/gcc/cfgrtl.c
index 03fa688fed6..0e65537f255 100644
--- a/gcc/cfgrtl.c
+++ b/gcc/cfgrtl.c
@@ -374,12 +374,8 @@ rtl_create_basic_block (void *headp, void *endp, basic_block after)
   /* Grow the basic block array if needed.  */
   if ((size_t) last_basic_block_for_fn (cfun)
   >= basic_block_info_for_fn (cfun)->length ())
-{
-  size_t new_size =
-	(last_basic_block_for_fn (cfun)
-	 + (last_basic_block_for_fn (cfun) + 3) / 4);
-  vec_safe_grow_cleared (basic_block_info_for_fn (cfun), new_size, true);
-}
+vec_safe_grow_cleared (basic_block_info_for_fn (cfun),
+			   last_basic_block_for_fn (cfun) + 1);
 
   n_basic_blocks_for_fn (cfun)++;
 
diff --git a/gcc/gimple.c b/gcc/gimple.c
index 337a83a9154..a174ed48e0b 100644
--- a/gcc/gimple.c
+++ b/gcc/gimple.c
@@ -1689,12 +1689,7 @@ gimple_set_bb (gimple *stmt, basic_block bb)
 	vec_safe_length (label_to_block_map_for_fn (cfun));
 	  LABEL_DECL_UID (t) = uid = cfun->cfg->last_label_uid++;
 	  if (old_len <= (unsigned) uid)
-	{
-	  unsigned new_len = 3 * uid / 2 + 1;
-
-	  vec_safe_grow_cleared (label_to_block_map_for_fn (cfun),
- new_len, true);
-	}
+	vec_safe_grow_cleared (label_to_block_map_for_fn (cfun), uid + 1);
 	}
 
   (*label_to_block_map_for_fn (cfun))[uid] = bb;
diff --git a/gcc/symbol-summary.h b/gcc/symbol-summary.h
index fa1df5c8015..a38eb1db778 100644
--- a/gcc/symbol-summary.h
+++ b/gcc/symbol-summary.h
@@ -354,11 +354,8 @@ public:
   id = this->m_symtab->assign_summary_id (node);
 
 if ((unsigned int)id >= m_vector->length ())
-  {
-	int newlen = this->m_symtab->cgraph_max_summary_id;
-	vec_safe_reserve (m_vector, newlen - m_vector->length ());
-	m_vector->quick_grow_cleared (newlen);
-  }
+  vec_safe_grow_cleared (m_vector,
+			 this->m_symtab->cgraph_max_summary_id);
 
 if ((*m_vector)[id] == NULL)
   (*m_vector)[id] = this->allocate_new ();
@@ -815,11 +812,7 @@ public:
   id = this->m_symtab->assign_summary_id (edge);
 
 if ((unsigned)id >= m_vector->length ())
-  {
-	int newlen = this->m_symtab->edges_max_summary_id;
-	m_vector->reserve (newlen - m_vector->length ());
-	m_vector->quick_grow_cleared (newlen);
-  }
+  vec_safe_grow_cleared (m_vector, this->m_symtab->edges_max_summary_id);
 
 if ((*m_vector)[id] == NULL)
   (*m_vector)[id] = this->allocate_new ();
diff --git a/gcc/tree-cfg.c b/gcc/tree-cfg.c
index 2bae2eeddba..b79cf6c6d4c 100644
--- a/gcc/tree-cfg.c
+++ b/gcc/tree-cfg.c
@@ -183,12 +183,12 @@ init_empty_tree_cfg_for_function (struct function *fn)
   last_basic_block_for_fn (fn) = NUM_FIXED_BLOCKS;
   vec_alloc (basic_block_info_for_fn (fn), initial_cfg_capacity);
   vec_safe_grow_cleared (basic_block_info_for_fn (fn),
-			 initial_cfg_capacity, true);
+			 initial_cfg_capacity);
 
   /* Build a mapping of labels to their associated blocks.  */
   vec_alloc (label_to_block_map_for_fn (fn), initial_cfg_capacity);
   vec_safe_grow_cleared (label_to_block_map_for_fn (fn),
-			 initial_cfg_capacity, true);
+			 initial_cfg_capacity);
 
   SET_BASIC_BLOCK_FOR_FN (fn, ENTRY_BLOCK, ENTRY_BLOCK_PTR_FOR_FN (fn));
   SET_BASIC_BLOCK_FOR_FN (fn, EXIT_BLOCK, EXIT_BLOCK_PTR_FOR_FN (fn));
@@ -232,7 +232,7 @@ build_gimple_cfg (gimple_seq seq)
   if (basic_block_info_for_fn (cfun)->length ()
   < (size_t) n_basic_blocks_for_fn (cfun))
 vec_safe_grow_cleared (basic_block_info_for_fn (cfun),
-			   n_basic_blocks_for_fn (cfun), true);
+			   n_basic_blocks_for_fn (cfun));
 
   /* To speed up statement iterator walks, we first purge dead labels.  */
   cleanup_dead_labels ();
@@ -681,12 +681,8 @@ create_bb (void *h, void *e, basic_block after)
   /* Grow the basic block array if needed.  */
   if ((size_t) last_basic_block_for_fn (cfun)
   == basic_block_info_for_fn (cfun)->length ())
-{
-  size_t new_size =
-	(last_basic_block_for_fn (cfun)
-	 + (last_basic_block_for_fn (cfun) + 3) / 4);
-  vec_safe_grow_cleared (basic_block_info_for_fn (cfun), new_size, true);
-}
+vec_safe_grow_cleared (basic_block_info_for_fn (cfun),
+			   last_basic_block_for_fn (cfun) + 1);
 
   /* Add the newly created block to the array.  */
   SET_BASIC_BLOCK_FOR_FN (cfun, 

[PATCH 1/3] vec: add exact argument for various grow functions.

2020-08-11 Thread Martin Liška

Hello.

All right, I did it in 3 steps:
1) - new exact argument is added (no default value) - I tested the on 
x86_64-linux-gnu
and I build all cross targets.
2) set default value of exact = false
3) change places which calculate its own growth to use the default argument

I would like to install first 1) and then wait some time before the rest is 
installed.

Thoughts?
Martin
>From c659680bd65bdaa749d8c07fc99b45542a872786 Mon Sep 17 00:00:00 2001
From: Martin Liska 
Date: Mon, 10 Aug 2020 11:11:05 +0200
Subject: [PATCH 1/3] vec: add exact argument for various grow functions.

gcc/ada/ChangeLog:

	* gcc-interface/trans.c (gigi): Set exact argument of a vector
	growth function to true.
	(Attribute_to_gnu): Likewise.

gcc/ChangeLog:

	* alias.c (init_alias_analysis): Set exact argument of a vector
	growth function to true.
	* calls.c (internal_arg_pointer_based_exp_scan): Likewise.
	* cfgbuild.c (find_many_sub_basic_blocks): Likewise.
	* cfgexpand.c (expand_asm_stmt): Likewise.
	* cfgrtl.c (rtl_create_basic_block): Likewise.
	* combine.c (combine_split_insns): Likewise.
	(combine_instructions): Likewise.
	* config/aarch64/aarch64-sve-builtins.cc (function_expander::add_output_operand): Likewise.
	(function_expander::add_input_operand): Likewise.
	(function_expander::add_integer_operand): Likewise.
	(function_expander::add_address_operand): Likewise.
	(function_expander::add_fixed_operand): Likewise.
	* df-core.c (df_worklist_dataflow_doublequeue): Likewise.
	* dwarf2cfi.c (update_row_reg_save): Likewise.
	* early-remat.c (early_remat::init_block_info): Likewise.
	(early_remat::finalize_candidate_indices): Likewise.
	* except.c (sjlj_build_landing_pads): Likewise.
	* final.c (compute_alignments): Likewise.
	(grow_label_align): Likewise.
	* function.c (temp_slots_at_level): Likewise.
	* fwprop.c (build_single_def_use_links): Likewise.
	(update_uses): Likewise.
	* gcc.c (insert_wrapper): Likewise.
	* genautomata.c (create_state_ainsn_table): Likewise.
	(add_vect): Likewise.
	(output_dead_lock_vect): Likewise.
	* genmatch.c (capture_info::capture_info): Likewise.
	(parser::finish_match_operand): Likewise.
	* genrecog.c (optimize_subroutine_group): Likewise.
	(merge_pattern_info::merge_pattern_info): Likewise.
	(merge_into_decision): Likewise.
	(print_subroutine_start): Likewise.
	(main): Likewise.
	* gimple-loop-versioning.cc (loop_versioning::loop_versioning): Likewise.
	* gimple.c (gimple_set_bb): Likewise.
	* graphite-isl-ast-to-gimple.c (translate_isl_ast_node_user): Likewise.
	* haifa-sched.c (sched_extend_luids): Likewise.
	(extend_h_i_d): Likewise.
	* insn-addr.h (insn_addresses_new): Likewise.
	* ipa-cp.c (gather_context_independent_values): Likewise.
	(find_more_contexts_for_caller_subset): Likewise.
	* ipa-devirt.c (final_warning_record::grow_type_warnings): Likewise.
	(ipa_odr_read_section): Likewise.
	* ipa-fnsummary.c (evaluate_properties_for_edge): Likewise.
	(ipa_fn_summary_t::duplicate): Likewise.
	(analyze_function_body): Likewise.
	(ipa_merge_fn_summary_after_inlining): Likewise.
	(read_ipa_call_summary): Likewise.
	* ipa-icf.c (sem_function::bb_dict_test): Likewise.
	* ipa-prop.c (ipa_alloc_node_params): Likewise.
	(parm_bb_aa_status_for_bb): Likewise.
	(ipa_compute_jump_functions_for_edge): Likewise.
	(ipa_analyze_node): Likewise.
	(update_jump_functions_after_inlining): Likewise.
	(ipa_read_edge_info): Likewise.
	(read_ipcp_transformation_info): Likewise.
	(ipcp_transform_function): Likewise.
	* ipa-reference.c (ipa_reference_write_optimization_summary): Likewise.
	* ipa-split.c (execute_split_functions): Likewise.
	* ira.c (find_moveable_pseudos): Likewise.
	* lower-subreg.c (decompose_multiword_subregs): Likewise.
	* lto-streamer-in.c (input_eh_regions): Likewise.
	(input_cfg): Likewise.
	(input_struct_function_base): Likewise.
	(input_function): Likewise.
	* modulo-sched.c (set_node_sched_params): Likewise.
	(extend_node_sched_params): Likewise.
	(schedule_reg_moves): Likewise.
	* omp-general.c (omp_construct_simd_compare): Likewise.
	* passes.c (pass_manager::create_pass_tab): Likewise.
	(enable_disable_pass): Likewise.
	* predict.c (determine_unlikely_bbs): Likewise.
	* profile.c (compute_branch_probabilities): Likewise.
	* read-rtl-function.c (function_reader::parse_block): Likewise.
	* read-rtl.c (rtx_reader::read_rtx_code): Likewise.
	* reg-stack.c (stack_regs_mentioned): Likewise.
	* regrename.c (regrename_init): Likewise.
	* rtlanal.c (T>::add_single_to_queue): Likewise.
	* sched-deps.c (init_deps_data_vector): Likewise.
	* sel-sched-ir.c (sel_extend_global_bb_info): Likewise.
	(extend_region_bb_info): Likewise.
	(extend_insn_data): Likewise.
	* symtab.c (symtab_node::create_reference): Likewise.
	* tracer.c (tail_duplicate): Likewise.
	* trans-mem.c (tm_region_init): Likewise.
	(get_bb_regions_instrumented): Likewise.
	* tree-cfg.c (init_empty_tree_cfg_for_function): Likewise.
	(build_gimple_cfg): Likewise.
	(create_bb): Likewise.
	(move_block_to_fn): Likewise.
	* tree-complex.c 

Re: [PATCH] expr: Optimize noop copies [PR96539]

2020-08-11 Thread Richard Biener
On August 11, 2020 11:00:14 AM GMT+02:00, Jakub Jelinek  
wrote:
>Hi!
>
>At GIMPLE e.g. for __builtin_memmove we optimize away (to just the
>return
>value) noop copies where src == dest, but at the RTL we don't, and as
>the
>testcase shows, in some cases such copies can appear only at the RTL
>level
>e.g. from trying to copy an aggregate by value argument to the same
>location
>as it already has.  If the block move is expanded e.g. piecewise, we
>actually manage to optimize it away, as the individual memory copies
>are
>seen as noop moves, but if the target optabs are used, often the
>sequences
>stay until final.
>
>Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

OK. 

Richard. 

>2020-08-10  Jakub Jelinek  
>
>   PR rtl-optimization/96539
>   * expr.c (emit_block_move_hints): Don't copy anything if x and y
>   are the same and neither is MEM_VOLATILE_P.
>
>   * gcc.target/i386/pr96539.c: New test.
>
>--- gcc/expr.c.jj  2020-07-28 15:39:09.886757905 +0200
>+++ gcc/expr.c 2020-08-10 13:14:47.190328119 +0200
>@@ -1637,6 +1637,12 @@ emit_block_move_hints (rtx x, rtx y, rtx
>   x = adjust_address (x, BLKmode, 0);
>   y = adjust_address (y, BLKmode, 0);
> 
>+  /* If source and destination are the same, no need to copy anything.
> */
>+  if (rtx_equal_p (x, y)
>+  && !MEM_VOLATILE_P (x)
>+  && !MEM_VOLATILE_P (y))
>+return 0;
>+
>/* Set MEM_SIZE as appropriate for this block copy.  The main place
>this
>  can be incorrect is coming from __builtin_memcpy.  */
>   poly_int64 const_size;
>--- gcc/testsuite/gcc.target/i386/pr96539.c.jj 2020-08-10
>13:37:14.492946062 +0200
>+++ gcc/testsuite/gcc.target/i386/pr96539.c2020-08-10
>13:36:57.158183171 +0200
>@@ -0,0 +1,16 @@
>+/* PR rtl-optimization/96539 */
>+/* { dg-do compile } *
>+/* { dg-options "-Os" } */
>+/* { dg-final { scan-assembler-not "rep\[^\n\r]\*movs" } } */
>+
>+struct A { int a, b, c, d, e, f; void *g, *h, *i, *j, *k, *l, *m; };
>+
>+int bar (int a);
>+int baz (int a, int b, int c, void *p, struct A s);
>+
>+int
>+foo (int a, int b, int c, void *p, struct A s)
>+{
>+  bar (a);
>+  return baz (a, b, c, p, s);
>+}
>
>   Jakub



Re: [PATCH] tree: Fix up get_narrower [PR96549]

2020-08-11 Thread Richard Biener
On August 11, 2020 10:46:47 AM GMT+02:00, Jakub Jelinek  
wrote:
>Hi!
>
>My changes to get_narrower to support COMPOUND_EXPRs apparently
>used a wrong type for the COMPOUND_EXPRs, while e.g. the rhs
>type was unsigned short, the COMPOUND_EXPR got int type as that was the
>original type of op.  The type of COMPOUND_EXPR should be always the
>type
>of the rhs.
>
>Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux,
>ok for trunk/10.3?

OK. 

Richard. 

>2020-08-10  Jakub Jelinek  
>
>   PR c/96549
>   * tree.c (get_narrower): Use TREE_TYPE (ret) instead of
>   TREE_TYPE (win) for COMPOUND_EXPRs.
>
>   * gcc.c-torture/execute/pr96549.c: New test.
>
>--- gcc/tree.c.jj  2020-08-03 22:54:51.456531124 +0200
>+++ gcc/tree.c 2020-08-10 11:05:49.129685858 +0200
>@@ -8877,7 +8877,7 @@ get_narrower (tree op, int *unsignedp_pt
>   v.safe_push (op);
>   FOR_EACH_VEC_ELT_REVERSE (v, i, op)
>   ret = build2_loc (EXPR_LOCATION (op), COMPOUND_EXPR,
>-TREE_TYPE (win), TREE_OPERAND (op, 0),
>+TREE_TYPE (ret), TREE_OPERAND (op, 0),
> ret);
>   return ret;
> }
>--- gcc/testsuite/gcc.c-torture/execute/pr96549.c.jj   2020-08-10
>11:09:30.307623013 +0200
>+++ gcc/testsuite/gcc.c-torture/execute/pr96549.c  2020-08-10
>11:09:15.772824289 +0200
>@@ -0,0 +1,12 @@
>+/* PR c/96549 */
>+
>+long c = -1L;
>+long b = 0L;
>+
>+int
>+main ()
>+{
>+  if (3L > (short) ((c ^= (b = 1L)) * 3L))
>+return 0;
>+  __builtin_abort ();
>+}
>
>   Jakub



Re: [PATCH] x86_64: Use peephole2 to eliminate redundant moves.

2020-08-11 Thread Uros Bizjak via Gcc-patches
On Tue, Aug 11, 2020 at 9:34 AM Roger Sayle  wrote:
>
>
> The recent fix for mul_widen_cost revealed an interesting
> quirk of ira/reload register allocation on x86_64.  As shown in
> https://gcc.gnu.org/pipermail/gcc-patches/2020-August/551648.html
> for gcc.target/i386/pr71321.c we generate the following code that
> performs unnecessary register shuffling.
>
> movl$-51, %edx
> movl%edx, %eax
> mulb%dil
>
> which is caused by reload generating the following instructions
> (notice the set of the first register is dead in the 2nd insn):
>
> (insn 7 4 36 2 (set (reg:QI 1 dx [94])
> (const_int -51 [0xffcd])) {*movqi_internal}
>  (expr_list:REG_EQUIV (const_int -51 [0xffcd])
> (nil)))
> (insn 36 7 8 2 (set (reg:QI 0 ax [93])
> (reg:QI 1 dx [94])) {*movqi_internal}
>  (expr_list:REG_DEAD (reg:QI 1 dx [94])
> (nil)))
>
> Various discussions in bugzilla seem to point to reload preferring
> not to load constants directly into CLASS_LIKELY_SPILLED_P registers.

This can extend the lifetime of a register over the instruction that
needs one of the CLASS_LIKELY_SPILLED_P registers. Various MUL, DIV
and even shift insns were able to choke the allocator for x86 targets,
so this is a small price to pay to avoid regalloc failure.

> Whatever the cause, one solution (workaround), that doesn't involve
> rewriting a register allocator, is to use peephole2 to spot this
> weirdness and eliminate it.  In fact, this use case is (probably)
> the reason peephole optimizers were originally developed, but it's
> a little disappointing this application of them is still required
> today.  On a positive note, this clean-up is cheap, as we're already
> traversing the instruction stream with liveness (REG_DEAD notes)
> already calculated.
>
> With this peephole2 the above three instructions (from pr71321.c)
> are replaced with:
>
> movl$-51, %eax
> mulb%dil
>
> This patch has been tested on x86_64-pc-linux-gnu with "make bootstrap"
> and "make -k check" with no new failures.  This peephole triggers
> 1435 during stage2 and stage3 of a bootstrap, and a further 1274
> times during "make check".  The most common case is DX_REG->AX_REG
> (as above) which occurs 421 times.  I've restricted this pattern to
> immediate constant loads into general operand registers, which fixes
> this particular problem, but broader predicates may help similar cases.
> Ok for mainline?
>
> 2020-08-11  Roger Sayle  
>
> * config/i386/i386.md (peephole2): Reduce unnecessary
> register shuffling produced by register allocation.

LGTM, but I wonder if the allocator is also too conservative with
memory operands. Perhaps x86_64_general_operand can be used here.

Uros.
>
> Thanks in advance,
> Roger
> --
> Roger Sayle
> NextMove Software
> Cambridge, UK
>


Do not combine PRED_LOOP_GUARD and PRED_LOOP_GUARD_WITH_RECURSION

2020-08-11 Thread Jan Hubicka
Hi,
this patch avoids both PRED_LOOP_GUARD and PRED_LOOP_GUARD_WITH_RECURSION to be
attached to one edge.  We have logic that prevents same predictor to apply to
one edge twice, but since we split LOOP_GUARD to two more specialized cases,
this no longer fires.

Double prediction happens in exchange benchmark and leads to unrealistically
low hitrates on some edges which in turn leads to bad IPA profile and misguides
ipa-cp.

Unforutnately it seems that the bad profile also leads to bit better
performance by disabling some of loop stuff, but that really ought to be done
in some meaningful way, not by an accident.

Bootstrapped/regtested x86_64-linux, comitted.

Honza

gcc/ChangeLog:

2020-08-11  Jan Hubicka  

* predict.c (not_loop_guard_equal_edge_p): New function.
(maybe_predict_edge): New function.
(predict_paths_for_bb): Use it.
(predict_paths_leading_to_edge): Use it.

gcc/testsuite/ChangeLog:

2020-08-11  Jan Hubicka  

* gcc.dg/ipa/ipa-clone-2.c: Lower threshold from 500 to 400.

diff --git a/gcc/predict.c b/gcc/predict.c
index 2164a06e083..4c4bba54939 100644
--- a/gcc/predict.c
+++ b/gcc/predict.c
@@ -3122,6 +3122,35 @@ tree_guess_outgoing_edge_probabilities (basic_block bb)
   bb_predictions = NULL;
 }
 
+/* Filter function predicate that returns true for a edge predicate P
+   if its edge is equal to DATA.  */
+
+static bool
+not_loop_guard_equal_edge_p (edge_prediction *p, void *data)
+{
+  return p->ep_edge != (edge)data || p->ep_predictor != PRED_LOOP_GUARD;
+}
+
+/* Predict edge E with PRED unless it is already predicted by some predictor
+   considered equivalent.  */
+
+static void
+maybe_predict_edge (edge e, enum br_predictor pred, enum prediction taken)
+{
+  if (edge_predicted_by_p (e, pred, taken))
+return;
+  if (pred == PRED_LOOP_GUARD
+  && edge_predicted_by_p (e, PRED_LOOP_GUARD_WITH_RECURSION, taken))
+return;
+  /* Consider PRED_LOOP_GUARD_WITH_RECURSION superrior to LOOP_GUARD.  */
+  if (pred == PRED_LOOP_GUARD_WITH_RECURSION)
+{
+  edge_prediction **preds = bb_predictions->get (e->src);
+  if (preds)
+   filter_predictions (preds, not_loop_guard_equal_edge_p, e);
+}
+  predict_edge_def (e, pred, taken);
+}
 /* Predict edges to successors of CUR whose sources are not postdominated by
BB by PRED and recurse to all postdominators.  */
 
@@ -3177,10 +3206,7 @@ predict_paths_for_bb (basic_block cur, basic_block bb,
 regions that are only reachable by abnormal edges.  We simply
 prevent visiting given BB twice.  */
   if (found)
-   {
- if (!edge_predicted_by_p (e, pred, taken))
-predict_edge_def (e, pred, taken);
-   }
+   maybe_predict_edge (e, pred, taken);
   else if (bitmap_set_bit (visited, e->src->index))
predict_paths_for_bb (e->src, e->src, pred, taken, visited, in_loop);
 }
@@ -3223,7 +3249,7 @@ predict_paths_leading_to_edge (edge e, enum br_predictor 
pred,
   if (!has_nonloop_edge)
 predict_paths_for_bb (bb, bb, pred, taken, auto_bitmap (), in_loop);
   else
-predict_edge_def (e, pred, taken);
+maybe_predict_edge (e, pred, taken);
 }
 
 /* This is used to carry information about basic blocks.  It is
diff --git a/gcc/testsuite/gcc.dg/ipa/ipa-clone-2.c 
b/gcc/testsuite/gcc.dg/ipa/ipa-clone-2.c
index d513020ee8b..53ae25a1e24 100644
--- a/gcc/testsuite/gcc.dg/ipa/ipa-clone-2.c
+++ b/gcc/testsuite/gcc.dg/ipa/ipa-clone-2.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O3 -fdump-ipa-cp-details -fno-early-inlining --param 
ipa-cp-max-recursive-depth=8" } */
+/* { dg-options "-O3 -fdump-ipa-cp-details -fno-early-inlining --param 
ipa-cp-max-recursive-depth=8 --param=ipa-cp-eval-threshold=400" } */
 
 int fn();
 


Re: [PATCH] [PR target/96350]Force ENDBR immediate into memory to avoid fake ENDBR opcode.

2020-08-11 Thread Uros Bizjak via Gcc-patches
On Tue, Aug 11, 2020 at 11:36 AM Hongtao Liu  wrote:
>
> On Tue, Aug 11, 2020 at 4:38 PM Uros Bizjak  wrote:
> >
> > On Tue, Aug 11, 2020 at 5:30 AM Hongtao Liu  wrote:
> > >
> > > Hi:
> > >   The issue is described in the bugzilla.
> > >   Bootstrap is ok, regression test for i386/x86-64 backend is ok.
> > >   Ok for trunk?
> > >
> > > ChangeLog
> > > gcc/
> > > PR target/96350
> > > * config/i386/i386.c (ix86_legitimate_constant_p): Return
> > > false for ENDBR immediate.
> > > (ix86_legitimate_address_p): Ditto.
> > > * config/i386/predicated.md
> > > (x86_64_immediate_operand): Exclude ENDBR immediate.
> > > (x86_64_zext_immediate_operand): Ditto.
> > > (x86_64_dwzext_immediate_operand): Ditto.
> > > (ix86_not_endbr_immediate_operand): New predicate.
> > >
> > > gcc/testsuite
> > > * gcc.target/i386/endbr_immediate.c: New test.
> >
> > +;; Return true if VALUE isn't an ENDBR opcode in immediate field.
> > +(define_predicate "ix86_not_endbr_immediate_operand"
> > +  (match_test "1")
> >
> > Please reverse the above logic to introduce
> > ix86_endbr_immediate_operand, that returns true for unwanted
> > immediate. Something like:
> >
> > (define_predicate "ix86_endbr_immediate_operand"
> >   (match_code "const_int")
> > ...
> >
> > And you will be able to use it like:
> >
> > if (ix86_endbr_immediate_operand (x, VOIDmode)
> >   return false;
> >
>
> Changed.

No, it is not.

+  if ((flag_cf_protection & CF_BRANCH)
+  && CONST_INT_P (op))

You don't need to check for const ints here.

And please rewrite the body of the function to something like (untested):

{
  unsigned HOST_WIDE_INT val = TARGET_64BIT ? 0xfa1e0ff3 : 0xfb1e0ff3;

  if (x == val)
return 1;

  if (TARGET_64BIT)
for (; x >= val; x >>= 8)
  if (x == val)
return 1;

  return 0;
}

so it will at least *look* like some thoughts have been spent on this.
I don't plan to review the code where it is obvious from the first
look that it was thrown together in a hurry. Please get some internal
company signoff first. Ping me in a week for a review.

Uros.
>
> >/* Otherwise we handle everything else in the move patterns.  */
> > -  return true;
> > +  return ix86_not_endbr_immediate_operand (x, VOIDmode);
> >  }
> >
> > Please handle this in CASE_CONST_SCALAR_INT: part.
> >
> > +  if (disp && !ix86_not_endbr_immediate_operand (disp, VOIDmode))
> > +return false;
> >
> > And this in:
> >
> >   /* Validate displacement.  */
> >   if (disp)
> > {
> >
>
> Changed.

A better place for these new special cases is at the beginning of the
part I referred, not at the end.

Uros.


Re: [PATCH] Add debug counter for IPA bits CP.

2020-08-11 Thread Martin Liška

On 8/11/20 11:04 AM, Jan Hubicka wrote:

Looks good to me.  Perhaps it would be more systematic to add them to
the remaining propagators as well - bugs tends to pop up from time to
time related to those.


Works for me.
@Martin: Can you please add them?

Thanks,
Martin


Re: [PATCH] [AVX512]For vector compare to mask register, UNSPEC is needed instead of comparison operator [PR96243]

2020-08-11 Thread Hongtao Liu via Gcc-patches
Hi:
  The issue is described in the bugzilla.
  Bootstrap is ok, regression test for i386/x86-64 backend is ok.
 Ok for trunk?

ChangeLog
gcc/
PR target/96551
* config/i386/sse.md (vec_unpacku_float_hi_v16si): For vector
compare to integer mask, don't use gen_rtx_LT , use
ix86_expand_mask_vec_cmp instead.
(vec_unpacku_float_hi_v16si): Ditto.

gcc/testsuite
* gcc.target/i386/pr96551-1.c: New test.
* gcc.target/i386/pr96551-2.c: New test.

-- 
BR,
Hongtao
From 6e8e1502591d78e14fc9e3c25e7d47c0f2c4559a Mon Sep 17 00:00:00 2001
From: liuhongt 
Date: Tue, 11 Aug 2020 11:05:40 +0800
Subject: [PATCH] Refine expander
 vec_unpacku_float_hi_v16si/vec_unpacku_float_lo_v16si

gcc/
	PR target/96551
	* config/i386/sse.md (vec_unpacku_float_hi_v16si): For vector
	compare to integer mask, don't use gen_rtx_LT , use
	ix86_expand_mask_vec_cmp instead.
	(vec_unpacku_float_hi_v16si): Ditto.

gcc/testsuite
	* gcc.target/i386/pr96551-1.c: New test.
	* gcc.target/i386/pr96551-2.c: New test.
---
 gcc/config/i386/sse.md|  4 +--
 gcc/testsuite/gcc.target/i386/pr96551-1.c | 18 +
 gcc/testsuite/gcc.target/i386/pr96551-2.c | 33 +++
 3 files changed, 53 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr96551-1.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr96551-2.c

diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index ad8169f6f08..a890f994ab0 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -6971,7 +6971,7 @@
 
   emit_insn (gen_vec_extract_hi_v16si (tmp[3], operands[1]));
   emit_insn (gen_floatv8siv8df2 (tmp[2], tmp[3]));
-  emit_insn (gen_rtx_SET (k, gen_rtx_LT (QImode, tmp[2], tmp[0])));
+  ix86_expand_mask_vec_cmp (k, LT, tmp[2], tmp[0]);
   emit_insn (gen_addv8df3_mask (tmp[2], tmp[2], tmp[1], tmp[2], k));
   emit_move_insn (operands[0], tmp[2]);
   DONE;
@@ -7018,7 +7018,7 @@
   k = gen_reg_rtx (QImode);
 
   emit_insn (gen_avx512f_cvtdq2pd512_2 (tmp[2], operands[1]));
-  emit_insn (gen_rtx_SET (k, gen_rtx_LT (QImode, tmp[2], tmp[0])));
+  ix86_expand_mask_vec_cmp (k, LT, tmp[2], tmp[0]);
   emit_insn (gen_addv8df3_mask (tmp[2], tmp[2], tmp[1], tmp[2], k));
   emit_move_insn (operands[0], tmp[2]);
   DONE;
diff --git a/gcc/testsuite/gcc.target/i386/pr96551-1.c b/gcc/testsuite/gcc.target/i386/pr96551-1.c
new file mode 100644
index 000..598bb6e85f3
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr96551-1.c
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -mavx512f -mprefer-vector-width=512" } */
+
+unsigned int a[256];
+double b[256];
+
+void
+__attribute__ ((noipa, optimize ("tree-vectorize")))
+foo(void)
+{
+  int i;
+
+  for (i=0; i<256; ++i)
+b[i] = a[i];
+}
+
+/* { dg-final { scan-assembler "vcvtdq2pd\[^\n\]*zmm" } } */
+
diff --git a/gcc/testsuite/gcc.target/i386/pr96551-2.c b/gcc/testsuite/gcc.target/i386/pr96551-2.c
new file mode 100644
index 000..722767aaf2a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr96551-2.c
@@ -0,0 +1,33 @@
+/* { dg-do run } */
+/* { dg-options "-O2 -mavx512f -mprefer-vector-width=512" } */
+/* { dg-require-effective-target avx512f } */
+
+#ifndef CHECK
+#define CHECK "avx512f-helper.h"
+#endif
+
+#include CHECK
+
+#ifndef TEST
+#define TEST test_512
+#endif
+
+#include "pr96551-1.c"
+
+static void
+TEST (void)
+{
+  double exp[256];
+  for (int i = 0; i != 256; i++)
+{
+  a[i] = i * i + 3 * i + 13;
+  exp[i] = a[i];
+  b[i] = 0;
+}
+
+  foo ();
+
+  for (int i = 0; i != 256; i++)
+if (exp[i] != b[i])
+  __builtin_abort ();
+}
-- 
2.18.1



Re: [PATCH] [PR target/96350]Force ENDBR immediate into memory to avoid fake ENDBR opcode.

2020-08-11 Thread Hongtao Liu via Gcc-patches
On Tue, Aug 11, 2020 at 4:38 PM Uros Bizjak  wrote:
>
> On Tue, Aug 11, 2020 at 5:30 AM Hongtao Liu  wrote:
> >
> > Hi:
> >   The issue is described in the bugzilla.
> >   Bootstrap is ok, regression test for i386/x86-64 backend is ok.
> >   Ok for trunk?
> >
> > ChangeLog
> > gcc/
> > PR target/96350
> > * config/i386/i386.c (ix86_legitimate_constant_p): Return
> > false for ENDBR immediate.
> > (ix86_legitimate_address_p): Ditto.
> > * config/i386/predicated.md
> > (x86_64_immediate_operand): Exclude ENDBR immediate.
> > (x86_64_zext_immediate_operand): Ditto.
> > (x86_64_dwzext_immediate_operand): Ditto.
> > (ix86_not_endbr_immediate_operand): New predicate.
> >
> > gcc/testsuite
> > * gcc.target/i386/endbr_immediate.c: New test.
>
> +;; Return true if VALUE isn't an ENDBR opcode in immediate field.
> +(define_predicate "ix86_not_endbr_immediate_operand"
> +  (match_test "1")
>
> Please reverse the above logic to introduce
> ix86_endbr_immediate_operand, that returns true for unwanted
> immediate. Something like:
>
> (define_predicate "ix86_endbr_immediate_operand"
>   (match_code "const_int")
> ...
>
> And you will be able to use it like:
>
> if (ix86_endbr_immediate_operand (x, VOIDmode)
>   return false;
>

Changed.

>/* Otherwise we handle everything else in the move patterns.  */
> -  return true;
> +  return ix86_not_endbr_immediate_operand (x, VOIDmode);
>  }
>
> Please handle this in CASE_CONST_SCALAR_INT: part.
>
> +  if (disp && !ix86_not_endbr_immediate_operand (disp, VOIDmode))
> +return false;
>
> And this in:
>
>   /* Validate displacement.  */
>   if (disp)
> {
>

Changed.

> Uros.
>
> > --
> > BR,
> > Hongtao

Update patch.

-- 
BR,
Hongtao
From eb943a5bf060f0d912979bce76b4f0c0cbaed858 Mon Sep 17 00:00:00 2001
From: liuhongt 
Date: Tue, 4 Aug 2020 10:00:13 +0800
Subject: [PATCH] Force ENDBR immediate into memory.

gcc/
	PR target/96350
	* config/i386/i386.c (ix86_legitimate_constant_p): Return
	false for ENDBR immediate.
	(ix86_legitimate_address_p): Ditto.
	* config/i386/predicated.md
	(x86_64_immediate_operand): Exclude ENDBR immediate.
	(x86_64_zext_immediate_operand): Ditto.
	(x86_64_dwzext_immediate_operand): Ditto.
	(ix86_endbr_immediate_operand): New predicate.

gcc/testsuite
	* gcc.target/i386/endbr_immediate.c: New test.
---
 gcc/config/i386/i386.c|   4 +
 gcc/config/i386/predicates.md |  32 +++
 .../gcc.target/i386/endbr_immediate.c | 198 ++
 3 files changed, 234 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/i386/endbr_immediate.c

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 8ea6a4d7ea7..388291f1dba 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -10069,6 +10069,8 @@ ix86_legitimate_constant_p (machine_mode mode, rtx x)
 	default:
 	  break;
 	}
+  if (ix86_endbr_immediate_operand (x, VOIDmode))
+	return false;
   break;
 
 case CONST_VECTOR:
@@ -10566,6 +10568,8 @@ ix86_legitimate_address_p (machine_mode, rtx addr, bool strict)
 	   && CONST_INT_P (disp)
 	   && val_signbit_known_set_p (SImode, INTVAL (disp)))
 	return false;
+  if (ix86_endbr_immediate_operand (disp, VOIDmode))
+	return false;
 }
 
   /* Everything looks valid.  */
diff --git a/gcc/config/i386/predicates.md b/gcc/config/i386/predicates.md
index 07e69d555c0..47ab053dc99 100644
--- a/gcc/config/i386/predicates.md
+++ b/gcc/config/i386/predicates.md
@@ -130,10 +130,37 @@
 (define_predicate "symbol_operand"
   (match_code "symbol_ref"))
 
+;; Return true if VALUE isn't an ENDBR opcode in immediate field.
+(define_predicate "ix86_endbr_immediate_operand"
+  (match_code "const_int")
+{
+  if ((flag_cf_protection & CF_BRANCH)
+  && CONST_INT_P (op))
+ {
+   unsigned HOST_WIDE_INT imm = INTVAL (op);
+   if (!TARGET_64BIT || imm <= 0x)
+	 return imm == (TARGET_64BIT ? 0xfa1e0ff3 : 0xfb1e0ff3);
+
+   /* NB: Encoding is byte based.  */
+   do
+	 {
+	  if ((0x & imm) == 0xfa1e0ff3)
+	return true;
+	  imm >>= 8;
+	 }
+   while (imm > 0x);
+ }
+
+  return false;
+})
+
 ;; Return true if VALUE can be stored in a sign extended immediate field.
 (define_predicate "x86_64_immediate_operand"
   (match_code "const_int,symbol_ref,label_ref,const")
 {
+  if (ix86_endbr_immediate_operand (op, VOIDmode))
+return false;
+
   if (!TARGET_64BIT)
 return immediate_operand (op, mode);
 
@@ -260,6 +287,9 @@
 (define_predicate "x86_64_zext_immediate_operand"
   (match_code "const_int,symbol_ref,label_ref,const")
 {
+  if (ix86_endbr_immediate_operand (op, VOIDmode))
+return false;
+
   switch (GET_CODE (op))
 {
 case CONST_INT:
@@ -374,6 +404,8 @@
 (define_predicate "x86_64_dwzext_immediate_operand"
   (match_code "const_int,const_wide_int")
 {
+  if (ix86_endbr_immediate_operand (op, VOIDmode))
+return false;
   switch 

Re: [PATCH] PR libstdc++/91620 Implement DR 526 for std::[forward_]list::remove_if/unique

2020-08-11 Thread Jonathan Wakely via Gcc-patches

On 11/08/20 10:11 +0100, Jonathan Wakely wrote:

On 27/12/19 11:57 +0100, François Dumont wrote:
Here is the patch to extend DR 526 to forward_list and list 
remove_if and unique.


As the adopted pattern is simpler I also applied it to the remove methods.

    PR libstdc++/91620
    * include/bits/forward_list.tcc (forward_list<>::remove): Collect nodes
    to destroy in an intermediate forward_list.
    (forward_list<>::remove_if, forward_list<>::unique): Likewise.
    * include/bits/list.tcc (list<>::remove, list<>::unique): Likewise.
    (list<>::remove_if): Likewise.
    * include/debug/forward_list (forward_list<>::_M_erase_after): Remove.
    (forward_list<>::erase_after): Adapt.
    (forward_list<>::remove, forward_list<>::remove_if): Collect nodes to
    destroy in an intermediate forward_list.
    (forward_list<>::unique): Likewise.
    * include/debug/list (list<>::remove, list<>::unique): Likewise.
    (list<>::remove_if): Likewise.

Tested under Linux x86_64 normal and debug modes.

Ok to commit ?

François





diff --git a/libstdc++-v3/include/bits/forward_list.tcc 
b/libstdc++-v3/include/bits/forward_list.tcc
index 088111e3330..70de7e75a43 100644
--- a/libstdc++-v3/include/bits/forward_list.tcc
+++ b/libstdc++-v3/include/bits/forward_list.tcc
@@ -290,30 +290,19 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
   remove(const _Tp& __val) -> __remove_return_type
   {
 size_type __removed __attribute__((__unused__)) = 0;
-  _Node_base* __curr = >_M_impl._M_head;
-  _Node_base* __extra = nullptr;
+  forward_list __to_destroy(get_allocator());

-  while (_Node* __tmp = static_cast<_Node*>(__curr->_M_next))
-   {
+  auto __prev_it = cbefore_begin();
+  while (_Node* __tmp = static_cast<_Node*>(__prev_it._M_node->_M_next))
if (*__tmp->_M_valptr() == __val)
  {
- if (__tmp->_M_valptr() != std::__addressof(__val))
-   {
- this->_M_erase_after(__curr);
+   __to_destroy.splice_after(__to_destroy.cbefore_begin(),
+ *this, __prev_it);
_GLIBCXX20_ONLY( __removed++ );
- continue;
  }
else
-   __extra = __curr;
-   }
- __curr = __curr->_M_next;
-   }
+ ++__prev_it;

-  if (__extra)
-   {
- this->_M_erase_after(__extra);
- _GLIBCXX20_ONLY( __removed++ );
-   }
 return _GLIBCXX20_ONLY( __removed );
   }

@@ -324,17 +313,19 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
 remove_if(_Pred __pred) -> __remove_return_type
 {
size_type __removed __attribute__((__unused__)) = 0;
-   _Node_base* __curr = >_M_impl._M_head;
-   while (_Node* __tmp = static_cast<_Node*>(__curr->_M_next))
- {
+   forward_list __to_destroy(get_allocator());
+
+   auto __prev_it = cbefore_begin();
+   while (_Node* __tmp = static_cast<_Node*>(__prev_it._M_node->_M_next))
  if (__pred(*__tmp->_M_valptr()))
{
-   this->_M_erase_after(__curr);
+ __to_destroy.splice_after(__to_destroy.cbefore_begin(),
+   *this, __prev_it);
  _GLIBCXX20_ONLY( __removed++ );
}
  else
- __curr = __curr->_M_next;
- }
+   ++__prev_it;
+
return _GLIBCXX20_ONLY( __removed );
 }

@@ -348,19 +339,23 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
iterator __last = end();
if (__first == __last)
  return _GLIBCXX20_ONLY(0);
+
+   forward_list __to_destroy(get_allocator());
size_type __removed __attribute__((__unused__)) = 0;
iterator __next = __first;
while (++__next != __last)
{
  if (__binary_pred(*__first, *__next))
{
- erase_after(__first);
+ __to_destroy.splice_after(__to_destroy.cbefore_begin(),
+   *this, __first);
  _GLIBCXX20_ONLY( __removed++ );
}
  else
__first = __next;
  __next = __first;
}
+
return _GLIBCXX20_ONLY( __removed );
 }

diff --git a/libstdc++-v3/include/bits/list.tcc 
b/libstdc++-v3/include/bits/list.tcc
index 0ac6654b079..5a6fd5b0824 100644
--- a/libstdc++-v3/include/bits/list.tcc
+++ b/libstdc++-v3/include/bits/list.tcc
@@ -331,10 +331,12 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
   list<_Tp, _Alloc>::
   remove(const value_type& __value)
   {
+#if !_GLIBCXX_USE_CXX11_ABI
 size_type __removed __attribute__((__unused__)) = 0;
+#endif
+  list __to_destroy(get_allocator());
 iterator __first = begin();
 iterator __last = end();
-  iterator __extra = __last;
 while (__first != __last)
{
  iterator __next = __first;
@@ -344,22 +346,20 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
  // _GLIBCXX_RESOLVE_LIB_DEFECTS
  // 526. Is it undefined 

Re: [PATCH] PR libstdc++/91620 Implement DR 526 for std::[forward_]list::remove_if/unique

2020-08-11 Thread Jonathan Wakely via Gcc-patches

On 27/12/19 11:57 +0100, François Dumont wrote:
Here is the patch to extend DR 526 to forward_list and list remove_if 
and unique.


As the adopted pattern is simpler I also applied it to the remove methods.

    PR libstdc++/91620
    * include/bits/forward_list.tcc (forward_list<>::remove): Collect nodes
    to destroy in an intermediate forward_list.
    (forward_list<>::remove_if, forward_list<>::unique): Likewise.
    * include/bits/list.tcc (list<>::remove, list<>::unique): Likewise.
    (list<>::remove_if): Likewise.
    * include/debug/forward_list (forward_list<>::_M_erase_after): Remove.
    (forward_list<>::erase_after): Adapt.
    (forward_list<>::remove, forward_list<>::remove_if): Collect nodes to
    destroy in an intermediate forward_list.
    (forward_list<>::unique): Likewise.
    * include/debug/list (list<>::remove, list<>::unique): Likewise.
    (list<>::remove_if): Likewise.

Tested under Linux x86_64 normal and debug modes.

Ok to commit ?

François





diff --git a/libstdc++-v3/include/bits/forward_list.tcc 
b/libstdc++-v3/include/bits/forward_list.tcc
index 088111e3330..70de7e75a43 100644
--- a/libstdc++-v3/include/bits/forward_list.tcc
+++ b/libstdc++-v3/include/bits/forward_list.tcc
@@ -290,30 +290,19 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
remove(const _Tp& __val) -> __remove_return_type
{
  size_type __removed __attribute__((__unused__)) = 0;
-  _Node_base* __curr = >_M_impl._M_head;
-  _Node_base* __extra = nullptr;
+  forward_list __to_destroy(get_allocator());

-  while (_Node* __tmp = static_cast<_Node*>(__curr->_M_next))
-   {
+  auto __prev_it = cbefore_begin();
+  while (_Node* __tmp = static_cast<_Node*>(__prev_it._M_node->_M_next))
if (*__tmp->_M_valptr() == __val)
  {
- if (__tmp->_M_valptr() != std::__addressof(__val))
-   {
- this->_M_erase_after(__curr);
+   __to_destroy.splice_after(__to_destroy.cbefore_begin(),
+ *this, __prev_it);
_GLIBCXX20_ONLY( __removed++ );
- continue;
  }
else
-   __extra = __curr;
-   }
- __curr = __curr->_M_next;
-   }
+ ++__prev_it;

-  if (__extra)
-   {
- this->_M_erase_after(__extra);
- _GLIBCXX20_ONLY( __removed++ );
-   }
  return _GLIBCXX20_ONLY( __removed );
}

@@ -324,17 +313,19 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
  remove_if(_Pred __pred) -> __remove_return_type
  {
size_type __removed __attribute__((__unused__)) = 0;
-   _Node_base* __curr = >_M_impl._M_head;
-   while (_Node* __tmp = static_cast<_Node*>(__curr->_M_next))
- {
+   forward_list __to_destroy(get_allocator());
+
+   auto __prev_it = cbefore_begin();
+   while (_Node* __tmp = static_cast<_Node*>(__prev_it._M_node->_M_next))
  if (__pred(*__tmp->_M_valptr()))
{
-   this->_M_erase_after(__curr);
+ __to_destroy.splice_after(__to_destroy.cbefore_begin(),
+   *this, __prev_it);
  _GLIBCXX20_ONLY( __removed++ );
}
  else
- __curr = __curr->_M_next;
- }
+   ++__prev_it;
+
return _GLIBCXX20_ONLY( __removed );
  }

@@ -348,19 +339,23 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
iterator __last = end();
if (__first == __last)
  return _GLIBCXX20_ONLY(0);
+
+   forward_list __to_destroy(get_allocator());
size_type __removed __attribute__((__unused__)) = 0;
iterator __next = __first;
while (++__next != __last)
{
  if (__binary_pred(*__first, *__next))
{
- erase_after(__first);
+ __to_destroy.splice_after(__to_destroy.cbefore_begin(),
+   *this, __first);
  _GLIBCXX20_ONLY( __removed++ );
}
  else
__first = __next;
  __next = __first;
}
+
return _GLIBCXX20_ONLY( __removed );
  }

diff --git a/libstdc++-v3/include/bits/list.tcc 
b/libstdc++-v3/include/bits/list.tcc
index 0ac6654b079..5a6fd5b0824 100644
--- a/libstdc++-v3/include/bits/list.tcc
+++ b/libstdc++-v3/include/bits/list.tcc
@@ -331,10 +331,12 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
list<_Tp, _Alloc>::
remove(const value_type& __value)
{
+#if !_GLIBCXX_USE_CXX11_ABI
  size_type __removed __attribute__((__unused__)) = 0;
+#endif
+  list __to_destroy(get_allocator());
  iterator __first = begin();
  iterator __last = end();
-  iterator __extra = __last;
  while (__first != __last)
{
  iterator __next = __first;
@@ -344,22 +346,20 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
  // _GLIBCXX_RESOLVE_LIB_DEFECTS
  // 526. Is it undefined if a function in the standard 

Re: [RS6000] PR96493, powerpc local call linkage failure

2020-08-11 Thread Alan Modra via Gcc-patches
This fixes a fail when power10 isn't supported by binutils, and
ensures the test isn't run without power10 hardware or simulation on
the off chance that power10 insns are emitted in the future for this
testcase.  Bootstrapped etc.  OK?

PR target/96525
* testsuite/gcc.target/powerpc/pr96493.c: Make it a link test
when no power10_hw.  Require power10_ok.

diff --git a/gcc/testsuite/gcc.target/powerpc/pr96493.c 
b/gcc/testsuite/gcc.target/powerpc/pr96493.c
index f0de0818813..1e5d43f199d 100644
--- a/gcc/testsuite/gcc.target/powerpc/pr96493.c
+++ b/gcc/testsuite/gcc.target/powerpc/pr96493.c
@@ -1,6 +1,8 @@
-/* { dg-do run } */
+/* { dg-do run { target { power10_hw } } } */
+/* { dg-do link { target { ! power10_hw } } } */
 /* { dg-options "-mdejagnu-cpu=power8 -O2" } */
 /* { dg-require-effective-target powerpc_elfv2 } */
+/* { dg-require-effective-target power10_ok } */
 
 /* Test local calls between pcrel and non-pcrel code.
 

-- 
Alan Modra
Australia Development Lab, IBM


Re: [PATCH] Add debug counter for IPA bits CP.

2020-08-11 Thread Jan Hubicka
> Hey.
> 
> I'm debugging PR96482 and it would be handy for me to have a debug counter
> for the problematic transformation.
> 
> Ready for master?

Looks good to me.  Perhaps it would be more systematic to add them to
the remaining propagators as well - bugs tends to pop up from time to
time related to those.

Honza
> Thanks,
> Martin
> 
> gcc/ChangeLog:
> 
>   * dbgcnt.def (DEBUG_COUNTER): Add ipa_cp_bits.
>   * ipa-cp.c (ipcp_store_bits_results): Use it when we store known
>   bits for parameters.
> ---
>  gcc/dbgcnt.def |  1 +
>  gcc/ipa-cp.c   | 11 ---
>  2 files changed, 9 insertions(+), 3 deletions(-)
> 
> diff --git a/gcc/dbgcnt.def b/gcc/dbgcnt.def
> index 3998c9636aa..cf8775b2b66 100644
> --- a/gcc/dbgcnt.def
> +++ b/gcc/dbgcnt.def
> @@ -170,6 +170,7 @@ DEBUG_COUNTER (if_after_combine)
>  DEBUG_COUNTER (if_after_reload)
>  DEBUG_COUNTER (if_conversion)
>  DEBUG_COUNTER (if_conversion_tree)
> +DEBUG_COUNTER (ipa_cp_bits)
>  DEBUG_COUNTER (ipa_sra_params)
>  DEBUG_COUNTER (ipa_sra_retvalues)
>  DEBUG_COUNTER (ira_move)
> diff --git a/gcc/ipa-cp.c b/gcc/ipa-cp.c
> index 10cc59509d5..945a69977f3 100644
> --- a/gcc/ipa-cp.c
> +++ b/gcc/ipa-cp.c
> @@ -123,6 +123,7 @@ along with GCC; see the file COPYING3.  If not see
>  #include "tree-ssa-ccp.h"
>  #include "stringpool.h"
>  #include "attribs.h"
> +#include "dbgcnt.h"
>  template  class ipcp_value;
> @@ -5788,9 +5789,13 @@ ipcp_store_bits_results (void)
> ipa_bits *jfbits;
> if (plats->bits_lattice.constant_p ())
> - jfbits
> -   = ipa_get_ipa_bits_for_value (plats->bits_lattice.get_value (),
> - plats->bits_lattice.get_mask ());
> + {
> +   jfbits
> + = ipa_get_ipa_bits_for_value (plats->bits_lattice.get_value (),
> +   plats->bits_lattice.get_mask ());
> +   if (!dbg_cnt (ipa_cp_bits))
> + jfbits = NULL;
> + }
> else
>   jfbits = NULL;
> -- 
> 2.28.0
> 


[PATCH] expr: Optimize noop copies [PR96539]

2020-08-11 Thread Jakub Jelinek via Gcc-patches
Hi!

At GIMPLE e.g. for __builtin_memmove we optimize away (to just the return
value) noop copies where src == dest, but at the RTL we don't, and as the
testcase shows, in some cases such copies can appear only at the RTL level
e.g. from trying to copy an aggregate by value argument to the same location
as it already has.  If the block move is expanded e.g. piecewise, we
actually manage to optimize it away, as the individual memory copies are
seen as noop moves, but if the target optabs are used, often the sequences
stay until final.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2020-08-10  Jakub Jelinek  

PR rtl-optimization/96539
* expr.c (emit_block_move_hints): Don't copy anything if x and y
are the same and neither is MEM_VOLATILE_P.

* gcc.target/i386/pr96539.c: New test.

--- gcc/expr.c.jj   2020-07-28 15:39:09.886757905 +0200
+++ gcc/expr.c  2020-08-10 13:14:47.190328119 +0200
@@ -1637,6 +1637,12 @@ emit_block_move_hints (rtx x, rtx y, rtx
   x = adjust_address (x, BLKmode, 0);
   y = adjust_address (y, BLKmode, 0);
 
+  /* If source and destination are the same, no need to copy anything.  */
+  if (rtx_equal_p (x, y)
+  && !MEM_VOLATILE_P (x)
+  && !MEM_VOLATILE_P (y))
+return 0;
+
   /* Set MEM_SIZE as appropriate for this block copy.  The main place this
  can be incorrect is coming from __builtin_memcpy.  */
   poly_int64 const_size;
--- gcc/testsuite/gcc.target/i386/pr96539.c.jj  2020-08-10 13:37:14.492946062 
+0200
+++ gcc/testsuite/gcc.target/i386/pr96539.c 2020-08-10 13:36:57.158183171 
+0200
@@ -0,0 +1,16 @@
+/* PR rtl-optimization/96539 */
+/* { dg-do compile } *
+/* { dg-options "-Os" } */
+/* { dg-final { scan-assembler-not "rep\[^\n\r]\*movs" } } */
+
+struct A { int a, b, c, d, e, f; void *g, *h, *i, *j, *k, *l, *m; };
+
+int bar (int a);
+int baz (int a, int b, int c, void *p, struct A s);
+
+int
+foo (int a, int b, int c, void *p, struct A s)
+{
+  bar (a);
+  return baz (a, b, c, p, s);
+}

Jakub



[PATCH] c-family: Fix ICE in get_atomic_generic_size [PR96545]

2020-08-11 Thread Jakub Jelinek via Gcc-patches
Hi!

As the testcase shows, we would ICE if the type of the first argument of
various atomic builtins was pointer to (non-void) incomplete type, we would
assume that TYPE_SIZE_UNIT must be non-NULL.  This patch diagnoses it
instead.  And also changes the TREE_CODE != INTEGER_CST check to
!tree_fits_uhwi_p, as we use tree_to_uhwi after this and at least in theory
the int could be too large and not fit.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2020-08-10  Jakub Jelinek  

PR c/96545
* c-common.c (get_atomic_generic_size): Require that first argument's
type points to a complete type and use tree_fits_uhwi_p instead of
just INTEGER_CST TREE_CODE check for the TYPE_SIZE_UNIT.

* c-c++-common/pr96545.c: New test.

--- gcc/c-family/c-common.c.jj  2020-07-31 23:07:00.566153515 +0200
+++ gcc/c-family/c-common.c 2020-08-10 12:03:35.236841534 +0200
@@ -7017,8 +7017,15 @@ get_atomic_generic_size (location_t loc,
   return 0;
 }
 
+  if (!COMPLETE_TYPE_P (TREE_TYPE (type_0)))
+{
+  error_at (loc, "argument 1 of %qE must be a pointer to a complete type",
+   function);
+  return 0;
+}
+
   /* Types must be compile time constant sizes. */
-  if (TREE_CODE ((TYPE_SIZE_UNIT (TREE_TYPE (type_0 != INTEGER_CST)
+  if (!tree_fits_uhwi_p ((TYPE_SIZE_UNIT (TREE_TYPE (type_0)
 {
   error_at (loc, 
"argument 1 of %qE must be a pointer to a constant size type",
--- gcc/testsuite/c-c++-common/pr96545.c.jj 2020-08-10 12:28:43.296222401 
+0200
+++ gcc/testsuite/c-c++-common/pr96545.c2020-08-10 12:28:28.258428487 
+0200
@@ -0,0 +1,31 @@
+/* PR c/96545 */
+/* { dg-do compile } */
+
+extern char x[], y[], z[];
+struct S;
+extern struct S s, t, u;
+int v, w;
+
+void
+foo (void)
+{
+  __atomic_exchange (, , , 0);   /* { dg-error "must be a pointer to a 
complete type" } */
+}
+
+void
+bar (void)
+{
+  __atomic_exchange (, , , 0);   /* { dg-error "must be a pointer to a 
complete type" } */
+}
+
+void
+baz (void)
+{
+  __atomic_exchange (, , , 0);   /* { dg-error "size mismatch in 
argument 2 of" } */
+}
+
+void
+qux (void)
+{
+  __atomic_exchange (, , , 0);   /* { dg-error "size mismatch in 
argument 3 of" } */
+}

Jakub



[PATCH] tree: Fix up get_narrower [PR96549]

2020-08-11 Thread Jakub Jelinek via Gcc-patches
Hi!

My changes to get_narrower to support COMPOUND_EXPRs apparently
used a wrong type for the COMPOUND_EXPRs, while e.g. the rhs
type was unsigned short, the COMPOUND_EXPR got int type as that was the
original type of op.  The type of COMPOUND_EXPR should be always the type
of the rhs.

Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux,
ok for trunk/10.3?

2020-08-10  Jakub Jelinek  

PR c/96549
* tree.c (get_narrower): Use TREE_TYPE (ret) instead of
TREE_TYPE (win) for COMPOUND_EXPRs.

* gcc.c-torture/execute/pr96549.c: New test.

--- gcc/tree.c.jj   2020-08-03 22:54:51.456531124 +0200
+++ gcc/tree.c  2020-08-10 11:05:49.129685858 +0200
@@ -8877,7 +8877,7 @@ get_narrower (tree op, int *unsignedp_pt
v.safe_push (op);
   FOR_EACH_VEC_ELT_REVERSE (v, i, op)
ret = build2_loc (EXPR_LOCATION (op), COMPOUND_EXPR,
- TREE_TYPE (win), TREE_OPERAND (op, 0),
+ TREE_TYPE (ret), TREE_OPERAND (op, 0),
  ret);
   return ret;
 }
--- gcc/testsuite/gcc.c-torture/execute/pr96549.c.jj2020-08-10 
11:09:30.307623013 +0200
+++ gcc/testsuite/gcc.c-torture/execute/pr96549.c   2020-08-10 
11:09:15.772824289 +0200
@@ -0,0 +1,12 @@
+/* PR c/96549 */
+
+long c = -1L;
+long b = 0L;
+
+int
+main ()
+{
+  if (3L > (short) ((c ^= (b = 1L)) * 3L))
+return 0;
+  __builtin_abort ();
+}

Jakub



Re: [PATCH] [PR target/96350]Force ENDBR immediate into memory to avoid fake ENDBR opcode.

2020-08-11 Thread Uros Bizjak via Gcc-patches
On Tue, Aug 11, 2020 at 5:30 AM Hongtao Liu  wrote:
>
> Hi:
>   The issue is described in the bugzilla.
>   Bootstrap is ok, regression test for i386/x86-64 backend is ok.
>   Ok for trunk?
>
> ChangeLog
> gcc/
> PR target/96350
> * config/i386/i386.c (ix86_legitimate_constant_p): Return
> false for ENDBR immediate.
> (ix86_legitimate_address_p): Ditto.
> * config/i386/predicated.md
> (x86_64_immediate_operand): Exclude ENDBR immediate.
> (x86_64_zext_immediate_operand): Ditto.
> (x86_64_dwzext_immediate_operand): Ditto.
> (ix86_not_endbr_immediate_operand): New predicate.
>
> gcc/testsuite
> * gcc.target/i386/endbr_immediate.c: New test.

+;; Return true if VALUE isn't an ENDBR opcode in immediate field.
+(define_predicate "ix86_not_endbr_immediate_operand"
+  (match_test "1")

Please reverse the above logic to introduce
ix86_endbr_immediate_operand, that returns true for unwanted
immediate. Something like:

(define_predicate "ix86_endbr_immediate_operand"
  (match_code "const_int")
...

And you will be able to use it like:

if (ix86_endbr_immediate_operand (x, VOIDmode)
  return false;

   /* Otherwise we handle everything else in the move patterns.  */
-  return true;
+  return ix86_not_endbr_immediate_operand (x, VOIDmode);
 }

Please handle this in CASE_CONST_SCALAR_INT: part.

+  if (disp && !ix86_not_endbr_immediate_operand (disp, VOIDmode))
+return false;

And this in:

  /* Validate displacement.  */
  if (disp)
{

Uros.

> --
> BR,
> Hongtao


[PATCH] x86_64: Use peephole2 to eliminate redundant moves.

2020-08-11 Thread Roger Sayle

The recent fix for mul_widen_cost revealed an interesting
quirk of ira/reload register allocation on x86_64.  As shown in
https://gcc.gnu.org/pipermail/gcc-patches/2020-August/551648.html
for gcc.target/i386/pr71321.c we generate the following code that
performs unnecessary register shuffling.

movl$-51, %edx
movl%edx, %eax
mulb%dil

which is caused by reload generating the following instructions
(notice the set of the first register is dead in the 2nd insn):

(insn 7 4 36 2 (set (reg:QI 1 dx [94])
(const_int -51 [0xffcd])) {*movqi_internal}
 (expr_list:REG_EQUIV (const_int -51 [0xffcd])
(nil)))
(insn 36 7 8 2 (set (reg:QI 0 ax [93])
(reg:QI 1 dx [94])) {*movqi_internal}
 (expr_list:REG_DEAD (reg:QI 1 dx [94])
(nil)))

Various discussions in bugzilla seem to point to reload preferring
not to load constants directly into CLASS_LIKELY_SPILLED_P registers.
Whatever the cause, one solution (workaround), that doesn't involve
rewriting a register allocator, is to use peephole2 to spot this
weirdness and eliminate it.  In fact, this use case is (probably)
the reason peephole optimizers were originally developed, but it's
a little disappointing this application of them is still required
today.  On a positive note, this clean-up is cheap, as we're already
traversing the instruction stream with liveness (REG_DEAD notes)
already calculated.

With this peephole2 the above three instructions (from pr71321.c)
are replaced with:

movl$-51, %eax
mulb%dil

This patch has been tested on x86_64-pc-linux-gnu with "make bootstrap"
and "make -k check" with no new failures.  This peephole triggers
1435 during stage2 and stage3 of a bootstrap, and a further 1274
times during "make check".  The most common case is DX_REG->AX_REG
(as above) which occurs 421 times.  I've restricted this pattern to
immediate constant loads into general operand registers, which fixes
this particular problem, but broader predicates may help similar cases.
Ok for mainline?

2020-08-11  Roger Sayle  

* config/i386/i386.md (peephole2): Reduce unnecessary
register shuffling produced by register allocation.

Thanks in advance,
Roger
--
Roger Sayle
NextMove Software
Cambridge, UK

diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index 4e916bf..34a8946 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -18946,6 +18946,16 @@
   operands[2] = gen_rtx_REG (GET_MODE (operands[0]), FLAGS_REG);
   ix86_expand_clear (operands[1]);
 })
+
+;; Reload dislikes loading constants directly into class_likely_spilled
+;; hard registers.  Try to tidy things up here.
+(define_peephole2
+  [(set (match_operand:SWI 0 "general_reg_operand")
+   (match_operand:SWI 1 "x86_64_immediate_operand"))
+   (set (match_operand:SWI 2 "general_reg_operand")
+   (match_dup 0))]
+  "peep2_reg_dead_p (2, operands[0])"
+  [(set (match_dup 2) (match_dup 1))])
 
 ;; Misc patterns (?)
 


[PATCH] Add debug counter for IPA bits CP.

2020-08-11 Thread Martin Liška

Hey.

I'm debugging PR96482 and it would be handy for me to have a debug counter
for the problematic transformation.

Ready for master?
Thanks,
Martin

gcc/ChangeLog:

* dbgcnt.def (DEBUG_COUNTER): Add ipa_cp_bits.
* ipa-cp.c (ipcp_store_bits_results): Use it when we store known
bits for parameters.
---
 gcc/dbgcnt.def |  1 +
 gcc/ipa-cp.c   | 11 ---
 2 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/gcc/dbgcnt.def b/gcc/dbgcnt.def
index 3998c9636aa..cf8775b2b66 100644
--- a/gcc/dbgcnt.def
+++ b/gcc/dbgcnt.def
@@ -170,6 +170,7 @@ DEBUG_COUNTER (if_after_combine)
 DEBUG_COUNTER (if_after_reload)
 DEBUG_COUNTER (if_conversion)
 DEBUG_COUNTER (if_conversion_tree)
+DEBUG_COUNTER (ipa_cp_bits)
 DEBUG_COUNTER (ipa_sra_params)
 DEBUG_COUNTER (ipa_sra_retvalues)
 DEBUG_COUNTER (ira_move)
diff --git a/gcc/ipa-cp.c b/gcc/ipa-cp.c
index 10cc59509d5..945a69977f3 100644
--- a/gcc/ipa-cp.c
+++ b/gcc/ipa-cp.c
@@ -123,6 +123,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "tree-ssa-ccp.h"
 #include "stringpool.h"
 #include "attribs.h"
+#include "dbgcnt.h"
 
 template  class ipcp_value;
 
@@ -5788,9 +5789,13 @@ ipcp_store_bits_results (void)

  ipa_bits *jfbits;
 
 	  if (plats->bits_lattice.constant_p ())

-   jfbits
- = ipa_get_ipa_bits_for_value (plats->bits_lattice.get_value (),
-   plats->bits_lattice.get_mask ());
+   {
+ jfbits
+   = ipa_get_ipa_bits_for_value (plats->bits_lattice.get_value (),
+ plats->bits_lattice.get_mask ());
+ if (!dbg_cnt (ipa_cp_bits))
+   jfbits = NULL;
+   }
  else
jfbits = NULL;
 
--

2.28.0



Re: [PATCH] emit-rtl.c: Allow splitting of RTX_FRAME_RELATED_P insns?

2020-08-11 Thread Richard Sandiford
Segher Boessenkool  writes:
> On Mon, Aug 10, 2020 at 05:16:15PM +0100, Richard Sandiford wrote:
>> Senthil Kumar via Gcc-patches  writes:
>> >   The wiki suggests using post-reload splitters, so that's the
>> >   direction I took, but I ran into an issue where split_insn
>> >   bails out early if RTX_FRAME_RELATED_P is true - this means
>> >   that splits for REG_CC clobbering insns with
>> >   RTX_FRAME_RELATED_P will never execute, resulting in a
>> >   could-not-split insn ICE in the final stage.
>> >
>> >   I see that the recog.c:peep2_attempt allows splitting of a
>> >   RTX_FRAME_RELATED_P insn, provided the result of the split is a
>> >   single insn. Would it be ok to modify try_split also to
>> >   allow those kinds of insns (tentative patch attached, code
>> >   copied over from peep2_attempt, only setting old and new_insn)? Or is 
>> > there
>> >   a different approach to fix this?
>> 
>> I agree there's no obvious reason why splitting to a single insn
>> should be rejected but a peephole2 to a single instruction should be OK.
>> And reusing the existing, tried-and-tested code is the way to go.
>
> The only obvious difference is that the splitters run many times, while
> peep2 runs only once, very late.  If you make this only do stuff for
> reload_completed splitters, that difference is gone as well.

Yeah, but I was talking specifically about RTX_FRAME_RELATED_P stuff,
rather than in general, and RTX_FRAME_RELATED_P insns shouldn't exist
until prologue/epilogue generation.  The reference to “single insn”
was because both passes would still reject splitting/peepholing an
RTX_FRAME_RELATED_P insn to multiple insns.

Thanks,
Richard