Re: [PATCH v2] [libatomic] Add RTEMS support

2016-04-24 Thread Sebastian Huber

Ok, what about the GCC trunk?

On 20/04/16 14:35, Sebastian Huber wrote:

Hello,

I know that I am pretty late, but is there a chance to get this into 
the GCC 6.1 release?


On 19/04/16 14:56, Sebastian Huber wrote:

v2: Do not use architecture configuration due to broken ARM libatomic
support.

gcc/

* config/rtems.h (LIB_SPEC): Add -latomic.

libatomic/

* configure.tgt (*-*-rtems*): New supported target.
* config/rtems/host-config.h: New file.
* config/rtems/lock.c: Likewise.




--
Sebastian Huber, embedded brains GmbH

Address : Dornierstr. 4, D-82178 Puchheim, Germany
Phone   : +49 89 189 47 41-16
Fax : +49 89 189 47 41-09
E-Mail  : sebastian.hu...@embedded-brains.de
PGP : Public key available on request.

Diese Nachricht ist keine geschäftliche Mitteilung im Sinne des EHUG.



Re: [wwwdocs] Buildstat update for 4.8

2016-04-24 Thread Gerald Pfeifer
On Sat, 23 Apr 2016, Tom G. Christensen wrote:
> Latest results for 4.8.x

Thank you, Tom.  Applied.

Gerald


[Ping] Re: [ARM] Add support for overflow add, sub, and neg operations

2016-04-24 Thread Michael Collison


Ping. Previous Patch posted here:

https://gcc.gnu.org/ml/gcc-patches/2016-03/msg01472.html

--
Michael Collison
Linaro Toolchain Working Group
michael.colli...@linaro.org



Re: [PATCH][doc] Update documentation of AArch64 options

2016-04-24 Thread Sandra Loosemore

On 04/22/2016 11:11 AM, Wilco Dijkstra wrote:


[snip]

Fixed, new version below:


2016-04-22  Wilco Dijkstra  

gcc/
* gcc/doc/invoke.texi (AArch64 Options): Update.


Thanks, this looks much better.

-Sandra



Re: [PATCH][AArch64][wwwdocs] Summarise some more AArch64 changes for GCC6

2016-04-24 Thread Sandra Loosemore

On 04/22/2016 03:57 AM, James Greenhalgh wrote:

On Thu, Apr 21, 2016 at 09:15:17AM +0100, Kyrill Tkachov wrote:

Hi all,

Here's a proposed summary of the changes in the AArch64 backend for GCC 6.
If there's anything I've missed it's purely my oversight, feel free to add
entries or suggest improvements.


For me, I'm mostly happy with the wording below (I've tried to be
helpful inline). But I'm not as conscientious at checking grammar as others
in the community. So this is OK from an AArch64 target perspective with
the changes below, but wait a short while to give Gerald or Sandra a chance
to comment.


I haven't done a careful review of the whole section of existing text, 
but I did notice a few things in text not being touched by this patch:



+ 
 The new command line options -march=native,


s/command line options/command-line options/


 -mcpu=native and -mtune=native are now
 available on native AArch64 GNU/Linux systems.  Specifying
 these options will cause GCC to auto-detect the host CPU and


s/will cause/causes/


 rewrite these options to the optimal setting for that system.


s/rewrite these options to the optimal/choose the/


-   -fpic is now supported by the AArch64 target when 
generating
+   -fpic is now supported when generating
 code for the small code model (-mcmodel=small).  The size 
of
 the global offset table (GOT) is limited to 28KiB under the LP64 SysV 
ABI
 , and 15KiB under the ILP32 SysV ABI.


Move the comma directly after "ABI", not separated by newline and 
whitespace.


-Sandra



Re: Document OpenACC status for GCC 6

2016-04-24 Thread Sandra Loosemore

On 04/22/2016 03:26 AM, Thomas Schwinge wrote:


Thanks for the review; OK to commit as follows?  And then, should
something be added to the "News" section on 
itself, too?  (I don't know the policy for that.  We didn't suggest that
for GCC 5, because at that time we described the support as a
"preliminary implementation of the OpenACC 2.0a specification"; now it's
much more complete and usable.)


I think the new patch is acceptable for release notes, but TBH I don't 
know what the policy is for updating "News", either.  :-S


-Sandra



Re: [PATCH, rs6000] Add support for vector element-reversal built-ins

2016-04-24 Thread Bill Schmidt
On Sun, 2016-04-24 at 15:52 -0500, Segher Boessenkool wrote:
> On Sun, Apr 24, 2016 at 02:06:47PM -0500, Bill Schmidt wrote:
> > ISA 3.0 adds the lvxh8x, lvxb16x, stvxh8x, and stvxb16x instructions,
> 
> lxvh8x etc.  It looks like you only swapped things in this message,
> not in the actual patch :-)

D'oh!  Yes, I make that typo all the time...

> 
> > (While working on this patch, I happened to notice that the existing
> > entries in rs6000-builtin.def for STXVD2X_ and STXVW4X_ are
> > mapped to stxsdx instead of stxvd2x/stxvw4x.  I took the opportunity to
> > correct that as an obvious bug.)
> 
> Does that part need backporting?

Yes, we should do that.

> 
> Should the new builtins be documented?

Bah, I knew I had forgotten something.  Yes, they should.  I'll put
together a follow-on patch to update the documentation.

Thanks!

Bill

> 
> Looks fine otherwise.
> 
> 
> Segher
> 




[PATCH, i386]: Use const_0_to_3_operand predicate some more

2016-04-24 Thread Uros Bizjak
Hello!

No functional changes.

2016-04-25  Uros Bizjak  

* config/i386/i386.md (*lea_general_4): Use const_0_to_3_operand
predicate for operand 2.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

Committed to mainline SVN.

Uros.
Index: i386.md
===
--- i386.md (revision 235396)
+++ i386.md (working copy)
@@ -6291,10 +6291,9 @@
(any_or:SWI12
  (ashift:SWI12
(match_operand:SWI12 1 "index_register_operand" "l")
-   (match_operand:SWI12 2 "const_int_operand" "n"))
+   (match_operand:SWI12 2 "const_0_to_3_operand" "n"))
  (match_operand:SWI12 3 "const_int_operand" "n")))]
   "(!TARGET_PARTIAL_REG_STALL || optimize_function_for_size_p (cfun))
-   && (unsigned HOST_WIDE_INT) INTVAL (operands[2]) <= 3
&& ((unsigned HOST_WIDE_INT) INTVAL (operands[3])
< (HOST_WIDE_INT_1U << INTVAL (operands[2])))"
   "#"
@@ -6316,11 +6315,10 @@
(any_or:SWI48
  (ashift:SWI48
(match_operand:SWI48 1 "index_register_operand" "l")
-   (match_operand:SWI48 2 "const_int_operand" "n"))
+   (match_operand:SWI48 2 "const_0_to_3_operand" "n"))
  (match_operand:SWI48 3 "const_int_operand" "n")))]
-  "(unsigned HOST_WIDE_INT) INTVAL (operands[2]) <= 3
-   && ((unsigned HOST_WIDE_INT) INTVAL (operands[3])
-   < (HOST_WIDE_INT_1U << INTVAL (operands[2])))"
+  "(unsigned HOST_WIDE_INT) INTVAL (operands[3])
+   < (HOST_WIDE_INT_1U << INTVAL (operands[2]))"
   "#"
   "&& reload_completed"
   [(set (match_dup 0)


Re: [PATCH] Allow all 1s of integer as standard SSE constants

2016-04-24 Thread Uros Bizjak
Hello!

Attached patch is what I have committed to handle immediates with all
bits set as standard SSE constants.

2016-04-24  Uros Bizjak  
H.J. Lu  

* config/i386/i386-protos.h (standard_sse_constant_p): Add
machine_mode argument.
* config/i386/i386.c (standard_sse_constant_p): Return 2 for
constm1_rtx operands.  For VOIDmode constants, get mode from
pred_mode.  Check mode size if the mode is supported by ABI.
(standard_sse_constant_opcode): Do not use standard_constant_p.
Strictly check ABI support for all-ones operands.
(ix86_legitimate_constant_p): Handle TImode, OImode and XImode
immediates. Update calls to standard_sse_constant_p.
(ix86_expand_vector_move): Update calls to standard_sse_constant_p.
(ix86_rtx_costs): Ditto.
* config/i386/i386.md (*movxi_internal_avx512f): Use
nonimmediate_or_sse_const_operand instead of vector_move_operand.
Use (v,BC) alternative instead of (v,C). Use register_operand
checks instead of MEM_P.
(*movoi_internal_avx): Use nonimmediate_or_sse_const_operand instead
of vector_move_operand.  Add (v,BC) alternative and corresponding avx2
isa attribute.  Use register_operand checks instead of MEM_P.
(*movti_internal): Use nonimmediate_or_sse_const_operand for
TARGET_SSE.  Improve TARGET_SSE insn constraint.  Add (v,BC)
alternative and corresponding sse2 isa attribute.
(*movtf_internal, *movdf_internal, *movsf_interal): Update calls
to standard_sse_constant_p.
(FP constant splitters): Ditto.
* config/i386/constraints.md (BC): Do not use standard_sse_constant_p.
(C): Ditto.
* config/i386/predicates.md (constm1_operand): Remove.
(nonimmediate_or_sse_const_operand): Rewrite using RTX.
* config/i386/sse.md (*_cvtmask2): Use
vector_all_ones_operand instead of constm1_operand.

Patch was bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

Uros.
Index: constraints.md
===
--- constraints.md  (revision 235371)
+++ constraints.md  (working copy)
@@ -186,7 +186,10 @@
 
 (define_constraint "BC"
   "@internal SSE constant operand."
-  (match_test "standard_sse_constant_p (op)"))
+  (and (match_test "TARGET_SSE")
+   (ior (match_test "op == const0_rtx || op == constm1_rtx")
+   (match_operand 0 "const0_operand")
+   (match_operand 0 "vector_all_ones_operand"
 
 ;; Integer constant constraints.
 (define_constraint "I"
@@ -239,7 +242,9 @@
 ;; This can theoretically be any mode's CONST0_RTX.
 (define_constraint "C"
   "SSE constant zero operand."
-  (match_test "standard_sse_constant_p (op) == 1"))
+  (and (match_test "TARGET_SSE")
+   (ior (match_test "op == const0_rtx")
+   (match_operand 0 "const0_operand"
 
 ;; Constant-or-symbol-reference constraints.
 
Index: i386-protos.h
===
--- i386-protos.h   (revision 235371)
+++ i386-protos.h   (working copy)
@@ -50,7 +50,7 @@ extern bool ix86_using_red_zone (void);
 extern int standard_80387_constant_p (rtx);
 extern const char *standard_80387_constant_opcode (rtx);
 extern rtx standard_80387_constant_rtx (int);
-extern int standard_sse_constant_p (rtx);
+extern int standard_sse_constant_p (rtx, machine_mode);
 extern const char *standard_sse_constant_opcode (rtx_insn *, rtx);
 extern bool symbolic_reference_mentioned_p (rtx);
 extern bool extended_reg_mentioned_p (rtx);
Index: i386.c
===
--- i386.c  (revision 235371)
+++ i386.c  (working copy)
@@ -10762,11 +10762,11 @@ standard_80387_constant_rtx (int idx)
   XFmode);
 }
 
-/* Return 1 if X is all 0s and 2 if x is all 1s
+/* Return 1 if X is all bits 0 and 2 if X is all bits 1
in supported SSE/AVX vector mode.  */
 
 int
-standard_sse_constant_p (rtx x)
+standard_sse_constant_p (rtx x, machine_mode pred_mode)
 {
   machine_mode mode;
 
@@ -10774,34 +10774,38 @@ int
 return 0;
 
   mode = GET_MODE (x);
-  
-  if (x == const0_rtx || x == CONST0_RTX (mode))
+
+  if (x == const0_rtx || const0_operand (x, mode))
 return 1;
-  if (vector_all_ones_operand (x, mode))
-switch (mode)
-  {
-  case V16QImode:
-  case V8HImode:
-  case V4SImode:
-  case V2DImode:
-   if (TARGET_SSE2)
- return 2;
-  case V32QImode:
-  case V16HImode:
-  case V8SImode:
-  case V4DImode:
-   if (TARGET_AVX2)
- return 2;
-  case V64QImode:
-  case V32HImode:
-  case V16SImode:
-  case V8DImode:
-   if (TARGET_AVX512F)
- return 2;
-  default:
-   break;
-  }
 
+  if (x == constm1_rtx || vector_all_ones_operand (x, mode))
+{
+  /* VOIDmode integer constant, get mode from the predicate.  */
+  if (mode == VOIDmode)
+   mode = pred_mode;
+
+  switch (GET_MODE_SIZE (mode))
+  

Re: [PATCH, rs6000] Add support for vector element-reversal built-ins

2016-04-24 Thread Segher Boessenkool
On Sun, Apr 24, 2016 at 02:06:47PM -0500, Bill Schmidt wrote:
> ISA 3.0 adds the lvxh8x, lvxb16x, stvxh8x, and stvxb16x instructions,

lxvh8x etc.  It looks like you only swapped things in this message,
not in the actual patch :-)

> (While working on this patch, I happened to notice that the existing
> entries in rs6000-builtin.def for STXVD2X_ and STXVW4X_ are
> mapped to stxsdx instead of stxvd2x/stxvw4x.  I took the opportunity to
> correct that as an obvious bug.)

Does that part need backporting?

Should the new builtins be documented?

Looks fine otherwise.


Segher


[PATCH, rs6000] Add support for vector element-reversal built-ins

2016-04-24 Thread Bill Schmidt
Hi,

ISA 3.0 adds the lvxh8x, lvxb16x, stvxh8x, and stvxb16x instructions,
which perform vector loads in big-endian order, regardless of the target
endianness.  These join the similar lvxd2x, lvxw4x, stvxd2x, and stvxw4x
instructions introduced in 2.6.  These existing instructions have been
used in several ways, but we don't yet have built-ins to allow them to
be specifically generated for little-endian.  This patch corrects that,
and adds built-ins for the new ISA 3.0 instructions as well.

Note that the behavior of lvxd2x, lvxw4x, lvxh8x, and lxvb16x are
indistinguishable from one another in big-endian mode, and similarly for
the stores.  So we can treat these as simple moves that will generate
any applicable load or store (such as lxvx and stxvx for ISA 3.0).  For
little-endian, however, we require separate patterns for each of these
loads and stores to ensure that we get the correct element-reversal
semantics for each of them, depending on the vector mode.

(While working on this patch, I happened to notice that the existing
entries in rs6000-builtin.def for STXVD2X_ and STXVW4X_ are
mapped to stxsdx instead of stxvd2x/stxvw4x.  I took the opportunity to
correct that as an obvious bug.)

I've added four new tests to demonstrate correct behavior of the new
built-in functions.  These include variants for big- and little-endian,
and variants for -mcpu=power8 and -mcpu=power9.

Bootstrapped and tested on powerpc64-unknown-linux-gnu and
powerpc64le-unknown-linux-gnu with no regressions.  Is this ok for
trunk, following GCC 6 release?

Thanks,
Bill


[gcc]

2016-04-24  Bill Schmidt  

* config/rs6000/rs6000-builtin.def (STXVD2X_V1TI): Fix target
built-in function name.
(STXVD2X_V2DF): Likewise.
(STXVD2X_V2DI): Likewise.
(STXVW4X_V4SF): Likewise.
(STXVW4X_V4SI): Likewise.
(STXVW4X_V8HI): Likewise.
(STXVW4X_V16QI): Likewise.
(LD_ELEMREV_V2DF): New.
(LD_ELEMREV_V2DI): New.
(LD_ELEMREV_V4SF): New.
(LD_ELEMREV_V4SI): New.
(LD_ELEMREV_V8HI): New.
(LD_ELEMREV_V16QI): New.
(ST_ELEMREV_V2DF): New.
(ST_ELEMREV_V2DI): New.
(ST_ELEMREV_V4SF): New.
(ST_ELEMREV_V4SI): New.
(ST_ELEMREV_V8HI): New.
(ST_ELEMREV_V16QI): New.
* config/rs6000/rs6000.c (altivec_expand_builtin): Add handling
for VSX_BUILTIN_ST_ELEMREV_ and
VSX_BUILTIN_LD_ELEMREV_.
(altivec_init_builtins): Likewise.
* config/rs6000/vsx.md (vsx_ld_elemrev_v2di): New define_insn.
(vsx_ld_elemrev_v2df): Likewise.
(vsx_ld_elemrev_v4sf): Likewise.
(vsx_ld_elemrev_v4si): Likewise.
(vsx_ld_elemrev_v8hi): Likewise.
(vsx_ld_elemrev_v16qi): Likewise.
(vsx_st_elemrev_v2df): Likewise.
(vsx_st_elemrev_v2di): Likewise.
(vsx_st_elemrev_v4sf): Likewise.
(vsx_st_elemrev_v4si): Likewise.
(vsx_st_elemrev_v8hi): Likewise.
(vsx_st_elemrev_v16qi): Likewise.

[gcc/testsuite]

2016-04-24  Bill Schmidt  

* gcc.target/powerpc/vsx-elemrev-1.c: New.
* gcc.target/powerpc/vsx-elemrev-2.c: New.
* gcc.target/powerpc/vsx-elemrev-3.c: New.
* gcc.target/powerpc/vsx-elemrev-4.c: New.


diff --git a/gcc/config/rs6000/rs6000-builtin.def 
b/gcc/config/rs6000/rs6000-builtin.def
index 5b82b00..aa87633 100644
--- a/gcc/config/rs6000/rs6000-builtin.def
+++ b/gcc/config/rs6000/rs6000-builtin.def
@@ -1391,13 +1391,25 @@ BU_VSX_X (LXVW4X_V4SI,"lxvw4x_v4si",MEM)
 BU_VSX_X (LXVW4X_V8HI,"lxvw4x_v8hi",   MEM)
 BU_VSX_X (LXVW4X_V16QI,  "lxvw4x_v16qi",   MEM)
 BU_VSX_X (STXSDX,"stxsdx", MEM)
-BU_VSX_X (STXVD2X_V1TI,  "stxsdx_v1ti",MEM)
-BU_VSX_X (STXVD2X_V2DF,  "stxsdx_v2df",MEM)
-BU_VSX_X (STXVD2X_V2DI,  "stxsdx_v2di",MEM)
-BU_VSX_X (STXVW4X_V4SF,  "stxsdx_v4sf",MEM)
-BU_VSX_X (STXVW4X_V4SI,  "stxsdx_v4si",MEM)
-BU_VSX_X (STXVW4X_V8HI,  "stxsdx_v8hi",MEM)
-BU_VSX_X (STXVW4X_V16QI,  "stxsdx_v16qi",  MEM)
+BU_VSX_X (STXVD2X_V1TI,  "stxvd2x_v1ti",   MEM)
+BU_VSX_X (STXVD2X_V2DF,  "stxvd2x_v2df",   MEM)
+BU_VSX_X (STXVD2X_V2DI,  "stxvd2x_v2di",   MEM)
+BU_VSX_X (STXVW4X_V4SF,  "stxvw4x_v4sf",   MEM)
+BU_VSX_X (STXVW4X_V4SI,  "stxvw4x_v4si",   MEM)
+BU_VSX_X (STXVW4X_V8HI,  "stxvw4x_v8hi",   MEM)
+BU_VSX_X (STXVW4X_V16QI,  "stxvw4x_v16qi", MEM)
+BU_VSX_X (LD_ELEMREV_V2DF,"ld_elemrev_v2df",  MEM)
+BU_VSX_X (LD_ELEMREV_V2DI,"ld_elemrev_v2di",  MEM)
+BU_VSX_X (LD_ELEMREV_V4SF,"ld_elemrev_v4sf",  MEM)
+BU_VSX_X (LD_ELEMREV_V4SI,"ld_elemrev_v4si",  MEM)
+BU_VSX_X (LD_ELEMREV_V8HI,"ld_elemrev_v8hi",  MEM)
+BU_VSX_X (LD_ELEMREV_V16QI,   "ld_elemrev_v16qi", MEM)
+BU_VSX_X (ST_ELEMREV_V2DF,"st_elemrev_v2df",  MEM)
+BU_VSX_X (ST_ELEMREV_V2DI,"st_elemrev_v2di",  M

New Swedish PO file for 'gcc' (version 6.1-b20160131)

2016-04-24 Thread Translation Project Robot
Hello, gentle maintainer.

This is a message from the Translation Project robot.

A revised PO file for textual domain 'gcc' has been submitted
by the Swedish team of translators.  The file is available at:

http://translationproject.org/latest/gcc/sv.po

(This file, 'gcc-6.1-b20160131.sv.po', has just now been sent to you in
a separate email.)

All other PO files for your package are available in:

http://translationproject.org/latest/gcc/

Please consider including all of these in your next release, whether
official or a pretest.

Whenever you have a new distribution with a new version number ready,
containing a newer POT file, please send the URL of that distribution
tarball to the address below.  The tarball may be just a pretest or a
snapshot, it does not even have to compile.  It is just used by the
translators when they need some extra translation context.

The following HTML page has been updated:

http://translationproject.org/domain/gcc.html

If any question arises, please contact the translation coordinator.

Thank you for all your work,

The Translation Project robot, in the
name of your translation coordinator.




[patch] libstdc++/70762 fix fallback implementation of nonexistent_path

2016-04-24 Thread Jonathan Wakely

This ensures that each call to __gnu_test::nonexistent_path() returns
a different path.

Tested x86_64-linux, x86_64-freebsd10, committed to trunk.


commit 3a632ad051d6d9ca20cd7f10b09bd39b09dae5ea
Author: Jonathan Wakely 
Date:   Sun Apr 24 18:50:26 2016 +0100

libstdc++/70762 fix fallback implementation of nonexistent_path

	PR libstdc++/70762
	* testsuite/util/testsuite_fs.h (__gnu_test::nonexistent_path): Use
	static counter to return a different path on every call.

diff --git a/libstdc++-v3/testsuite/util/testsuite_fs.h b/libstdc++-v3/testsuite/util/testsuite_fs.h
index d2b3b18..f1e0bfc 100644
--- a/libstdc++-v3/testsuite/util/testsuite_fs.h
+++ b/libstdc++-v3/testsuite/util/testsuite_fs.h
@@ -83,11 +83,13 @@ namespace __gnu_test
 p = tmp;
 #else
 char buf[64];
+static int counter;
 #if _GLIBCXX_USE_C99_STDIO
-std::snprintf(buf, 64, "filesystem-ts-test.%lu", (unsigned long)::getpid());
+std::snprintf(buf, 64,
 #else
-std::sprintf(buf, "filesystem-ts-test.%lu", (unsigned long)::getpid());
+std::sprintf(buf,
 #endif
+  "filesystem-ts-test.%d.%lu", counter++, (unsigned long) ::getpid());
 p = buf;
 #endif
 return p;


match.pd: unsigned A - B > A --> A < B

2016-04-24 Thread Marc Glisse

Hello,

the first part is something that was discussed last stage3, and Jakub 
argued in favor of single_use. The second part is probably less useful, it 
notices that if we manually check for overflow using the result of 
IFN_*_OVERFLOW, then we might as well read that information from the 
result of that function.


Bootstrap+regtest on powerpc64le-unknown-linux-gnu. (hmm, I probably 
should have done it on x86_64 instead, I don't know if the ppc backend has 
implemented the overflow functions recently)


2016-04-25  Marc Glisse  

gcc/
* match.pd (A - B > A, A + B < A): New transformations.

gcc/testsuite/
* gcc.dg/tree-ssa/overflow-2.c: New testcase.
* gcc.dg/tree-ssa/minus-ovf.c: Likewise.

--
Marc GlisseIndex: trunk-ovf2/gcc/match.pd
===
--- trunk-ovf2/gcc/match.pd (revision 235371)
+++ trunk-ovf2/gcc/match.pd (working copy)
@@ -3071,10 +3071,60 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
 (simplify
  /* signbit(x) -> 0 if x is nonnegative.  */
  (SIGNBIT tree_expr_nonnegative_p@0)
  { integer_zero_node; })
 
 (simplify
  /* signbit(x) -> x<0 if x doesn't have signed zeros.  */
  (SIGNBIT @0)
  (if (!HONOR_SIGNED_ZEROS (@0))
   (convert (lt @0 { build_real (TREE_TYPE (@0), dconst0); }
+
+/* To detect overflow in unsigned A - B, A < B is simpler than A - B > A.
+   However, the detection logic for SUB_OVERFLOW in tree-ssa-math-opts.c
+   expects the long form, so we restrict the transformation for now.  */
+(for cmp (gt le)
+ (simplify
+  (cmp (minus@2 @0 @1) @0)
+  (if (single_use (@2)
+   && ANY_INTEGRAL_TYPE_P (TREE_TYPE (@0))
+   && TYPE_UNSIGNED (TREE_TYPE (@0))
+   && TYPE_OVERFLOW_WRAPS (TREE_TYPE (@0)))
+   (cmp @1 @0
+(for cmp (lt ge)
+ (simplify
+  (cmp @0 (minus@2 @0 @1))
+  (if (single_use (@2)
+   && ANY_INTEGRAL_TYPE_P (TREE_TYPE (@0))
+   && TYPE_UNSIGNED (TREE_TYPE (@0))
+   && TYPE_OVERFLOW_WRAPS (TREE_TYPE (@0)))
+   (cmp @0 @1
+
+/* Testing for overflow is unnecessary if we already know the result.  */
+/* A < A - B  */
+(for cmp (lt ge)
+ out (ne eq)
+ (simplify
+  (cmp @0 (realpart (IFN_SUB_OVERFLOW@2 @0 @1)))
+  (if (TYPE_UNSIGNED (TREE_TYPE (@0)))
+   (out (imagpart @2) { build_zero_cst (TREE_TYPE (@0)); }
+/* A - B > A  */
+(for cmp (gt le)
+ out (ne eq)
+ (simplify
+  (cmp (realpart (IFN_SUB_OVERFLOW@2 @0 @1)) @0)
+  (if (TYPE_UNSIGNED (TREE_TYPE (@0)))
+   (out (imagpart @2) { build_zero_cst (TREE_TYPE (@0)); }
+/* A + B < A  */
+(for cmp (lt ge)
+ out (ne eq)
+ (simplify
+  (cmp (realpart (IFN_ADD_OVERFLOW:c@2 @0 @1)) @0)
+  (if (TYPE_UNSIGNED (TREE_TYPE (@0)))
+   (out (imagpart @2) { build_zero_cst (TREE_TYPE (@0)); }
+/* A > A + B  */
+(for cmp (gt le)
+ out (ne eq)
+ (simplify
+  (cmp @0 (realpart (IFN_ADD_OVERFLOW:c@2 @0 @1)))
+  (if (TYPE_UNSIGNED (TREE_TYPE (@0)))
+   (out (imagpart @2) { build_zero_cst (TREE_TYPE (@0)); }
Index: trunk-ovf2/gcc/testsuite/gcc.dg/tree-ssa/minus-ovf.c
===
--- trunk-ovf2/gcc/testsuite/gcc.dg/tree-ssa/minus-ovf.c(revision 0)
+++ trunk-ovf2/gcc/testsuite/gcc.dg/tree-ssa/minus-ovf.c(working copy)
@@ -0,0 +1,24 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fdump-tree-optimized" } */
+
+int f(unsigned a, unsigned b) {
+  unsigned remove = a - b;
+  return remove > a;
+}
+
+int g(unsigned a, unsigned b) {
+  unsigned remove = a - b;
+  return remove <= a;
+}
+
+int h(unsigned a, unsigned b) {
+  unsigned remove = a - b;
+  return a < remove;
+}
+
+int i(unsigned a, unsigned b) {
+  unsigned remove = a - b;
+  return a >= remove;
+}
+
+/* { dg-final { scan-tree-dump-not "remove" "optimized" } } */
Index: trunk-ovf2/gcc/testsuite/gcc.dg/tree-ssa/overflow-2.c
===
--- trunk-ovf2/gcc/testsuite/gcc.dg/tree-ssa/overflow-2.c   (revision 0)
+++ trunk-ovf2/gcc/testsuite/gcc.dg/tree-ssa/overflow-2.c   (working copy)
@@ -0,0 +1,68 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized-raw" } */
+
+int carry;
+int f(unsigned a, unsigned b) {
+  unsigned r;
+  carry = __builtin_sub_overflow(a, b, &r);
+  return r > a;
+}
+int g(unsigned a, unsigned b) {
+  unsigned r;
+  carry = __builtin_sub_overflow(a, b, &r);
+  return a < r;
+}
+int h(unsigned a, unsigned b) {
+  unsigned r;
+  carry = __builtin_sub_overflow(a, b, &r);
+  return r <= a;
+}
+int i(unsigned a, unsigned b) {
+  unsigned r;
+  carry = __builtin_sub_overflow(a, b, &r);
+  return a >= r;
+}
+int j(unsigned a, unsigned b) {
+  unsigned r;
+  carry = __builtin_add_overflow(a, b, &r);
+  return r < a;
+}
+int j2(unsigned a, unsigned b) {
+  unsigned r;
+  carry = __builtin_add_overflow(a, b, &r);
+  return r < b;
+}
+int k(unsigned a, unsigned b) {
+  unsigned r;
+  carry = __builtin_add_overflow(a, b, &r);
+  return a > r;
+}
+int k2(unsigned a, unsigned b) {
+  unsig

Re: match.pd patch: u + 3 < u is u > UINT_MAX - 3

2016-04-24 Thread Marc Glisse

On Fri, 22 Apr 2016, Marc Glisse wrote:


On Fri, 22 Apr 2016, Richard Biener wrote:


On Fri, Apr 22, 2016 at 5:29 AM, Marc Glisse  wrote:

Hello,

this optimizes a common pattern for unsigned overflow detection, when one 
of

the arguments turns out to be a constant. There are more ways this could
look like, (a + 42 <= 41) in particular, but that'll be for another patch.


This case is also covered by fold_comparison which should be re-written
to match.pd patterns (and removed from fold-const.c).

fold_binary also as a few interesting/similar equality compare cases
like X +- Y CMP X to Y CMP 0 which look related.

Also your case is in fold_binary for the case of undefined overflow:


As far as I can tell, fold-const.c handles this kind of transformation 
strictly in the case of undefined overflow (or floats), while this is 
strictly in the case of unsigned with wrapping overflow. I thought it would 
be more readable to take advantage of the genmatch machinery and group the 
wrapping transforms in one place, and the undefined overflow ones in another 
place (they don't group the same way by operator, etc).


If you prefer to group by pattern shape and port the related fold-const.c bit 
at the same time, I could try that...



+/* When one argument is a constant, overflow detection can be simplified.
+   Currently restricted to single use so as not to interfere too much with
+   ADD_OVERFLOW detection in tree-ssa-math-opts.c.  */
+(for cmp (lt le ge gt)
+ out (gt gt le le)
+ (simplify
+  (cmp (plus@2 @0 integer_nonzerop@1) @0)
+  (if (TYPE_UNSIGNED (TREE_TYPE (@0))
+   && TYPE_OVERFLOW_WRAPS (TREE_TYPE (@0))
+   && TYPE_MAX_VALUE (TREE_TYPE (@0))
+   && single_use (@2))
+   (out @0 (minus { TYPE_MAX_VALUE (TREE_TYPE (@0)); } @1)
+(for cmp (gt ge le lt)
+ out (gt gt le le)
+ (simplify
+  (cmp @0 (plus@2 @0 integer_nonzerop@1))
+  (if (TYPE_UNSIGNED (TREE_TYPE (@0))
+   && TYPE_OVERFLOW_WRAPS (TREE_TYPE (@0))
+   && TYPE_MAX_VALUE (TREE_TYPE (@0))
+   && single_use (@2))
+   (out @0 (minus { TYPE_MAX_VALUE (TREE_TYPE (@0)); } @1)

please add a comment with the actual transform - A + CST CMP A -> A CMP' 
CST'


As we are relying on twos-complement wrapping you shouldn't need 
TYPE_MAX_VALUE

here but you can use wi::max_value (precision, sign).  I'm not sure we
have sensible
TYPE_MAX_VALUE for vector or complex types - the accessor uses
NUMERICAL_TYPE_CKECK
and TYPE_OVERFLOW_WRAPS checks for ANY_INTEGRAL_TYPE.  Thus I wonder
if we should restrict this to INTEGRAL_TYPE_P (making the
wi::max_value route valid).


integer_nonzerop currently already restricts to INTEGER_CST or COMPLEX_CST, 
and I don't think complex can appear in a comparison. I'll go back to writing 
the more explicit INTEGER_CST in the pattern and I'll use wide_int.


Better this way?

By the way, it would be cool to be able to write:
(lt:c @0 @1)

which would expand to both
(lt @0 @1)
(gt @1 @0)

(as per swap_tree_comparison or swapped_tcc_comparison)

--
Marc GlisseIndex: trunk-ovf/gcc/match.pd
===
--- trunk-ovf/gcc/match.pd  (revision 235371)
+++ trunk-ovf/gcc/match.pd  (working copy)
@@ -3071,10 +3071,36 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
 (simplify
  /* signbit(x) -> 0 if x is nonnegative.  */
  (SIGNBIT tree_expr_nonnegative_p@0)
  { integer_zero_node; })
 
 (simplify
  /* signbit(x) -> x<0 if x doesn't have signed zeros.  */
  (SIGNBIT @0)
  (if (!HONOR_SIGNED_ZEROS (@0))
   (convert (lt @0 { build_real (TREE_TYPE (@0), dconst0); }
+
+/* When one argument is a constant, overflow detection can be simplified.
+   Currently restricted to single use so as not to interfere too much with
+   ADD_OVERFLOW detection in tree-ssa-math-opts.c.
+   A + CST CMP A  ->  A CMP' CST' */
+(for cmp (lt le ge gt)
+ out (gt gt le le)
+ (simplify
+  (cmp (plus@2 @0 INTEGER_CST@1) @0)
+  (if (TYPE_UNSIGNED (TREE_TYPE (@0))
+   && TYPE_OVERFLOW_WRAPS (TREE_TYPE (@0))
+   && wi::ne_p (@1, 0)
+   && single_use (@2))
+   (out @0 { wide_int_to_tree (TREE_TYPE (@0), wi::max_value
+  (TYPE_PRECISION (TREE_TYPE (@0)), UNSIGNED) - @1); }
+/* A CMP A + CST  ->  A CMP' CST' */
+(for cmp (gt ge le lt)
+ out (gt gt le le)
+ (simplify
+  (cmp @0 (plus@2 @0 INTEGER_CST@1))
+  (if (TYPE_UNSIGNED (TREE_TYPE (@0))
+   && TYPE_OVERFLOW_WRAPS (TREE_TYPE (@0))
+   && wi::ne_p (@1, 0)
+   && single_use (@2))
+   (out @0 { wide_int_to_tree (TREE_TYPE (@0), wi::max_value
+  (TYPE_PRECISION (TREE_TYPE (@0)), UNSIGNED) - @1); }
Index: trunk-ovf/gcc/testsuite/gcc.dg/tree-ssa/overflow-1.c
===
--- trunk-ovf/gcc/testsuite/gcc.dg/tree-ssa/overflow-1.c(revision 0)
+++ trunk-ovf/gcc/testsuite/gcc.dg/tree-ssa/overflow-1.c(working copy)
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fdump-tree-optimized" } */
+
+int 

Move "X +- C1 CMP C2 to X CMP C2 -+ C1" to match.pd

2016-04-24 Thread Marc Glisse

Hello,

trying to move a first pattern from fold_comparison.

I first tried without single_use. It brought the number of 'free' in 
g++.dg/tree-ssa/pr61034.C down to 11, changed gcc.dg/sms-6.c to only 2 SMS 
(I don't think the generated code was worse, maybe even better, but I 
don't know ppc asm), broke Wstrict-overflow-18.c (the optimization moved 
from VRP to CCP if I remember correctly), and caused IVOPTS to make a mess 
in guality/pr54693-2.c (much longer code, and many  debug 
variables). If someone wants to drop the single_use, they can work on that 
after this patch is in.


The conditions do not exactly match the ones in fold-const.c, but I guess 
they are close. The warning in the constant case was missing in 
fold_comparison, but present in VRP, so I had to add it not to regress.


I don't think we were warning much from match.pd. I can't say that I am a 
big fan of those strict overflow warnings, but I expect people would 
complain if we just dropped the existing ones when moving the transforms 
to match.pd?


I wanted to restrict the equality case to TYPE_OVERFLOW_WRAPS || 
TYPE_OVERFLOW_UNDEFINED, but that broke 20041114-1.c at -O1 (no strict 
overflow), so I went with some kind of complement we use elsewhere. Now 
that I am writing this, I don't remember why I didn't just add 
-fstrict-overflow to the testcase, or xfail it at -O1. The saturating case 
could be handled as long as the constant is not an extremum, but I don't 
think we really handle saturating integers anyway.


I split the equality case, because it was already getting ugly.

Bootstrap+regtest on powerpc64le-unknown-linux-gnu.

2016-04-25  Marc Glisse  

gcc/
* fold-const.h: Include flag-types.h.
(fold_overflow_warning): Declare.
* fold-const.c (fold_overflow_warning): Make non-static.
(fold_comparison): Move the transformation of X +- C1 CMP C2
into X CMP C2 -+ C1 ...
* match.pd: ... here.
* tree-ssa-forwprop.c (execute): Protect fold_stmt with
fold_defer_overflow_warnings.

gcc/testsuite/
* gcc.dg/tree-ssa/20040305-1.c: Adjust.


--
Marc GlisseIndex: trunk4/gcc/fold-const.c
===
--- trunk4/gcc/fold-const.c (revision 235384)
+++ trunk4/gcc/fold-const.c (working copy)
@@ -290,21 +290,21 @@ fold_undefer_and_ignore_overflow_warning
 
 bool
 fold_deferring_overflow_warnings_p (void)
 {
   return fold_deferring_overflow_warnings > 0;
 }
 
 /* This is called when we fold something based on the fact that signed
overflow is undefined.  */
 
-static void
+void
 fold_overflow_warning (const char* gmsgid, enum warn_strict_overflow_code wc)
 {
   if (fold_deferring_overflow_warnings > 0)
 {
   if (fold_deferred_overflow_warning == NULL
  || wc < fold_deferred_overflow_code)
{
  fold_deferred_overflow_warning = gmsgid;
  fold_deferred_overflow_code = wc;
}
@@ -8366,89 +8366,20 @@ fold_comparison (location_t loc, enum tr
 {
   const bool equality_code = (code == EQ_EXPR || code == NE_EXPR);
   tree arg0, arg1, tem;
 
   arg0 = op0;
   arg1 = op1;
 
   STRIP_SIGN_NOPS (arg0);
   STRIP_SIGN_NOPS (arg1);
 
-  /* Transform comparisons of the form X +- C1 CMP C2 to X CMP C2 -+ C1.  */
-  if ((TREE_CODE (arg0) == PLUS_EXPR || TREE_CODE (arg0) == MINUS_EXPR)
-  && (equality_code
- || (ANY_INTEGRAL_TYPE_P (TREE_TYPE (arg0))
- && TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (arg0
-  && TREE_CODE (TREE_OPERAND (arg0, 1)) == INTEGER_CST
-  && !TREE_OVERFLOW (TREE_OPERAND (arg0, 1))
-  && TREE_CODE (arg1) == INTEGER_CST
-  && !TREE_OVERFLOW (arg1))
-{
-  const enum tree_code
-   reverse_op = TREE_CODE (arg0) == PLUS_EXPR ? MINUS_EXPR : PLUS_EXPR;
-  tree const1 = TREE_OPERAND (arg0, 1);
-  tree const2 = fold_convert_loc (loc, TREE_TYPE (const1), arg1);
-  tree variable = TREE_OPERAND (arg0, 0);
-  tree new_const = int_const_binop (reverse_op, const2, const1);
-
-  /* If the constant operation overflowed this can be
-simplified as a comparison against INT_MAX/INT_MIN.  */
-  if (TREE_OVERFLOW (new_const)
- && !TYPE_OVERFLOW_WRAPS (TREE_TYPE (arg0)))
-   {
- int const1_sgn = tree_int_cst_sgn (const1);
- enum tree_code code2 = code;
-
- /* Get the sign of the constant on the lhs if the
-operation were VARIABLE + CONST1.  */
- if (TREE_CODE (arg0) == MINUS_EXPR)
-   const1_sgn = -const1_sgn;
-
- /* The sign of the constant determines if we overflowed
-INT_MAX (const1_sgn == -1) or INT_MIN (const1_sgn == 1).
-Canonicalize to the INT_MIN overflow by swapping the comparison
-if necessary.  */
- if (const1_sgn == -1)
-   code2 = swap_tree_comparison (code);
-
- /* We now can look at the canonicalized case
-  VARIABLE + 1  CODE2  INT_MIN
-