date:20150511

Re: Fwd: [PING 2][PATCH] libgcc: Add CFI directives to the soft floating point support code for ARM

2015-05-11 Thread Ramana Radhakrishnan




On 10/05/15 23:16, Martin Galvan wrote:

Hi Ramana! Sorry to bother, but I looked at the repository and didn't
see this committed. As I don't have write access could you please
commit this for me?

Thanks a lot!



sorry about the slow response, I was travelling for a bit and missed 
your emails. Trying your patch out gives me failures possibly because my 
mail client munged it when it received this inline.


Can you please rebase if necessary, test and send it again as an 
attachment ?


patch -p1 --dry-run  ~/Downloads/martin-patch.txt
checking file libgcc/config/arm/ieee754-df.S
Hunk #2 FAILED at 57.
Hunk #3 succeeded at 70 with fuzz 2.
Hunk #4 FAILED at 86.
Hunk #5 FAILED at 153.
Hunk #6 FAILED at 418.
Hunk #7 FAILED at 425.
Hunk #8 FAILED at 440.
Hunk #9 FAILED at 462.
Hunk #10 FAILED at 485.
Hunk #11 FAILED at 555.
Hunk #12 FAILED at 566.
Hunk #13 FAILED at 601.
Hunk #14 FAILED at 653.
Hunk #15 FAILED at 720.
Hunk #16 FAILED at 868.
Hunk #17 FAILED at 1057.
Hunk #18 FAILED at 1068.
Hunk #19 FAILED at 1082.
Hunk #20 FAILED at 1090.
Hunk #21 FAILED at 1122.
Hunk #22 FAILED at 1133.
Hunk #23 succeeded at 1145 with fuzz 2.
Hunk #24 FAILED at 1155.
Hunk #25 FAILED at 1168.
Hunk #26 FAILED at 1228.
Hunk #27 succeeded at 1236 with fuzz 2.
Hunk #28 FAILED at 1254.
Hunk #29 succeeded at 1263 with fuzz 2.
Hunk #30 FAILED at 1297.
Hunk #31 succeeded at 1306 with fuzz 2.
Hunk #32 FAILED at 1336.
Hunk #33 succeeded at 1345 with fuzz 2.
Hunk #34 FAILED at 1410.
27 out of 34 hunks FAILED
checking file libgcc/config/arm/ieee754-sf.S
Hunk #1 FAILED at 31.
Hunk #4 FAILED at 294.
Hunk #9 FAILED at 460.
Hunk #10 FAILED at 471.
Hunk #16 FAILED at 845.
Hunk #20 FAILED at 898.
6 out of 27 hunks FAILED
checking file libgcc/config/arm/lib1funcs.S

regards
Ramana


On Tue, Apr 28, 2015 at 2:07 PM, Martin Galvan
martin.gal...@tallertechnologies.com wrote:

Thanks a lot. I don't have write access to the repository, could you
commit this for me?

On Tue, Apr 28, 2015 at 1:21 PM, Ramana Radhakrishnan
ramana@googlemail.com wrote:

On Tue, Apr 28, 2015 at 4:19 PM, Martin Galvan
martin.gal...@tallertechnologies.com wrote:

This patch adds CFI directives to the soft floating point support code for ARM.

Previously, if we tried to do a backtrace from that code in a debug session we'd
get something like this:

(gdb) bt
#0  __nedf2 () at 
../../../../../../gcc-4.9.2/libgcc/config/arm/ieee754-df.S:1082
#1  0x0db6 in __aeabi_cdcmple () at 
../../../../../../gcc-4.9.2/libgcc/config/arm/ieee754-df.S:1158
#2  0xf5c28f5c in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

Now we'll get something like this:

(gdb) bt
#0  __nedf2 () at 
../../../../../../gcc-4.9.2/libgcc/config/arm/ieee754-df.S:1156
#1  0x0db6 in __aeabi_cdcmple () at 
../../../../../../gcc-4.9.2/libgcc/config/arm/ieee754-df.S:1263
#2  0x0dc8 in __aeabi_dcmpeq () at 
../../../../../../gcc-4.9.2/libgcc/config/arm/ieee754-df.S:1285
#3  0x0504 in main ()

I have a company-wide copyright assignment. I don't have commit access, though, 
so it would be great if anyone could commit this for me.

Thanks a lot!



this is OK , thanks. Sorry about the delay in reviewing this.

Ramana

Re: [PATCH 2/6] combine: If recog fails, try again with zero_ext{ract,end} simplified

2015-05-11 Thread Segher Boessenkool

On Sun, May 10, 2015 at 10:15:34PM -0600, Jeff Law wrote:
  (recog_for_combine): If recog fails, try again with the pattern
  modified by change_zero_ext; if that still fails, restore the
  pattern.
 I like it.  Attacking the extensions are the most obvious candidates, 
 but I wonder if there's others (like the whole ASHIFT vs MULT stuff 
 that we were recently looking at for ARM).

Yeah, I thought about that as well.  But that case, MULT instead of shift,
outside of MEM, is simply non-canonical RTL; and combine created it on
purpose.  It just shouldn't.

There certainly will be other cases though; hopefully not overlapping,
I don't want to recog an exponentially expanding number of times ;-)


Segher

Re: [PATCH][PR66013] Update address_taken after ifn_va_arg expansion

2015-05-11 Thread Richard Biener

On Fri, May 8, 2015 at 9:16 PM, Tom de Vries tom_devr...@mentor.com wrote:
 On 08-05-15 17:31, Richard Biener wrote:

 OK for trunk?

 As noted in one of the PRs I think that it is the proper time to
 re-implement the stdarg optimization on the un-lowered form which
 should also fix this.


 AFAIU, the implementation of the stdarg optimization on the un-lowered form
 should contain an independent fix for this problem, and this fix won't be
 functional any more then, so indeed, it's a patch with (hopefully) a limited
 lifespan.

 But I thought that the patch is simple and low-risk enough to accept for
 trunk now, hence the submission.

 Also, I suppose it's not bad to start developing from a situation where more
 tests are passing.

Sure - but this adds a pass over the IL plus an SSA update to the compile.  Sth
which may also enable va-unrelated optimization and thus may cause regressions
if you remove it later...

Otherwise I agree of course.

Thanks,
Richard.

 Thanks,
 - Tom

Re: [C frontend] Fix construction of TYPE_STUB_DECL

2015-05-11 Thread Richard Biener

On Sun, May 10, 2015 at 7:33 PM, Jan Hubicka hubi...@ucw.cz wrote:
 Hi,
 TREE_PUBLIC of TYPE_DECL is defined to say if the type is public:
 /* In a VAR_DECL, FUNCTION_DECL, NAMESPACE_DECL or TYPE_DECL,
nonzero means name is to be accessible from outside this translation unit.
In an IDENTIFIER_NODE, nonzero means an external declaration
accessible from outside this translation unit was previously seen
for this name in an inner scope.  */
 #define TREE_PUBLIC(NODE) ((NODE)-base.public_flag)

 This is properly honored by C++ FE but other FEs are bit random, which in turn
 confuses type_in_anonymous_namespace_p predicate that leads to flase poistives
 on type mismatch warnings.  I used to be able to get around by checking only
 C++ types at LTO time, but with type checking in lto-symtab I can not, because
 I do want to compare type compatibility cross translation units and cross 
 languages
 and we have no reliable way to say what type originated as C++ and what did 
 not.

The idea was you can walk up the context chain until you reach a
TRANSLATION_UNIT_DECL
where the source language is encoded in...  of course most FEs are
quite lazy here and
eventually end up at NULL_TREE instead.

Richard.

 This fixed TYPE_STUB_DECl construction in C frontend. I will check other
 FEs separately. I can also add way to recognize C++ types, but I think it is
 good idea to make type representation consistent across FEs.

 Bootstrapped/regtested x86_64-linux, OK?

 Honza

 * c-decl.c (pushtag): Declare type as public.
 Index: c/c-decl.c
 ===
 --- c/c-decl.c  (revision 222981)
 +++ c/c-decl.c  (working copy)
 @@ -1563,6 +1563,7 @@ pushtag (location_t loc, tree name, tree

TYPE_STUB_DECL (type) = pushdecl (build_decl (loc,
 TYPE_DECL, NULL_TREE, type));
 +  TREE_PUBLIC (TYPE_STUB_DECL (type)) = 1;

/* An approximation for now, so we can tell this is a function-scope tag.
   This will be updated in pop_scope.  */

Re: [PATCH, alpha]: Remove dead (HOST_BITS_PER_WIDE_INT 64) code

2015-05-11 Thread Uros Bizjak

On Fri, May 8, 2015 at 10:23 PM, Richard Henderson r...@redhat.com wrote:
 On 05/08/2015 09:43 AM, Uros Bizjak wrote:
 @@ -1509,8 +1509,7 @@
   (and:DI (ashift:DI (match_operand:DI 1 register_operand r)
  (match_operand:DI 2 mul8_operand I))
   (match_operand:DI 3 immediate_operand i)))]
 -  HOST_BITS_PER_WIDE_INT == 64
 -CONST_INT_P (operands[3])
 +  CONST_INT_P (operands[3])

 Just noticed that we can eliminate this test if we use const_int_operand 
 instead.

 +  operands[2] = force_reg (DImode, GEN_INT (HOST_WIDE_INT_1  63));)

 I bet we're now supposed to use HOST_WIDE_INT_1U, to avoid any ubsan shift
 silliness.

 @@ -1929,7 +1925,7 @@ alpha_emit_set_const_1 (rtx target, machine_mode m
   temp = alpha_emit_set_const (subtarget, mode, new_const, i, 
 no_output);
   if (!temp)
 {
 - new_const = (c  bits) | (((HOST_WIDE_INT) 1  bits) - 1);
 + new_const = (c  bits) | ((HOST_WIDE_INT_1  bits) - 1);

 Likewise.

Thanks for the review!

I have committed the attached patch.

Uros.
Index: config/alpha/alpha.c
===
--- config/alpha/alpha.c(revision 222913)
+++ config/alpha/alpha.c(working copy)
@@ -1771,11 +1771,9 @@ alpha_emit_set_const_1 (rtx target, machine_mode m
   rtx temp, insn;
 
   /* If this is a sign-extended 32-bit constant, we can do this in at most
- three insns, so do it if we have enough insns left.  We always have
- a sign-extended 32-bit constant when compiling on a narrow machine.  */
+ three insns, so do it if we have enough insns left.  */
 
-  if (HOST_BITS_PER_WIDE_INT != 64
-  || c  31 == -1 || c  31 == 0)
+  if (c  31 == -1 || c  31 == 0)
 {
   HOST_WIDE_INT low = ((c  0x) ^ 0x8000) - 0x8000;
   HOST_WIDE_INT tmp1 = c - low;
@@ -1917,11 +1915,9 @@ alpha_emit_set_const_1 (rtx target, machine_mode m
   /* Now try high-order zero bits.  Here we try the shifted-in bits as
 all zero and all ones.  Be careful to avoid shifting outside the
 mode and to avoid shifting outside the host wide int size.  */
-  /* On narrow hosts, don't shift a 1 into the high bit, since we'll
-confuse the recursive call and set all of the high 32 bits.  */
 
   bits = (MIN (HOST_BITS_PER_WIDE_INT, GET_MODE_SIZE (mode) * 8)
- - floor_log2 (c) - 1 - (HOST_BITS_PER_WIDE_INT  64));
+ - floor_log2 (c) - 1);
   if (bits  0)
for (; bits  0; bits--)
  {
@@ -1929,7 +1925,7 @@ alpha_emit_set_const_1 (rtx target, machine_mode m
temp = alpha_emit_set_const (subtarget, mode, new_const, i, 
no_output);
if (!temp)
  {
-   new_const = (c  bits) | (((HOST_WIDE_INT) 1  bits) - 1);
+   new_const = (c  bits) | ((HOST_WIDE_INT_1U  bits) - 1);
temp = alpha_emit_set_const (subtarget, mode, new_const,
 i, no_output);
  }
@@ -1955,7 +1951,7 @@ alpha_emit_set_const_1 (rtx target, machine_mode m
temp = alpha_emit_set_const (subtarget, mode, new_const, i, 
no_output);
if (!temp)
  {
-   new_const = (c  bits) | (((HOST_WIDE_INT) 1  bits) - 1);
+   new_const = (c  bits) | ((HOST_WIDE_INT_1U  bits) - 1);
temp = alpha_emit_set_const (subtarget, mode, new_const,
 i, no_output);
  }
@@ -1969,7 +1965,6 @@ alpha_emit_set_const_1 (rtx target, machine_mode m
  }
 }
 
-#if HOST_BITS_PER_WIDE_INT == 64
   /* Finally, see if can load a value into the target that is the same as the
  constant except that all bytes that are 0 are changed to be 0xff.  If we
  can, then we can do a ZAPNOT to obtain the desired constant.  */
@@ -1996,7 +1991,6 @@ alpha_emit_set_const_1 (rtx target, machine_mode m
   target, 0, OPTAB_WIDEN);
}
 }
-#endif
 
   return 0;
 }
@@ -2077,7 +2071,7 @@ alpha_emit_set_long_const (rtx target, HOST_WIDE_I
   HOST_WIDE_INT d1, d2, d3, d4;
 
   /* Decompose the entire word */
-#if HOST_BITS_PER_WIDE_INT = 64
+
   gcc_assert (c2 == -(c1  0));
   d1 = ((c1  0x) ^ 0x8000) - 0x8000;
   c1 -= d1;
@@ -2087,17 +2081,6 @@ alpha_emit_set_long_const (rtx target, HOST_WIDE_I
   c1 -= d3;
   d4 = ((c1  0x) ^ 0x8000) - 0x8000;
   gcc_assert (c1 == d4);
-#else
-  d1 = ((c1  0x) ^ 0x8000) - 0x8000;
-  c1 -= d1;
-  d2 = ((c1  0x) ^ 0x8000) - 0x8000;
-  gcc_assert (c1 == d2);
-  c2 += (d2  0);
-  d3 = ((c2  0x) ^ 0x8000) - 0x8000;
-  c2 -= d3;
-  d4 = ((c2  0x) ^ 0x8000) - 0x8000;
-  gcc_assert (c2 == d4);
-#endif
 
   /* Construct the high word */
   if (d4)
@@ -2138,16 +2121,11 @@ alpha_extract_integer (rtx x, HOST_WIDE_INT *p0, H
   i0 = INTVAL (x);
   i1 = -(i0  0);
 }
-  else if

RE: [patch, avr] extend part-clobbered check to AVR_TINY architecture

2015-05-11 Thread Sivanupandi, Pitchumani

 -Original Message-
 From: Denis Chertykov [mailto:cherty...@gmail.com]
 Sent: Sunday, May 10, 2015 12:55 PM
 To: Sivanupandi, Pitchumani
 Cc: Georg-Johann Lay; GCC Patches
 Subject: Re: [patch, avr] extend part-clobbered check to AVR_TINY
 architecture
 
 Sorry for delay. (I was at vacation in Kazakhstan without internet.)
 
 
 2015-05-08 8:32 GMT+03:00 Sivanupandi, Pitchumani
 pitchumani.sivanupa...@atmel.com:
  Ping!
 
  -Original Message-
  From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
  ow...@gcc.gnu.org] On Behalf Of Sivanupandi, Pitchumani
  Sent: Tuesday, April 21, 2015 8:21 PM
  To: Georg-Johann Lay; Denis Chertykov
  Cc: GCC Patches
  Subject: [patch, avr] extend part-clobbered check to AVR_TINY
  architecture
 
  Hi,
 
  When tried backporting AVR_TINY architecture support to 4.9, build
  failed in libgcc for AVR_TINY.
  Failure was due to ICE same as:
  https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53065
 
  Fix provided for that bug checks for if the mode crosses the callee
  saved register.
  Below patch updates that check as the AVR_TINY has different set of
  callee saved registers (r18 and r19).
 
  This patch is against trunk.
 
  NOTE: ICE is re-produciable only with 4.9 + tiny patch and
  --with-dwarf2 enabled.
 
  Is this ok for trunk?
 
  diff --git a/gcc/config/avr/avr.c b/gcc/config/avr/avr.c index
  68d5ddc..2f441e5 100644
  --- a/gcc/config/avr/avr.c
  +++ b/gcc/config/avr/avr.c
  @@ -11333,9 +11333,10 @@ avr_hard_regno_call_part_clobbered
 (unsigned
  regno, machine_mode mode)
   return 0;
 
 /* Return true if any of the following boundaries is crossed:
  - 17/18, 27/28 and 29/30.  */
  + 17/18 or 19/20 (if AVR_TINY), 27/28 and 29/30.  */
 
  -  return ((regno  18  regno + GET_MODE_SIZE (mode)  18)
  +  return ((regno = LAST_CALLEE_SAVED_REG 
  +   regno + GET_MODE_SIZE (mode)  (LAST_CALLEE_SAVED_REG +
  + 1))
 || (regno  REG_Y  regno + GET_MODE_SIZE (mode)  REG_Y)
 || (regno  REG_Z  regno + GET_MODE_SIZE (mode) 
  REG_Z));  }
 
 
 I think it's ok.

Could you please commit? I do not have commit access.

(--patch--)
diff --git a/gcc/config/avr/avr.c b/gcc/config/avr/avr.c index 68d5ddc..2f441e5 
100644
--- a/gcc/config/avr/avr.c
+++ b/gcc/config/avr/avr.c
@@ -11333,9 +11333,10 @@ avr_hard_regno_call_part_clobbered (unsigned regno, 
machine_mode mode)
 return 0;

   /* Return true if any of the following boundaries is crossed:
- 17/18, 27/28 and 29/30.  */
+ 17/18 or 19/20 (if AVR_TINY), 27/28 and 29/30.  */

-  return ((regno  18  regno + GET_MODE_SIZE (mode)  18)
+  return ((regno = LAST_CALLEE_SAVED_REG 
+   regno + GET_MODE_SIZE (mode)  (LAST_CALLEE_SAVED_REG + 1))
   || (regno  REG_Y  regno + GET_MODE_SIZE (mode)  REG_Y)
   || (regno  REG_Z  regno + GET_MODE_SIZE (mode)  REG_Z));  }
(--patch--)

Regards,
Pitchumani

gcc/ChangeLog
2015-05-11  Pitchumani Sivanupandi  pitchuman...@atmel.com

* config/avr/avr.c (avr_hard_regno_call_part_clobbered): Use
LAST_CALLEE_SAVED_REG instead of hard-coded register number.
(Last callee saved reg is different for AVR_TINY architecture)

Re: [PATCH][PR66010] Don't take address of ap unless necessary

2015-05-11 Thread Richard Biener

On Fri, 8 May 2015, Tom de Vries wrote:

 Hi,
 
 this patch fixes PR66010.
 
 
 I.
 
 Consider this test-case, with a va_list passed from f2 to f2_1:
 ...
 #include stdarg.h
 
 inline int __attribute__((always_inline))
 f2_1 (va_list ap)
 {
   return va_arg (ap, int);
 }
 
 int
 f2 (int i, ...)
 {
   int res;
   va_list ap;
 
   va_start (ap, i);
   res = f2_1 (ap);
   va_end (ap);
 
   return res;
 }
 ...
 
 When compiling at -O2, before eline we have:
 ...
 f2_1 (struct  * apD.1832)
 {
   intD.6 _3;
 
   # .MEM_2 = VDEF .MEM_1(D)
   # USE = anything
   # CLB = anything
   _3 = VA_ARG (apD.1832, 0B);
 
   # VUSE .MEM_2
   return _3;
 }
 
 f2 (intD.6 iD.1835)
 {
   struct  apD.1839[1];
   intD.6 resD.1838;
   intD.6 _6;
 
   # .MEM_2 = VDEF .MEM_1(D)
   # USE = anything
   # CLB = anything
   __builtin_va_startD.1030 (apD.1839, 0);
 
   # .MEM_3 = VDEF .MEM_2
   # USE = anything
   # CLB = anything
   res_4 = f2_1D.1833 (apD.1839);
 
   # .MEM_5 = VDEF .MEM_3
   # USE = anything
   # CLB = anything
   __builtin_va_endD.1029 (apD.1839);
 
   _6 = res_4;
 
   # .MEM_7 = VDEF .MEM_5
   apD.1839 ={v} {CLOBBER};
 
   # VUSE .MEM_7
   return _6;
 }
 ...
 
 Because the va_list type is an array type:
 ...
   struct  apD.1839[1];
 ...
 
 we're passing the location of the initial element:
 ...
   res_4 = f2_1D.1833 (apD.1839);
 ...
 
 And the type of the parameter in f2_1 is accordingly a pointer to array
 element:
 ...
 f2_1 (struct  * apD.1832)
 ...
 
 That means the address operator here is superfluous.
 ...
   _3 = VA_ARG (apD.1832, 0B);
 ...
 Or, differently put, when we take the address of ap in va_start and va_end in
 f2, the result is a pointer to array element type.
 When we take the address of ap in f2_2, the result is a pointer to pointer to
 array element type.
 
 This extra indirection doesn't cause wrong code to be generated. The va_arg
 expansion code handles this correctly, thanks to the combination of:
 - an unconditional build_fold_indirect_ref in expand_ifn_va_arg_1 which
 removes
   an indirection, and
 - a fixup in gimplify_va_arg_internal that again adds an indirection in some
   cases.
 
 The call gets inlined, and before pass_stdarg we have:
 ...
 f2 (intD.6 iD.1835)
 {
   struct  * apD.1849;
   struct  apD.1839[1];
   intD.6 _6;
 
   # .MEM_2 = VDEF .MEM_1(D)
   # USE = nonlocal escaped
   # CLB = nonlocal escaped { D.1839 } (escaped)
   __builtin_va_startD.1030 (apD.1839, 0);
 
   # .MEM_3 = VDEF .MEM_2
   apD.1849 = apD.1839;
 
   # .MEM_7 = VDEF .MEM_3
   # USE = nonlocal null { D.1839 D.1849 } (escaped)
   # CLB = nonlocal null { D.1839 D.1849 } (escaped)
   _6 = VA_ARG (apD.1849, 0B);
 ...
 
 And after expanding ifn_va_arg, we have:
 ...
 f2 (intD.6 iD.1835)
 {
   struct  * ap.3D.1853;
   struct  apD.1839[1];
 
   # .MEM_2 = VDEF .MEM_1(D)
   # USE = nonlocal escaped
   # CLB = nonlocal escaped { D.1839 } (escaped)
   __builtin_va_startD.1030 (apD.1839, 0);
 
   ap_3 = apD.1839;
 
   ap.3_10 = ap_3;
 
   # VUSE .MEM_2
   _11 = ap.3_10-gp_offsetD.2;
   SNIP
 ...
 
 The pass_stdarg optimization fails:
 ...
 f2: va_list escapes 1, needs to save all GPR units and all FPR units.
 ...
 
 It fails on the superfluous address operator:
 ...
 va_list escapes in ap_3 = apD.1839;
 ...
 
 
 II.
 
 The patch prevents the superfluous address operator from being added.
 
 It also removes the need for the fixup in gimplify_va_arg_internal, by
 deciding in gimplify_va_arg_expr whether the build_fold_indirect_ref in
 expand_ifn_va_arg_1 is needed before passing ap on to
 gimplify_va_arg_internal. This decision is encoded as an extra argument to
 ifn_va_arg.
 
 
 III.
 
 Using the patch, before inlining we can see the address operator has been
 removed in va_arg:
 ...
 f2_1 (struct  * apD.1832)
 {
   intD.6 _4;
 
   # .MEM_3 = VDEF .MEM_1(D)
   # USE = anything
   # CLB = anything
 
   _4 = VA_ARG (ap_2(D), 0B, 0);
   # VUSE .MEM_3
   return _4;
 }
 ...
 
 And after inlining, we see that va_start and va_arg now use the same ap:
 ...
 f2 (intD.6 iD.1835)
 {
   struct  apD.1839[1];
 
   # .MEM_2 = VDEF .MEM_1(D)
   # USE = anything
   # CLB = anything
   __builtin_va_startD.1030 (apD.1839, 0);
 
   # .MEM_3 = VDEF .MEM_2
   # USE = anything
   # CLB = anything
   _8 = VA_ARG (apD.1839, 0B, 0);
 ...
 
 That allows the pass_stdarg optimization to succeed:
 ...
 f2: va_list escapes 0, needs to save 8 GPR units and 0 FPR units.
 ...
 
 Bootstrapped and reg-tested on x86_64, with and without -m32.
 
 OK for trunk?
 
 [ FWIW, I suspect this patch will make life easier for the reimplementation of
 the pass_stdarg optimization using ifn_va_arg. ]

+  if (canon_va_type != NULL)
+{
+  if (!(TREE_CODE (canon_va_type) == ARRAY_TYPE
+TREE_CODE (va_type) != ARRAY_TYPE))
+   /* In gimplify_va_arg_expr we take the address of the ap argument, 
mark
+  it addressable now.  */
+   mark_addressable (expr);

can we simplify this and ...

-   }
-
+  gcc_assert (TREE_CODE (TREE_TYPE

Fix pattern validation in genrecog

2015-05-11 Thread Richard Sandiford

Thomas pointed out that, while I'd kept the code to validate patterns
for things like missing modes, the code wasn't being used.  Fixed with
the patch below.

I ended up reinstating the original code to create a single rtx
pattern for a define_peephole2 sequence, so that it could be
passed to validate_pattern as before.  One of the reasons I got
rid of the code originally was because I didn't like the idea of
a sequence of define_peephole2 instructions being put into a
PARALLEL -- there's nothing parallel about them, and it would be
easy to confuse the result with a real parallel in a define_insn
or define_split match.  I therefore changed it to use SEQUENCE instead.
(Not that it matters: nothing actually looks at the code.)

I also threw in a check for zero-length define_peephole2s while there.
This is enforced syntactically for other define_*s, but it's possible
to write a define_peephole2 that just asks for scratch registers and
doesn't actually match anything.

Tested on x86_64-linux-gnu.  OK to install?

Thanks,
Richard


gcc/
* genrecog.c (match_pattern_1): Expect the pattern to be a SEQUENCE
for define_peephole2s.
(get_peephole2_pattern): New function.
(main): Use it.  Call validate_pattern.

Index: gcc/genrecog.c
===
--- gcc/genrecog.c  2015-05-10 10:45:41.750134123 +0100
+++ gcc/genrecog.c  2015-05-10 10:45:41.746134173 +0100
@@ -4080,14 +4080,14 @@ match_pattern_2 (state *s, rtx top_patte
(2) the rtx matches TOP_PATTERN and
(3) C_TEST is true.
 
-   For peephole2, TOP_PATTERN is the DEFINE_PEEPHOLE2 itself, otherwise
-   it is the rtx pattern to match (PARALLEL, SET, etc.).  */
+   For peephole2, TOP_PATTERN is a SEQUENCE of the instruction patterns
+   to match, otherwise it is a single instruction pattern.  */
 
 static void
 match_pattern_1 (state *s, rtx top_pattern, const char *c_test,
 acceptance_type acceptance)
 {
-  if (GET_CODE (top_pattern) == DEFINE_PEEPHOLE2)
+  if (acceptance.type == PEEPHOLE2)
 {
   /* Match each individual instruction.  */
   position **subpos_ptr = peep2_insn_pos_list;
@@ -4095,18 +4095,14 @@ match_pattern_1 (state *s, rtx top_patte
   for (int i = 0; i  XVECLEN (top_pattern, 0); ++i)
{
  rtx x = XVECEXP (top_pattern, 0, i);
- /* Ignore scratch register requirements.  */
- if (GET_CODE (x) != MATCH_SCRATCH  GET_CODE (x) != MATCH_DUP)
-   {
- position *subpos = next_position (subpos_ptr, root_pos,
-   POS_PEEP2_INSN, count);
- if (count  0)
-   s = add_decision (s, rtx_test::peep2_count (count + 1),
- true, false);
- s = match_pattern_2 (s, top_pattern, subpos, x);
- subpos_ptr = subpos-next;
- count += 1;
-   }
+ position *subpos = next_position (subpos_ptr, root_pos,
+   POS_PEEP2_INSN, count);
+ if (count  0)
+   s = add_decision (s, rtx_test::peep2_count (count + 1),
+ true, false);
+ s = match_pattern_2 (s, top_pattern, subpos, x);
+ subpos_ptr = subpos-next;
+ count += 1;
}
   acceptance.u.full.u.match_len = count - 1;
 }
@@ -5165,6 +5161,30 @@ add_implicit_parallel (rtvec vec)
 }
 }
 
+/* Return the rtx pattern for the list of rtxes in a define_peephole2.  */
+
+static rtx
+get_peephole2_pattern (rtvec vec)
+{
+  int i, j;
+  rtx pattern = rtx_alloc (SEQUENCE);
+  XVEC (pattern, 0) = rtvec_alloc (GET_NUM_ELEM (vec));
+  for (i = j = 0; i  GET_NUM_ELEM (vec); i++)
+{
+  rtx x = RTVEC_ELT (vec, i);
+  /* Ignore scratch register requirements.  */
+  if (GET_CODE (x) != MATCH_SCRATCH  GET_CODE (x) != MATCH_DUP)
+   {
+ XVECEXP (pattern, 0, j) = x;
+ j++;
+   }
+}
+  XVECLEN (pattern, 0) = j;
+  if (j == 0)
+error_with_line (pattern_lineno, empty define_peephole2);
+  return pattern;
+}
+
 /* Return true if *PATTERN_PTR is a PARALLEL in which at least one trailing
rtx can be added automatically by add_clobbers.  If so, update
*ACCEPTANCE_PTR so that its num_clobbers field contains the number
@@ -5231,20 +5251,20 @@ main (int argc, char **argv)
   if (desc == NULL)
break;
 
-  rtx pattern;
-
   acceptance_type acceptance;
   acceptance.partial_p = false;
   acceptance.u.full.code = next_insn_code;
 
+  rtx pattern;
   switch (GET_CODE (desc))
{
case DEFINE_INSN:
  {
/* Match the instruction in the original .md form.  */
-   pattern = add_implicit_parallel (XVEC (desc, 1));
acceptance.type = RECOG;
acceptance.u.full.u.num_clobbers = 0;
+   pattern = add_implicit_parallel (XVEC (desc, 1));
+   validate_pattern (pattern,

PR 66076: invalid vec_grow in rtx iterators

2015-05-11 Thread Richard Sandiford

The rtx iterators start out using a stack worklist but fall back to
a heap worklist if the stack one turns out to be too small (which is rare).
PR 66076 showed up a bug in the code that makes the switch from the stack
worklist to the heap worklist: the heap one might already have been
allocated if the worklist structure was shared with a previous
FOR_EACH_SUBRTX and if that FOR_EACH_SUBRTX also blew the stack worklist.
Fixed by making the vec_safe_grow conditional on a vec_safe_length check.
(Note that vec_safe_grow starts with a call to vec_safe_length,
so this is no less efficient than a check for a null pointer.)

If one FOR_EACH_SUBRTX needs to use the heap worklist, it might be slightly
more efficient for future FOR_EACH_SUBRTXes on the same worklist structure
to start out using the heap worklist, in order to avoid any future stack-to-heap
memcpy.  The problem is that that would require extra set-up code for all
FOR_EACH_SUBRTXes, and the idea is to optimise for the common case where
the stack worklist is enough.  The situation handled by the patch should
be very rare -- backed up by the fact that the ICE would always trigger
in that case, yet it didn't show up in pre-release testing.

Maybe it would be worth bumping up the size of the stack worklist to cope
with those long AVX512 constants though.

Tested on x86_64-linux-gnu.  Also tested that the testcase passes on
aarch64-elf.  OK to install?

Thanks,
Richard


gcc/
PR rtl-optimization/66076
* rtlanal.c (generic_subrtx_iterator T::add_single_to_queue):
Don't grow the heap array if it is already big enough from a
previous iteration.

gcc/testsuite/
PR rtl-optimization/66076
* gcc.dg/torture/pr66076.c: New test.

Index: gcc/rtlanal.c
===
--- gcc/rtlanal.c   2015-05-10 10:44:24.091059530 +0100
+++ gcc/rtlanal.c   2015-05-10 20:49:56.618230132 +0100
@@ -104,7 +104,10 @@ generic_subrtx_iterator T::add_single_
  return base;
}
   gcc_checking_assert (i == LOCAL_ELEMS);
-  vec_safe_grow (array.heap, i + 1);
+  /* A previous iteration might also have moved from the stack to the
+heap, in which case the heap array will already be big enough.  */
+  if (vec_safe_length (array.heap) = i)
+   vec_safe_grow (array.heap, i + 1);
   base = array.heap-address ();
   memcpy (base, array.stack, sizeof (array.stack));
   base[LOCAL_ELEMS] = x;
Index: gcc/testsuite/gcc.dg/torture/pr66076.c
===
--- /dev/null   2015-04-20 08:05:53.260830006 +0100
+++ gcc/testsuite/gcc.dg/torture/pr66076.c  2015-05-10 20:49:56.626230040 
+0100
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-options  } */
+/* { dg-options -mno-prefer-avx128 -march=bdver4 { target i?86-*-* 
x86_64-*-* } } */
+
+void
+f0a (char *result, char *arg1, char *arg4, char temp_6)
+{
+  int idx = 0;
+  for (idx = 0; idx  416; idx += 1)
+result[idx] = (arg1[idx] + arg4[idx]) * temp_6;
+}

Re: PR 66076: invalid vec_grow in rtx iterators

2015-05-11 Thread Eric Botcazou

 gcc/
   PR rtl-optimization/66076
   * rtlanal.c (generic_subrtx_iterator T::add_single_to_queue):
   Don't grow the heap array if it is already big enough from a
   previous iteration.
 
 gcc/testsuite/
   PR rtl-optimization/66076
   * gcc.dg/torture/pr66076.c: New test.

OK, thanks.

-- 
Eric Botcazou

Re: PATCH] PR target/65612: Multiversioning doesn't work with DSO nor PIE

2015-05-11 Thread Jakub Jelinek

On Mon, May 11, 2015 at 11:20:15AM +0100, Szabolcs Nagy wrote:
 
 
 On 09/05/15 19:57, Szabolcs Nagy wrote:
  * H.J. Lu hjl.to...@gmail.com [2015-05-09 10:41:41 -0700]:
  There are
 
   4: 2b70   806 FUNCGLOBAL DEFAULT   12
  __cpu_indicator_init@GCC_4.8.0
  38: 002153d016 OBJECT  GLOBAL DEFAULT   25 
  __cpu_model@GCC_4.8.0
 
  and
 
  0215000  00040001 R_X86_64_64
  2b70 __cpu_indicator_init@GCC_4.8.0 + 0
  00215220  00260006 R_X86_64_GLOB_DAT
  002153d0 __cpu_model@GCC_4.8.0 + 0
 
  in libgcc_s.so.1.  Musl ld.so must be fixed to handle it.
 
 
 Rich is looking at how to do this non-intrusively, but
 it seems non-trivial (some users of musl prefer not to
 resolve such versioned symbols).
 
  
  i think it might be enough to add __cpu_indicator_init_local
  as an alias to __cpu_indicator_init in libgcc.a and then use
  the *_local symbol from the ifunc resolver, that way no new
  dependency is added to libgcc_s.so handling.
 
 i tried this approach and it seems to work: passes all
 multiversioning tests on x86_64.
 
 i think it's no worse than the symver approach.
 
 is it ok to change the current fix to this?

No.  Instead of piling hacks like this just fix it in musl.
libgcc certainly isn't the only library that uses @ symbol versions,
e.g. libstdc++ does as well, as well as many other shared libraries.

Jakub

Re: [patch 0/10] debug-early merge

2015-05-11 Thread Aldy Hernandez

Perfect. Thanks for doing this.

Aldy

On May 11, 2015 4:23 AM, Kyrill Tkachov kyrylo.tkac...@arm.com wrote:

 Hi Aldy, 

 On 08/05/15 01:30, Aldy Hernandez wrote: 
  Hi folks. 
  
  I have divided the patches into 10 pieces.  The patches are 
  interdependent and cannot be applied independently.  I am merely 
  dividing them up to aid the relevant reviewers. 
  
  As I've mentioned elsewhere, the patchset as posted has been 
  bootstrapped and GCC tested on: 
  
  x86_64-unknown-linux-gnu 
  powerpc-ibm-aix7.1.2.0 
  powerpc64-unknown-linux-gnu 
  aarch64-unknown-linux-gnu 

 I bootstrapped the branch (aldyh/debug-early from the git mirror, I hope ;)) 
 on arm-none-linux-gnueabihf --with-float=hard --with-cpu=cortex-a15 
 --with-mode=thumb --with-fpu=neon-vfpv4 --enable-languages=c,c++,fortran 
 which is my usual arm bootstrap config. 

 It completed ok. 

 Cheers, 
 Kyrill 

  
  I have also GDB tested the patchset on x86_64-linux. 
  
  Thanks for your help in this ordeal. 
  
  Bring it on! 
  Aldy

Re: [PATCH, ARM] Fix testcase for PR64616

2015-05-11 Thread Ramana Radhakrishnan

On Mon, May 11, 2015 at 10:43 AM, Thomas Preud'homme
thomas.preudho...@arm.com wrote:
 Hi,

 Testcase made for PR64616 was only passing when using a litteral pool. Rather 
 than having an alternative for systems where this is not true, this patch 
 changes the test to check that a global copy propagation occurs in cprop2. 
 This should work accross all ARM targets (it works when targetting Cortex-M0, 
 Cortex-M3 and whatever default core for ARMv7-a with vfpv3-d16 FPU).

 ChangeLog entry is as follows:

 *** gcc/testsuite/ChangeLog ***

 2015-05-04  Thomas Preud'homme  thomas.preudho...@arm.com

 * gcc.target/arm/pr64616.c: Test dump rather than assembly to work
 accross ARM targets.

 diff --git a/gcc/testsuite/gcc.target/arm/pr64616.c 
 b/gcc/testsuite/gcc.target/arm/pr64616.c
 index c686ffa..2280f21 100644
 --- a/gcc/testsuite/gcc.target/arm/pr64616.c
 +++ b/gcc/testsuite/gcc.target/arm/pr64616.c
 @@ -1,5 +1,5 @@
  /* { dg-do compile } */
 -/* { dg-options -O2 } */
 +/* { dg-options -O2 -fdump-rtl-cprop2 } */

  int f (int);
  unsigned int glob;
 @@ -11,4 +11,5 @@ g ()
glob = 5;
  }

 -/* { dg-final { scan-assembler-times ldr 2 } } */
 +/* { dg-final { scan-rtl-dump GLOBAL COPY-PROP cprop2 } } */
 +/* { dg-final { cleanup-rtl-dump cprop2 } } */


 Patch was tested by verifying that the pattern appears when targeting 
 Cortex-M0, Cortex-M3 and
 the default core for ARMv7-a with vfpv3-d16 FPU.

 Best regards,

 Thomas




OK - thanks.

Ramana

Re: [patch 0/10] debug-early merge

2015-05-11 Thread Kyrill Tkachov


Hi Aldy,

On 08/05/15 01:30, Aldy Hernandez wrote:

Hi folks.

I have divided the patches into 10 pieces.  The patches are
interdependent and cannot be applied independently.  I am merely
dividing them up to aid the relevant reviewers.

As I've mentioned elsewhere, the patchset as posted has been
bootstrapped and GCC tested on:

x86_64-unknown-linux-gnu
powerpc-ibm-aix7.1.2.0
powerpc64-unknown-linux-gnu
aarch64-unknown-linux-gnu


I bootstrapped the branch (aldyh/debug-early from the git mirror, I hope ;))
on arm-none-linux-gnueabihf --with-float=hard --with-cpu=cortex-a15 
--with-mode=thumb --with-fpu=neon-vfpv4 --enable-languages=c,c++,fortran which 
is my usual arm bootstrap config.

It completed ok.

Cheers,
Kyrill



I have also GDB tested the patchset on x86_64-linux.

Thanks for your help in this ordeal.

Bring it on!
Aldy

Re: [PATCH, ARM] attribute target (thumb,arm) [3/6] respin (4th)

2015-05-11 Thread Christian Bruel



binunvnEzkM1k.bin
Description: PGP/MIME version identification


encrypted.asc
Description: OpenPGP encrypted message

Re: [PATCH, ARM] attribute target (thumb,arm) [1/6] respin (4th)

2015-05-11 Thread Ramana Radhakrishnan

On Mon, May 11, 2015 at 10:13 AM, Christian Bruel
christian.br...@st.com wrote:


 OK with those changes.



 Ramana


 thanks, done

 following up the thumb_code cleanup, here is a missing chunk for the vxworks
 config.

 arm-vxworks build checked. ok for trunk ?

 thanks,

 Christian


OK thanks  - please post the version of p1 that you committed for
archival purposes.

Ramana

Re: [rs6000] Fix compare debug failure on AIX

2015-05-11 Thread Eric Botcazou

 I'm sorry that XCOFF debugging changes the generated code (only in the
 sense of allocating a frame), but that is a system dependency.  It's
 been this way for over 20 years.  I see no reason to produce worse
 code at -O0 when not debugging simply to make testcases happier.

You apparently read my previous message too quickly, this was reported to 
AdaCore on real code, i.e. someone opened a ticket and complained that the 
compiler was generating different code with and without -g for XCOFF.

Note that someone a little perverse could easily use this obvious defect to 
prevent the compiler from bootstrapping on AIX. ;-)

-- 
Eric Botcazou

Re: [PATCH, ARM] attribute target (thumb,arm) [1/6] respin (4th)

2015-05-11 Thread Christian Bruel




OK with those changes.



Ramana



thanks, done

following up the thumb_code cleanup, here is a missing chunk for the 
vxworks config.


arm-vxworks build checked. ok for trunk ?

thanks,

Christian
 2015-05-11  Christian Bruel  christian.br...@st.com
 
	* config/arm/arm-protos.h (thumb_code, thumb1_code): Remove.
	* config/arm/vxworks.h (thumb_code): Replace with TARGET_THUMB.

Index: gcc/config/arm/arm-protos.h
===
--- gcc/config/arm/arm-protos.h	(revision 222997)
+++ gcc/config/arm/arm-protos.h	(working copy)
@@ -462,12 +462,6 @@
 /* Nonzero if tuning for Cortex-A9.  */
 extern int arm_tune_cortex_a9;
 
-/* Nonzero if generating Thumb instructions.  */
-extern int thumb_code;
-
-/* Nonzero if generating Thumb-1 instructions.  */
-extern int thumb1_code;
-
 /* Nonzero if we should define __THUMB_INTERWORK__ in the
preprocessor.
XXX This is a bit of a hack, it's intended to help work around
Index: gcc/config/arm/vxworks.h
===
--- gcc/config/arm/vxworks.h	(revision 222997)
+++ gcc/config/arm/vxworks.h	(working copy)
@@ -40,7 +40,7 @@
   builtin_define (CPU=ARMARCH5);		\
 else if (arm_arch4)\
   {		\
-	if (thumb_code)\
+	if (TARGET_THUMB)			\
 	  builtin_define (CPU=ARMARCH4_T);	\
 	else	\
 	  builtin_define (CPU=ARMARCH4);	\

[PATCH, ARM] Fix testcase for PR64616

2015-05-11 Thread Thomas Preud'homme

Hi,

Testcase made for PR64616 was only passing when using a litteral pool. Rather 
than having an alternative for systems where this is not true, this patch 
changes the test to check that a global copy propagation occurs in cprop2. This 
should work accross all ARM targets (it works when targetting Cortex-M0, 
Cortex-M3 and whatever default core for ARMv7-a with vfpv3-d16 FPU).

ChangeLog entry is as follows:

*** gcc/testsuite/ChangeLog ***

2015-05-04  Thomas Preud'homme  thomas.preudho...@arm.com

* gcc.target/arm/pr64616.c: Test dump rather than assembly to work
accross ARM targets.

diff --git a/gcc/testsuite/gcc.target/arm/pr64616.c 
b/gcc/testsuite/gcc.target/arm/pr64616.c
index c686ffa..2280f21 100644
--- a/gcc/testsuite/gcc.target/arm/pr64616.c
+++ b/gcc/testsuite/gcc.target/arm/pr64616.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options -O2 } */
+/* { dg-options -O2 -fdump-rtl-cprop2 } */
 
 int f (int);
 unsigned int glob;
@@ -11,4 +11,5 @@ g ()
   glob = 5;
 }
 
-/* { dg-final { scan-assembler-times ldr 2 } } */
+/* { dg-final { scan-rtl-dump GLOBAL COPY-PROP cprop2 } } */
+/* { dg-final { cleanup-rtl-dump cprop2 } } */


Patch was tested by verifying that the pattern appears when targeting 
Cortex-M0, Cortex-M3 and
the default core for ARMv7-a with vfpv3-d16 FPU.

Best regards,

Thomas

Re: [Patch, Fortran, 66035, v1] [5/6 Regression] gfortran ICE segfault

2015-05-11 Thread Andre Vehreschild

Hi Mikael,

  diff --git a/gcc/fortran/trans-expr.c b/gcc/fortran/trans-expr.c
  index cf607d0..402d9b9 100644
  --- a/gcc/fortran/trans-expr.c
  +++ b/gcc/fortran/trans-expr.c
  @@ -6881,6 +6881,30 @@ alloc_scalar_allocatable_for_subcomponent_assignment
  (stmtblock_t *block, TREE_TYPE (tmp), tmp,
 fold_convert (TREE_TYPE (tmp),
  size)); }
  +  else if (cm-ts.type == BT_CLASS)
  +{
  +  gcc_assert (expr2-ts.type == BT_CLASS || expr2-ts.type ==
  BT_DERIVED);
  +  if (expr2-ts.type == BT_DERIVED)
  +   {
  + tmp = gfc_get_symbol_decl (gfc_find_vtab (expr2-ts));
  + tmp = gfc_build_addr_expr (NULL_TREE, tmp);
  + size = fold_convert (size_type_node, gfc_vptr_size_get (tmp));
  +   }
 Use TYPE_SIZE_UNIT of the rhs in this case, in the same way as in the
 else branch further below.

Er, but when I get TYPE_SIZE_UNIT () correctly, then it will grab the size
needed to store the %_vptr%size component. Do you really intent that? I need to
alloc the size of the polymorphic type of expr2 here.

snipp

  @@ -7008,7 +7032,9 @@ gfc_trans_subcomponent_assign (tree dest,
  gfc_component * cm, gfc_expr * expr, gfc_add_expr_to_block (block, tmp);
   }
 else if (init  (cm-attr.allocatable
  -  || (cm-ts.type == BT_CLASS  CLASS_DATA
  (cm)-attr.allocatable)))
  +  || (cm-ts.type == BT_CLASS  CLASS_DATA (cm)-attr.allocatable
  +   (expr-ts.type != BT_CLASS
  +  || CLASS_DATA (expr)-attr.allocatable
 maybe: || !CLASS_DATA (expr)-attr.allocatable
 (with a '!')?

No, I only want to copy the rhs to new memory, when it is allocatable. For all
other cases, one should not do this, to prevent a memory leak. Furthermore, is
the data copied to freshly allocated memory anyway. Have a look at the pseudo
code generated for the testcase in the patch.

Regards,
Andre
-- 
Andre Vehreschild * Email: vehre ad gmx dot de

Re: [C frontend] Fix construction of TYPE_STUB_DECL

2015-05-11 Thread Jan Hubicka

 On Sun, May 10, 2015 at 7:33 PM, Jan Hubicka hubi...@ucw.cz wrote:
  Hi,
  TREE_PUBLIC of TYPE_DECL is defined to say if the type is public:
  /* In a VAR_DECL, FUNCTION_DECL, NAMESPACE_DECL or TYPE_DECL,
 nonzero means name is to be accessible from outside this translation 
  unit.
 In an IDENTIFIER_NODE, nonzero means an external declaration
 accessible from outside this translation unit was previously seen
 for this name in an inner scope.  */
  #define TREE_PUBLIC(NODE) ((NODE)-base.public_flag)
 
  This is properly honored by C++ FE but other FEs are bit random, which in 
  turn
  confuses type_in_anonymous_namespace_p predicate that leads to flase 
  poistives
  on type mismatch warnings.  I used to be able to get around by checking only
  C++ types at LTO time, but with type checking in lto-symtab I can not, 
  because
  I do want to compare type compatibility cross translation units and cross 
  languages
  and we have no reliable way to say what type originated as C++ and what did 
  not.
 
 The idea was you can walk up the context chain until you reach a
 TRANSLATION_UNIT_DECL
 where the source language is encoded in...  of course most FEs are
 quite lazy here and
 eventually end up at NULL_TREE instead.

Yep, I can walk up and look for NAMESPACE_DECL without TREE_PUBLIC flag, too,
or cleanup the flag in free_lang_data when I know if we do C++ (probably).
The predicate for anonymous_namespace is called on few quite hot paths in
devirt code, but that is not my main concern either.

In general I think we ought to fix frontends to agree with each other (and
docs) on how to represent semantics instead of adding special cases and
workarounds to middle-end/LTO.  I am trying to keep the anonymous namespace
thing indepent of C++. While I am not sure we support other language with
similar notion, it seems to make sense independently of C++. After all, it
gives the type escape info for free :)

Honza
 
 Richard.

Re: [PATCH][expr.c] PR 65358 Avoid clobbering partial argument during sibcall

2015-05-11 Thread Kyrill Tkachov



On 01/05/15 19:51, Jeff Law wrote:

On 04/28/2015 03:54 AM, Kyrill Tkachov wrote:


On 27/04/15 21:13, Jeff Law wrote:

On 04/21/2015 11:33 AM, Kyrill Tkachov wrote:

On 21/04/15 15:09, Jeff Law wrote:

On 04/21/2015 02:30 AM, Kyrill Tkachov wrote:

   From reading config/stormy16/stormy-abi it seems to me that we
don't
pass arguments partially in stormy16, so this code would never be
called
there. That leaves pa as the potential problematic target.
I don't suppose there's an easy way to test on pa? My checkout of
binutils
doesn't seem to include a sim target for it.

No simulator, no machines in the testfarm, the box I had access to via
parisc-linux.org seems dead and my ancient PA overheats well before a
bootstrap could complete.  I often regret knowing about the backwards
way many things were done on the PA because it makes me think about
cases that only matter on dead architectures.

So what should be the action plan here? I can't add an assert on
positive result as a negative result is valid.

We want to catch the case where this would cause trouble on
pa, or change the patch until we're confident that it's fine
for pa.

That being said, reading the documentation of STACK_GROWS_UPWARD
and ARGS_GROW_DOWNWARD I'm having a hard time visualising a case
where this would cause trouble on pa.

Is the problem that in the function:

+/* Add SIZE to X and check whether it's greater than Y.
+   If it is, return the constant amount by which it's greater or
smaller.
+   If the two are not statically comparable (for example, X and Y
contain
+   different registers) return -1.  This is used in expand_push_insn to
+   figure out if reading SIZE bytes from location X will end up reading
from
+   location Y.  */
+static int
+memory_load_overlap (rtx x, rtx y, HOST_WIDE_INT size)
+{
+  rtx tmp = plus_constant (Pmode, x, size);
+  rtx sub = simplify_gen_binary (MINUS, Pmode, tmp, y);
+
+  if (!CONST_INT_P (sub))
+return -1;
+
+  return INTVAL (sub);
+}

for ARGS_GROW_DOWNWARD we would be reading 'backwards' from x,
so the function should something like the following?

So I had to go back and compile some simple examples.

References to outgoing arguments will be SP relative. References to the
incoming arguments will be ARGP relative.  And that brings me to the
another issue.  Isn't X in this context the incoming argument slot and
the destination an outgoing argument slot?

If so, the approach of memory_load_overlap simply won't work on a target
with calling conventions like the PA.  And you might really want to
consider punting for these kind of calling conventions


Ok, thanks for the guidance.
How about this? This patch disables sibcall optimisation when
encountering a partial argument when ARGS_GROW_DOWNWARD 
!STACK_GROWS_DOWNWARD.
Hopefully this shouldn't harm codegen on parisc if, as you say, it's
rare to have
partial arguments anyway on PA due to the large number of argument regs.

I tested this on arm and bootstrapped on x86_64.
I am now going through the process of getting access to a Debian PA
machine to
give it a test there (thanks Dave!)

Ok if testing comes clean?

Thanks,
Kyrill

2015-04-28  Kyrylo Tkachov  kyrylo.tkac...@arm.com

 PR target/65358
 * calls.c (expand_call): Cancel sibcall optimisation when encountering
 partial argument on targets with ARGS_GROW_DOWNWARD and
 !STACK_GROWS_DOWNWARD.
 * expr.c (memory_load_overlap): New function.
 (emit_push_insn): When pushing partial args to the stack would
 clobber the register part load the overlapping part into a pseudo
 and put it into the hard reg after pushing.

2015-04-28  Honggyu Kim  hong.gyu@lge.com

 PR target/65358
 * gcc.dg/pr65358.c: New test.

The more I think about this, the more I think it's an ugly can of worms and 
maybe we should just disable sibcalls for partial arguments.  I doubt it's a 
big performance issue in general.


We already have quite a bit of code in calls.c to detect cases with partial 
argument overlap for the
explicit purpose of allowing sibcalls when partial arguments occur in the 
general case. However, that
code only detects when a partial argument overlaps with other arguments in a 
call. In this PR the
partial argument overlaps with itself. It would be a shame to disable sibcalls 
for all partial arguments
when there is already infrastructure in place to handle them.




In addition to the argument/stack direction stuff, I've been pondering the stack/frame/arg pointer issues.  Your approach assumes that the incoming and outgoing areas are always referenced off the same base register.  If they aren't, then 
the routine returns no overlap.


But we'd need to consider the case where we have a reference to the arg or 
frame pointer which later gets rewritten into a stack pointer relative address.

Is it too late at the point were you do the checks to reject the sibling call?  If not, then maybe the overlap routine should return a tri-state.  No overlap, overlap, don't

Re: PATCH] PR target/65612: Multiversioning doesn't work with DSO nor PIE

2015-05-11 Thread Szabolcs Nagy



On 09/05/15 19:57, Szabolcs Nagy wrote:
 * H.J. Lu hjl.to...@gmail.com [2015-05-09 10:41:41 -0700]:
 There are

  4: 2b70   806 FUNCGLOBAL DEFAULT   12
 __cpu_indicator_init@GCC_4.8.0
 38: 002153d016 OBJECT  GLOBAL DEFAULT   25 
 __cpu_model@GCC_4.8.0

 and

 0215000  00040001 R_X86_64_64
 2b70 __cpu_indicator_init@GCC_4.8.0 + 0
 00215220  00260006 R_X86_64_GLOB_DAT
 002153d0 __cpu_model@GCC_4.8.0 + 0

 in libgcc_s.so.1.  Musl ld.so must be fixed to handle it.


Rich is looking at how to do this non-intrusively, but
it seems non-trivial (some users of musl prefer not to
resolve such versioned symbols).

 
 i think it might be enough to add __cpu_indicator_init_local
 as an alias to __cpu_indicator_init in libgcc.a and then use
 the *_local symbol from the ifunc resolver, that way no new
 dependency is added to libgcc_s.so handling.

i tried this approach and it seems to work: passes all
multiversioning tests on x86_64.

i think it's no worse than the symver approach.

is it ok to change the current fix to this?


libgcc/Changelog:

2015-05-11  Szabolcs Nagy  szabolcs.n...@arm.com

* config/i386/cpuinfo.c (__cpu_indicator_init_local): Add.
(__cpu_indicator_init@GCC_4.8.0, __cpu_model@GCC_4.8.0): Remove.

* config/i386/t-linux (HOST_LIBGCC2_CFLAGS): Remove -DUSE_ELF_SYMVER.

gcc/Changelog:

2015-05-11  Szabolcs Nagy  szabolcs.n...@arm.com

* config/i386/i386.c (ix86_expand_builtin): Make __builtin_cpu_init
call __cpu_indicator_init_local instead of __cpu_indicator_init.
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 7bd9ff3..7327cf3 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -38398,10 +38398,10 @@ ix86_expand_builtin (tree exp, rtx target, rtx subtarget,
 {
 case IX86_BUILTIN_CPU_INIT:
   {
-	/* Make it call __cpu_indicator_init in libgcc. */
+	/* Make it call __cpu_indicator_init_local in libgcc.a. */
 	tree call_expr, fndecl, type;
 type = build_function_type_list (integer_type_node, NULL_TREE); 
-	fndecl = build_fn_decl (__cpu_indicator_init, type);
+	fndecl = build_fn_decl (__cpu_indicator_init_local, type);
 	call_expr = build_call_expr (fndecl, 0); 
 	return expand_expr (call_expr, target, mode, EXPAND_NORMAL);
   }
diff --git a/libgcc/config/i386/cpuinfo.c b/libgcc/config/i386/cpuinfo.c
index f6f91dd..86b6d21 100644
--- a/libgcc/config/i386/cpuinfo.c
+++ b/libgcc/config/i386/cpuinfo.c
@@ -425,7 +425,7 @@ __cpu_indicator_init (void)
   return 0;
 }
 
-#if defined SHARED  defined USE_ELF_SYMVER
-__asm__ (.symver __cpu_indicator_init, __cpu_indicator_init@GCC_4.8.0);
-__asm__ (.symver __cpu_model, __cpu_model@GCC_4.8.0);
+#ifndef SHARED
+int __cpu_indicator_init_local (void)
+  __attribute__ ((weak, alias (__cpu_indicator_init)));
 #endif
diff --git a/libgcc/config/i386/t-linux b/libgcc/config/i386/t-linux
index 11bb46e..4f47f7b 100644
--- a/libgcc/config/i386/t-linux
+++ b/libgcc/config/i386/t-linux
@@ -3,4 +3,4 @@
 # t-slibgcc-elf-ver and t-linux
 SHLIB_MAPFILES = libgcc-std.ver $(srcdir)/config/i386/libgcc-glibc.ver
 
-HOST_LIBGCC2_CFLAGS += -mlong-double-80 -DUSE_ELF_SYMVER
+HOST_LIBGCC2_CFLAGS += -mlong-double-80

Re: [PATCH, ARM] committed: attribute target (thumb,arm) [1/6] respin (4th)

2015-05-11 Thread Christian Bruel


here the p1 patch committed at rev 222995

On 05/11/2015 11:56 AM, Ramana Radhakrishnan wrote:

On Mon, May 11, 2015 at 10:13 AM, Christian Bruel
christian.br...@st.com wrote:



OK with those changes.



Ramana



thanks, done

following up the thumb_code cleanup, here is a missing chunk for the vxworks
config.

arm-vxworks build checked. ok for trunk ?

thanks,

Christian



OK thanks  - please post the version of p1 that you committed for
archival purposes.

Ramana

Index: gcc/ChangeLog
===
--- gcc/ChangeLog	(revision 222994)
+++ gcc/ChangeLog	(revision 222995)
@@ -1,3 +1,15 @@
+2014-09-23  Christian Bruel  christian.br...@st.com
+
+	* config/arm/arm.c (arm_option_override): Reoganized and split into :
+	(arm_option_params_internal); New function.
+	(arm_option_check_internal): New function.
+	(arm_option_override_internal): New function.
+	(thumb_code, thumb1_code): Remove.
+	* config/arm/arm.h (TREE_TARGET_THUMB, TREE_TARGET_THUMB1): New macros.
+	(TREE_TARGET_THUM2, TREE_TARGET_ARM): Likewise.
+	(thumb_code, thumb1_code): Remove.
+	* config/arm/arm.md (is_thumb, is_thumb1): Check TARGET flag.
+
 2015-05-11  Uros Bizjak  ubiz...@gmail.com
 
 	* config/alpha/alpha.c (alpha_emit_set_const_1)
Index: gcc/config/arm/arm.c
===
--- gcc/config/arm/arm.c	(revision 222994)
+++ gcc/config/arm/arm.c	(revision 222995)
@@ -846,12 +846,6 @@
 /* Nonzero if tuning for Cortex-A9.  */
 int arm_tune_cortex_a9 = 0;
 
-/* Nonzero if generating Thumb instructions.  */
-int thumb_code = 0;
-
-/* Nonzero if generating Thumb-1 instructions.  */
-int thumb1_code = 0;
-
 /* Nonzero if we should define __THUMB_INTERWORK__ in the
preprocessor.
XXX This is a bit of a hack, it's intended to help work around
@@ -2669,6 +2663,150 @@
   return std_gimplify_va_arg_expr (valist, type, pre_p, post_p);
 }
 
+/* Check any incompatible options that the user has specified.  */
+static void
+arm_option_check_internal (struct gcc_options *opts)
+{
+  /* Make sure that the processor choice does not conflict with any of the
+ other command line choices.  */
+  if (TREE_TARGET_ARM (opts)  !(insn_flags  FL_NOTM))
+error (target CPU does not support ARM mode);
+
+  /* TARGET_BACKTRACE calls leaf_function_p, which causes a crash if done
+ from here where no function is being compiled currently.  */
+  if ((TARGET_TPCS_FRAME || TARGET_TPCS_LEAF_FRAME)  TREE_TARGET_ARM (opts))
+warning (0, enabling backtrace support is only meaningful when compiling for the Thumb);
+
+  if (TREE_TARGET_ARM (opts)  TARGET_CALLEE_INTERWORKING)
+warning (0, enabling callee interworking support is only meaningful when compiling for the Thumb);
+
+  /* If this target is normally configured to use APCS frames, warn if they
+ are turned off and debugging is turned on.  */
+  if (TREE_TARGET_ARM (opts)
+   write_symbols != NO_DEBUG
+   !TARGET_APCS_FRAME
+   (TARGET_DEFAULT  MASK_APCS_FRAME))
+warning (0, -g with -mno-apcs-frame may not give sensible debugging);
+
+  /* iWMMXt unsupported under Thumb mode.  */
+  if (TREE_TARGET_THUMB (opts)  TARGET_IWMMXT)
+error (iWMMXt unsupported under Thumb mode);
+
+  if (TARGET_HARD_TP  TREE_TARGET_THUMB1 (opts))
+error (can not use -mtp=cp15 with 16-bit Thumb);
+
+  if (TREE_TARGET_THUMB (opts)  TARGET_VXWORKS_RTP  flag_pic)
+{
+  error (RTP PIC is incompatible with Thumb);
+  flag_pic = 0;
+}
+
+  /* We only support -mslow-flash-data on armv7-m targets.  */
+  if (target_slow_flash_data
+   ((!(arm_arch7  !arm_arch_notm)  !arm_arch7em)
+	  || (TREE_TARGET_THUMB1 (opts) || flag_pic || TARGET_NEON)))
+error (-mslow-flash-data only supports non-pic code on armv7-m targets);
+}
+
+/* Set params depending on attributes and optimization options.  */
+static void
+arm_option_params_internal (struct gcc_options *opts)
+{
+ /* If we are not using the default (ARM mode) section anchor offset
+ ranges, then set the correct ranges now.  */
+  if (TREE_TARGET_THUMB1 (opts))
+{
+  /* Thumb-1 LDR instructions cannot have negative offsets.
+ Permissible positive offset ranges are 5-bit (for byte loads),
+ 6-bit (for halfword loads), or 7-bit (for word loads).
+ Empirical results suggest a 7-bit anchor range gives the best
+ overall code size.  */
+  targetm.min_anchor_offset = 0;
+  targetm.max_anchor_offset = 127;
+}
+  else if (TREE_TARGET_THUMB2 (opts))
+{
+  /* The minimum is set such that the total size of the block
+ for a particular anchor is 248 + 1 + 4095 bytes, which is
+ divisible by eight, ensuring natural spacing of anchors.  */
+  targetm.min_anchor_offset = -248;
+  targetm.max_anchor_offset = 4095;
+}
+  else
+{
+  targetm.min_anchor_offset = TARGET_MIN_ANCHOR_OFFSET;
+  targetm.max_anchor_offset = TARGET_MAX_ANCHOR_OFFSET;
+

Re: [PATCH][MIPS] Enable load-load/store-store bonding

2015-05-11 Thread sameera


On Tuesday 21 April 2015 12:39 AM, Matthew Fortune wrote:

Sameera Deshpande sameera.deshpa...@imgtec.com writes:

Gentle reminder!


Thanks Sameera. Just a couple of comments inline below and a question
for Catherine at the end.


- Thanks and regards,
Sameera D.

On Monday 30 March 2015 04:58 PM, sameera wrote:

Hi!

Sorry for delay in sending this patch for review.
Please find attached updated patch.

In P5600, 2 consecutive loads/stores of same type which access
contiguous memory locations are bonded together by instruction issue
unit to dispatch single load/store instruction which accesses both

locations. This allows 2X improvement in memory intensive code. This
optimization can be performed for LH, SH, LW, SW, LWC, SWC, LDC, SDC
instructions.


This patch adds peephole2 patterns to identify such loads/stores, and
put them in parallel, so that the scheduler will not split it -

thereby guaranteeing h/w level load/store bonding.


The patch is tested with dejagnu for correctness, and tested on

hardware for performance.

Ok for trunk?

Changelog:
gcc/
  * config/mips/mips.md (JOIN_MODE): New mode iterator.
  (join2_load_StoreJOIN_MODE:mode): New pattern.
  (join2_loadhi): Likewise.
  (define_peehole2): Add peephole2 patterns to join 2 HI/SI/SF/DF-

mode

  load-load and store-stores.
  * config/mips/mips.opt (mload-store-pairs): New option.
  (TARGET_LOAD_STORE_PAIRS): New macro.
  *config/mips/mips.h (ENABLE_LD_ST_PAIRS): Likewise.
  *config/mips/mips-protos.h (mips_load_store_bonding_p): New

prototype.

  *config/mips/mips.c(mips_load_store_bonding_p): New function.


I don't know if this has been corrupted by mail clients but a single
space after '*' and a space before '('.


diff --git a/gcc/config/mips/mips-protos.h b/gcc/config/mips/mips-protos.h
index b48e04f..244eb8d 100644
--- a/gcc/config/mips/mips-protos.h
+++ b/gcc/config/mips/mips-protos.h
@@ -360,6 +360,7 @@ extern bool mips_epilogue_uses (unsigned int);
extern void mips_final_prescan_insn (rtx_insn *, rtx *, int);
extern int mips_trampoline_code_size (void);
extern void mips_function_profiler (FILE *);
+extern bool mips_load_store_bonding_p (rtx *, machine_mode, bool);

typedef rtx (*mulsidi3_gen_fn) (rtx, rtx, rtx);
#ifdef RTX_CODE
diff --git a/gcc/config/mips/mips.c b/gcc/config/mips/mips.c
index 1733457..85f0591 100644
--- a/gcc/config/mips/mips.c
+++ b/gcc/config/mips/mips.c
@@ -18241,6 +18241,64 @@ umips_load_store_pair_p_1 (bool load_p, bool swap_p,
   return true;
}

+bool
+mips_load_store_bonding_p (rtx *operands, enum machine_mode mode, bool load_p)


Remove enum from machine_mode.


+{
+  rtx reg1, reg2, mem1, mem2, base1, base2;
+  enum reg_class rc1, rc2;
+  HOST_WIDE_INT offset1, offset2;
+
+  if (load_p)
+{
+  reg1 = operands[0];
+  reg2 = operands[2];
+  mem1 = operands[1];
+  mem2 = operands[3];
+}
+  else
+{
+  reg1 = operands[1];
+  reg2 = operands[3];
+  mem1 = operands[0];
+  mem2 = operands[2];
+}
+
+  if (mips_address_insns (XEXP (mem1, 0), mode, false) == 0
+  || mips_address_insns (XEXP (mem2, 0), mode, false) == 0)
+return false;
+
+  mips_split_plus (XEXP (mem1, 0), base1, offset1);
+  mips_split_plus (XEXP (mem2, 0), base2, offset2);
+
+  /* Base regs do not match.  */
+  if (!REG_P (base1) || !rtx_equal_p (base1, base2))
+return false;
+
+  /* Either of the loads is clobbering base register.  */
+  if (load_p
+   (REGNO (reg1) == REGNO (base1)
+ || (REGNO (reg2) == REGNO (base1
+return false;


Can you add a comment saying that this case does not get bonded by
any known hardware even though it could be valid to bond them if it
is the second load that clobbers the base.


+  /* Loading in same registers.  */
+  if (load_p
+   REGNO (reg1) == REGNO (reg2))
+return false;
+
+  /* The loads/stores are not of same type.  */
+  rc1 = REGNO_REG_CLASS (REGNO (reg1));
+  rc2 = REGNO_REG_CLASS (REGNO (reg2));
+  if (rc1 != rc2
+   !reg_class_subset_p (rc1, rc2)
+   !reg_class_subset_p (rc2, rc1))
+return false;
+
+  if (abs (offset1 - offset2) != GET_MODE_SIZE (mode))
+return false;
+
+  return true;
+}
+
/* OPERANDS describes the operands to a pair of SETs, in the order
dest1, src1, dest2, src2.  Return true if the operands can be used
in an LWP or SWP instruction; LOAD_P says which.  */
diff --git a/gcc/config/mips/mips.h b/gcc/config/mips/mips.h
index ec69ed5..1bd0dae 100644
--- a/gcc/config/mips/mips.h
+++ b/gcc/config/mips/mips.h
@@ -3147,3 +3147,7 @@ extern GTY(()) struct target_globals *mips16_globals;
#define STANDARD_STARTFILE_PREFIX_1 /lib64/
#define STANDARD_STARTFILE_PREFIX_2 /usr/lib64/
#endif
+
+#define ENABLE_LD_ST_PAIRS \
+  (TARGET_LOAD_STORE_PAIRS  TUNE_P5600 \
+!TARGET_MICROMIPS  !TARGET_FIX_24K)


I've already forgotten why these extra micromips/fix24k conditions
were required. Can you add a comment?


diff --git

Re: PATCHes to help with C++11 bootstrap

2015-05-11 Thread Markus Trippelsdorf

On 2015.05.08 at 23:30 -0500, Jason Merrill wrote:
 One C++11 compatibility issue that turns up a lot in the GCC sources is 
 that in C++98,
 
 #define BAR bar
 const char *p = fooBAR;

There was a missing fix for gcc/config/rs6000/option-defaults.h. 
This broke bootstrap on ppc64.
Fix committed as obvious.

2015-05-11  Markus Trippelsdorf  mar...@trippelsdorf.de

PR bootstrap/66105
* config/rs6000/option-defaults.h: Add space between string literal
and macro name.

diff --git a/gcc/config/rs6000/option-defaults.h 
b/gcc/config/rs6000/option-defaults.h
index 7da2c7c77b86..95a147206186 100644
--- a/gcc/config/rs6000/option-defaults.h
+++ b/gcc/config/rs6000/option-defaults.h
@@ -39,11 +39,11 @@
 #endif
 
 #if TARGET_DEFAULT  OPTION_MASK_64BIT
-#define OPT_ARCH64 !OPT_32
+#define OPT_ARCH64 ! OPT_32
 #define OPT_ARCH32 OPT_32
 #else
 #define OPT_ARCH64 OPT_64
-#define OPT_ARCH32 !OPT_64
+#define OPT_ARCH32 ! OPT_64
 #endif
 
 /* Support for a compile-time default CPU, et cetera.  The rules are:

-- 
Markus

[Patch][loop-invariant.c] Fix a couple of bugs regarding loop invariant motion discovered by spec2k6 on aarch64

2015-05-11 Thread David Sherwood

Hi,

This patch fixes a couple of issues I saw during the compilation of shell_lam.f
for 410.bwaves test in spec2006:

* create_new_invariant: We shouldn't bother attempting to calculate the address
cost for something that clearly isn't an address. Use a SCALAR_INT_MODE_P check
to eliminate the most obvious cases, such as vector modes and so on.
* get_inv_cost: Change the code so that we don't treat something as an address
if it is never actually used as an address, i.e. we didn't use it at all. This
occurs due to the way we process the innermost loop first - invariants get
pulled out of inner most loop and then get processed during next innermost loop.
Here is more detail from shell_lam.f.210r.loop2_invariant dump file that
shows what currently happens:

   4427 *ending processing of loop 27 **

   4442 (const_vector:V2DF [
   4443 (const_double:DF 4.0e+0 [0x0.8p+3])
    (const_double:DF 4.0e+0 [0x0.8p+3])
   4445 ])
   4446 
   4447 Hot cost: 8 (final)
   4448 (set (reg:V2DF 3179)
   4449 (const_vector:V2DF [
   4450 (const_double:DF 4.0e+0 [0x0.8p+3])
   4451 (const_double:DF 4.0e+0 [0x0.8p+3])
   4452 ]))
   4453 
   4454 Hot cost: 8 (final)
   4455 Set in insn 2764 is invariant (1), cost 8, depends on 

   4471 Decided to move invariant 1 -- gain 8

move_invariant_reg moves invariant (the const_vector above) into 3490 and
replaces use of 3179 with 3490, but conservatively keeps the definition of 3179,
which gets processed in outer loop, but is never used.

Debug for outer loop now follows:

   4497 *starting processing of loop 26 **

   6379 (insn 2764 2748 2766 110 (set (reg:V2DF 3490)
   6380 (const_vector:V2DF [
   6381 (const_double:DF 4.0e+0 [0x0.8p+3])
   6382 (const_double:DF 4.0e+0 [0x0.8p+3])
   6383 ])) shell_lam.f:287 819 {*aarch64_simd_movv2df}
   6384  (nil))
   6385 ;;   UD chains for insn luid 88 uid 2766

   6604 ;;   UD chains for insn luid 21 uid 3836
   6605 ;;  reg 3490 { d419(bb 110 insn 2764) }
   6606 (insn 3836 2763 2765 111 (set (reg:V2DF 3179)
   6607 (reg:V2DF 3490)) shell_lam.f:287 -1
   6608  (nil))
^^^ additional insn generated by move_invariant_reg from loop 27

   6836 ;;   UD chains for insn luid 40 uid 2790
   6837 ;;  reg 3161 { d272(bb 111 insn 2746) }
   6838 ;;  reg 3201 { d305(bb 111 insn 2788) }
   6839 ;;  reg 3490 { d419(bb 110 insn 2764) }
   6840 ;;   eq_note reg 3161 { d272(bb 111 insn 2746) }
   6841 ;;   eq_note reg 3201 { d305(bb 111 insn 2788) }
   6842 (insn 2790 2788 2791 111 (set (reg:V2DF 3203 [ vect__930.427 ])
   6843 (fma:V2DF (reg:V2DF 3161 [ MEM[base: vectp_q.371_2137, index: 
ivtmp.679_3214,
offset: 0B] ])
   6844 (neg:V2DF (reg:V2DF 3490))
   6845 (reg:V2DF 3201 [ vect__928.424 ]))) shell_lam.f:287 1219 
{fnmav2df4}
   6846  (expr_list:REG_DEAD (reg:V2DF 3201 [ vect__928.424 ])
   6847 (expr_list:REG_DEAD (reg:V2DF 3179) ...
^^^ 3179 marked as DEAD

   8366 *ending processing of loop 26 **

   8472 (const_vector:V2DF [
   8473 (const_double:DF 4.0e+0 [0x0.8p+3])
   8474 (const_double:DF 4.0e+0 [0x0.8p+3])
   8475 ])
   8476 
   8477 Hot cost: 8 (final)
   8478 (set (reg:V2DF 3490)
   8479 (const_vector:V2DF [
   8480 (const_double:DF 4.0e+0 [0x0.8p+3])
   8481 (const_double:DF 4.0e+0 [0x0.8p+3])
   8482 ]))
   8483 
   8484 Hot cost: 8 (final)
   8485 Set in insn 2764 is invariant (12), cost 8, depends on 

   8505 (set (reg:V2DF 3179)
   8506 (reg:V2DF 3490))
   8507 
   8508 Hot cost: 8 (final)
   8509 Set in insn 3836 is invariant (15), cost 8, depends on 12
^^^ Never moves 3179 invariant out, even though it's never used.

After my patch we now get this:

8586 Decided to move invariant 15 -- gain 16
8587 Decided to move dependent invariant 12

since get_inv_cost sees the number of uses is zero and the dependent invariant
(12) gets moved too. Of course, without my patch the definition of 3179 would
ultimately be eliminated as dead code in a later pass anyway.

Is this ok to go in?

Regards,
David Sherwood.

ChangeLog entry follows ...

2015-05-08  David Sherwood  david.sherw...@arm.com

* loop-invariant.c (create_new_invariant): Don't calculate address cost
if mode is not scalar integers.
(get_inv_cost): Increase computational cost for unused invariants.



loop-invariant.patch
Description: Binary data

Re: PATCH] PR target/65612: Multiversioning doesn't work with DSO nor PIE

2015-05-11 Thread Szabolcs Nagy

On 11/05/15 11:31, Jakub Jelinek wrote:
 On Mon, May 11, 2015 at 11:20:15AM +0100, Szabolcs Nagy wrote:
 i think it might be enough to add __cpu_indicator_init_local
 as an alias to __cpu_indicator_init in libgcc.a and then use
 the *_local symbol from the ifunc resolver, that way no new
 dependency is added to libgcc_s.so handling.

 i tried this approach and it seems to work: passes all
 multiversioning tests on x86_64.

 i think it's no worse than the symver approach.

 is it ok to change the current fix to this?
 
 No.  Instead of piling hacks like this just fix it in musl.
 libgcc certainly isn't the only library that uses @ symbol versions,
 e.g. libstdc++ does as well, as well as many other shared libraries.
 

can you explain how using a standard elf feature is a hack,
but the current symver asm directive is not?

fyi, musl loader loads libstdc++ just fine because it has no
relocations for symbols which only has sym@version definition
(libgcc_s.so.1 has because __cpu_indicator_init is a ctor).

musl may end up supporting @version but that's an independent
quest.

(the one big pile of hacks here is multiversioning: each function
with mv has its own ifunc resolver repeating the same logic, then
the 1.5K cpuinfo.o is static linked into every single dso that
uses mv and the separate __cpu_model structs all have to be
initialized.. including the unused one in libgcc_s.so.1 because
of the ctor.. adding more startup overhead.

i'd gladly propose a patch to remove this feature if getting rid
of piling hacks has priority.. the current design seems problematic
to me for other archs that may need to call libc functions to do
the dispatch anyway.)

Re: PATCH] PR target/65612: Multiversioning doesn't work with DSO nor PIE

2015-05-11 Thread Jakub Jelinek

On Mon, May 11, 2015 at 01:39:17PM +0100, Szabolcs Nagy wrote:
 fyi, musl loader loads libstdc++ just fine because it has no

But will it load any shared library that uses any of the 26 (if I count well
on x86_64) @ symbols from libstdc++.so.6?
readelf -Ws /lib64/libstdc++.so.6 | grep '@' | grep -v 'UND\|@@'
For libstdc++ that is primarily C++ apps and shared libraries
compiled/linked with GCC 4.0.0.

 musl may end up supporting @version but that's an independent
 quest.

It is not independent.  If musl claims to support symbol versioning, it
should support it properly, if not, then supposedly gcc configured for musl
can't be compatible with gcc configured for other linux C libraries, and
should force symbol versioning off.

Jakub

[PATCH] combine: Don't create (set (reg:CC) (compare (reg:CC) (const0)))

2015-05-11 Thread Segher Boessenkool

There already was code to just use the original reg:CC, but it was
positioned incorrectly: if the original code (that this RTL is
simplified from) did not already start with a COMPARE (or not in
the right mode), it didn't trigger.  But it is valid in that case
as well.

This then allows merging the other two arms of this conditional;
do so.

Bootstrapped and regression tested on powerpc64-linux and x86_64-linux;
no regressions.  Also built toolchains and Linux kernels for every
arch where that works; no new failures on that.

Any objections?


Segher


2015-05-11  Segher Boessenkool  seg...@kernel.crashing.org

* combine.c (simplify_set): When generating a CC set, if the
source already is in the correct mode, do not wrap it in a
compare.  Simplify the rest of that code.

---
 gcc/combine.c | 16 ++--
 1 file changed, 6 insertions(+), 10 deletions(-)

diff --git a/gcc/combine.c b/gcc/combine.c
index 896d9d2..51f78a5 100644
--- a/gcc/combine.c
+++ b/gcc/combine.c
@@ -6675,20 +6675,16 @@ simplify_set (rtx x)
   if (other_changed)
undobuf.other_insn = other_insn;
 
-  /* Otherwise, if we didn't previously have a COMPARE in the
-correct mode, we need one.  */
-  if (GET_CODE (src) != COMPARE || GET_MODE (src) != compare_mode)
-   {
- SUBST (SET_SRC (x), gen_rtx_COMPARE (compare_mode, op0, op1));
- src = SET_SRC (x);
-   }
-  else if (GET_MODE (op0) == compare_mode  op1 == const0_rtx)
+  /* Don't generate a compare of a CC with 0, just use that CC.  */
+  if (GET_MODE (op0) == compare_mode  op1 == const0_rtx)
{
  SUBST (SET_SRC (x), op0);
  src = SET_SRC (x);
}
-  /* Otherwise, update the COMPARE if needed.  */
-  else if (XEXP (src, 0) != op0 || XEXP (src, 1) != op1)
+  /* Otherwise, if we didn't previously have the same COMPARE we
+want, create it from scratch.  */
+  else if (GET_CODE (src) != COMPARE || GET_MODE (src) != compare_mode
+  || XEXP (src, 0) != op0 || XEXP (src, 1) != op1)
{
  SUBST (SET_SRC (x), gen_rtx_COMPARE (compare_mode, op0, op1));
  src = SET_SRC (x);
-- 
1.8.1.4

Re: [PATCH] Expand PIC calls without PLT with -fno-plt

2015-05-11 Thread Michael Matz

Hi,

On Wed, 6 May 2015, Rich Felker wrote:

 I don't see how this case is improved unless GCC is failing to consider 
 strong definitions in the same TU as locally-binding.

Interposition of non-static non-inline non-weak symbols is supported 
independend of if they are defined in the same TU or not (if you're 
producing a shared lib, that is).  I.e. no, they are not considered 
locally-binding (for instance, they aren't automatically inlined).

 If this is the case, is there a reason for that behavior?

Because IMHO interposition is orthogonal to TU placement, and hence 
shouldn't be influenced by it.  There's visibility, inline hints or 
static-ness to achieve different effects.  (perhaps the real reason is: 
because it always worked like that :) )

 IMO it's wrong.

Why?  I think it's right.


Ciao,
Michael.

Re: [patch, libgomp] Re-factor GOMP_MAP_POINTER handling

2015-05-11 Thread Chung-Lin Tang

Ping.

On 2015/4/21 08:21 PM, Chung-Lin Tang wrote:
 Hi,
 while investigating some issues in the variable mapping code, I observed
 that the GOMP_MAP_POINTER handling is essentially duplicated under the PSET 
 case.
 This patch abstracts and unifies the handling code, basically just a cleanup
 patch. Ran libgomp tests to ensure no regressions, ok for trunk?
 
 Thanks,
 Chung-Lin
 
 2015-04-21  Chung-Lin Tang  clt...@codesourcery.com
 
 libgomp/
 * target.c (gomp_map_pointer): New function abstracting out
 GOMP_MAP_POINTER handling.
 (gomp_map_vars): Remove GOMP_MAP_POINTER handling code and use
 gomp_map_pointer().

Fix verify_type ICE on TYPE_METHOS

2015-05-11 Thread Jan Hubicka

Jason,
this patch is trying to fix the following verify_type_ice that check
that TYPE_METHODS is same for main variant and all its complete variants.

-  /* FIXME: this check triggers during libstdc++ build that is a bug.
- It affects non-LTO debug output only, because free_lang_data clears
- this anyway.  */
-  if (RECORD_OR_UNION_TYPE_P (t)  COMPLETE_TYPE_P (t)  0
-   TYPE_METHODS (t) != TYPE_METHODS (tv))

I tried to go in a way of making TYPE_METHODS to be defined only on main 
vairants
as it seems impractical to update variants each time we change TYPE_METHODS.
While doing so, I noticed that lazily_declare_fn sometimes adds new method
to variant only (that is probably source of ICE as fixup_type_variants copies 
the
TYPE_METHODS).

The attach patch however removes the copying and verifies that TYPE_MEHTODS is
NULL at all variants. It also updates middle-end uses to always check for main 
variant.
Bootstrapped/regtested ppc64-linux, does this seem to make sense?

I also noticed that TYPE_METHODS != NULL matters in some cases: it makes 
function.c
to drop register keyword and it makes us to speak of class instead of struct in
diagnostics, so I changed free_lang_data to zap the list to error_mark_node 
instead
of NULL (in past I tried to stream it for ODR warnings and I remember it added
measurable extra overhead that I considered too large for mainline).

Honza

* class.c (fixup_type_variants): Do not copy TYPE_METHODS
(one_inheriting_sig): Assert tat we always set TYPE_METHODS of main 
variant.
* semantics.c (finish_member_declaration): Likewise.
* method.c (lazily_declare_fn): Allways add method to main variant list.

* dwarf2out.c (gen_member_die): Sanity check that we access 
TYPE_MAIN_VARIANT
for TYPE_METHODS.
* function.c (use_register_for_decl): Look for TYPE_MAIN_VARIANT when 
checking
TYPE_METHODS.
* tree.c (free_lang_data_in_type): See TYPE_METHODS to error_mark_node
if non-null.
(build_distinct_type_copy): Clear TYPE_METHODS.
(verify_type_variant): Verify that TYPE_METHODS is NULL for variants.
(verify_type): Allow TYPE_METHODS to be error_mark_node.
* tree.def: Update docs of YTPE_STUB_DECL and TYPE_METHODS.
Index: cp/class.c
===
--- cp/class.c  (revision 222991)
+++ cp/class.c  (working copy)
@@ -1972,7 +1972,6 @@ fixup_type_variants (tree t)
 
   /* Copy whatever these are holding today.  */
   TYPE_VFIELD (variants) = TYPE_VFIELD (t);
-  TYPE_METHODS (variants) = TYPE_METHODS (t);
   TYPE_FIELDS (variants) = TYPE_FIELDS (t);
 }
 }
@@ -3238,6 +3237,7 @@ one_inheriting_sig (tree t, tree ctor, t
 parmlist = tree_cons (NULL_TREE, parms[i], parmlist);
   tree fn = implicitly_declare_fn (sfk_inheriting_constructor,
   t, false, ctor, parmlist);
+  gcc_assert (TYPE_MAIN_VARIANT (t) == t);
   if (add_method (t, fn, NULL_TREE))
 {
   DECL_CHAIN (fn) = TYPE_METHODS (t);
Index: cp/method.c
===
--- cp/method.c (revision 222991)
+++ cp/method.c (working copy)
@@ -2189,11 +2189,11 @@ lazily_declare_fn (special_function_kind
DECL_VIRTUAL_P (fn))
 /* The ABI requires that a virtual destructor go at the end of the
vtable.  */
-TYPE_METHODS (type) = chainon (TYPE_METHODS (type), fn);
+TYPE_METHODS (type) = chainon (TYPE_METHODS (TYPE_MAIN_VARIANT (type)), 
fn);
   else
 {
-  DECL_CHAIN (fn) = TYPE_METHODS (type);
-  TYPE_METHODS (type) = fn;
+  DECL_CHAIN (fn) = TYPE_METHODS (TYPE_MAIN_VARIANT (type));
+  TYPE_METHODS (TYPE_MAIN_VARIANT (type)) = fn;
 }
   maybe_add_class_template_decl_list (type, fn, /*friend_p=*/0);
   if (DECL_MAYBE_IN_CHARGE_CONSTRUCTOR_P (fn)
Index: cp/semantics.c
===
--- cp/semantics.c  (revision 222991)
+++ cp/semantics.c  (working copy)
@@ -2913,6 +2913,7 @@ finish_member_declaration (tree decl)
 CLASSTYPE_METHOD_VEC.  */
   if (add_method (current_class_type, decl, NULL_TREE))
{
+ gcc_assert (TYPE_MAIN_VARIANT (current_class_type) == 
current_class_type);
  DECL_CHAIN (decl) = TYPE_METHODS (current_class_type);
  TYPE_METHODS (current_class_type) = decl;
 
Index: dwarf2out.c
===
--- dwarf2out.c (revision 222991)
+++ dwarf2out.c (working copy)
@@ -19945,6 +19945,8 @@ gen_member_die (tree type, dw_die_ref co
gen_decl_die (member, NULL, context_die);
 }
 
+  /* We do not keep type methods in type variants.  */
+  gcc_assert (TYPE_MAIN_VARIANT (type) == type);
   /* Now output info about the function members (if any).  */
   for (member = TYPE_METHODS (type); member; member = DECL_CHAIN (member))
 {
Index: function.c

RE: [PATCH][MIPS] Enable load-load/store-store bonding

2015-05-11 Thread Matthew Fortune

Hi Sameera,

Sameera Deshpande sameera.deshpa...@imgtec.com writes: 
 Changelog:
 gcc/
  * config/mips/mips.md (JOIN_MODE): New mode iterator.
  (join2_load_StoreJOIN_MODE:mode): New pattern.
  (join2_loadhi): Likewise.
  (define_peehole2): Add peephole2 patterns to join 2
 HI/SI/SF/DF-mode
  load-load and store-stores.
  * config/mips/mips.opt (mload-store-pairs): New option.
  (TARGET_LOAD_STORE_PAIRS): New macro.
  * config/mips/mips.h (ENABLE_LD_ST_PAIRS): Likewise.
  * config/mips/mips-protos.h (mips_load_store_bonding_p): New
 prototype.
  * config/mips/mips.c (mips_load_store_bonding_p): New function.
 
 gcc/testsuite/
  * gcc.target/mips/p5600-bonding.c : New testcase to test
 bonding.

Just 'New file.' is fine for the changelog.

diff --git a/gcc/testsuite/gcc.target/mips/p5600-bonding.c 
b/gcc/testsuite/gcc.target/mips/p5600-bonding.c
new file mode 100644
index 000..122b9f8
--- /dev/null
+++ b/gcc/testsuite/gcc.target/mips/p5600-bonding.c
@@ -0,0 +1,19 @@
+/* { dg-do compile } */
+/* { dg-options -dp -mtune=p5600  -mno-micromips -mno-mips16 } */
+/* { dg-skip-if Bonding needs peephole optimization. { *-*-* } { -O0 
-O1 } {  } } */
+typedef int VINT32 __attribute__ ((vector_size((16;
+
+void memory_operation_fun2_si(void * __restrict src, void * __restrict dest, 
int num)

Code style applies for testcases too, return type on line above, space
after function name, line length.

+{
+VINT32 *vsrc = (VINT32 *)src;

Indentation.

+VINT32 *vdest = (VINT32 *)dest;
+int i;
+
+for (i = 0; i  num - 1; i+=2)
+{

Indentation

+  vdest[i] = (vdest[i] + vsrc[i]);

Unnecessary brackets.

+  vdest[i + 1] = vdest[i + 1] + vsrc[i + 1];
+}
+}
+/* { dg-final { scan-assembler join2_ } }  */
+

OK with those changes.

Thanks,
Matthew

[patch, fortran] pr66100 bound simplification refactoring fallout

2015-05-11 Thread Mikael Morin

Hello,

This fixes a regression introduced by my bound simplification refactoring at
https://gcc.gnu.org/ml/gcc-patches/2015-05/msg00843.html
The assert was introduced in the change above and checks that the
refactoring didn't lose anything regarding the condition used in the
existing code.
The condition was assuming full object arrays, and fails with subobject
arrays.
The fix (attached) is obvious enough.
The testcase is Thomas' reduced one, which is independent on matmul
simplification.

I plan to commit this patch tonight.

Mikael





2015-05-11  Mikael Morin  mik...@gcc.gnu.org

* simplify.c (simplify_bound): Fix assert to accept subobject
arrays.
Index: simplify.c
===
--- simplify.c	(rÃ©vision 222979)
+++ simplify.c	(copie de travail)
@@ -3463,8 +3463,8 @@ simplify_bound (gfc_expr *array, gfc_expr *dim, gf
   gcc_assert (!as
 	  || (as-type != AS_DEFERRED
 		   array-expr_type == EXPR_VARIABLE
-		   !array-symtree-n.sym-attr.allocatable
-		   !array-symtree-n.sym-attr.pointer));
+		   !gfc_expr_attr (array).allocatable
+		   !gfc_expr_attr (array).pointer));
 
   if (dim == NULL)
 {


! { dg-do compile }
! { dg-additional-options -fdump-tree-original }
!
! PR fortran/66100
! ICE on lbound simplification
!
! Original test case by Joost VandeVondele joost.vandevond...@mat.ethz.ch
! Reduced by Thomas Koenig tkoe...@gcc.gnu.org
!
MODULE qs_integrate_potential_low
  INTEGER, PARAMETER :: dp = 8
  TYPE cell_type
REAL(KIND=8) :: h_inv(3,3)
  END TYPE
  TYPE(cell_type), POINTER :: cell
  REAL(KIND=dp), DIMENSION(3)  :: rp
  CONTAINS
SUBROUTINE integrate_general_opt()
REAL(KIND=dp) :: gp(3)
INTEGER :: ng
if (any(lbound(cell%h_inv) /= 1)) call abort
if (any(ubound(cell%h_inv) /= 3)) call abort
END SUBROUTINE integrate_general_opt
END MODULE qs_integrate_potential_low
! { dg-final { scan-tree-dump-not bound original } }
! { dg-final { scan-tree-dump-not abort original } }
! { dg-final { cleanup-tree-dump original } }

[PATCH] libiberty: cleanup Makefile.in

2015-05-11 Thread Bernhard Reutner-Fischer

* maint-tool: Refactor pic/ and noasan/ handling.
* Makefile.in: Likewise. Regenerate dependencies.

---

It's hard to see which parts of libiberty get built with which flags due
to the current way the checks for pic and noasan are performed. So push
this decision to make(1).

For the jit it would be nice to build pic variants of libbackend etc so
let's tweak the way libiberty does this before the current way gets
copied over and over.

I'd have used an order-only prerequisite for the output-dirs pic/ and
noasan/ but that's left for a possible follow-up.

Bootstrapped with

configure -v --enable-languages=c,lto,fortran,c++,go,objc,obj-c++ \
CFLAGS=-g -O2 CXXFLAGS=-g -O2 BOOT_CFLAGS=-g -O2 \
BOOT_CXXFLAGS=-g -O2 CFLAGS_FOR_TARGET=-g -O2 \
CXXFLAGS_FOR_TARGET=-g -O2 \
--prefix=/opt/x86_64/gcc-6.0.mine/ --enable-shared \
--without-system-zlib --enable-nls --without-included-gettext \
--enable-threads=posix --enable-__cxa_atexit \
--enable-libstdcxx-allocator=mt --enable-clocale=gnu \
--enable-libstdcxx-debug --enable-mpfr --disable-werror \
--enable-checking=yes --enable-debug -C \
--disable-intermodule --enable-multilib --disable-libstdcxx-pch \
--enable-bootstrap --enable-checking=yes \
--with-cpu=native --with-tune=native --enable-plugin

and regtested on x86_64-linux with no new regeressions.

In addition, manual inspection of the flags used to compile the
libiberty/{,*/}*.o indicate no behavioural change.

Ok for trunk?

(should wipe intermodule from configure and libgfortran too, just
because i see that i still have it in this build-script..)
---
 libiberty/Makefile.in | 1425 +
 libiberty/maint-tool  |   15 +-
 2 files changed, 483 insertions(+), 957 deletions(-)

diff --git a/libiberty/Makefile.in b/libiberty/Makefile.in
index f06cc69..de0c8c9 100644
--- a/libiberty/Makefile.in
+++ b/libiberty/Makefile.in
@@ -69,6 +69,22 @@ MAKEOVERRIDES =
 TARGETLIB = ./libiberty.a
 TESTLIB = ./testlib.a
 
+TARGETLIB_PIC = $(dir $(TARGETLIB))pic/$(notdir $(TARGETLIB))
+TARGETLIB_NOASAN = $(dir $(TARGETLIB))noasan/$(notdir $(TARGETLIB))
+
+TARGETLIBS := $(TARGETLIB)
+
+DEFAULT_OFILES = $(REQUIRED_OFILES) $(EXTRA_OFILES) $(LIBOBJS)
+ifneq ($(strip $(PICFLAG)),)
+TARGETLIBS += $(TARGETLIB_PIC)
+PIC_OFILES = $(patsubst ./%,$(dir $(TARGETLIB_PIC))%,$(DEFAULT_OFILES))
+endif
+
+ifneq ($(strip $(NOASANFLAG)),)
+TARGETLIBS += $(TARGETLIB_NOASAN)
+NOASAN_OFILES = $(patsubst ./%,$(dir $(TARGETLIB_NOASAN))%,$(DEFAULT_OFILES))
+endif
+
 LIBOBJS = @LIBOBJS@
 
 # A configuration can specify extra .o files that should be included,
@@ -102,7 +118,7 @@ FLAGS_TO_PASS = \
 SUBDIRS = testsuite
 
 # FIXME: add @BUILD_INFO@ once we're sure it works for everyone.
-all: stamp-picdir stamp-noasandir $(TARGETLIB) required-list all-subdir
+all: $(TARGETLIBS) required-list all-subdir
@: $(MAKE) ; $(MULTIDO) $(FLAGS_TO_PASS) multi-do DO=all
 
 .PHONY: check installcheck
@@ -113,7 +129,7 @@ installcheck: installcheck-subdir
 
 INCDIR=$(srcdir)/$(MULTISRCTOP)../include
 
-COMPILE.c = $(CC) -c @DEFS@ $(CFLAGS) $(CPPFLAGS) -I. -I$(INCDIR) $(HDEFINES) 
@ac_libiberty_warn_cflags@
+COMPILE.c = $(CC) -c @DEFS@ $(CFLAGS) $(SUB_CFLAGS) $(CPPFLAGS) -I. 
-I$(INCDIR) $(HDEFINES) @ac_libiberty_warn_cflags@
 
 # Just to make sure we don't use a built-in rule with VPATH
 .c.$(objext):
@@ -243,25 +259,18 @@ INSTALLED_HEADERS =   
  \
$(INCDIR)/splay-tree.h \
$(INCDIR)/timeval-utils.h
 
-$(TARGETLIB): $(REQUIRED_OFILES) $(EXTRA_OFILES) $(LIBOBJS)
-   -rm -f $(TARGETLIB) pic/$(TARGETLIB) noasan/$(TARGETLIB)
-   $(AR) $(AR_FLAGS) $(TARGETLIB) \
- $(REQUIRED_OFILES) $(EXTRA_OFILES) $(LIBOBJS)
-   $(RANLIB) $(TARGETLIB)
-   if [ x$(PICFLAG) != x ]; then \
- cd pic; \
- $(AR) $(AR_FLAGS) $(TARGETLIB) \
-   $(REQUIRED_OFILES) $(EXTRA_OFILES) $(LIBOBJS); \
- $(RANLIB) $(TARGETLIB); \
- cd ..; \
-   else true; fi; \
-   if [ x$(NOASANFLAG) != x ]; then \
- cd noasan; \
- $(AR) $(AR_FLAGS) $(TARGETLIB) \
-   $(REQUIRED_OFILES) $(EXTRA_OFILES) $(LIBOBJS); \
- $(RANLIB) $(TARGETLIB); \
- cd ..; \
-   else true; fi
+$(TARGETLIB): $(DEFAULT_OFILES)
+
+$(TARGETLIB_PIC): $(PIC_OFILES)
+$(PIC_OFILES): SUB_CFLAGS = $(PICFLAG)
+
+$(TARGETLIB_NOASAN): $(NOASAN_OFILES)
+$(NOASAN_OFILES): SUB_CFLAGS = $(PICFLAG) $(NOASANFLAG)
+
+$(TARGETLIB) $(TARGETLIB_PIC) $(TARGETLIB_NOASAN):
+   -rm -f $@
+   $(AR) $(AR_FLAGS) $@ $^
+   $(RANLIB) $@
 
 $(TESTLIB): $(REQUIRED_OFILES) $(CONFIGURED_OFILES)
-rm -f $(TESTLIB)
@@ -444,7 +453,7 @@ maint-deps :
 mostlyclean: mostlyclean-subdir
-rm -rf *.$(objext) pic noasan core errs \#* *.E a.out
-rm -f errors dummy config.h stamp-*
-   -rm -f $(CONFIG_H) stamp-picdir stamp-noasandir
+   -rm -f $(CONFIG_H)
-rm -f libiberty.aux libiberty.cp libiberty.cps

Re: [PATCH][MIPS] Enable load-load/store-store bonding

2015-05-11 Thread sameera


On Monday 11 May 2015 05:43 PM, Matthew Fortune wrote:

Hi Sameera,

Sameera Deshpande sameera.deshpa...@imgtec.com writes:

Changelog:
gcc/
  * config/mips/mips.md (JOIN_MODE): New mode iterator.
  (join2_load_StoreJOIN_MODE:mode): New pattern.
  (join2_loadhi): Likewise.
  (define_peehole2): Add peephole2 patterns to join 2
HI/SI/SF/DF-mode
  load-load and store-stores.
  * config/mips/mips.opt (mload-store-pairs): New option.
  (TARGET_LOAD_STORE_PAIRS): New macro.
  * config/mips/mips.h (ENABLE_LD_ST_PAIRS): Likewise.
  * config/mips/mips-protos.h (mips_load_store_bonding_p): New
prototype.
  * config/mips/mips.c (mips_load_store_bonding_p): New function.

gcc/testsuite/
  * gcc.target/mips/p5600-bonding.c : New testcase to test
bonding.


Just 'New file.' is fine for the changelog.


diff --git a/gcc/testsuite/gcc.target/mips/p5600-bonding.c 
b/gcc/testsuite/gcc.target/mips/p5600-bonding.c
new file mode 100644
index 000..122b9f8
--- /dev/null
+++ b/gcc/testsuite/gcc.target/mips/p5600-bonding.c
@@ -0,0 +1,19 @@
+/* { dg-do compile } */
+/* { dg-options -dp -mtune=p5600  -mno-micromips -mno-mips16 } */
+/* { dg-skip-if Bonding needs peephole optimization. { *-*-* } { -O0 -O1 } { 
 } } */
+typedef int VINT32 __attribute__ ((vector_size((16;
+
+void memory_operation_fun2_si(void * __restrict src, void * __restrict dest, 
int num)


Code style applies for testcases too, return type on line above, space
after function name, line length.


+{
+VINT32 *vsrc = (VINT32 *)src;


Indentation.


+VINT32 *vdest = (VINT32 *)dest;
+int i;
+
+for (i = 0; i  num - 1; i+=2)
+{


Indentation


+  vdest[i] = (vdest[i] + vsrc[i]);


Unnecessary brackets.


+  vdest[i + 1] = vdest[i + 1] + vsrc[i + 1];
+}
+}
+/* { dg-final { scan-assembler join2_ } }  */
+


OK with those changes.

Thanks,
Matthew


Hi Matthew,

Thanks for the comments.
Please find attached updated patch.

I do not have permissions to apply the patch in GCC.
Can you please submit the patch for me?

Changelog:
gcc/
* config/mips/mips.md (JOIN_MODE): New mode iterator.
(join2_load_StoreJOIN_MODE:mode): New pattern.
(join2_loadhi): Likewise.
(define_peehole2): Add peephole2 patterns to join 2 HI/SI/SF/DF-mode
load-load and store-stores.
* config/mips/mips.opt (mload-store-pairs): New option.
(TARGET_LOAD_STORE_PAIRS): New macro.
* config/mips/mips.h (ENABLE_LD_ST_PAIRS): Likewise.
* config/mips/mips-protos.h (mips_load_store_bonding_p): New prototype.
* config/mips/mips.c (mips_load_store_bonding_p): New function.

gcc/testsuite/
* gcc.target/mips/p5600-bonding.c : New file.
diff --git a/gcc/config/mips/mips-protos.h b/gcc/config/mips/mips-protos.h
index b48e04f..244eb8d 100644
--- a/gcc/config/mips/mips-protos.h
+++ b/gcc/config/mips/mips-protos.h
@@ -360,6 +360,7 @@ extern bool mips_epilogue_uses (unsigned int);
 extern void mips_final_prescan_insn (rtx_insn *, rtx *, int);
 extern int mips_trampoline_code_size (void);
 extern void mips_function_profiler (FILE *);
+extern bool mips_load_store_bonding_p (rtx *, machine_mode, bool);
 
 typedef rtx (*mulsidi3_gen_fn) (rtx, rtx, rtx);
 #ifdef RTX_CODE
diff --git a/gcc/config/mips/mips.c b/gcc/config/mips/mips.c
index bf69850..4fc15c4 100644
--- a/gcc/config/mips/mips.c
+++ b/gcc/config/mips/mips.c
@@ -18241,6 +18241,66 @@ umips_load_store_pair_p_1 (bool load_p, bool swap_p,
   return true;
 }
 
+bool
+mips_load_store_bonding_p (rtx *operands, machine_mode mode, bool load_p)
+{
+  rtx reg1, reg2, mem1, mem2, base1, base2;
+  enum reg_class rc1, rc2;
+  HOST_WIDE_INT offset1, offset2;
+
+  if (load_p)
+{
+  reg1 = operands[0];
+  reg2 = operands[2];
+  mem1 = operands[1];
+  mem2 = operands[3];
+}
+  else
+{
+  reg1 = operands[1];
+  reg2 = operands[3];
+  mem1 = operands[0];
+  mem2 = operands[2];
+}
+
+  if (mips_address_insns (XEXP (mem1, 0), mode, false) == 0
+  || mips_address_insns (XEXP (mem2, 0), mode, false) == 0)
+return false;
+
+  mips_split_plus (XEXP (mem1, 0), base1, offset1);
+  mips_split_plus (XEXP (mem2, 0), base2, offset2);
+
+  /* Base regs do not match.  */
+  if (!REG_P (base1) || !rtx_equal_p (base1, base2))
+return false;
+
+  /* Either of the loads is clobbering base register.  It is legitimate to bond
+ loads if second load clobbers base register.  However, hardware does not
+ support such bonding.  */
+  if (load_p
+   (REGNO (reg1) == REGNO (base1)
+	  || (REGNO (reg2) == REGNO (base1
+return false;
+
+  /* Loading in same registers.  */
+  if (load_p
+   REGNO (reg1) == REGNO (reg2))
+return false;
+
+  /* The loads/stores are not of same type.  */
+  rc1 = REGNO_REG_CLASS (REGNO (reg1));
+  rc2 = REGNO_REG_CLASS (REGNO (reg2));
+  if (rc1 != rc2
+   !reg_class_subset_p (rc1, rc2)

Re: [PATCH] Fix memory leak in C++ pretty printer

2015-05-11 Thread Manuel López-Ibáñez


On 11/05/15 03:34, Patrick Palka wrote:

In gcc/cp/error.c we initialize the C++ pretty printer object twice:
first during statics initialization and later in a placement-new in
init_error().  This double-initialization causes a memory leak of about
7kb according to valgrind.  I don't see a reason to initialize the
object a second time so I elected to remove init_error().


I seem to remember there is some issue with the constructors when using static 
initialization that requires the placement-new. We also do the placement-new 
dance in the other FEs and the reason should be the same.


My preference would be to replace the static with a pointer and placement-new 
with proper new and delete, but see: 
https://gcc.gnu.org/ml/gcc-patches/2014-12/msg00910.html


If you change it here, please change it everywhere else where we use 
placement-new, such that all FEs are consistent on this.


Cheers,

Manuel.

Re: [RFC 0/6] Flags outputs for asms

2015-05-11 Thread Segher Boessenkool

On Fri, May 08, 2015 at 01:15:25PM -0700, Richard Henderson wrote:
 But it *does* try to match an intermediate pattern,
 
 (set (reg:CCGC 17 flags)
 (compare:CCGC (reg:CCGC 17 flags)
 (const_int 0 [0])))

Patch posted at http://gcc.gnu.org/ml/gcc-patches/2015-05/msg00918.html.


Segher

[PATCH 00/13] S/390 Implement support for IBM z13

2015-05-11 Thread Andreas Krebbel

The attached patchset adds support for the IBM z13 machine to the
GCC S/390 backend.

The machine has been announced recently:
http://www-03.ibm.com/press/us/en/pressrelease/45808.wss

IBM z/Architecture Principles of Operation
http://publibfi.boulder.ibm.com/epubs/pdf/dz9zr010.pdf

The required Binutils support is upstream since January:
https://sourceware.org/ml/binutils/2015-01/msg00197.html

Highlights from a toolchain perspective are:

- 32 128 bit vector registers (overlapping with the existing 16 64 bit
  floating point registers)
- vector double instructions
- vector integer instructions
- scalar vector instructions (allowing to have more floating point
  registers for scalar operations)
- vector string instructions

I would like to commit this patchset also to GCC 5 branch in order to
enable distros to pick it up more easily.

Andreas Krebbel (13):
  recog: Increased max number of alternatives.
  optabs: Fix vec_perm - V16QI middle end lowering.
  S/390 Fix secondary reload issue with store/load relative operands.
  S/390 Add -march/-mtune=z13 option.
  S/390 Vector base support.
  Vector base support - testcases
  S/390 Add vector scalar instruction support.
  S/390 zvector builtin support.
  S/390 Add zvector testcases.
  Testsuite These testcases require disabling hardware vector support
on S/390.
  Testsuite S/390 vector types are only 8 byte aligned.
  S/390 Vector ABI GNU Attribute.
  S/390 Invalid vector binary ops

 gcc/common/config/s390/s390-common.c   |3 +
 gcc/config.gcc |   26 +-
 gcc/config/s390/constraints.md |   28 +
 gcc/config/s390/predicates.md  |   12 +-
 gcc/config/s390/s390-builtin-types.def |  747 ++
 gcc/config/s390/s390-builtins.def  | 2486 
 gcc/config/s390/s390-builtins.h|  160 ++
 gcc/config/s390/s390-c.c   |  907 +++
 gcc/config/s390/s390-modes.def |   61 +
 gcc/config/s390/s390-opts.h|1 +
 gcc/config/s390/s390-protos.h  |   17 +
 gcc/config/s390/s390.c | 2314 +++---
 gcc/config/s390/s390.h |  220 +-
 gcc/config/s390/s390.md|  800 +--
 gcc/config/s390/s390.opt   |   11 +
 gcc/config/s390/s390intrin.h   |3 +
 gcc/config/s390/t-s390 |   27 +
 gcc/config/s390/vecintrin.h|  311 +++
 gcc/config/s390/vector.md  | 1228 ++
 gcc/config/s390/vx-builtins.md | 2081 
 gcc/configure  |   36 +
 gcc/configure.ac   |7 +
 gcc/optabs.c   |   18 +-
 gcc/recog.h|2 +-
 gcc/testsuite/gcc.dg/tree-ssa/gen-vect-11b.c   |1 +
 gcc/testsuite/gcc.dg/tree-ssa/gen-vect-11c.c   |1 +
 gcc/testsuite/gcc.target/s390/s390.exp |   18 +
 gcc/testsuite/gcc.target/s390/vector/vec-abi-1.c   |   18 +
 gcc/testsuite/gcc.target/s390/vector/vec-abi-2.c   |   15 +
 gcc/testsuite/gcc.target/s390/vector/vec-abi-3.c   |  101 +
 gcc/testsuite/gcc.target/s390/vector/vec-abi-4.c   |   19 +
 .../gcc.target/s390/vector/vec-abi-align-1.c   |   48 +
 .../gcc.target/s390/vector/vec-abi-attr-1.c|   18 +
 .../gcc.target/s390/vector/vec-abi-attr-2.c|   53 +
 .../gcc.target/s390/vector/vec-abi-attr-3.c|   18 +
 .../gcc.target/s390/vector/vec-abi-attr-4.c|   17 +
 .../gcc.target/s390/vector/vec-abi-attr-5.c|   19 +
 .../gcc.target/s390/vector/vec-abi-attr-6.c|   24 +
 .../gcc.target/s390/vector/vec-abi-single-1.c  |   24 +
 .../gcc.target/s390/vector/vec-abi-single-2.c  |   12 +
 .../gcc.target/s390/vector/vec-abi-struct-1.c  |   37 +
 .../gcc.target/s390/vector/vec-abi-vararg-1.c  |   60 +
 .../gcc.target/s390/vector/vec-abi-vararg-2.c  |   18 +
 .../gcc.target/s390/vector/vec-clobber-1.c |   38 +
 gcc/testsuite/gcc.target/s390/vector/vec-cmp-1.c   |   45 +
 gcc/testsuite/gcc.target/s390/vector/vec-cmp-2.c   |   38 +
 .../s390/vector/vec-dbl-math-compile-1.c   |   48 +
 .../gcc.target/s390/vector/vec-genbytemask-1.c |   70 +
 .../gcc.target/s390/vector/vec-genbytemask-2.c |   46 +
 .../gcc.target/s390/vector/vec-genmask-1.c |   70 +
 .../gcc.target/s390/vector/vec-genmask-2.c |   46 +
 gcc/testsuite/gcc.target/s390/vector/vec-init-1.c  |   68 +
 .../s390/vector/vec-int-math-compile-1.c   |   40 +
 .../gcc.target/s390/vector/vec-scalar-cmp-1.c  |   49 +
 gcc/testsuite/gcc.target/s390/vector/vec-shift-1.c |  108 +
 gcc/testsuite/gcc.target/s390/vector/vec-sub-1.c   |   51 +

[PATCH 03/13] S/390 Fix secondary reload issue with store/load relative operands.

2015-05-11 Thread Andreas Krebbel

We need a scratch register for loading from or storing to a symbolic
memory reference where we cannot use the load/store relative
instructions for.  However, the check currently fails to handle
floating point modes in GPRs correctly.

gcc/
* config/s390/s390.c (s390_secondary_reload): Fix check for
  load/store relative.
---
 gcc/config/s390/s390.c |   16 +++-
 1 file changed, 7 insertions(+), 9 deletions(-)

diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
index 7d16048..cc37618 100644
--- a/gcc/config/s390/s390.c
+++ b/gcc/config/s390/s390.c
@@ -3141,17 +3141,15 @@ s390_secondary_reload (bool in_p, rtx x, reg_class_t 
rclass_i,
sri-icode = ((mode == DImode) ? CODE_FOR_reloaddi_larl_odd_addend_z10
  : CODE_FOR_reloadsi_larl_odd_addend_z10);
 
-  /* On z10 we need a scratch register when moving QI, TI or floating
-point mode values from or to a memory location with a SYMBOL_REF
-or if the symref addend of a SI or DI move is not aligned to the
-width of the access.  */
+  /* Handle all the (mem (symref)) accesses we cannot use the z10
+instructions for.  */
   if (MEM_P (x)
   s390_loadrelative_operand_p (XEXP (x, 0), NULL, NULL)
-  (mode == QImode || mode == TImode || FLOAT_MODE_P (mode)
- || (!TARGET_ZARCH  mode == DImode)
- || ((mode == HImode || mode == SImode || mode == DImode)
-  (!s390_check_symref_alignment (XEXP (x, 0),
-   GET_MODE_SIZE (mode))
+  (mode == QImode
+ || !reg_classes_intersect_p (GENERAL_REGS, rclass)
+ || GET_MODE_SIZE (mode)  UNITS_PER_WORD
+ || !s390_check_symref_alignment (XEXP (x, 0),
+  GET_MODE_SIZE (mode
{
 #define __SECONDARY_RELOAD_CASE(M,m)   \
  case M##mode: \
-- 
1.7.9.5

[PATCH 01/13] recog: Increased max number of alternatives.

2015-05-11 Thread Andreas Krebbel

With the vector facility support z13 mov patterns have more than 30
alternatives.

gcc/
* recog.h: Increase MAX_RECOG_ALTERNATIVES.
---
 gcc/recog.h |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/recog.h b/gcc/recog.h
index 8a38b26..4d8ca0c 100644
--- a/gcc/recog.h
+++ b/gcc/recog.h
@@ -23,7 +23,7 @@ along with GCC; see the file COPYING3.  If not see
 /* Random number that should be large enough for all purposes.  Also define
a type that has at least MAX_RECOG_ALTERNATIVES + 1 bits, with the extra
bit giving an invalid value that can be used to mean uninitialized.  */
-#define MAX_RECOG_ALTERNATIVES 30
+#define MAX_RECOG_ALTERNATIVES 35
 typedef unsigned int alternative_mask;
 
 /* A mask of all alternatives.  */
-- 
1.7.9.5

[PATCH 05/13] S/390 Vector base support.

2015-05-11 Thread Andreas Krebbel

gcc/
* config/s390/constraints.md (j00, jm1, jxx, jyy, v): New
constraints.
* config/s390/predicates.md (const0_operand, constm1_operand)
(constable_operand): Accept vector operands.
* config/s390/s390-modes.def: Add supported vector modes.
* config/s390/s390-protos.h (s390_cannot_change_mode_class)
(s390_function_arg_vector, s390_contiguous_bitmask_vector_p)
(s390_bytemask_vector_p, s390_expand_vec_strlen)
(s390_expand_vec_compare, s390_expand_vcond)
(s390_expand_vec_init): Add prototypes.
* config/s390/s390.c (VEC_ARG_NUM_REG): New macro.
(s390_vector_mode_supported_p): New function.
(s390_contiguous_bitmask_p): Mask out the irrelevant bits.
(s390_contiguous_bitmask_vector_p): New function.
(s390_bytemask_vector_p): New function.
(s390_split_ok_p): Vector regs don't work either.
(regclass_map): Add VEC_REGS.
(s390_legitimate_constant_p): Handle vector constants.
(s390_cannot_force_const_mem): Handle CONST_VECTOR.
(legitimate_reload_vector_constant_p): New function.
(s390_preferred_reload_class): Handle CONST_VECTOR.
(s390_reload_symref_address):  Likewise.
(s390_secondary_reload): Vector memory instructions only support
short displacements.  Rename reload*_nonoffmem* to reload*_la*.
(s390_emit_ccraw_jump): New function.
(s390_expand_vec_strlen): New function.
(s390_expand_vec_compare): New function.
(s390_expand_vcond): New function.
(s390_expand_vec_init): New function.
(s390_dwarf_frame_reg_mode): New function.
(print_operand): Handle addresses with 'O' and 'R' constraints.
(NR_C_MODES, constant_modes): Add vector modes.
(s390_output_pool_entry): Handle vector constants.
(s390_hard_regno_mode_ok): Handle vector registers.
(s390_class_max_nregs): Likewise.
(s390_cannot_change_mode_class): New function.
(s390_invalid_arg_for_unprototyped_fn): New function.
(s390_function_arg_vector): New function.
(s390_function_arg_float): Remove size variable.
(s390_pass_by_reference): Handle vector arguments.
(s390_function_arg_advance): Likewise.
(s390_function_arg): Likewise.
(s390_return_in_memory): Vector values are returned in a VR if
possible.
(s390_function_and_libcall_value): Handle vector arguments.
(s390_gimplify_va_arg): Likewise.
(s390_call_saved_register_used): Consider the arguments named.
(s390_conditional_register_usage): Disable v16-v31 for non-vec
targets.
(s390_preferred_simd_mode): New function.
(s390_support_vector_misalignment): New function.
(s390_vector_alignment): New function.
(TARGET_STRICT_ARGUMENT_NAMING, TARGET_DWARF_FRAME_REG_MODE)
(TARGET_VECTOR_MODE_SUPPORTED_P)
(TARGET_INVALID_ARG_FOR_UNPROTOTYPED_FN)
(TARGET_VECTORIZE_PREFERRED_SIMD_MODE)
(TARGET_VECTORIZE_SUPPORT_VECTOR_MISALIGNMENT)
(TARGET_VECTOR_ALIGNMENT): Define target macro.
* config/s390/s390.h (FUNCTION_ARG_PADDING): Define macro.
(FIRST_PSEUDO_REGISTER): Increase value.
(VECTOR_NOFP_REGNO_P, VECTOR_REGNO_P, VECTOR_NOFP_REG_P)
(VECTOR_REG_P): Define macros.
(FIXED_REGISTERS, CALL_USED_REGISTERS)
(CALL_REALLY_USED_REGISTERS, REG_ALLOC_ORDER)
(HARD_REGNO_CALL_PART_CLOBBERED, REG_CLASS_NAMES)
(FUNCTION_ARG_REGNO_P, FUNCTION_VALUE_REGNO_P, REGISTER_NAMES):
Add vector registers.
(CANNOT_CHANGE_MODE_CLASS): Call C function.
(enum reg_class): Add VEC_REGS, ADDR_VEC_REGS, GENERAL_VEC_REGS.
(SECONDARY_MEMORY_NEEDED): Allow SF-SI mode moves without
memory.
(DBX_REGISTER_NUMBER, FIRST_VEC_ARG_REGNO, LAST_VEC_ARG_REGNO)
(SHORT_DISP_IN_RANGE, VECTOR_STORE_FLAG_VALUE): Define macro.
* config/s390/s390.md (UNSPEC_VEC_*): New constants.
(VR*_REGNUM): New constants.
(ALL): New mode iterator.
(INTALL): Remove mode iterator.
Include vector.md.
(movti): Implement TImode moves for VRs.
Disable TImode splitter for VR targets.
Implement splitting TImode GPR-VR moves.
(reload*_tomem_z10, reload*_toreg_z10): Replace INTALL with ALL.
(reloadmode_nonoffmem_in, reloadmode_nonoffmem_out): Rename to
reloadmode_la_in, reloadmode_la_out.
(*movdi_64, *movsi_zarch, *movhi, *movqi, *movmode_64dfp)
(*movmode_64, *movmode_31): Add vector instructions.
(TD/TF mode splitter): Enable for GPRs only (formerly !FP).
(movmode SF SD): Prefer lder, lde for loading.
Add lrl and strl instructions.
Add vector instructions.
(strlenmode): Rename old strlenmode to strlen_srstmode.
Call s390_expand_vec_strlen on z13.
(*cc_to_int): Change

[PATCH 06/13] Vector base support - testcases

2015-05-11 Thread Andreas Krebbel

gcc/testsuite/
* gcc.target/s390/s390.exp
(check_effective_target_vector): New check.
* gcc.target/s390/vector/vec-abi-1.c: New test.
* gcc.target/s390/vector/vec-abi-2.c: New test.
* gcc.target/s390/vector/vec-abi-3.c: New test.
* gcc.target/s390/vector/vec-abi-4.c: New test.
* gcc.target/s390/vector/vec-abi-align-1.c: New test.
* gcc.target/s390/vector/vec-abi-single-1.c: New test.
* gcc.target/s390/vector/vec-abi-single-2.c: New test.
* gcc.target/s390/vector/vec-abi-struct-1.c: New test.
* gcc.target/s390/vector/vec-abi-vararg-1.c: New test.
* gcc.target/s390/vector/vec-abi-vararg-2.c: New test.
* gcc.target/s390/vector/vec-clobber-1.c: New test.
* gcc.target/s390/vector/vec-cmp-1.c: New test.
* gcc.target/s390/vector/vec-cmp-2.c: New test.
* gcc.target/s390/vector/vec-dbl-math-compile-1.c: New test.
* gcc.target/s390/vector/vec-genbytemask-1.c: New test.
* gcc.target/s390/vector/vec-genbytemask-2.c: New test.
* gcc.target/s390/vector/vec-genmask-1.c: New test.
* gcc.target/s390/vector/vec-genmask-2.c: New test.
* gcc.target/s390/vector/vec-init-1.c: New test.
* gcc.target/s390/vector/vec-int-math-compile-1.c: New test.
* gcc.target/s390/vector/vec-shift-1.c: New test.
* gcc.target/s390/vector/vec-sub-1.c: New test.
---
 gcc/testsuite/gcc.target/s390/s390.exp |   18 
 gcc/testsuite/gcc.target/s390/vector/vec-abi-1.c   |   17 +++
 gcc/testsuite/gcc.target/s390/vector/vec-abi-2.c   |   15 +++
 gcc/testsuite/gcc.target/s390/vector/vec-abi-3.c   |  101 ++
 gcc/testsuite/gcc.target/s390/vector/vec-abi-4.c   |   19 
 .../gcc.target/s390/vector/vec-abi-align-1.c   |   48 +
 .../gcc.target/s390/vector/vec-abi-single-1.c  |   24 +
 .../gcc.target/s390/vector/vec-abi-single-2.c  |   12 +++
 .../gcc.target/s390/vector/vec-abi-struct-1.c  |   37 +++
 .../gcc.target/s390/vector/vec-abi-vararg-1.c  |   60 +++
 .../gcc.target/s390/vector/vec-abi-vararg-2.c  |   18 
 .../gcc.target/s390/vector/vec-clobber-1.c |   38 +++
 gcc/testsuite/gcc.target/s390/vector/vec-cmp-1.c   |   45 
 gcc/testsuite/gcc.target/s390/vector/vec-cmp-2.c   |   38 +++
 .../s390/vector/vec-dbl-math-compile-1.c   |   48 +
 .../gcc.target/s390/vector/vec-genbytemask-1.c |   70 +
 .../gcc.target/s390/vector/vec-genbytemask-2.c |   46 +
 .../gcc.target/s390/vector/vec-genmask-1.c |   70 +
 .../gcc.target/s390/vector/vec-genmask-2.c |   46 +
 gcc/testsuite/gcc.target/s390/vector/vec-init-1.c  |   68 
 .../s390/vector/vec-int-math-compile-1.c   |   40 
 gcc/testsuite/gcc.target/s390/vector/vec-shift-1.c |  108 
 gcc/testsuite/gcc.target/s390/vector/vec-sub-1.c   |   51 +
 23 files changed, 1037 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/s390/vector/vec-abi-1.c
 create mode 100644 gcc/testsuite/gcc.target/s390/vector/vec-abi-2.c
 create mode 100644 gcc/testsuite/gcc.target/s390/vector/vec-abi-3.c
 create mode 100644 gcc/testsuite/gcc.target/s390/vector/vec-abi-4.c
 create mode 100644 gcc/testsuite/gcc.target/s390/vector/vec-abi-align-1.c
 create mode 100644 gcc/testsuite/gcc.target/s390/vector/vec-abi-single-1.c
 create mode 100644 gcc/testsuite/gcc.target/s390/vector/vec-abi-single-2.c
 create mode 100644 gcc/testsuite/gcc.target/s390/vector/vec-abi-struct-1.c
 create mode 100644 gcc/testsuite/gcc.target/s390/vector/vec-abi-vararg-1.c
 create mode 100644 gcc/testsuite/gcc.target/s390/vector/vec-abi-vararg-2.c
 create mode 100644 gcc/testsuite/gcc.target/s390/vector/vec-clobber-1.c
 create mode 100644 gcc/testsuite/gcc.target/s390/vector/vec-cmp-1.c
 create mode 100644 gcc/testsuite/gcc.target/s390/vector/vec-cmp-2.c
 create mode 100644 
gcc/testsuite/gcc.target/s390/vector/vec-dbl-math-compile-1.c
 create mode 100644 gcc/testsuite/gcc.target/s390/vector/vec-genbytemask-1.c
 create mode 100644 gcc/testsuite/gcc.target/s390/vector/vec-genbytemask-2.c
 create mode 100644 gcc/testsuite/gcc.target/s390/vector/vec-genmask-1.c
 create mode 100644 gcc/testsuite/gcc.target/s390/vector/vec-genmask-2.c
 create mode 100644 gcc/testsuite/gcc.target/s390/vector/vec-init-1.c
 create mode 100644 
gcc/testsuite/gcc.target/s390/vector/vec-int-math-compile-1.c
 create mode 100644 gcc/testsuite/gcc.target/s390/vector/vec-shift-1.c
 create mode 100644 gcc/testsuite/gcc.target/s390/vector/vec-sub-1.c

diff --git a/gcc/testsuite/gcc.target/s390/s390.exp 
b/gcc/testsuite/gcc.target/s390/s390.exp
index 431e2c0..eb1d73b 100644
--- a/gcc/testsuite/gcc.target/s390/s390.exp
+++ b/gcc/testsuite/gcc.target/s390/s390.exp
@@ -37,6 +37,21 @@ proc check_effective_target_htm { } {
 }] -march=zEC12 -mzarch ] } { return 0 } else { return

[PATCH 07/13] S/390 Add vector scalar instruction support.

2015-05-11 Thread Andreas Krebbel

With this patch GCC makes use of the vector instruction which are
available in single element mode.  By using these instructions scalar
double operations can use 32 registers.

gcc/
* config/s390/s390-modes.def: Add new modes CCVEQ, CCVFH, and
CCVFHE.
* config/s390/s390.c (s390_match_ccmode_set): Handle new modes.
(s390_select_ccmode): Likewise.
(s390_canonicalize_comparison): Swap operands if necessary.
(s390_expand_vec_compare_scalar): Expand DFmode compare using
single element vector instructions.
(s390_emit_compare): Call s390_expand_vec_compare_scalar.
(s390_branch_condition_mask): Generate CC masks for the new modes.
* config/s390/s390.md (v0, vf, vd): New mode attributes.
(VFCMP, asm_fcmp, insn_cmp): New mode iterator and attributes.
(*vec_cmpinsn_cmpdf_cconly, *fixuns_truncdfdi2_z13)
(*fix_truncBFP:modeGPR:mode2_bfp, *floatunsdidf2_z13)
(*floatunsGPR:modeFP:mode2, *extendsfdf2_z13)
(*extendDSF:modeBFP:mode2): New insn definition.
(fix_truncBFP:modeGPR:mode2_bfp, loatunsGPR:modeFP:mode2)
(extendDSF:modeBFP:mode2): Turn into expander.
(floatdimode2, truncdfsf2, addmode3, submode3, mulmode3)
(divmode3, *negmode2, *absmode2, *negabsmode2)
(sqrtmode2): Add vector instruction.

gcc/testsuite/
* gcc.target/s390/vector/vec-scalar-cmp-1.c: New test.
---
 gcc/config/s390/s390-modes.def |   10 +
 gcc/config/s390/s390.c |  131 -
 gcc/config/s390/s390.md|  304 ++--
 .../gcc.target/s390/vector/vec-scalar-cmp-1.c  |   49 
 4 files changed, 403 insertions(+), 91 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/s390/vector/vec-scalar-cmp-1.c

diff --git a/gcc/config/s390/s390-modes.def b/gcc/config/s390/s390-modes.def
index a40559e..26c0a81 100644
--- a/gcc/config/s390/s390-modes.def
+++ b/gcc/config/s390/s390-modes.def
@@ -84,7 +84,12 @@ Requested mode- Destination CC register mode
 CCS, CCU, CCT, CCSR, CCUR - CCZ
 CCA   - CCAP, CCAN
 
+Vector comparison modes
 
+CCVEQEQ  --   NE (VCEQ)
+
+CCVFHGT  --   UNLE   (VFCH)
+CCVFHE   GE  --   UNLT   (VFCHE)
 *** Comments ***
 
 CCAP, CCAN
@@ -182,6 +187,11 @@ CC_MODE (CCT2);
 CC_MODE (CCT3);
 CC_MODE (CCRAW);
 
+CC_MODE (CCVEQ);
+CC_MODE (CCVFH);
+CC_MODE (CCVFHE);
+
+
 /* Vector modes.  */
 
 VECTOR_MODES (INT, 2);/* V2QI */
diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
index 11fed14..848cc0c 100644
--- a/gcc/config/s390/s390.c
+++ b/gcc/config/s390/s390.c
@@ -681,6 +681,9 @@ s390_match_ccmode_set (rtx set, machine_mode req_mode)
 case CCT1mode:
 case CCT2mode:
 case CCT3mode:
+case CCVEQmode:
+case CCVFHmode:
+case CCVFHEmode:
   if (req_mode != set_mode)
 return 0;
   break;
@@ -781,6 +784,29 @@ s390_tm_ccmode (rtx op1, rtx op2, bool mixed)
 machine_mode
 s390_select_ccmode (enum rtx_code code, rtx op0, rtx op1)
 {
+  if (TARGET_VX
+   register_operand (op0, DFmode)
+   register_operand (op1, DFmode))
+{
+  /* LT, LE, UNGT, UNGE require swapping OP0 and OP1.  Either
+s390_emit_compare or s390_canonicalize_comparison will take
+care of it.  */
+  switch (code)
+   {
+   case EQ:
+   case NE:
+ return CCVEQmode;
+   case GT:
+   case UNLE:
+ return CCVFHmode;
+   case GE:
+   case UNLT:
+ return CCVFHEmode;
+   default:
+ ;
+   }
+}
+
   switch (code)
 {
   case EQ:
@@ -1058,8 +1084,74 @@ s390_canonicalize_comparison (int *code, rtx *op0, rtx 
*op1,
   rtx tem = *op0; *op0 = *op1; *op1 = tem;
   *code = (int)swap_condition ((enum rtx_code)*code);
 }
+
+  /* Using the scalar variants of vector instructions for 64 bit FP
+ comparisons might require swapping the operands.  */
+  if (TARGET_VX
+   register_operand (*op0, DFmode)
+   register_operand (*op1, DFmode)
+   (*code == LT || *code == LE || *code == UNGT || *code == UNGE))
+{
+  rtx tmp;
+
+  switch (*code)
+   {
+   case LT:   *code = GT; break;
+   case LE:   *code = GE; break;
+   case UNGT: *code = UNLE; break;
+   case UNGE: *code = UNLT; break;
+   default: ;
+   }
+  tmp = *op0; *op0 = *op1; *op1 = tmp;
+}
 }
 
+/* Helper function for s390_emit_compare.  If possible emit a 64 bit
+   FP compare using the single element variant of vector instructions.
+   Replace CODE with the comparison code to be used in the CC reg
+   compare and return the condition code register RTX in CC.  */
+
+static bool
+s390_expand_vec_compare_scalar (enum rtx_code *code, rtx cmp1, rtx cmp2,
+   rtx *cc)
+{
+  machine_mode

[PATCH 02/13] optabs: Fix vec_perm - V16QI middle end lowering.

2015-05-11 Thread Andreas Krebbel

The current implementation re-uses the location of the selection
pattern to generate a new one.  This fails if the pattern resides in a
read-only location.  With the patch a new temporary register is
allocated for that purpose.

gcc/
* optabs.c (expand_vec_perm): Allocate a temp reg for the new
  select pattern.
---
 gcc/optabs.c |   18 +++---
 1 file changed, 11 insertions(+), 7 deletions(-)

diff --git a/gcc/optabs.c b/gcc/optabs.c
index 983c8d9..8926efa 100644
--- a/gcc/optabs.c
+++ b/gcc/optabs.c
@@ -6784,14 +6784,18 @@ expand_vec_perm (machine_mode mode, rtx v0, rtx v1, rtx 
sel, rtx target)
 {
   /* Multiply each element by its byte size.  */
   machine_mode selmode = GET_MODE (sel);
+  /* We cannot re-use SEL as a temp operand since it might by in
+read-only storage.  */
+  rtx sel_reg = gen_reg_rtx (selmode);
+
   if (u == 2)
-   sel = expand_simple_binop (selmode, PLUS, sel, sel,
-  sel, 0, OPTAB_DIRECT);
+   sel_reg = expand_simple_binop (selmode, PLUS, sel, sel,
+  sel_reg, 0, OPTAB_DIRECT);
   else
-   sel = expand_simple_binop (selmode, ASHIFT, sel,
-  GEN_INT (exact_log2 (u)),
-  sel, 0, OPTAB_DIRECT);
-  gcc_assert (sel != NULL);
+   sel_reg = expand_simple_binop (selmode, ASHIFT, sel,
+  GEN_INT (exact_log2 (u)),
+  sel_reg, 0, OPTAB_DIRECT);
+  gcc_assert (sel_reg != NULL);
 
   /* Broadcast the low byte each element into each of its bytes.  */
   vec = rtvec_alloc (w);
@@ -6803,7 +6807,7 @@ expand_vec_perm (machine_mode mode, rtx v0, rtx v1, rtx 
sel, rtx target)
  RTVEC_ELT (vec, i) = GEN_INT (this_e);
}
   tmp = gen_rtx_CONST_VECTOR (qimode, vec);
-  sel = gen_lowpart (qimode, sel);
+  sel = gen_lowpart (qimode, sel_reg);
   sel = expand_vec_perm (qimode, sel, sel, tmp, NULL);
   gcc_assert (sel != NULL);
 
-- 
1.7.9.5

[RFC 12/13] S/390 Vector ABI GNU Attribute.

2015-05-11 Thread Andreas Krebbel

With this patch .gnu_attribute is used to mark binaries with a vector
ABI tag.  This is required since the z13 vector support breaks the ABI
of existing vector_size attribute generated vector types:

1. vector_size(16) and bigger vectors are aligned to 8 byte
boundaries (formerly vectors were always naturally aligned)

2. vector_size(16) or smaller vectors are passed via VR if available
or by value on the stack (formerly vector were passed on the stack by
reference).

The .gnu_attribute will be used by ld to emit a warning if binaries
with incompatible ABIs are being linked together:
https://sourceware.org/ml/binutils/2015-04/msg00316.html

And it will be used by GDB to perform inferior function calls using a
vector ABI which fits to the binary being debugged:
https://sourceware.org/ml/gdb-patches/2015-04/msg00833.html

The current implementation tries to only set the attribute if the
vector types are really used in ABI relevant contexts in order to
avoid false positives during linking.

However, this unfortunately has some limitations like in the following
case where an ABI relevant context cannot be detected properly:

typedef int __attribute__((vector_size(16))) v4si;
struct A
{
  char x;
  v4si y;
};
char a[sizeof(struct A)];

The number of elements in a depends on the ABI (24 with -mvx and 32
with -mno-vx).  However, the implementation is not able to detect this
since the struct type is not used anywhere else and consequently does
not survive until the checking code is able to see it.

Ideas about how to improve the implementation without creating too
many false postives are welcome.

In particular we do not want to set the attribute for local uses of
vector types as they would be natural for ifunc optimizations.

gcc/
* config/s390/s390.c (s390_vector_abi): New variable definition.
(s390_check_type_for_vector_abi): New function.
(TARGET_ASM_FILE_END): New macro definition.
(s390_asm_file_end): New function.
(s390_function_arg): Call s390_check_type_for_vector_abi.
(s390_gimplify_va_arg): Likewise.
* configure: Regenerate.
* configure.ac: Check for .gnu_attribute Binutils feature.

gcc/testsuite/
* gcc.target/s390/vector/vec-abi-1.c: Add gnu attribute check.
* gcc.target/s390/vector/vec-abi-attr-1.c: New test.
* gcc.target/s390/vector/vec-abi-attr-2.c: New test.
* gcc.target/s390/vector/vec-abi-attr-3.c: New test.
* gcc.target/s390/vector/vec-abi-attr-4.c: New test.
* gcc.target/s390/vector/vec-abi-attr-5.c: New test.
* gcc.target/s390/vector/vec-abi-attr-6.c: New test.
---
 gcc/config/s390/s390.c |  120 
 gcc/configure  |   36 ++
 gcc/configure.ac   |7 ++
 gcc/testsuite/gcc.target/s390/vector/vec-abi-1.c   |1 +
 .../gcc.target/s390/vector/vec-abi-attr-1.c|   18 +++
 .../gcc.target/s390/vector/vec-abi-attr-2.c|   53 +
 .../gcc.target/s390/vector/vec-abi-attr-3.c|   18 +++
 .../gcc.target/s390/vector/vec-abi-attr-4.c|   17 +++
 .../gcc.target/s390/vector/vec-abi-attr-5.c|   19 
 .../gcc.target/s390/vector/vec-abi-attr-6.c|   24 
 10 files changed, 313 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/s390/vector/vec-abi-attr-1.c
 create mode 100644 gcc/testsuite/gcc.target/s390/vector/vec-abi-attr-2.c
 create mode 100644 gcc/testsuite/gcc.target/s390/vector/vec-abi-attr-3.c
 create mode 100644 gcc/testsuite/gcc.target/s390/vector/vec-abi-attr-4.c
 create mode 100644 gcc/testsuite/gcc.target/s390/vector/vec-abi-attr-5.c
 create mode 100644 gcc/testsuite/gcc.target/s390/vector/vec-abi-attr-6.c

diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
index e682516..e1ae1ed 100644
--- a/gcc/config/s390/s390.c
+++ b/gcc/config/s390/s390.c
@@ -466,6 +466,97 @@ struct GTY(()) machine_function
 #define PREDICT_DISTANCE (TARGET_Z10 ? 384 : 2048)
 
 
+/* Indicate which ABI has been used for passing vector args.
+   0 - no vector type arguments have been passed where the ABI is relevant
+   1 - the old ABI has been used
+   2 - a vector type argument has been passed either in a vector register
+   or on the stack by value  */
+static int s390_vector_abi = 0;
+
+/* Set the vector ABI marker if TYPE is subject to the vector ABI
+   switch.  The vector ABI affects only vector data types.  There are
+   two aspects of the vector ABI relevant here:
+
+   1. vectors = 16 bytes have an alignment of 8 bytes with the new
+   ABI and natural alignment with the old.
+
+   2. vector = 16 bytes are passed in VRs or by value on the stack
+   with the new ABI but by reference on the stack with the old.
+
+   If ARG_P is true TYPE is used for a function argument or return
+   value.  The ABI marker then is set for all vector data types.  If
+   ARG_P is false only type 1 vectors are being checked.  */
+

[PATCH 10/13] Testsuite These testcases require disabling hardware vector support on S/390.

2015-05-11 Thread Andreas Krebbel

gcc/testsuite/
* gcc.dg/tree-ssa/gen-vect-11b.c: Disable vector
  instructions on s390*.
* gcc.dg/tree-ssa/gen-vect-11c.c: Likewise.
---
 gcc/testsuite/gcc.dg/tree-ssa/gen-vect-11b.c |1 +
 gcc/testsuite/gcc.dg/tree-ssa/gen-vect-11c.c |1 +
 2 files changed, 2 insertions(+)

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/gen-vect-11b.c 
b/gcc/testsuite/gcc.dg/tree-ssa/gen-vect-11b.c
index 50dea9c..41d0e0c 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/gen-vect-11b.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/gen-vect-11b.c
@@ -1,5 +1,6 @@
 /* { dg-do run { target vect_cmdline_needed } } */
 /* { dg-options -O2 -ftree-vectorize -fdump-tree-vect-details } */
+/* { dg-options -O2 -ftree-vectorize -fdump-tree-vect-details -mno-vx { 
target { s390*-*-* } } } */
 /* { dg-options -O2 -ftree-vectorize -fdump-tree-vect-details -mno-sse { 
target { i?86-*-* x86_64-*-* } } } */
 
 #include stdlib.h
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/gen-vect-11c.c 
b/gcc/testsuite/gcc.dg/tree-ssa/gen-vect-11c.c
index f3ada99..cf0aef7 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/gen-vect-11c.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/gen-vect-11c.c
@@ -1,5 +1,6 @@
 /* { dg-do run { target vect_cmdline_needed } } */
 /* { dg-options -O2 -ftree-vectorize -fdump-tree-vect-details } */
+/* { dg-options -O2 -ftree-vectorize -fdump-tree-vect-details -mno-vx { 
target { s390*-*-* } } } */
 /* { dg-options -O2 -ftree-vectorize -fdump-tree-vect-details -mno-sse { 
target { i?86-*-* x86_64-*-* } } } */
 
 #include stdlib.h
-- 
1.7.9.5

[PATCH 11/13] Testsuite S/390 vector types are only 8 byte aligned.

2015-05-11 Thread Andreas Krebbel

gcc/testsuite/
* lib/target-supports.exp: Vector do not always have natural
  alignment on s390*.
---
 gcc/testsuite/lib/target-supports.exp |3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/gcc/testsuite/lib/target-supports.exp 
b/gcc/testsuite/lib/target-supports.exp
index c5d0ffe..155cefa 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -4347,7 +4347,8 @@ proc check_effective_target_vect_natural_alignment { } {
 } else {
 set et_vect_natural_alignment_saved 1
 if { [check_effective_target_arm_eabi]
-|| [istarget nvptx-*-*] } {
+|| [istarget nvptx-*-*]
+|| [istarget s390*-*-*] } {
 set et_vect_natural_alignment_saved 0
 }
 }
-- 
1.7.9.5

[PATCH 13/13] S/390 Invalid vector binary ops

2015-05-11 Thread Andreas Krebbel

This is a first try to implement at least some of the requirements
regarding the vector bool type documented for IBM XLC.

With this patch error messages will be issued for invalid uses of
vector bool types in binary operators.

vector bool types are being marked opaque in order to prevent the
front-end from complaining about vector bool long vs vector bool
long long combinations on 64 bit.  The opaque flag basically
suppresses any type checking. However, we still want vector bool to be
accepted only in contexts specified in the documentation (to be
published soon).  Implementing the invalid binary op hook does this
for binary operators at least.  But this is far from being complete :(

gcc/
* config/s390/s390.c (s390_vector_bool_type_p): New function.
(s390_invalid_binary_op): New function.
(TARGET_INVALID_BINARY_OP): Define macro.
---
 gcc/config/s390/s390.c |   61 
 1 file changed, 61 insertions(+)

diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
index e1ae1ed..a64836e 100644
--- a/gcc/config/s390/s390.c
+++ b/gcc/config/s390/s390.c
@@ -13764,6 +13764,64 @@ s390_asm_file_end (void)
 #endif
 }
 
+/* Return true if TYPE is a vector bool type.  */
+static inline bool
+s390_vector_bool_type_p (const_tree type)
+{
+  return TYPE_VECTOR_OPAQUE (type);
+}
+
+/* Return the diagnostic message string if the binary operation OP is
+   not permitted on TYPE1 and TYPE2, NULL otherwise.  */
+static const char*
+s390_invalid_binary_op (int op ATTRIBUTE_UNUSED, const_tree type1, const_tree 
type2)
+{
+  bool bool1_p, bool2_p;
+  bool plusminus_p;
+  bool muldiv_p;
+  bool compare_p;
+  machine_mode mode1, mode2;
+
+  if (!TARGET_ZVECTOR)
+return NULL;
+
+  if (!VECTOR_TYPE_P (type1) || !VECTOR_TYPE_P (type2))
+return NULL;
+
+  bool1_p = s390_vector_bool_type_p (type1);
+  bool2_p = s390_vector_bool_type_p (type2);
+
+  /* Mixing signed and unsigned types is forbidden for all
+ operators.  */
+  if (!bool1_p  !bool2_p
+   TYPE_UNSIGNED (type1) != TYPE_UNSIGNED (type2))
+return N_(types differ in signess);
+
+  plusminus_p = (op == PLUS_EXPR || op == MINUS_EXPR);
+  muldiv_p = (op == MULT_EXPR || op == RDIV_EXPR || op == TRUNC_DIV_EXPR
+ || op == CEIL_DIV_EXPR || op == FLOOR_DIV_EXPR
+ || op == ROUND_DIV_EXPR);
+  compare_p = (op == LT_EXPR || op == LE_EXPR || op == GT_EXPR || op == GE_EXPR
+  || op == EQ_EXPR || op == NE_EXPR);
+
+  if (bool1_p  bool2_p  (plusminus_p || muldiv_p))
+return N_(binary operator does not support two vector bool operands);
+
+  if (bool1_p != bool2_p  (muldiv_p || compare_p))
+return N_(binary operator does not support vector bool operand);
+
+  mode1 = TYPE_MODE (type1);
+  mode2 = TYPE_MODE (type2);
+
+  if (bool1_p != bool2_p  plusminus_p
+   (GET_MODE_CLASS (mode1) == MODE_VECTOR_FLOAT
+ || GET_MODE_CLASS (mode2) == MODE_VECTOR_FLOAT))
+return N_(binary operator does not support mixing vector 
+ bool with floating point vector operands);
+
+  return NULL;
+}
+
 /* Initialize GCC target structure.  */
 
 #undef  TARGET_ASM_ALIGNED_HI_OP
@@ -13981,6 +14039,9 @@ s390_asm_file_end (void)
 #undef TARGET_ASM_FILE_END
 #define TARGET_ASM_FILE_END s390_asm_file_end
 
+#undef TARGET_INVALID_BINARY_OP
+#define TARGET_INVALID_BINARY_OP s390_invalid_binary_op
+
 struct gcc_target targetm = TARGET_INITIALIZER;
 
 #include gt-s390.h
-- 
1.7.9.5

[PATCH 09/13] S/390 Add zvector testcases.

2015-05-11 Thread Andreas Krebbel

gcc/testsuite/
* gcc.target/s390/zvector/vec-dbl-math-compile-1.c: New test.
* gcc.target/s390/zvector/vec-genbytemask-1.c: New test.
* gcc.target/s390/zvector/vec-genmask-1.c: New test.
* gcc.target/s390/zvector/vec-lcbb-1.c: New test.
* gcc.target/s390/zvector/vec-overloading-1.c: New test.
* gcc.target/s390/zvector/vec-overloading-2.c: New test.
* gcc.target/s390/zvector/vec-overloading-3.c: New test.
* gcc.target/s390/zvector/vec-overloading-4.c: New test.
* gcc.target/s390/zvector/vec-test-mask-1.c: New test.
* gcc.target/s390/zvector/vec-elem-1.c: New test.
---
 .../s390/zvector/vec-dbl-math-compile-1.c  |   67 +
 gcc/testsuite/gcc.target/s390/zvector/vec-elem-1.c |   11 +++
 .../gcc.target/s390/zvector/vec-genbytemask-1.c|   21 ++
 .../gcc.target/s390/zvector/vec-genmask-1.c|   24 ++
 gcc/testsuite/gcc.target/s390/zvector/vec-lcbb-1.c |   31 
 .../gcc.target/s390/zvector/vec-overloading-1.c|   77 
 .../gcc.target/s390/zvector/vec-overloading-2.c|   54 ++
 .../gcc.target/s390/zvector/vec-overloading-3.c|   19 +
 .../gcc.target/s390/zvector/vec-overloading-4.c|   18 +
 .../gcc.target/s390/zvector/vec-test-mask-1.c  |   25 +++
 10 files changed, 347 insertions(+)
 create mode 100644 
gcc/testsuite/gcc.target/s390/zvector/vec-dbl-math-compile-1.c
 create mode 100644 gcc/testsuite/gcc.target/s390/zvector/vec-elem-1.c
 create mode 100644 gcc/testsuite/gcc.target/s390/zvector/vec-genbytemask-1.c
 create mode 100644 gcc/testsuite/gcc.target/s390/zvector/vec-genmask-1.c
 create mode 100644 gcc/testsuite/gcc.target/s390/zvector/vec-lcbb-1.c
 create mode 100644 gcc/testsuite/gcc.target/s390/zvector/vec-overloading-1.c
 create mode 100644 gcc/testsuite/gcc.target/s390/zvector/vec-overloading-2.c
 create mode 100644 gcc/testsuite/gcc.target/s390/zvector/vec-overloading-3.c
 create mode 100644 gcc/testsuite/gcc.target/s390/zvector/vec-overloading-4.c
 create mode 100644 gcc/testsuite/gcc.target/s390/zvector/vec-test-mask-1.c

diff --git a/gcc/testsuite/gcc.target/s390/zvector/vec-dbl-math-compile-1.c 
b/gcc/testsuite/gcc.target/s390/zvector/vec-dbl-math-compile-1.c
new file mode 100644
index 000..31b277b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/zvector/vec-dbl-math-compile-1.c
@@ -0,0 +1,67 @@
+/* { dg-do compile { target { s390*-*-* } } } */
+/* { dg-options -O3 -mzarch -march=z13 -mzvector --save-temps } */
+
+/* { dg-final { scan-assembler-times vfcedb\t 1 } } */
+/* { dg-final { scan-assembler-times vfchdb\t 2 } } */
+/* { dg-final { scan-assembler-times vfchedb\t 2 } } */
+
+/* { dg-final { scan-assembler-times vfcedbs\t 2 } } */
+/* { dg-final { scan-assembler-times vfchdbs\t 2 } } */
+
+/* { dg-final { cleanup-saved-temps } } */
+
+#include vecintrin.h
+
+vector bool long long
+cmpeq (vector double a, vector double b)
+{
+  return vec_cmpeq (a, b); /* vfcedb */
+}
+
+vector bool long long
+cmpgt (vector double a, vector double b)
+{
+  return vec_cmpgt (a, b); /* vfchdb */
+}
+
+vector bool long long
+cmpge (vector double a, vector double b)
+{
+  return vec_cmpge (a, b); /* vfchedb */
+}
+
+vector bool long long
+cmplt (vector double a, vector double b)
+{
+  return vec_cmplt (a, b); /* vfchdb */
+}
+
+vector bool long long
+cmple (vector double a, vector double b)
+{
+  return vec_cmple (a, b); /* vfchedb */
+}
+
+int
+all_eq (vector double a, vector double b)
+{
+  return vec_all_eq (a, b);
+}
+
+int
+any_eq (vector double a, vector double b)
+{
+  return vec_any_eq (a, b);
+}
+
+int
+all_lt (vector double a, vector double b)
+{
+  return vec_all_lt (a, b);
+}
+
+int
+any_lt (vector double a, vector double b)
+{
+  return vec_any_lt (a, b);
+}
diff --git a/gcc/testsuite/gcc.target/s390/zvector/vec-elem-1.c 
b/gcc/testsuite/gcc.target/s390/zvector/vec-elem-1.c
new file mode 100644
index 000..c8578bf8
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/zvector/vec-elem-1.c
@@ -0,0 +1,11 @@
+/* { dg-do compile { target { s390*-*-* } } } */
+/* { dg-options -O3 -mzarch -march=z13 -mzvector } */
+
+/* { dg-final { scan-assembler nilf\t%r2,15 } } */
+/* { dg-final { scan-assembler vlgvb } } */
+
+signed char
+foo(unsigned char uc)
+{
+  return __builtin_s390_vec_extract((__vector signed char){ 0 }, uc);
+}
diff --git a/gcc/testsuite/gcc.target/s390/zvector/vec-genbytemask-1.c 
b/gcc/testsuite/gcc.target/s390/zvector/vec-genbytemask-1.c
new file mode 100644
index 000..09471f8
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/zvector/vec-genbytemask-1.c
@@ -0,0 +1,21 @@
+/* { dg-do compile } */
+/* { dg-options -O3 -mzarch -march=z13 -mzvector } */
+
+#include vecintrin.h
+
+
+vector unsigned char a, b, c, d;
+
+int
+foo ()
+{
+  a = vec_genmask (0);
+  b = vec_genmask (65535);
+  c = vec_genmask (43605);
+  d = vec_genmask (37830);
+}
+
+/* { dg-final { scan-assembler-times vzero 1 } } */
+/* {

[PATCH 04/13] S/390 Add -march/-mtune=z13 option.

2015-05-11 Thread Andreas Krebbel

gcc/
* common/config/s390/s390-common.c (processor_flags_table): Add
z13.
* config.gcc: Add z13.
* config/s390/s390-opts.h (enum processor_type): Add
PROCESSOR_2964_Z13.
* config/s390/s390.c (s390_adjust_priority): Check for
PROCESSOR_2964_Z13.
(s390_reorg): Likewise.
(s390_sched_reorder): Likewise.
(s390_sched_variable_issue): Likewise.
(s390_loop_unroll_adjust): Likewise.
(s390_option_override): Likewise. Default to -mvx when available.
* config/s390/s390.h (enum processor_flags): Add PF_Z13 and PF_VX.
(TARGET_CPU_Z13, TARGET_CPU_VX, TARGET_Z13, TARGET_VX)
(TARGET_VX_ABI): Define macros.
macros.
(TARGET_DEFAULT): Add MASK_OPT_VX.
* config/s390/s390.md (cpu attribute): Add z13.
(cpu_facility attribute): Add vec.
* config/s390/s390.opt (processor_type): Add z13.
(mvx): New options.
---
 gcc/common/config/s390/s390-common.c |3 +++
 gcc/config.gcc   |2 +-
 gcc/config/s390/s390-opts.h  |1 +
 gcc/config/s390/s390.c   |   35 --
 gcc/config/s390/s390.h   |   19 --
 gcc/config/s390/s390.md  |8 ++--
 gcc/config/s390/s390.opt |7 +++
 7 files changed, 64 insertions(+), 11 deletions(-)

diff --git a/gcc/common/config/s390/s390-common.c 
b/gcc/common/config/s390/s390-common.c
index 7181beb..43459c8 100644
--- a/gcc/common/config/s390/s390-common.c
+++ b/gcc/common/config/s390/s390-common.c
@@ -42,7 +42,10 @@ EXPORTED_CONST int processor_flags_table[] =
 /* z196 */   PF_IEEE_FLOAT | PF_ZARCH | PF_LONG_DISPLACEMENT
  | PF_EXTIMM | PF_DFP | PF_Z10 | PF_Z196,
 /* zEC12 */  PF_IEEE_FLOAT | PF_ZARCH | PF_LONG_DISPLACEMENT
+ | PF_EXTIMM | PF_DFP | PF_Z10 | PF_Z196 | PF_ZEC12 | PF_TX,
+/* z13 */PF_IEEE_FLOAT | PF_ZARCH | PF_LONG_DISPLACEMENT
  | PF_EXTIMM | PF_DFP | PF_Z10 | PF_Z196 | PF_ZEC12 | PF_TX
+ | PF_Z13 | PF_VX
   };
 
 /* Change optimizations to be performed, depending on the
diff --git a/gcc/config.gcc b/gcc/config.gcc
index a1df043..a2af2a0 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -4071,7 +4071,7 @@ case ${target} in
for which in arch tune; do
eval val=\$with_$which
case ${val} in
-| g5 | g6 | z900 | z990 | z9-109 | z9-ec | z10 | 
z196 | zEC12)
+| g5 | g6 | z900 | z990 | z9-109 | z9-ec | z10 | 
z196 | zEC12 | z13)
# OK
;;
*)
diff --git a/gcc/config/s390/s390-opts.h b/gcc/config/s390/s390-opts.h
index cb9ebc7..5bde333 100644
--- a/gcc/config/s390/s390-opts.h
+++ b/gcc/config/s390/s390-opts.h
@@ -35,6 +35,7 @@ enum processor_type
   PROCESSOR_2097_Z10,
   PROCESSOR_2817_Z196,
   PROCESSOR_2827_ZEC12,
+  PROCESSOR_2964_Z13,
   PROCESSOR_max
 };
 
diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
index cc37618..843a860 100644
--- a/gcc/config/s390/s390.c
+++ b/gcc/config/s390/s390.c
@@ -5851,7 +5851,8 @@ s390_adjust_priority (rtx_insn *insn, int priority)
s390_tune != PROCESSOR_2094_Z9_109
s390_tune != PROCESSOR_2097_Z10
s390_tune != PROCESSOR_2817_Z196
-   s390_tune != PROCESSOR_2827_ZEC12)
+   s390_tune != PROCESSOR_2827_ZEC12
+   s390_tune != PROCESSOR_2964_Z13)
 return priority;
 
   switch (s390_safe_attr_type (insn))
@@ -11451,7 +11452,8 @@ s390_reorg (void)
   /* Walk over the insns and do some =z10 specific changes.  */
   if (s390_tune == PROCESSOR_2097_Z10
   || s390_tune == PROCESSOR_2817_Z196
-  || s390_tune == PROCESSOR_2827_ZEC12)
+  || s390_tune == PROCESSOR_2827_ZEC12
+  || s390_tune == PROCESSOR_2964_Z13)
 {
   rtx_insn *insn;
   bool insn_added_p = false;
@@ -11700,7 +11702,8 @@ s390_sched_reorder (FILE *file, int verbose,
 if (reload_completed  *nreadyp  1)
   s390_z10_prevent_earlyload_conflicts (ready, nreadyp);
 
-  if (s390_tune == PROCESSOR_2827_ZEC12
+  if ((s390_tune == PROCESSOR_2827_ZEC12
+   || s390_tune == PROCESSOR_2964_Z13)
reload_completed
*nreadyp  1)
 {
@@ -11783,7 +11786,8 @@ s390_sched_variable_issue (FILE *file, int verbose, 
rtx_insn *insn, int more)
 {
   last_scheduled_insn = insn;
 
-  if (s390_tune == PROCESSOR_2827_ZEC12
+  if ((s390_tune == PROCESSOR_2827_ZEC12
+   || s390_tune == PROCESSOR_2964_Z13)
reload_completed
recog_memoized (insn) = 0)
 {
@@ -11863,7 +11867,8 @@ s390_loop_unroll_adjust (unsigned nunroll, struct loop 
*loop)
 
   if (s390_tune != PROCESSOR_2097_Z10
s390_tune != PROCESSOR_2817_Z196
-   s390_tune != PROCESSOR_2827_ZEC12)
+   s390_tune != PROCESSOR_2827_ZEC12
+   s390_tune != PROCESSOR_2964_Z13)

Re: [patch] Perform anonymous constant propagation during inlining

2015-05-11 Thread Eric Botcazou

 Would you be OK with a slight variation of your earlier idea, i.e.
 calling fold_stmt with a specific valueizer from fold_marked_statements
 instead of the implicit no_follow_ssa_edges in the inliner?  Something
 like:
 
 tree
 follow_anonymous_single_use_edges (tree val)
 {
 
   if (TREE_CODE (val) == SSA_NAME
   
(!SSA_NAME_VAR (val) || DECL_IGNORED_P (SSA_NAME_VAR
   (var)))
has_single_use (val))
 
 return val
   
   return NULL_TREE;
 
 }
 
 Yes, that works for me as well.

But not for GCC. :-)  The propagation per se works but, since the statement is 
not folded in the end, no replacement is made at all...

So we're back to square one and anonymous constant propagation seems to be the 
only way out at -O0.  The attached patch implements it in a less hackish way 
(and enables it unconditionally when optimizing as suggested by Jan) by doing 
it just before invoking fold_stmt on the marked statements so this should make 
folding more effective in the process. 

Tested (compiler only) on x86_64-suse-linux, OK for the mainline?


2015-05-11  Eric Botcazou  ebotca...@adacore.com

* tree-inline.c: Include tree-ssa-propagate.h.
(replace_constant_uses_in): New function.
(fold_marked_statements): Call it before folding the statement.

* gimple-expr.h (is_gimple_constant): Reorder.
* tree-ssa-propagate.c (before_dom_children): Use inline accessor.


-- 
Eric BotcazouIndex: tree-ssa-propagate.c
===
--- tree-ssa-propagate.c	(revision 222673)
+++ tree-ssa-propagate.c	(working copy)
@@ -1246,9 +1246,7 @@ substitute_and_fold_dom_walker::before_d
 	   gimple_call_noreturn_p (stmt))
 	stmts_to_fixup.safe_push (stmt);
 
-	  if (is_gimple_assign (stmt)
-	   (get_gimple_rhs_class (gimple_assign_rhs_code (stmt))
-		  == GIMPLE_SINGLE_RHS))
+	  if (gimple_assign_single_p (stmt))
 	{
 	  tree rhs = gimple_assign_rhs1 (stmt);
 
Index: tree-inline.c
===
--- tree-inline.c	(revision 222673)
+++ tree-inline.c	(working copy)
@@ -82,6 +82,7 @@ along with GCC; see the file COPYING3.
 #include expr.h
 #include tree-dfa.h
 #include tree-ssa.h
+#include tree-ssa-propagate.h
 #include tree-pretty-print.h
 #include except.h
 #include debug.h
@@ -4801,6 +4802,54 @@ gimple_expand_calls_inline (basic_block
   return inlined;
 }
 
+/* Replace USE references in statement STMT with constant values.  Return
+   true if at least one reference was replaced.
+
+   We do it at -O0 for anonymous SSA names or SSA names of anonymous decls,
+   this makes it possible to generate reasonable code for operators that are
+   implemented as functions and inlined on constants.  */
+
+static bool
+replace_constant_uses_in (gimple stmt)
+{
+  bool replaced = false;
+  use_operand_p use;
+  ssa_op_iter iter;
+
+  FOR_EACH_SSA_USE_OPERAND (use, stmt, iter, SSA_OP_USE)
+{
+  tree tuse = USE_FROM_PTR (use);
+
+  if (TREE_CODE (tuse) != SSA_NAME || SSA_NAME_IS_DEFAULT_DEF (tuse))
+	continue;
+
+  if (!optimize
+	   SSA_NAME_VAR (tuse)
+	   !DECL_IGNORED_P (SSA_NAME_VAR (tuse)))
+	continue;
+
+  if (!gimple_assign_single_p (SSA_NAME_DEF_STMT (tuse)))
+	continue;
+
+  tree rhs = gimple_assign_rhs1 (SSA_NAME_DEF_STMT (tuse));
+  if (!is_gimple_constant (rhs))
+	continue;
+
+  propagate_value (use, rhs);
+
+  replaced = true;
+}
+
+  if (replaced  gimple_assign_single_p (stmt))
+{
+  tree rhs = gimple_assign_rhs1 (stmt);
+
+  if (TREE_CODE (rhs) == ADDR_EXPR)
+	recompute_tree_invariant_for_addr_expr (rhs);
+}
+
+  return replaced;
+}
 
 /* Walk all basic blocks created after FIRST and try to fold every statement
in the STATEMENTS pointer set.  */
@@ -4819,7 +4868,10 @@ fold_marked_statements (int first, hash_
 	  if (statements-contains (gsi_stmt (gsi)))
 	{
 	  gimple old_stmt = gsi_stmt (gsi);
-	  tree old_decl = is_gimple_call (old_stmt) ? gimple_call_fndecl (old_stmt) : 0;
+	  tree old_decl
+		= is_gimple_call (old_stmt)
+		  ? gimple_call_fndecl (old_stmt) : NULL_TREE;
+	  bool replaced = replace_constant_uses_in (old_stmt);
 
 	  if (old_decl  DECL_BUILT_IN (old_decl))
 		{
@@ -4827,7 +4879,7 @@ fold_marked_statements (int first, hash_
 		 we need to look at all of them.  */
 		  gimple_stmt_iterator i2 = gsi;
 		  gsi_prev (i2);
-		  if (fold_stmt (gsi))
+		  if (fold_stmt (gsi) || replaced)
 		{
 		  gimple new_stmt;
 		  /* If a builtin at the end of a bb folded into nothing,
@@ -4871,7 +4923,7 @@ fold_marked_statements (int first, hash_
 			}
 		}
 		}
-	  else if (fold_stmt (gsi))
+	  else if (fold_stmt (gsi) || replaced)
 		{
 		  /* Re-read the statement from GSI as fold_stmt() may
 		 have changed it.  */
Index: gimple-expr.h
===
--- gimple-expr.h	(revision 222673)

Re: [PATCH] Fix memory leak in C++ pretty printer

2015-05-11 Thread Manuel López-Ibáñez


On 11/05/15 03:34, Patrick Palka wrote:

In gcc/cp/error.c we initialize the C++ pretty printer object twice:
first during statics initialization and later in a placement-new in
init_error().  This double-initialization causes a memory leak of about
7kb according to valgrind.  I don't see a reason to initialize the
object a second time so I elected to remove init_error().


I seem to remember there is some issue with the constructors when using static 
initialization that requires the placement-new. We also do the placement-new 
dance in the other FEs and the reason should be the same.


My preference would be to replace the static with a pointer and placement-new 
with proper new and delete, but see: 
https://gcc.gnu.org/ml/gcc-patches/2014-12/msg00910.html


If you change it here, please change it everywhere else where we use 
placement-new, such that all FEs are consistent on this.


Cheers,

Manuel.

Re: [PATCH 01/13] recog: Increased max number of alternatives.

2015-05-11 Thread Segher Boessenkool

On Mon, May 11, 2015 at 03:23:29PM +0200, Andreas Krebbel wrote:
 With the vector facility support z13 mov patterns have more than 30
 alternatives.

Wow, that is a lot!

 --- a/gcc/recog.h
 +++ b/gcc/recog.h
 @@ -23,7 +23,7 @@ along with GCC; see the file COPYING3.  If not see
  /* Random number that should be large enough for all purposes.  Also define
 a type that has at least MAX_RECOG_ALTERNATIVES + 1 bits, with the extra
 bit giving an invalid value that can be used to mean uninitialized.  */
 -#define MAX_RECOG_ALTERNATIVES 30
 +#define MAX_RECOG_ALTERNATIVES 35
  typedef unsigned int alternative_mask;

int isn't at least 36 bits.


Segher

Re: [Patch, Fortran] Simplify lbound

2015-05-11 Thread Mikael Morin

Le 11/05/2015 00:08, Thomas Koenig a écrit :
 Am 10.05.2015 um 22:43 schrieb H.J. Lu:
 
 Here is what I have committed.


 It caused:

 /export/gnu/import/git/sources/gcc/gcc/testsuite/gfortran.dg/inline_matmul_3.f90:38:39:
 Error: Variable 'c1' cannot appear in the expression at (1)^M
 
 I know that error message, I got it when developing the inline
 matmul patches with the same test cases.  I had a fix for this
 error message in one of my matmul patches, but it was removed
 in the review process because it could no longer be reproduced.
 
 So, here is the fix again.  I think it is close to obvious (since it
 fixes the problem and can obviously do no harm), but anyway:  OK for
 trunk?
 
For what it's worth, I have looked at it further, and it seems to be
gfc_current_ns not being set to the internal namespace.
A patch like this also removes the error.

Index: frontend-passes.c
===
--- frontend-passes.c   (révision 223002)
+++ frontend-passes.c   (copie de travail)
@@ -581,6 +581,9 @@ insert_block ()
   else
 ns = inserted_block-ext.block.ns;

+  /* From now on, everything will happen in the inserted block.  */
+  gfc_current_ns = ns;
+
   return ns;
 }

To be honest, both patches look fragile to me. Yours because it leaves
gfc_current_ns to its value, leaving the door open to other problems.
Mine, well, because it's playing with a global variable, with the
possible side-effects this could have.
However, without a better idea, I'm OK with either patch (or both).

Mikael

[PATCH 08/13] S/390 zvector builtin support.

2015-05-11 Thread Andreas Krebbel

With this patch GCC implements an Altivec style set of builtins to
make use of vector instructions in C/C++ code.  This is provided for
compatibility with the IBM XL compiler.

gcc/
* config.gcc: Add vecintrin.h to extra_headers.  Add s390-c.o to
c_target_objs and cxx_target_objs.  Add t-s390 to tmake_file.
* config/s390/s390-builtin-types.def: New file.
* config/s390/s390-builtins.def: New file.
* config/s390/s390-builtins.h: New file.
* config/s390/s390-c.c: New file.
* config/s390/s390-modes.def: Add modes CCVEQANY, CCVH,
CCVHANY, CCVHU, CCVHUANY, CCVFHANY, CCVFHEANY.
* config/s390/s390-protos.h (s390_expand_vec_compare_cc)
(s390_cpu_cpp_builtins, s390_register_target_pragmas): Add
prototypes.
* config/s390/s390.c (s390-builtins.h, s390-builtins.def):
Include.
(flags_builtin, flags_overloaded_builtin_var, s390_builtin_types)
(s390_builtin_fn_types, s390_builtin_decls, code_for_builtin): New
variable definitions.
(s390_const_operand_ok): New function.
(s390_expand_builtin): Rewrite.
(s390_init_builtins): New function.
(s390_handle_vectorbool_attribute): New function.
(s390_attribute_table): Add s390_vector_bool attribute.
(s390_match_ccmode_set): Handle new cc modes CCVH, CCVHU.
(s390_branch_condition_mask): Generate masks for new modes.
(s390_expand_vec_compare_cc): New function.
(s390_mangle_type): Add mangling for vector bool types.
(enum s390_builtin): Remove.
(s390_atomic_assign_expand_fenv): Rename constants for sfpc and
efpc builtins.
* config/s390/s390.h (TARGET_CPU_CPP_BUILTINS): Call
s390_cpu_cpp_builtins.
(REGISTER_TARGET_PRAGMAS): New macro.
* config/s390/s390.md: Define more UNSPEC_VEC_* constants.
(insn_cmp mode attribute): Add new CC modes.
(s390_sfpc, s390_efpc): Rename patterns to sfpc and efpc.
(lcbb): New pattern definition.
* config/s390/s390intrin.h: Include vecintrin.h.
* config/s390/t-s390: New file.
* config/s390/vecintrin.h: New file.
* config/s390/vector.md: Include vx-builtins.md.
* config/s390/vx-builtins.md: New file.S/390 zvector builtin support.
---
 gcc/config.gcc |   24 +-
 gcc/config/s390/s390-builtin-types.def |  747 ++
 gcc/config/s390/s390-builtins.def  | 2486 
 gcc/config/s390/s390-builtins.h|  160 ++
 gcc/config/s390/s390-c.c   |  907 
 gcc/config/s390/s390-modes.def |   30 +
 gcc/config/s390/s390-protos.h  |8 +
 gcc/config/s390/s390.c |  833 ---
 gcc/config/s390/s390.h |   27 +-
 gcc/config/s390/s390.md|  118 +-
 gcc/config/s390/s390.opt   |4 +
 gcc/config/s390/s390intrin.h   |3 +
 gcc/config/s390/t-s390 |   27 +
 gcc/config/s390/vecintrin.h|  311 
 gcc/config/s390/vector.md  |2 +
 gcc/config/s390/vx-builtins.md | 2081 ++
 16 files changed, 7494 insertions(+), 274 deletions(-)
 create mode 100644 gcc/config/s390/s390-builtin-types.def
 create mode 100644 gcc/config/s390/s390-builtins.def
 create mode 100644 gcc/config/s390/s390-builtins.h
 create mode 100644 gcc/config/s390/s390-c.c
 create mode 100644 gcc/config/s390/t-s390
 create mode 100644 gcc/config/s390/vecintrin.h
 create mode 100644 gcc/config/s390/vx-builtins.md


0008-S-390-zvector-builtin-support.patch.gz
Description: Binary data

Re: C PATCH for -Wshift-negative-value (PR c/66066)

2015-05-11 Thread Manuel López-Ibáñez


On 11/05/15 16:21, Marek Polacek wrote:

The -Wshift-negative-value patch caused grief since it breaks building
some programs.  The following patch should alleviate the pain a bit: mark
a left shift of a negative value as non-const only if pedantic.


Either this is not correct according to the guidelines (the flag pedantic 
should not cause generated code differences or errors, 
https://gcc.gnu.org/wiki/DiagnosticsGuidelines) or the guidelines need updating.


My take is that this should work exactly like OPT_Woverflow (see 
constant_expression_warning):


!flag_isoc99 || !constant_required should produce a warning() with 
OPT_Wshift_negative_value


flag_isoc99  constant_required should produce a pedwarn() using (pedantic ? 
OPT_Wpedantic : OPT_Wshift_negative_value).


The above means that we only produce errors with -std=c99|gnu99 
-pedantic-errors in the places that are mandated by the standard. Otherwise, 
there are no code or error differences and at most there are warnings that are 
controlled by -Wshift-negative-value for those that want to disable them.


Of course, the problem here is that we are folding this even before we know if 
a constant is actually required by the standard, no? Is there no way to move 
the warning to the point where we actually know whether a constant is required 
or not?


Cheers,

Manuel.

Re: [PATCH][PR66010] Don't take address of ap unless necessary

2015-05-11 Thread Tom de Vries


On 11-05-15 09:47, Richard Biener wrote:

Bootstrapped and reg-tested on x86_64, with and without -m32.

OK for trunk?

[ FWIW, I suspect this patch will make life easier for the reimplementation of
the pass_stdarg optimization using ifn_va_arg. ]

+  if (canon_va_type != NULL)
+{
+  if (!(TREE_CODE (canon_va_type) == ARRAY_TYPE
+TREE_CODE (va_type) != ARRAY_TYPE))
+   /* In gimplify_va_arg_expr we take the address of the ap argument,
mark
+  it addressable now.  */
+   mark_addressable (expr);

can we simplify this and ...

-   }
-
+  gcc_assert (TREE_CODE (TREE_TYPE (valist)) != ARRAY_TYPE);
gimplify_expr (valist, pre_p, post_p, is_gimple_val, fb_rvalue);

this to use [!]POINTER_TYPE_P ()?


I'm not sure.


Why do we arrive with non-array
va_type but array canon_va_type in build_va_arg?


grokdeclarator in c-decl.c:
...
/* A parameter declared as an array of T is really a pointer to T.
   One declared as a function is really a pointer to a function.  */

if (TREE_CODE (type) == ARRAY_TYPE)
  {
/* Transfer const-ness of array into that of type pointed to.  */
type = TREE_TYPE (type);
if (type_quals)
  type = c_build_qualified_type (type, type_quals);
type = c_build_pointer_type (type);
...


I suppose the
va_list argument already decayed to a pointer then


The above means that the va_list function parameter decayed to a pointer. AFAIU, 
the va_list argument to va_arg just uses the same type (for parsing, grep for 
RID_VA_ARG in c-parser.c).



(in which case
the argument should already be addressable?)?



The argument is of pointer type. That pointer-typed-argument will only be 
addressable if we take the address of it, which is precisely the thing we're 
trying to avoid in this patch.



I think the overall idea of the patch is good - I'm just worried about
special-casing of ARRAY_TYPE vs. non-pointer-type (because it's the
latter that we ultimately want...).


AFAIU, the special casing of ARRAY_TYPE in the patch is a consequence of the 
special-casing of ARRAY_TYPE as a parameter.


I don't see how [!]POINTER_TYPE_P () can help here. I've rewritten and attached 
the build_va_arg bit using POINTER_TYPE_P and expanded comments a bit to 
demonstrate.


Thanks,
- Tom



diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
index 378f237..95cf69b 100644
--- a/gcc/c-family/c-common.c
+++ b/gcc/c-family/c-common.c
@@ -5918,9 +5918,45 @@ set_compound_literal_name (tree decl)
 tree
 build_va_arg (location_t loc, tree expr, tree type)
 {
-  /* In gimplify_va_arg_expr we take the address of the ap argument, mark it
- addressable now.  */
-  mark_addressable (expr);
+  tree va_type = TREE_TYPE (expr);
+  tree canon_va_type = (va_type == error_mark_node
+			? NULL_TREE
+			: targetm.canonical_va_list_type (va_type));
+
+  if (va_type == error_mark_node
+  || canon_va_type == NULL_TREE)
+/* Let's not bother about addressable or not, if expr:
+   - has parse errors, or
+   - is not an va_list type.  */
+;
+  else
+{
+  if (POINTER_TYPE_P (va_type))
+	{
+	  /* If it's a pointer type, there are two possibilities:
+	 - expr is a va_list param decl that is declared as an array type
+	   but was turned into a pointer type by grok_declarator.
+	 - expr is a va_list decl declared as pointer type.
+	 Detect the former case by looking at the canonical type.  */
+	  if (TREE_CODE (canon_va_type) == ARRAY_TYPE)
+	/* We know that the pointer is constant, so there's no need to modify
+	   it, so there's no need to pass it around using an address
+	   operator, so there's no need to mark it addressable.  */
+	;
+	  else
+	/* The type is an actual va_list. It might be modified by va_arg, so
+	   we need to pass it around using an address operator, so we need
+	   to mark it addressable.  */
+	mark_addressable (expr);
+	}
+  else
+	{
+	  /* The type is an actual va_list. It might be modified by va_arg, so
+	 we need to pass it around using an address operator, so we need to
+	 mark it addressable.  */
+	  mark_addressable (expr);
+	}
+}
 
   expr = build1 (VA_ARG_EXPR, type, expr);
   SET_EXPR_LOCATION (expr, loc);

C PATCH for -Wshift-negative-value (PR c/66066)

2015-05-11 Thread Marek Polacek

The -Wshift-negative-value patch caused grief since it breaks building
some programs.  The following patch should alleviate the pain a bit: mark
a left shift of a negative value as non-const only if pedantic.

Bootstrapped/regtested on x86_64-linux, ok for trunk?

2015-05-11  Marek Polacek  pola...@redhat.com

PR c/66066
* c-typeck.c (build_binary_op): Mark left shift of a negative value
as non-const only if pedantic.

* gcc.dg/c99-left-shift-2.c: New test.

diff --git gcc/c/c-typeck.c gcc/c/c-typeck.c
index 3fcb7c2..05b2709 100644
--- gcc/c/c-typeck.c
+++ gcc/c/c-typeck.c
@@ -10702,7 +10702,7 @@ build_binary_op (location_t location, enum tree_code 
code,
{
  /* Don't reject a left shift of a negative value in a context
 where a constant expression is needed in C90.  */
- if (flag_isoc99)
+ if (pedantic  flag_isoc99)
int_const = false;
  if (c_inhibit_evaluation_warnings == 0)
warning_at (location, OPT_Wshift_negative_value,
diff --git gcc/testsuite/gcc.dg/c99-left-shift-2.c 
gcc/testsuite/gcc.dg/c99-left-shift-2.c
index e69de29..a4cd9d0 100644
--- gcc/testsuite/gcc.dg/c99-left-shift-2.c
+++ gcc/testsuite/gcc.dg/c99-left-shift-2.c
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-options -std=iso9899:1999 } */
+
+enum E { A = -2  1 };
+int i = -1  0;
+
+int
+f (int i)
+{
+  switch (i)
+  case -1  0: break;
+}

Marek

Re: [PATCH] Expand PIC calls without PLT with -fno-plt

2015-05-11 Thread Rich Felker

On Mon, May 11, 2015 at 01:48:03PM +0200, Michael Matz wrote:
 Hi,
 
 On Wed, 6 May 2015, Rich Felker wrote:
 
  I don't see how this case is improved unless GCC is failing to consider 
  strong definitions in the same TU as locally-binding.
 
 Interposition of non-static non-inline non-weak symbols is supported 
 independend of if they are defined in the same TU or not (if you're 
 producing a shared lib, that is).  I.e. no, they are not considered 
 locally-binding (for instance, they aren't automatically inlined).

  If this is the case, is there a reason for that behavior?
 
 Because IMHO interposition is orthogonal to TU placement, and hence 
 shouldn't be influenced by it.  There's visibility, inline hints or 
 static-ness to achieve different effects.  (perhaps the real reason is: 
 because it always worked like that :) )
 
  IMO it's wrong.
 
 Why?  I think it's right.

I see it as an unnecessary pessimization. The ELF shared library
semantics for allowing interposition were designed to avoid behavioral
regressions versus static linking, and this is not such a case. With
an archive-type library, it's possible to cause whole TUs to be
omitted when linking as long as whatever symbol(s) may have been
needed from them are already defined elsewhere; interposition makes
the same possible with dynamic linking. But if symbols A and B were
both in the same TU, having A defined prior to searching an archive
but B undefined will cause the TU that defines both to be pulled in,
and is such a linking error (multiple definitions). So I'm not sure
why it's desirable to support this.

The it always worked like that argument may suffice if people are
depending on this behavior now (OTOH I'd rather see it break so they
fix their breakage of static linking) but I suspect the historical
reason it worked like that is that compilers were not smart enough to
process whole TUs at a time but just worked with one function at a
time and did not know that a referenced symbol was in the same TU.

BTW visibility can't really address the issue except with hacks
(hidden aliases) or protected visibility (which is hard to use because
it's broken on lots of toolchain versions).

Rich

Re: [PATCH 01/13] recog: Increased max number of alternatives.

2015-05-11 Thread Andreas Krebbel

On 05/11/2015 04:01 PM, Segher Boessenkool wrote:
 On Mon, May 11, 2015 at 03:23:29PM +0200, Andreas Krebbel wrote:
 With the vector facility support z13 mov patterns have more than 30
 alternatives.
 
 Wow, that is a lot!
 
 --- a/gcc/recog.h
 +++ b/gcc/recog.h
 @@ -23,7 +23,7 @@ along with GCC; see the file COPYING3.  If not see
  /* Random number that should be large enough for all purposes.  Also define
 a type that has at least MAX_RECOG_ALTERNATIVES + 1 bits, with the extra
 bit giving an invalid value that can be used to mean uninitialized.  */
 -#define MAX_RECOG_ALTERNATIVES 30
 +#define MAX_RECOG_ALTERNATIVES 35
  typedef unsigned int alternative_mask;
 
 int isn't at least 36 bits.

Right. That should be unsigned HOST_WIDE_INT instead.

Thanks!

-Andreas-

[wwwdocs] Update changes.html

2015-05-11 Thread Marek Polacek

Ok to commit?

Index: changes.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-6/changes.html,v
retrieving revision 1.3
diff -u -r1.3 changes.html
--- changes.html6 May 2015 09:55:29 -   1.3
+++ changes.html11 May 2015 14:44:38 -
@@ -16,15 +16,36 @@
 
 
 !-- .. --
-!-- h2 id=generalGeneral Optimizer Improvements/h2 --
-
+h2 id=generalGeneral Optimizer Improvements/h2
+  ul
+liUndefinedBehaviorSanitizer gained a new sanitization option:
+ul
+  licode-fsanitize=bounds-strict/code: enables strict checking
+ of array bounds.  In particular, it enables
+ code-fsanitize=bounds/code as well as instrumentation of
+ flexible array member-like arrays./li
+/ul
+  /ul
 
 !-- .. --
 h2 id=languagesNew Languages and Language specific improvements/h2
 
 !-- h3 id=adaAda/h3 --
 
-!-- h3 id=c-familyC family/h3 --
+h3 id=c-familyC family/h3
+  ul
+liA new command-line option code-Wshift-negative-value/code has been
+   added for the C and C++ compilers, which warns about left shifting
+   a negative value./li
+  /ul
+
+h3 id=cC/h3
+  ul
+liIt is possible to disable warnings when an initialized field of
+   a structure or an union with side effects is being overridden when
+   using designated initializers via a new warning option
+   code-Woverride-init-side-effects/code./li
+  /ul
 
 h3 id=cxxC++/h3
 

Marek

Re: PATCH] PR target/65612: Multiversioning doesn't work with DSO nor PIE

2015-05-11 Thread Rich Felker

On Mon, May 11, 2015 at 12:31:51PM +0200, Jakub Jelinek wrote:
 On Mon, May 11, 2015 at 11:20:15AM +0100, Szabolcs Nagy wrote:
  
  
  On 09/05/15 19:57, Szabolcs Nagy wrote:
   * H.J. Lu hjl.to...@gmail.com [2015-05-09 10:41:41 -0700]:
   There are
  
4: 2b70   806 FUNCGLOBAL DEFAULT   12
   __cpu_indicator_init@GCC_4.8.0
   38: 002153d016 OBJECT  GLOBAL DEFAULT   25 
   __cpu_model@GCC_4.8.0
  
   and
  
   0215000  00040001 R_X86_64_64
   2b70 __cpu_indicator_init@GCC_4.8.0 + 0
   00215220  00260006 R_X86_64_GLOB_DAT
   002153d0 __cpu_model@GCC_4.8.0 + 0
  
   in libgcc_s.so.1.  Musl ld.so must be fixed to handle it.
  
  
  Rich is looking at how to do this non-intrusively, but
  it seems non-trivial (some users of musl prefer not to
  resolve such versioned symbols).
  
   
   i think it might be enough to add __cpu_indicator_init_local
   as an alias to __cpu_indicator_init in libgcc.a and then use
   the *_local symbol from the ifunc resolver, that way no new
   dependency is added to libgcc_s.so handling.
  
  i tried this approach and it seems to work: passes all
  multiversioning tests on x86_64.
  
  i think it's no worse than the symver approach.
  
  is it ok to change the current fix to this?
 
 No.  Instead of piling hacks like this just fix it in musl.

I wouldn't call it piling hacks; it's an improvement as far as I can
tell since it remove symbolic relocations and replaces them with
relative ones.

 libgcc certainly isn't the only library that uses @ symbol versions,
 e.g. libstdc++ does as well, as well as many other shared libraries.

We haven't encountered such issues there.

Rich

False ODR violation positives on anonymous namespace types

2015-05-11 Thread Jan Hubicka

Jason,
I got my firefox tree building again and noticed that my patch to enable
ODR merging on non-class types caused false positives:

/aux/hubicka/trunk-install/include/c++/6.0.0/ext/new_allocator.h:66:26: 
warning: type ‘(anonymous namespace)::ObservationWithStack const’ violates one 
definition rule [-Wodr]
   typedef const _Tp const_reference;
  ^
/aux/hubicka/trunk-install/include/c++/6.0.0/ext/alloc_traits.h:110:53: note: 
it is defined as a pointer to different type in another translation unit
 typedef const value_type   const_reference;
 ^
/aux/hubicka/firefox8/xpcom/build/MainThreadIOLogger.cpp:28:8: note: type 
‘const struct ObservationWithStack’ should match type ‘const struct value_type’
 struct ObservationWithStack
^
/aux/hubicka/trunk-install/include/c++/6.0.0/ext/alloc_traits.h:103:53: note: 
the incompatible type is defined here
 typedef typename _Base_type::value_type value_type;
 ^

Here the obvious problem is that we try to merge type in anonymous namespace.
This is because at LTO time odr_type_p returns true and
type_in_anonymous_namespace returns false.  odr_type_p basically checks that we
computed mangled type name for it which is done in free_lang_data as follows:

{
  /* We use DECL_ASSEMBLER_NAME to hold mangled type names for One Definition 
Rule
 merging.  */
  if (flag_lto_odr_type_mering
   TREE_CODE (decl) == TYPE_DECL
   DECL_NAME (decl)
   decl == TYPE_NAME (TREE_TYPE (decl))
   !is_lang_specific (TREE_TYPE (decl))
  /* Save some work. Names of builtin types are always derived from
 properties of its main variant.  A special case are integer types
 where mangling do make differences between char/signed char/unsigned
 char etc.  Storing name for these makes e.g.
 -fno-signed-char/-fsigned-char mismatches to be handled well.

 See cp/mangle.c:write_builtin_type for details.  */
   (TREE_CODE (TREE_TYPE (decl)) != VOID_TYPE
   TREE_CODE (TREE_TYPE (decl)) != BOOLEAN_TYPE
   TREE_CODE (TREE_TYPE (decl)) != REAL_TYPE
   TREE_CODE (TREE_TYPE (decl)) != FIXED_POINT_TYPE)
   !TYPE_ARTIFICIAL (TREE_TYPE (decl))
   !variably_modified_type_p (TREE_TYPE (decl), NULL_TREE)
   !type_in_anonymous_namespace_p (TREE_TYPE (decl)))
return !DECL_ASSEMBLER_NAME_SET_P (decl);

and obviously we are not intended to mangle tyhis type and it should be
caught by type_in_anonymous_namespace_p check:

bool
type_in_anonymous_namespace_p (const_tree t)
{
  /* TREE_PUBLIC of TYPE_STUB_DECL may not be properly set for
 bulitin types; those have CONTEXT NULL.  */
  if (!TYPE_CONTEXT (t))
return false;
  return (TYPE_STUB_DECL (t)  !TREE_PUBLIC (TYPE_STUB_DECL (t)));
}

We already discussed earlier that type_in_anonymous_namespace_p is not working
on compund types, because these do not have TYPE_STUB_DECL.  I tought those are
!TYPE_NAME types.  What is reason for !TYPE_NAME type with no TYPE_STUB_DECL?
Is it always a compound type with typedef name?

One possible solution is to avoid producing mangled names so these won't
be odr_type_p at LTO time:

Index: tree.c
===
--- tree.c  (revision 222991)
+++ tree.c  (working copy)
@@ -5173,6 +5173,7 @@ need_assembler_name_p (tree decl)
   TREE_CODE (TREE_TYPE (decl)) != REAL_TYPE
   TREE_CODE (TREE_TYPE (decl)) != FIXED_POINT_TYPE)
!TYPE_ARTIFICIAL (TREE_TYPE (decl))
+   TYPE_STUB_DECL (TREE_TYPE (decl))
!variably_modified_type_p (TREE_TYPE (decl), NULL_TREE)
!type_in_anonymous_namespace_p (TREE_TYPE (decl)))
 return !DECL_ASSEMBLER_NAME_SET_P (decl);

which silence the warnings, but I think it also may cause false negatives.
Ohter option is to make type_in_anonymous_namespace_p work.  The types in
question are anonymous because they are defined within anonymous template,
would the following work in type_in_anonymous_namespace_p?

+  /* Types (such as pointer_type) defined within class types may not
+ have their own TYPE_STUB_DECL.  Look for the outer type.  */
+  while (!TYPE_STUB_DECL (t)  TYPE_NAME (t)
+ TREE_CODE (TYPE_NAME (t)) == TYPE_DECL
+ DECL_CONTEXT (TYPE_NAME (t))
+ TYPE_P (DECL_CONTEXT (TYPE_NAME (t
+t = DECL_CONTEXT (TYPE_NAME (t));

Last option would be make anonymous_namespace_p to walk to compound type bases.
I made variant of this in
https://gcc.gnu.org/ml/gcc-patches/2015-05/msg00880.html
(odr_or_derived_type_p, modifying type_in_anonymous_namespace_p to work same
way is definitly easy)

What would be a preffered fix for this?

Honza

Re: [PATCH 4/6] rs6000: Delete some now-superfluous zero_ext{end,ract} patterns

2015-05-11 Thread David Edelsohn

On Sun, May 10, 2015 at 12:13 PM, Segher Boessenkool
seg...@kernel.crashing.org wrote:
 After the change to combine, we no longer need all the special-case
 patterns.

 Tested on powerpc64-linux, as usual.  As mentioned with the combine
 patch, there are no differences to generated code in cc1.

 This does not delete DImode lshiftrt patterns, because those do not
 exist (yet).

 Is this okay for trunk?


 Segher

 [ Save the planet!  Delete 11% of your machine descriptions! ]


 2015-05-10  Segher Boessenkool  seg...@kernel.crashing.org

 * config/rs6000/rs6000.md (extzv): FAIL for SImode.
 (extzvsi_internal, *extzvsi_internal1, *extzvsi_internal2,
 *rotlsi3_internal7le, *rotlsi3_internal7be, *rotlsi3_internal8le,
 *rotlsi3_internal8be, *rotlsi3_internal9le, *rotlsi3_internal9be,
 *rotlsi3_internal10le, *rotlsi3_internal10be, *rotlsi3_internal11le,
 *rotlsi3_internal11be, *rotlsi3_internal12le, *rotlsi3_internal12be,
 *lshiftrt_internal1le, *lshiftrt_internal1be, *lshiftrt_internal2le,
 *lshiftrt_internal2be, *lshiftrt_internal3le, *lshiftrt_internal3be,
 *lshiftrt_internal4le, *lshiftrt_internal4be, *lshiftrt_internal5le,
 *lshiftrt_internal5be, *lshiftrt_internal5le, *lshiftrt_internal5be,
 *rotldi3_internal7le, *rotldi3_internal7be, *rotldi3_internal8le,
 *rotldi3_internal8be, *rotldi3_internal9le, *rotldi3_internal9be,
 *rotldi3_internal10le, *rotldi3_internal10be, *rotldi3_internal11le,
 *rotldi3_internal11be, *rotldi3_internal12le, *rotldi3_internal12be,
 *rotldi3_internal13le, *rotldi3_internal13be, *rotldi3_internal14le,
 *rotldi3_internal14be, *rotldi3_internal15le, *rotldi3_internal15be,
 and 30 corresponding splitters): Delete.

Okay.

Thanks, David

Re: [PATCH 5/6] rs6000: Don't use gen_rlwinm

2015-05-11 Thread David Edelsohn

On Sun, May 10, 2015 at 12:13 PM, Segher Boessenkool
seg...@kernel.crashing.org wrote:
 The next patch will rename the rlwinm pattern (as well as the other patterns
 that implement rlwinm, now unnamed).  The only place that uses gen_rlwinm
 (an expander) is better off expanding the separate operations separately.
 Do so.

 Okay for trunk?


 Segher


 2015-05-10  Segher Boessenkool  seg...@kernel.crashing.org

 * config/rs6000/rs6000.md (rs6000_adjust_atomic_subword): Use
 gen_ashlsi3 and gen_andsi3 instead of gen_rlwinm.

Okay.

Thanks, David

Re: PATCH] PR target/65612: Multiversioning doesn't work with DSO nor PIE

2015-05-11 Thread Szabolcs Nagy



On 11/05/15 14:05, Jakub Jelinek wrote:
 On Mon, May 11, 2015 at 01:39:17PM +0100, Szabolcs Nagy wrote:
 fyi, musl loader loads libstdc++ just fine because it has no
 
 But will it load any shared library that uses any of the 26 (if I count well
 on x86_64) @ symbols from libstdc++.so.6?

i looked, but all of those symbols have @@ variant too
so at least the libraries would load with musl.

 musl may end up supporting @version but that's an independent
 quest.
 
 It is not independent.  If musl claims to support symbol versioning, it
 should support it properly, if not, then supposedly gcc configured for musl
 can't be compatible with gcc configured for other linux C libraries, and
 should force symbol versioning off.

ok, but the current solution does not make that easy:
configuring gcc with

  --disable-gnu-indirect-function --disable-symvers

has no effect on libgcc_s.so.1 on linux.

(i can try to create a patch that removes the new
-DUSE_ELF_SYMVER from the libgcc cflags for musl, but that
seems a worse solution than the weak alias one).

(note that previously a simple spec file was enough to use an
existing gcc on linux to build things against musl...
this didnt work for c++ code that used libstdc++ headers, but
now it also fails for any build using -lgcc_s).

Re: [PATCH 6/6] rs6000: Clean up the various rlwinm patterns

2015-05-11 Thread David Edelsohn

On Sun, May 10, 2015 at 12:13 PM, Segher Boessenkool
seg...@kernel.crashing.org wrote:
 Some cleanups:

 * Give every define_insn a name;
 * Add missing conditions for some of the dot forms;
 * Use define_insn_and_split to reduce duplication;
 * Renumber operands so 0,1,2,3 are the actual operands of the machine
   instruction, in order;
 * Reformat some patterns.

 Is this okay for trunk?


 Segher


 2015-05-10  Segher Boessenkool  seg...@kernel.crashing.org

 * config/rs6000/rs6000.md (*rotlsi3_internal4, *rotlsi3_internal5,
 *rotlsi3_internal6, rlwinm, 5 unnamed define_insns, and 6
 define_splits): Delete.
 (*rotlsi3_mask, *rotlsi3_mask_dot, *rotlsi3_mask_dot2,
 *ashlsi3_imm_mask, *ashlsi3_imm_mask_dot, *ashlsi3_imm_mask_dot2,
 *lshrsi3_imm_mask, *lshrsi3_imm_mask_dot, *lshrsi3_imm_mask_dot2):
 New.

Okay.

Thanks, David

Re: [PATCH 3/6] rs6000: Don't use zero_extract in the bswap:HI splitter

2015-05-11 Thread David Edelsohn

On Sun, May 10, 2015 at 12:13 PM, Segher Boessenkool
seg...@kernel.crashing.org wrote:
 The next patch removes the zero_extract insn this splits to.  Write
 it as (and (lshiftrt ... instead.

 Okay for trunk?


 Segher


 2015-05-10  Segher Boessenkool  seg...@kernel.crashing.org

 * config/rs6000/rs6000.md (define_split for bswaphi): Don't use
 zero_extract.

Okay.

Thanks, David

[patch, MIPS, testsuite] Fix gcc.target/mips/branch-1.c

2015-05-11 Thread Steve Ellcey

The test gcc.target/mips/branch-1.c has started failing because it is trying
to verify that each of 4 functions generates and 'andi' instruction and only
finding 2 of them.  With a recent change (fixing PR 65150) GCC determined that
the f1 and f2 functions generate identical code and that the f3 and f4
functions generate identical code and so it replaced f1 with a tail call to f2
and f3 with a tail call to f4.  Thus only 2 'andi' instructions show up and
the test fails.

There is an existing bug report (PR 65534) about whether or not this
is a good optimization but I would like to fix this test so that that
optimization can not happen since that is not what this test is intended
to check for.  My solution is to pass different arguments to bar() in
each function so the code in each function is unique.

Tested with mips-mti-linux-gnu, OK to checkin?

Steve Ellcey
sell...@imgtec.com


2015-05-11  Steve Ellcey  sell...@mips.com

* gcc.target/mips/branch-1.c: Pass argument to bar().


diff --git a/gcc/testsuite/gcc.target/mips/branch-1.c 
b/gcc/testsuite/gcc.target/mips/branch-1.c
index 6ef50e8..83d7180 100644
--- a/gcc/testsuite/gcc.target/mips/branch-1.c
+++ b/gcc/testsuite/gcc.target/mips/branch-1.c
@@ -4,11 +4,11 @@
on zero.  */
 /* { dg-options forbid_cpu=octeon.* } */
 
-void bar (void);
-NOMIPS16 void f1 (int x) { if (x  4) bar (); }
-NOMIPS16 void f2 (int x) { if ((x  2)  1) bar (); }
-NOMIPS16 void f3 (unsigned int x) { if (x  0x10) bar (); }
-NOMIPS16 void f4 (unsigned int x) { if ((x  4)  1) bar (); }
+void bar (int);
+NOMIPS16 void f1 (int x) { if (x  4) bar (1); }
+NOMIPS16 void f2 (int x) { if ((x  2)  1) bar (2); }
+NOMIPS16 void f3 (unsigned int x) { if (x  0x10) bar (3); }
+NOMIPS16 void f4 (unsigned int x) { if ((x  4)  1) bar (4); }
 /* { dg-final { scan-assembler \tandi\t.*\tandi\t.*\tandi\t.*\tandi\t } } */
 /* { dg-final { scan-assembler-not \tsrl\t } } */
 /* { dg-final { scan-assembler-not \tsra\t } } */

Re: [wwwdocs] Update changes.html

2015-05-11 Thread Marek Polacek

On Mon, May 11, 2015 at 06:04:48PM +0200, Gerald Pfeifer wrote:
 Hi Marek,
 
 On Mon, 11 May 2015, Marek Polacek wrote:
  Ok to commit?
 
 as maintainer I am happy for you to commit documentation/web changes
 without approval, though I am also happy to review.

I was hoping you could glance over it before I commit the patch; and
given the below, I'm glad you did ;).
 
 Here I'm wondering whether the nested list could just be merged
 into the primary item (and I'd omit the colon before enables).

Agreed.  If we have more new options, we can make a list out of it.
 
 a union.  English can be tricky for us non-native speakers at 
 times, since usually you would use an before u, except when 
 that u is pronounced as a consonant or a syllable starting with 
 one.  And the u in union actually is prounced as you-nion.

Yeah, I'm familiar with this rule, and I usually got that right.
Not sure how I goofed it up this time.

Is the following any better?  Thanks,

Index: changes.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-6/changes.html,v
retrieving revision 1.3
diff -u -r1.3 changes.html
--- changes.html6 May 2015 09:55:29 -   1.3
+++ changes.html11 May 2015 16:16:04 -
@@ -16,15 +16,34 @@
 
 
 !-- .. --
-!-- h2 id=generalGeneral Optimizer Improvements/h2 --
-
+h2 id=generalGeneral Optimizer Improvements/h2
+  ul
+liUndefinedBehaviorSanitizer gained a new sanitization option,
+   code-fsanitize=bounds-strict/code, which enables strict checking
+   of array bounds.  In particular, it enables
+   code-fsanitize=bounds/code as well as instrumentation of
+   flexible array member-like arrays.
+  /ul
 
 !-- .. --
 h2 id=languagesNew Languages and Language specific improvements/h2
 
 !-- h3 id=adaAda/h3 --
 
-!-- h3 id=c-familyC family/h3 --
+h3 id=c-familyC family/h3
+  ul
+liA new command-line option code-Wshift-negative-value/code has been
+   added for the C and C++ compilers, which warns about left shifting
+   a negative value./li
+  /ul
+
+h3 id=cC/h3
+  ul
+liIt is possible to disable warnings when an initialized field of
+   a structure or a union with side effects is being overridden when
+   using designated initializers via a new warning option
+   code-Woverride-init-side-effects/code./li
+  /ul
 
 h3 id=cxxC++/h3
 

Marek

[PATCH, i386]: Use insn PATTERN when looking for memory_references in ix86_loop_unroll_adjust

2015-05-11 Thread Uros Bizjak

Hello!

2015-05-11  Uros Bizjak  ubiz...@gmail.com

* config/i386/i386.c (ix86_loop_unroll_adjust): Use PATTERN (insn)
when looking for memory references.

Tested on x86_64-linux-gnu {,-m32} and committed to mainline SVN.

Uros.
Index: config/i386/i386.c
===
--- config/i386/i386.c  (revision 222999)
+++ config/i386/i386.c  (working copy)
@@ -51446,7 +51446,7 @@ ix86_loop_unroll_adjust (unsigned nunroll, struct
   for (i = 0; i  loop-num_nodes; i++)
 FOR_BB_INSNS (bbs[i], insn)
   if (NONDEBUG_INSN_P (insn))
-   FOR_EACH_SUBRTX (iter, array, insn, NONCONST)
+   FOR_EACH_SUBRTX (iter, array, PATTERN (insn), NONCONST)
  if (const_rtx x = *iter)
if (MEM_P (x))
  {

Re: C PATCH for -Wshift-negative-value (PR c/66066)

2015-05-11 Thread Jeff Law


On 05/11/2015 12:14 PM, Manuel López-Ibáñez wrote:

On 11 May 2015 at 17:54, Marek Polacek pola...@redhat.com wrote:

I'm sorry to sound so pedantic but we have faced this same issue in
the past (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=19976#c7) and
the solution was to delay folding
(https://gcc.gnu.org/ml/gcc-patches/2008-10/msg01061.html).
And it's still the answer to many issues.  There's an effort to delay 
folding of various things in the C++ front-end and once done, that 
effort may be extended to other things.


Jeff

Re: [PATCH] Fix typo

2015-05-11 Thread Paulo Matos

On Sun, 10 May 2015 22:07:53 -0600, Jeff Law wrote:

 On 05/10/2015 03:00 PM, Paulo Matos wrote:
 Yes.  This would fall under the obvious rule and can be committed
 without waiting for approvals.
 
 jeff

Thanks. Committed.
-- 
Paulo Matos

Re: C PATCH for -Wshift-negative-value (PR c/66066)

2015-05-11 Thread Manuel López-Ibáñez

On 11 May 2015 at 17:54, Marek Polacek pola...@redhat.com wrote:
 The problem here isn't in the -Wshift-negative-value warning itself; the
 problem is with marking -1  0 as a non-constant: later on, we warn in
 a context where a constant expression is needed (initializer element is
 not a constant expression), and for e.g. int foo = -1  0 | 9; there's
 an error (initializer element is not constant).

Yes, I understand. What I'm saying is that if folding was done at the
moment that the constant is requested, this would be a case of
pedwarn-if-pedantic, which is exactly what Joseph is discussing here
for case (b):
https://gcc.gnu.org/ml/gcc/2013-11/msg00253.html

It doesn't matter that the error initializer element is not a
constant expression is not given as long as an error (e.g., left
shift of negative value is undefined according to ISO C99) is given
with -pedantic-errors.

I'm sorry to sound so pedantic but we have faced this same issue in
the past (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=19976#c7) and
the solution was to delay folding
(https://gcc.gnu.org/ml/gcc-patches/2008-10/msg01061.html).

 My change means that we wouldn't complain unless -pedantic (to not upset
 too many users).  I'm not particularly fond of it, but it seems like the
 simplest solution.

-Wpedantic (which is the same as -pedantic) should not give errors.
Otherwise, it would mean that someone who uses -Wpedantic-
-Wno-pedantic will silence the error, but someone that uses
-Werror=pedantic -Wno-error=pedantic will still see an error (!).
There is also no way to tell that error is related to -Wpedantic.

But if this is the accepted solution, please at least test
global_dc-pedantic_errors instead and put a big /*  This is a
hack to reject non-conforming programs with -pedantic-errors and
accept them otherwise. See
https://gcc.gnu.org/ml/gcc-patches/2008-10/msg01061.html and
https://gcc.gnu.org/ml/gcc/2013-11/msg00253.html about to fix this
properly. */

Cheers,

Manuel.

Re: [PATCH] Add myself to MAINTAINERS

2015-05-11 Thread Paulo Matos

On Sun, 10 May 2015 21:08:04 +, Paulo Matos wrote:

 Somehow I never added myself to the MAINTAINERS file.
 Apologies for that. OK to commit?
 
 2015-05-10  Paulo Matos  pa...@matos-sorge.com
 
 * MAINTAINERS: Add myself as commit after approval.
 
 diff --git a/MAINTAINERS b/MAINTAINERS index 7dc4c8f..c5d6c99 100644 ---
 a/MAINTAINERS +++ b/MAINTAINERS @@ -483,6 +483,7 @@ David Malcolm
 dmalc...@redhat.com
  Mikhail Maltsev
 malts...@gmail.com
  Simon Martin
 simar...@users.sourceforge.net
  Ranjit Mathew  rmat...@hotmail.com
 +Paulo Matospa...@matos-sorge.com
  Michael Matz   m...@suse.de
  Greg McGaryg...@gnu.org
  Roland McGrath rol...@hack.frob.com

Committed as obvious.

-- 
Paulo Matos

Re: Fix pattern validation in genrecog

2015-05-11 Thread Jeff Law


On 05/11/2015 03:05 AM, Richard Sandiford wrote:

Thomas pointed out that, while I'd kept the code to validate patterns
for things like missing modes, the code wasn't being used.  Fixed with
the patch below.

I ended up reinstating the original code to create a single rtx
pattern for a define_peephole2 sequence, so that it could be
passed to validate_pattern as before.  One of the reasons I got
rid of the code originally was because I didn't like the idea of
a sequence of define_peephole2 instructions being put into a
PARALLEL -- there's nothing parallel about them, and it would be
easy to confuse the result with a real parallel in a define_insn
or define_split match.  I therefore changed it to use SEQUENCE instead.
(Not that it matters: nothing actually looks at the code.)

I also threw in a check for zero-length define_peephole2s while there.
This is enforced syntactically for other define_*s, but it's possible
to write a define_peephole2 that just asks for scratch registers and
doesn't actually match anything.

Tested on x86_64-linux-gnu.  OK to install?

Thanks,
Richard


gcc/
* genrecog.c (match_pattern_1): Expect the pattern to be a SEQUENCE
for define_peephole2s.
(get_peephole2_pattern): New function.
(main): Use it.  Call validate_pattern.

OK.
jeff

Re: [PATCH i386] Extend sibcall peepholes to allow source in %eax

2015-05-11 Thread Jan Hubicka

 On Sun, 10 May 2015, Jan Hubicka wrote:
 
   On i386, peepholes that transform memory load and register-indirect jump 
   into
   memory-indirect jump are overly restrictive in that they don't allow 
   combining
   when the jump target is loaded into %eax, and the called function returns 
   a
   value (also in %eax, so it's not dead after the call).  Fix this by 
   checking
   for same source and output register operands separately.
   
   OK?
 * config/i386/i386.md (sibcall_value_memory): Extend peepholes to
 allow memory address in %eax.
 (sibcall_value_pop_memory): Likewise.
  
  Why do we need the check for liveness after all?  There is SIBLING_CALL_P
  (peep2_next_insn (1)) so we know that the function terminates by the call
  and there are no other uses of the value.
 
 Indeed.  Uros, the peep2_reg_dead_p check was added by your patch as svn
 revision 211776, git commit e51f8b8fed.  Would you agree that the check is not
 necessary for sibcalls as Honza explains?  Would you approve a patch that
 removes it in the sibcall peepholes I modify in the patch under discussion?
  
  Don't we however need to check that operands[0] is not used by the 
  call_insn as
  parameter of the call?  I.e. something like
  
  void
  test(void (*callback ()))
  {
callback(callback);
  }
 
 You need a pointer-to-pointer-to-function to trigger the peephole.  Something
 like this:
 
   void foo()
   {
 void (**bar)(void*);
 asm(:=r(bar));
 (*bar)(*bar);
   }
 
  I think instead of peep2_reg_dead_p we want to check that the parameter is 
  not in
  CALL_INSN_FUNCTION_USAGE of the sibcall..
 
 Playing with the above testcase I can't induce failure.  It seems today GCC
 won't allocate the same register as callee address and one of the arguments.
 Do you want me to implement such a check anyway?

Hmm, only way I can trigger same register is:
void foo()
  {
void (**bar)(void*);
asm(:=r(bar));
register void (*var)(void *) asm(%eax);
var=*bar;
asm(:+r(var));
var(var);
  }

removing the second asm makes CSE to forward propagae the memory operand
to call that makes the call different from the register variable.

Still I would check for that, but this is more Uros' area.

Honza

Re: [PATCH] Fix memory leak in C++ pretty printer

2015-05-11 Thread Jason Merrill


On 05/11/2015 12:57 PM, Jason Merrill wrote:

On 05/11/2015 08:03 AM, Manuel López-Ibáñez wrote:

My preference would be to replace the static with a pointer and
placement-new with proper new and delete


Actually, on second thought, there really doesn't seem to be a need for 
that.  The patch should be OK; if it doesn't work I'd like to know why. 
 I think the existing pattern is just a holdover from the C days.


Jason

Re: [PATCH 6/6] rs6000: Clean up the various rlwinm patterns

2015-05-11 Thread Maciej W. Rozycki

On Sun, 10 May 2015, Segher Boessenkool wrote:

   This clearly renames rather than removing the `rlwinm' pattern, please 
  correctly reflect that in ChangeLog.  Some other, unnamed patterns are 
  given names rather than deleted as well, just as you've noted at the top.  
  And none of the other changes are mentioned in your ChangeLog entry.
 
 The changelog says it deletes the patterns with the old names.  Which it
 does.  And it adds the ones with the new names.  Which it does.  No one
 said changelogs are useful ;-)

 No one?  Well, I'm saying it now, then. :)

   Would you be able to split this change up further by any chance?  
 
 I could, but it would be a lot of extra work.  First, I haven't done those
 steps in order (they aren't steps at all): I take one pattern, and fix
 it up, then the next, etc.

 Yeah, I know the pain, it's usually how it happens.  You start cleaning 
up things and then you find yourself having done a number conceptually 
independent changes that overlap one another in the source modified.  
I've been through it many times.

 The thing is this extra work of untangling is worth doing as it'll help 
the maintainer and other members of the community, including those 
investigating change history months if not years from now for one reason 
or another, understand what the consequences of conceptually individual 
changes are.

 Specifically which changes are obviously harmless (e.g. formatting 
changes or the addition of debug `*' pattern names) and which are 
potential trouble makers (e.g. the reordering of operands or folding 
`define_insn' and `define_split' into `define_insn_and_split'), that e.g. 
need to be taken into account when porting changes that may rely on them.  
When all the bits are lumped together, it's very difficult to decipher 
them and easy to miss something.

 So I've done such patch splitting and rearrangement many times, using 
various techniques.  For example it's often faster and easier if you 
hand-edit the patch being split into two rather than trying to reproduce 
the intermediate stage in the source being patched.  Other people may have 
other hints and tricks.

 Of course you'll want to keep the original final version of the file 
changed around and then you can compare it to the result of applying a 
patch series, to make sure both results are identical.

 But much worse: if you do tiny pattern changes like you suggest, I can
 guarantee you the patches _will_ mis-apply.  All of the time :-(

 Well, if you get the line numbers recorded in patches right, which is how 
`diff' generates them, then it won't happen.  And as a committer you can 
verify you applied your own patches correctly, by taking a diff against 
your original patched version before pushing the changes into the repo. 
Especially when other upstream source changes have since caused line 
numbers to shift.

  change addresses a single issue only.  That would avoid problems with 
  ChangeLog and make the review easier.
 
 This isn't the first patch like this, I've cleaned up most other PowerPC
 integer patterns already.  It's enough work already.  Often the changes
 cannot be separated, and even if they can you then need them in the fixed
 order you made for them, etc.

 Of course changes can be separated (if they cannot, then they weren't 
really conceptually separate in the first place), and you do need to get 
the order right.  E.g. patches that are potentially of interest to older 
release branches need to go first, followed by ones that are meant to stay 
on trunk only.

 I'll leave it up to David to decide anyway as he's the port maintainer, 
and I'm only a member of the community who happened to poke at this port 
once or twice in the past.

  Maciej

Re: [C frontend] Fix construction of TYPE_STUB_DECL

2015-05-11 Thread Jan Hubicka

 Bootstrapped/regtested x86_64-linux, OK?
 
 Honza
 
  * c-decl.c (pushtag): Declare type as public.
 What I'm struggling with here is how do you know the stub decl is
 public?  I realize these things are a bit special, but I don't see
 the C++ front-end doing anything similar.  What am I missing?

It is used by type_in_anonymous_namespace_p. C++ FE definitely sets them 
accordingly,
but I have no idea how it is done.  Jason, can you help?

Honza
 
 jeff

Re: [PATCH, RFC]: Next stage1, refactoring: propagating rtx subclasses

2015-05-11 Thread Joseph Myers

On Mon, 11 May 2015, Mikhail Maltsev wrote:

 In general, is there a recommended set of targets that cover most
 conditionally compiled code? Also, the GCC Wiki mentions some automated

See contrib/config-list.mk (note that some of those targets may have 
pre-existing build failures, and note that you need to start with a 
current trunk native compiler so that --enable-werror-always works; don't 
try to build all those cross compilers using an older GCC).

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: C PATCH for -Wshift-negative-value (PR c/66066)

2015-05-11 Thread Joseph Myers

On Mon, 11 May 2015, Marek Polacek wrote:

 The problem here isn't in the -Wshift-negative-value warning itself; the
 problem is with marking -1  0 as a non-constant: later on, we warn in
 a context where a constant expression is needed (initializer element is
 not a constant expression), and for e.g. int foo = -1  0 | 9; there's
 an error (initializer element is not constant).

For cases that are not integer constant expressions but can be folded to 
constants (e.g. where the unevaluated half of ?: contains a function call, 
which is not allowed in constant expressions), the general approach is a 
pedwarn-if-pedantic, e.g.:

  if (TREE_CODE (value) != INTEGER_CST)
{
  value = c_fully_fold (value, false, NULL);
  if (TREE_CODE (value) == INTEGER_CST)
pedwarn (loc, OPT_Wpedantic,
 enumerator value for %qE is not an integer 
 constant expression, name);
}
  if (TREE_CODE (value) != INTEGER_CST)
{
  error (enumerator value for %qE is not an integer constant,
 name);
  value = 0;
}

(and similar for various cases other than enums - it turns out the Linux 
kernel has or had lots of cases that aren't strictly integer constant 
expressions but are used in contexts requiring such).

So I think the issue here is that the above expression isn't folded to a 
constant by c_fully_fold.  If the expression involves division by zero, 
say, clearly it's appropriate not to fold; no meaningful result value can 
be assigned.  But for the above expression, a meaningful result can be 
assigned, so it would seem to make sense to fold.  That is, it would make 
sense for c_fully_fold_internal to fold inside a C_MAYBE_CONST_EXPR with 
C_MAYBE_CONST_EXPR_INT_OPERANDS set, in the cases where meaningful results 
can be assigned.

The rationale for not folding the contents of a C_MAYBE_CONST_EXPR in 
c_fully_fold is that folding already took place when the 
C_MAYBE_CONST_EXPR was created, and it's inefficient to potentially fold 
the same expression recursively many times.  But that's only the case for 
normal C_MAYBE_CONST_EXPRs, not those with 
C_MAYBE_CONST_EXPR_INT_OPERANDS set that indicate something that could 
occur an unevaluated part of an integer constant expression.  And in any 
case, if folding had already occurred, there would already be the 
potential for duplicate folding through the existing uses of 
remove_c_maybe_const_expr.

Thus I think this issue should be addressed through folding in 
c_fully_fold_internal in this case, while watching carefully to make sure 
it doesn't reduce any other cases (division by zero, negative or out of 
range shift, overflows) to pedwarns-if-pedantic if they aren't already.

There would then remain the matter of initializer element is not a 
constant expression being an unconditional pedwarn_init rather than a 
pedwarn-if-pedantic, so I think you'd still get warnings that couldn't be 
disabled for the above code (which would turn into errors with -Werror).  
The answer for that is probably to make this pedwarn_init use 
OPT_Wpedantic, consistent with the various other cases that handle things 
that aren't constant expressions but folded to constants.

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: False ODR violation positives on anonymous namespace types

2015-05-11 Thread Jan Hubicka

 On 05/11/2015 01:05 PM, Jan Hubicka wrote:
 On 05/11/2015 12:46 PM, Jan Hubicka wrote:
 Well, my main motivatoin to extend from RECORD_OR_UNION_TYPE_P was to 
 handle
 enums. But other case I would like to deal with are integer types - i.e. 
 preserve
 difference between char/signed char/unsigned char/short/int/long/wchar in 
 cases
 where they structurally coincide.
 
 In what context?  Won't you get that from comparing e.g. the field
 types of two definitions of the same class?
 
 If one class define int foo; and other long foo; we currently do not 
 complain
 about ODR on 32bit targets while I think we could.
 
 We certainly should.  But that's a problem because foo is subject to
 the ODR.  I don't see why you need to treat int as an ODR type to
 get checking for foo.

What happens in LTO is that lto-symtab decide to merge the two declarations of
foo.  At this time it has chance to output the warning.  For that it needs to
be able to work out that these declarations are having different types, but
because LTO merge canonical types on structural basis, types_compatible_p
(long, int) will return true. The idea behind LTO canonical type computation is
that it needs to safely work cross-language and if the two declarations came
from C and fortran, the mismatch in type name is useless
(see gimple_canonical_types_compatible_p)

So what I want is to have odr_or_derived_type_p to return true on those types
(because it sees they do have mangled name attached) and then call
odr_types_equivalent_p which is the same comparer I use to output ODR waring
about types and it will complain about mangled type names being different.
This is already implemented in
https://gcc.gnu.org/ml/gcc-patches/2015-05/msg00870.html

Honza
 
 Jason

Fix RTL checking failure in emit_pattern_{after|before}_setloc

2015-05-11 Thread Eric Botcazou

This fixes an RTL checking failure I ran into while working on a change and it 
is latent in the pristine compiler.  emit_pattern_{after|before}_setloc have:

  if (active_insn_p (after)  !INSN_LOCATION (after))

  if (active_insn_p (first)  !INSN_LOCATION (first))

Now active_insn_p still has the FIXME:

int
active_insn_p (const_rtx insn)
{
  return (CALL_P (insn) || JUMP_P (insn)
  || JUMP_TABLE_DATA_P (insn) /* FIXME */
  || (NONJUMP_INSN_P (insn)
   (! reload_completed
  || (GET_CODE (PATTERN (insn)) != USE
   GET_CODE (PATTERN (insn)) != CLOBBER;
}

so if AFTER or FIRST are JUMP_TABLE_DATA_P, INSN_LOCATION is invoked on them 
and this triggers the RTL checking failure.

Fixed thusly, tested on x86_64-suse-linux, applied on the mainline.


2015-05-11  Eric Botcazou  ebotca...@adacore.com

* emit-rtl.c (emit_pattern_after_setloc): Add missing guard.
(emit_pattern_before_setloc): Likewise.


-- 
Eric BotcazouIndex: emit-rtl.c
===
--- emit-rtl.c	(revision 222673)
+++ emit-rtl.c	(working copy)
@@ -4680,7 +4680,9 @@ emit_pattern_after_setloc (rtx pattern,
   after = NEXT_INSN (after);
   while (1)
 {
-  if (active_insn_p (after)  !INSN_LOCATION (after))
+  if (active_insn_p (after)
+	   !JUMP_TABLE_DATA_P (after) /* FIXME */
+	   !INSN_LOCATION (after))
 	INSN_LOCATION (after) = loc;
   if (after == last)
 	break;
@@ -4791,7 +4793,9 @@ emit_pattern_before_setloc (rtx pattern,
 first = NEXT_INSN (first);
   while (1)
 {
-  if (active_insn_p (first)  !INSN_LOCATION (first))
+  if (active_insn_p (first)
+	   !JUMP_TABLE_DATA_P (first) /* FIXME */
+	   !INSN_LOCATION (first))
 	INSN_LOCATION (first) = loc;
   if (first == last)
 	break;

Re: [PATCH] combine: Don't create (set (reg:CC) (compare (reg:CC) (const0)))

2015-05-11 Thread Segher Boessenkool

On Mon, May 11, 2015 at 11:23:47AM -0600, Jeff Law wrote:
  * combine.c (simplify_set): When generating a CC set, if the
  source already is in the correct mode, do not wrap it in a
  compare.  Simplify the rest of that code.
 Seems reasonable.  Might not hurt to do a little testing on a cc0 target 
 though.

I tested on mn10300, cris, and m68k.  On m68k it triggers while
building libgcc (for 040).  I verified the transform is correct.

I don't have a setup to actually bootstrap or regression check
any cc0 target.  A recurring theme :-(


Segher

Re: [PATCH] combine: Don't create (set (reg:CC) (compare (reg:CC) (const0)))

2015-05-11 Thread Jeff Law


On 05/11/2015 03:11 PM, Segher Boessenkool wrote:

On Mon, May 11, 2015 at 11:23:47AM -0600, Jeff Law wrote:

* combine.c (simplify_set): When generating a CC set, if the
source already is in the correct mode, do not wrap it in a
compare.  Simplify the rest of that code.

Seems reasonable.  Might not hurt to do a little testing on a cc0 target
though.


I tested on mn10300, cris, and m68k.  On m68k it triggers while
building libgcc (for 040).  I verified the transform is correct.

Seems reasonable to me.



I don't have a setup to actually bootstrap or regression check
any cc0 target.  A recurring theme :-(
Understood.  As I mentioned in IRC and in another thread, it may be 
useful to set up an aranym instance for m68k bootstrap testing in the 
compile farm.


For other targets, building foo-elf in a single tree build with 
binutils, newlib and the simulator, then running the gcc testsuite with 
the simulator is often a good alternative.  Probably wise to run through 
this once or twice just to get familiar with it, then pull it out when 
really needed ;)


jeff

Re: False ODR violation positives on anonymous namespace types

2015-05-11 Thread Jason Merrill


On 05/11/2015 01:05 PM, Jan Hubicka wrote:

On 05/11/2015 12:46 PM, Jan Hubicka wrote:

Well, my main motivatoin to extend from RECORD_OR_UNION_TYPE_P was to handle
enums. But other case I would like to deal with are integer types - i.e. 
preserve
difference between char/signed char/unsigned char/short/int/long/wchar in cases
where they structurally coincide.


In what context?  Won't you get that from comparing e.g. the field
types of two definitions of the same class?


If one class define int foo; and other long foo; we currently do not 
complain
about ODR on 32bit targets while I think we could.


We certainly should.  But that's a problem because foo is subject to the 
ODR.  I don't see why you need to treat int as an ODR type to get 
checking for foo.


Jason

Re: [C frontend] Fix construction of TYPE_STUB_DECL

2015-05-11 Thread Jason Merrill


On 05/10/2015 12:33 PM, Jan Hubicka wrote:

This is properly honored by C++ FE but other FEs are bit random, which in turn
confuses type_in_anonymous_namespace_p predicate that leads to flase poistives
on type mismatch warnings.  I used to be able to get around by checking only
C++ types at LTO time, but with type checking in lto-symtab I can not, because
I do want to compare type compatibility cross translation units and cross 
languages
and we have no reliable way to say what type originated as C++ and what did not.


I think we should, as only C++ declarations are subject to the ODR.  C 
has different (structural) compatibility rules, and I think they should 
apply when comparing C and C++ types.


Since C struct names have no linkage, I don't think it's right to set 
TREE_PUBLIC on them.


Jason

Re: [C frontend] Fix construction of TYPE_STUB_DECL

2015-05-11 Thread Jan Hubicka

 On 05/10/2015 12:33 PM, Jan Hubicka wrote:
 This is properly honored by C++ FE but other FEs are bit random, which in 
 turn
 confuses type_in_anonymous_namespace_p predicate that leads to flase 
 poistives
 on type mismatch warnings.  I used to be able to get around by checking only
 C++ types at LTO time, but with type checking in lto-symtab I can not, 
 because
 I do want to compare type compatibility cross translation units and cross 
 languages
 and we have no reliable way to say what type originated as C++ and what did 
 not.
 
 I think we should, as only C++ declarations are subject to the ODR.
 C has different (structural) compatibility rules, and I think they
 should apply when comparing C and C++ types.
 
 Since C struct names have no linkage, I don't think it's right to
 set TREE_PUBLIC on them.

Yes, I need safe way to tell what type is subject to ODR and what is not.
I am not quite sure what is the best approach here. This is what the code is 
doing
currently:

To detect ODR types at LTO time I use combination of presence of mangled name
and type_in_anonymous_namespace_p check. (The idea is that we do not really need
to mangle anonymous type as we do not need to deal with cross-module merging.):

odr_type_p (const_tree t)
{
  if (type_in_anonymous_namespace_p (t))
return true;
  /* We do not have this information when not in LTO, but we do not need
 to care, since it is used only for type merging.  */
  gcc_assert (in_lto_p || flag_lto);

  return (TYPE_NAME (t)
   (DECL_ASSEMBLER_NAME_SET_P (TYPE_NAME (t;
}
bool
type_in_anonymous_namespace_p (const_tree t)
{
  /* TREE_PUBLIC of TYPE_STUB_DECL may not be properly set for
 bulitin types; those have CONTEXT NULL.  */
  if (!TYPE_CONTEXT (t))
return false;
  return (TYPE_STUB_DECL (t)  !TREE_PUBLIC (TYPE_STUB_DECL (t)));
}

the catch is that type_in_anonymous_namespace_p returns true for some C types.
The TYPE_CONTEXT test is already hack as I run into cases of pre-streamed LTO
types to get injected into C++ classes (i.e. if you refere to int, LTO
streaming will replace C++ int with TREE_PUBLIC (TYPE_STUB_DECL (t)) by its own
int that is !TREE_PUBLIC (TYPE_STUB_DECL (t)).  The hack to avoid builtin types
made type_in_anonymous_namespace_p to work well in cases I needed for class
types. (I believe it is because C builds record with
 TREE_PUBLIC (TYPE_STUB_DECL (t))=1 but it is a while I checked this)

We could certainly just add a flag TYPE_ODR_P that says this type is
controlled by odr rule.  I considered that but it is generally not so nice
to introduce new flags and it seemed to me that because C
standard allows to match all types cross-module on structural basis, it makes
sense to consider them all as having public linkage because they are
accessible from outside this translation unit. as tree.h defines the flag.

I would be happy with TYPE_ODR_P or any other solution that looks better.

Honza

 
 Jason

Re: False ODR violation positives on anonymous namespace types

2015-05-11 Thread Jan Hubicka

 On 05/11/2015 12:46 PM, Jan Hubicka wrote:
 Well, my main motivatoin to extend from RECORD_OR_UNION_TYPE_P was to handle
 enums. But other case I would like to deal with are integer types - i.e. 
 preserve
 difference between char/signed char/unsigned char/short/int/long/wchar in 
 cases
 where they structurally coincide.
 
 In what context?  Won't you get that from comparing e.g. the field
 types of two definitions of the same class?

If one class define int foo; and other long foo; we currently do not 
complain
about ODR on 32bit targets while I think we could.
Other case was the ODR violations dragged by the signed/unsigned:

char switch:
$ cat t.C
char a;
$ ./xgcc -B ./ -O2 t.C -o t1.o -fno-signed-char -c -flto
$ ./xgcc -B ./ -O2 t.C -o t2.o -fsigned-char -c -flto
$ ./xgcc -B ./ -O2 t1.o t2.o -flto -fno-signed-char  -flto
built-in: warning: type ïcharï violates one definition rule [-Wodr]
built-in: note: a type with different signedness is defined in another 
translation unit
t.C:1:6: warning: type of ïaï does not match original declaration
 char a;
  ^
t.C:1:6: note: previously declared here
 char a;
  ^

I also want to use ODR for more fine grained TBAA on LTO and there the 
differences between
integer types matter more.

Honza
 
 Jason

Re: [RFC]: Remove Mem/address type assumption in combiner

2015-05-11 Thread Segher Boessenkool

Hi Steve,

On Mon, May 11, 2015 at 10:50:02AM -0700, Steve Ellcey wrote:
 This patch broke a number of MIPS tests, specifically mips32r6 tests
 that look for the lsa instruction (load scaled address) which shifts one
 register and then adds it to a second register.  I am not sure if this
 needs to be addressed in combine.c or if we need to add a peephole
 optimization to mips.md to handle the new instruction sequence.  What do
 you think?  Is the change here what you would expect to see from your
 patch?

Yes, this is as expected.  AFAICS the only change you need in the
MIPS backend is to change the GPR:dlsa pattern to match a shift
instead of a mult (and change the const_immlsa_operand predicate
to just match 1..4 instead of the powers).


Segher

Re: [Patch][loop-invariant.c] Fix a couple of bugs regarding loop invariant motion discovered by spec2k6 on aarch64

2015-05-11 Thread Jeff Law


On 05/11/2015 06:02 AM, David Sherwood wrote:

Hi,

This patch fixes a couple of issues I saw during the compilation of shell_lam.f
for 410.bwaves test in spec2006:

* create_new_invariant: We shouldn't bother attempting to calculate the address
cost for something that clearly isn't an address. Use a SCALAR_INT_MODE_P check
to eliminate the most obvious cases, such as vector modes and so on.
* get_inv_cost: Change the code so that we don't treat something as an address
if it is never actually used as an address, i.e. we didn't use it at all. This
occurs due to the way we process the innermost loop first - invariants get
pulled out of inner most loop and then get processed during next innermost loop.
Here is more detail from shell_lam.f.210r.loop2_invariant dump file that
shows what currently happens:

4427 *ending processing of loop 27 **

4442 (const_vector:V2DF [
4443 (const_double:DF 4.0e+0 [0x0.8p+3])
 (const_double:DF 4.0e+0 [0x0.8p+3])
4445 ])
4446
4447 Hot cost: 8 (final)
4448 (set (reg:V2DF 3179)
4449 (const_vector:V2DF [
4450 (const_double:DF 4.0e+0 [0x0.8p+3])
4451 (const_double:DF 4.0e+0 [0x0.8p+3])
4452 ]))
4453
4454 Hot cost: 8 (final)
4455 Set in insn 2764 is invariant (1), cost 8, depends on

4471 Decided to move invariant 1 -- gain 8

move_invariant_reg moves invariant (the const_vector above) into 3490 and
replaces use of 3179 with 3490, but conservatively keeps the definition of 3179,
which gets processed in outer loop, but is never used.

Debug for outer loop now follows:

4497 *starting processing of loop 26 **

6379 (insn 2764 2748 2766 110 (set (reg:V2DF 3490)
6380 (const_vector:V2DF [
6381 (const_double:DF 4.0e+0 [0x0.8p+3])
6382 (const_double:DF 4.0e+0 [0x0.8p+3])
6383 ])) shell_lam.f:287 819 {*aarch64_simd_movv2df}
6384  (nil))
6385 ;;   UD chains for insn luid 88 uid 2766

6604 ;;   UD chains for insn luid 21 uid 3836
6605 ;;  reg 3490 { d419(bb 110 insn 2764) }
6606 (insn 3836 2763 2765 111 (set (reg:V2DF 3179)
6607 (reg:V2DF 3490)) shell_lam.f:287 -1
6608  (nil))
^^^ additional insn generated by move_invariant_reg from loop 27

6836 ;;   UD chains for insn luid 40 uid 2790
6837 ;;  reg 3161 { d272(bb 111 insn 2746) }
6838 ;;  reg 3201 { d305(bb 111 insn 2788) }
6839 ;;  reg 3490 { d419(bb 110 insn 2764) }
6840 ;;   eq_note reg 3161 { d272(bb 111 insn 2746) }
6841 ;;   eq_note reg 3201 { d305(bb 111 insn 2788) }
6842 (insn 2790 2788 2791 111 (set (reg:V2DF 3203 [ vect__930.427 ])
6843 (fma:V2DF (reg:V2DF 3161 [ MEM[base: vectp_q.371_2137, index: 
ivtmp.679_3214,
offset: 0B] ])
6844 (neg:V2DF (reg:V2DF 3490))
6845 (reg:V2DF 3201 [ vect__928.424 ]))) shell_lam.f:287 1219 
{fnmav2df4}
6846  (expr_list:REG_DEAD (reg:V2DF 3201 [ vect__928.424 ])
6847 (expr_list:REG_DEAD (reg:V2DF 3179) ...
^^^ 3179 marked as DEAD

8366 *ending processing of loop 26 **

8472 (const_vector:V2DF [
8473 (const_double:DF 4.0e+0 [0x0.8p+3])
8474 (const_double:DF 4.0e+0 [0x0.8p+3])
8475 ])
8476
8477 Hot cost: 8 (final)
8478 (set (reg:V2DF 3490)
8479 (const_vector:V2DF [
8480 (const_double:DF 4.0e+0 [0x0.8p+3])
8481 (const_double:DF 4.0e+0 [0x0.8p+3])
8482 ]))
8483
8484 Hot cost: 8 (final)
8485 Set in insn 2764 is invariant (12), cost 8, depends on

8505 (set (reg:V2DF 3179)
8506 (reg:V2DF 3490))
8507
8508 Hot cost: 8 (final)
8509 Set in insn 3836 is invariant (15), cost 8, depends on 12
^^^ Never moves 3179 invariant out, even though it's never used.

After my patch we now get this:

8586 Decided to move invariant 15 -- gain 16
8587 Decided to move dependent invariant 12

since get_inv_cost sees the number of uses is zero and the dependent invariant
(12) gets moved too. Of course, without my patch the definition of 3179 would
ultimately be eliminated as dead code in a later pass anyway.

Is this ok to go in?

Regards,
David Sherwood.

ChangeLog entry follows ...

 2015-05-08  David Sherwood  david.sherw...@arm.com

 * loop-invariant.c (create_new_invariant): Don't calculate address cost
 if mode is not scalar integers.
 (get_inv_cost): Increase computational cost for unused invariants.
So you'd need to bootstrap and regression test this change for it to be 
fully ready for review.


It would also be good to include a test for the testsuite where you can 
show the improvements in behaviour due to this change.  I'm not offhand 
sure of the license on bwaves, so you may or may not be able to reduce 
it and use the reduced code in the GCC testsuite.


With some

Re: [PATCH i386] Extend sibcall peepholes to allow source in %eax

2015-05-11 Thread Jan Hubicka

 On 05/11/2015 01:46 PM, Uros Bizjak wrote:
 On Mon, May 11, 2015 at 8:00 PM, Jan Hubicka hubi...@ucw.cz wrote:
 On Sun, 10 May 2015, Jan Hubicka wrote:
 
 On i386, peepholes that transform memory load and register-indirect jump 
 into
 memory-indirect jump are overly restrictive in that they don't allow 
 combining
 when the jump target is loaded into %eax, and the called function 
 returns a
 value (also in %eax, so it's not dead after the call).  Fix this by 
 checking
 for same source and output register operands separately.
 
 OK?
* config/i386/i386.md (sibcall_value_memory): Extend peepholes to
allow memory address in %eax.
(sibcall_value_pop_memory): Likewise.
 
 Why do we need the check for liveness after all?  There is SIBLING_CALL_P
 (peep2_next_insn (1)) so we know that the function terminates by the call
 and there are no other uses of the value.
 
 Indeed.  Uros, the peep2_reg_dead_p check was added by your patch as svn
 revision 211776, git commit e51f8b8fed.  Would you agree that the check is 
 not
 necessary for sibcalls as Honza explains?  Would you approve a patch that
 removes it in the sibcall peepholes I modify in the patch under discussion?
 
 Don't we however need to check that operands[0] is not used by the 
 call_insn as
 parameter of the call?  I.e. something like
 
 void
 test(void (*callback ()))
 {
callback(callback);
 }
 
 You need a pointer-to-pointer-to-function to trigger the peephole.  
 Something
 like this:
 
void foo()
{
  void (**bar)(void*);
  asm(:=r(bar));
  (*bar)(*bar);
}
 
 I think instead of peep2_reg_dead_p we want to check that the parameter 
 is not in
 CALL_INSN_FUNCTION_USAGE of the sibcall..
 
 Playing with the above testcase I can't induce failure.  It seems today GCC
 won't allocate the same register as callee address and one of the 
 arguments.
 Do you want me to implement such a check anyway?
 
 Hmm, only way I can trigger same register is:
 void foo()
{
  void (**bar)(void*);
  asm(:=r(bar));
  register void (*var)(void *) asm(%eax);
  var=*bar;
  asm(:+r(var));
  var(var);
}
 
 removing the second asm makes CSE to forward propagae the memory operand
 to call that makes the call different from the register variable.
 
 Still I would check for that, but this is more Uros' area.
 
 This is from [1], and reading this reference, it looks to me that the
 check was introduced due to:
 
 - Adds check that eliminated register is really dead after the call
 (maybe an overkill, but some hard-to-debug problems surfaced due to
 missing liveness checks in the past)
 
 Going down that memory lane, it looks like a safety check for
 something that *might* happen. Looking at the comment, I'd say we can
 remove the check, but we should look for possible fallout.
 I'd tend to agree.

Yes, to make my original email clear, I think we are safe to remove
peep2_reg_dead_p.

I would however introduce a check that the call target is not also among
parameters of the function. In this case the peephole would remove the load
and make the parameter unefined.

While current mainline don't seem to be able to translate the testcase above
that way, perhaps future improvements to LRA/postreload gcse may make it happen
and generally RTL patterns are better to be safe by definition not
only for the actual RTL we are able to generate. I suppose reg_mentioned_p
on call usage is enough.

Honza
 
 jeff

RE: [RFC]: Remove Mem/address type assumption in combiner

2015-05-11 Thread Matthew Fortune

Jeff Law l...@redhat.com writes:
 On 05/11/2015 01:46 PM, Jeff Law wrote:
  On 05/11/2015 01:44 PM, Steve Ellcey wrote:
  On Mon, 2015-05-11 at 13:22 -0500, Segher Boessenkool wrote:
  Hi Steve,
 
  On Mon, May 11, 2015 at 10:50:02AM -0700, Steve Ellcey wrote:
  This patch broke a number of MIPS tests, specifically mips32r6
  tests that look for the lsa instruction (load scaled address) which
  shifts one register and then adds it to a second register.  I am
  not sure if this needs to be addressed in combine.c or if we need
  to add a peephole optimization to mips.md to handle the new
  instruction sequence.
  What do
  you think?  Is the change here what you would expect to see from
  your patch?
 
  Yes, this is as expected.  AFAICS the only change you need in the
  MIPS backend is to change the GPR:dlsa pattern to match a shift
  instead of a mult (and change the const_immlsa_operand predicate
  to just match 1..4 instead of the powers).
 
 
  Segher
 
  Hm, I thought it was going to be more complicated than that, but it
  seems to be working.  I will do a complete test run and then submit a
  patch.
  Yea, it really should be that easy.
 
  I'm pretty sure the sh[123]add insns in the PA need to be updated in a
  similar manner.
 Oh, and just to be clear, I'm not expecting you to do this Steve.  It'd
 be great if you did, but not required.

Does this patch effectively change the canonicalization rules? The following
Still exists in md.texi:

@item
Within address computations (i.e., inside @code{mem}), a left shift is
converted into the appropriate multiplication by a power of two.

Thanks,
Matthew

[PATCH] FreeBSD add functionality to build PIE executables.

2015-05-11 Thread Andreas Tobler


All,

this patch adds the ability to build PIE executables for FreeBSD. The 
core is since a longer time in the code base of FreeBSD itself and is 
working fine.


This patch makes it available for all FreeBSD targets.
Tested on x86_64-*-freebsd11.0 and armv6/hf-*-freebsd11.0, 
i386-*-freebsd11.0 is progress.


In the same turn I removed the STARTFILE/ENDFILE_SPEC from 
config/i386/freebsd.h and use the ones from config/freebsd-spec.h.


Here the results before the patch:

https://gcc.gnu.org/ml/gcc-testresults/2015-05/msg01267.html

and with the patch:

https://gcc.gnu.org/ml/gcc-testresults/2015-05/msg01324.html

Is this ok for trunk and for 5.1X?

Thanks,
Andreas

2015-05-11  Andreas Tobler  andre...@gcc.gnu.org

* config/freebsd-spec.h (FBSD_STARTFILE_SPEC): Add the bits to build
pie executables.
(FBSD_ENDFILE_SPEC): Likewise.
* config/i386/freebsd.h (STARTFILE_SPEC): Remove and use the one from
config/freebsd-spec.h.
(ENDFILE_SPEC): Likewise.

2015-05-11  Andreas Tobler  andre...@gcc.gnu.org

* lib/target-supports.exp (check_effective_target_pie): Add *-*-freebsd*
to the familiy of pie capable targets.
Index: gcc/config/freebsd-spec.h
===
--- gcc/config/freebsd-spec.h   (revision 223017)
+++ gcc/config/freebsd-spec.h   (working copy)
@@ -66,8 +66,9 @@
   %{!shared: \
  %{pg:gcrt1.o%s} %{!pg:%{p:gcrt1.o%s} \
   %{!p:%{profile:gcrt1.o%s} \
-%{!profile:crt1.o%s \
-   crti.o%s %{!shared:crtbegin.o%s} %{shared:crtbeginS.o%s}
+%{!profile: \
+%{pie: Scrt1.o%s;:crt1.o%s} \
+   crti.o%s %{static:crtbeginT.o%s;shared|pie:crtbeginS.o%s;:crtbegin.o%s}
 
 /* Provide a ENDFILE_SPEC appropriate for FreeBSD.  Here we tack on
the magical crtend.o file (see crtstuff.c) which provides part of 
@@ -76,7 +77,7 @@
`crtn.o'.  */
 
 #define FBSD_ENDFILE_SPEC \
-  %{!shared:crtend.o%s} %{shared:crtendS.o%s} crtn.o%s
+  %{shared|pie:crtendS.o%s;:crtend.o%s} crtn.o%s
 
 /* Provide a LIB_SPEC appropriate for FreeBSD as configured and as
required by the user-land thread model.  Before __FreeBSD_version
Index: gcc/config/i386/freebsd.h
===
--- gcc/config/i386/freebsd.h   (revision 223017)
+++ gcc/config/i386/freebsd.h   (working copy)
@@ -59,29 +59,16 @@
 #define SUBTARGET_EXTRA_SPECS \
   { fbsd_dynamic_linker, FBSD_DYNAMIC_LINKER }
 
-/* Provide a STARTFILE_SPEC appropriate for FreeBSD.  Here we add
-   the magical crtbegin.o file (see crtstuff.c) which provides part 
-   of the support for getting C++ file-scope static object constructed 
-   before entering `main'.  */
-   
-#undef STARTFILE_SPEC
-#define STARTFILE_SPEC \
-  %{!shared: \
- %{pg:gcrt1.o%s} %{!pg:%{p:gcrt1.o%s} \
-  %{!p:%{profile:gcrt1.o%s} \
-%{!profile:crt1.o%s \
-   crti.o%s %{!shared:crtbegin.o%s} %{shared:crtbeginS.o%s}
+/* Use the STARTFILE_SPEC from config/freebsd-spec.h.  */
 
-/* Provide a ENDFILE_SPEC appropriate for FreeBSD.  Here we tack on
-   the magical crtend.o file (see crtstuff.c) which provides part of 
-   the support for getting C++ file-scope static object constructed 
-   before entering `main', followed by a normal finalizer file, 
-   `crtn.o'.  */
+#undef  STARTFILE_SPEC
+#define STARTFILE_SPEC FBSD_STARTFILE_SPEC
 
-#undef ENDFILE_SPEC
-#define ENDFILE_SPEC \
-  %{!shared:crtend.o%s} %{shared:crtendS.o%s} crtn.o%s
+/* Use the ENDFILE_SPEC from config/freebsd-spec.h.  */
 
+#undef  ENDFILE_SPEC
+#define ENDFILE_SPEC FBSD_ENDFILE_SPEC
+
 /* Provide a LINK_SPEC appropriate for FreeBSD.  Here we provide support
for the special GCC options -static and -shared, which allow us to
link things in one of these three modes by applying the appropriate
Index: gcc/testsuite/lib/target-supports.exp
===
--- gcc/testsuite/lib/target-supports.exp   (revision 223017)
+++ gcc/testsuite/lib/target-supports.exp   (working copy)
@@ -954,6 +954,7 @@
 
 proc check_effective_target_pie { } {
 if { [istarget *-*-darwin\[912\]*]
+|| [istarget *-*-freebsd*]
 || [istarget *-*-linux*]
 || [istarget *-*-gnu*] } {
return 1;

Re: [PATCH] libiberty: cleanup Makefile.in

2015-05-11 Thread Joseph Myers

On Mon, 11 May 2015, Bernhard Reutner-Fischer wrote:

 +TARGETLIB_PIC = $(dir $(TARGETLIB))pic/$(notdir $(TARGETLIB))
 +TARGETLIB_NOASAN = $(dir $(TARGETLIB))noasan/$(notdir $(TARGETLIB))

Is it OK to use GNU make features in libiberty now?  That is, do binutils 
and GDB now require GNU make?

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: Re: Fwd: [PING 2][PATCH] libgcc: Add CFI directives to the soft floating point support code for ARM

2015-05-11 Thread Martin Galvan

On Mon, May 11, 2015 at 4:45 AM, Ramana Radhakrishnan 
ramana.radhakrish...@arm.com wrote:
 sorry about the slow response, I was travelling for a bit and missed your
 emails. Trying your patch out gives me failures possibly because my mail
 client munged it when it received this inline.

 Can you please rebase if necessary, test and send it again as an attachment?

I checked it again and it seems to be a whitespace issue.
I re-generated the patch and this one seems to apply just fine; try it out
again.

---
libgcc/config/arm/ieee754-df.S | 177 +---
libgcc/config/arm/ieee754-sf.S | 101 ++--
libgcc/config/arm/lib1funcs.S  |  10 ++
3 files changed, 261 insertions(+), 27 deletions(-)

diff --git a/libgcc/config/arm/ieee754-df.S b/libgcc/config/arm/ieee754-df.S
index f75630b..d1f9066 100644
--- a/libgcc/config/arm/ieee754-df.S
+++ b/libgcc/config/arm/ieee754-df.S
@@ -33,8 +33,12 @@
  * Only the default rounding mode is intended for best performances.
  * Exceptions aren't supported yet, but that can be added quite easily
  * if necessary without impacting performances.
+ *
+ * In the CFI related comments, 'previousOffset' refers to the previous offset
+ * from sp used to compute the CFA.
  */

+.cfi_sections .debug_frame

 #ifndef __ARMEB__
 #define xl r0
@@ -53,11 +57,13 @@

 ARM_FUNC_START negdf2
 ARM_FUNC_ALIAS aeabi_dneg negdf2
+CFI_START_FUNCTION

 @ flip sign bit
 eor xh, xh, #0x8000
 RET

+CFI_END_FUNCTION
 FUNC_END aeabi_dneg
 FUNC_END negdf2

@@ -66,6 +72,7 @@ ARM_FUNC_ALIAS aeabi_dneg negdf2
 #ifdef L_arm_addsubdf3

 ARM_FUNC_START aeabi_drsub
+CFI_START_FUNCTION

 eor xh, xh, #0x8000 @ flip sign bit of first arg
 b   1f
@@ -81,7 +88,11 @@ ARM_FUNC_ALIAS aeabi_dsub subdf3
 ARM_FUNC_START adddf3
 ARM_FUNC_ALIAS aeabi_dadd adddf3

-1:  do_push {r4, r5, lr}
+1:  do_push {r4, r5, lr}@ sp -= 12
+.cfi_adjust_cfa_offset 12   @ CFA is now sp + previousOffset + 12
+.cfi_rel_offset r4, 0   @ Registers are saved from sp to sp + 8
+.cfi_rel_offset r5, 4
+.cfi_rel_offset lr, 8

 @ Look for zeroes, equal values, INF, or NAN.
 shift1  lsl, r4, xh, #1
@@ -148,6 +159,11 @@ ARM_FUNC_ALIAS aeabi_dadd adddf3
 @ Since this is not common case, rescale them off line.
 teq r4, r5
 beq LSYM(Lad_d)
+
+@ CFI note: we're lucky that the branches to Lad_* that appear after this 
function
+@ have a CFI state that's exactly the same as the one we're in at this
+@ point. Otherwise the CFI would change to a different state after the branch,
+@ which would be disastrous for backtracing.
 LSYM(Lad_x):

 @ Compensate for the exponent overlapping the mantissa MSB added later
@@ -413,6 +429,7 @@ LSYM(Lad_i):
 orrne   xh, xh, #0x0008 @ quiet NAN
 RETLDM  r4, r5

+CFI_END_FUNCTION
 FUNC_END aeabi_dsub
 FUNC_END subdf3
 FUNC_END aeabi_dadd
@@ -420,12 +437,19 @@ LSYM(Lad_i):

 ARM_FUNC_START floatunsidf
 ARM_FUNC_ALIAS aeabi_ui2d floatunsidf
+CFI_START_FUNCTION

 teq r0, #0
 do_it   eq, t
 moveq   r1, #0
 RETc(eq)
-do_push {r4, r5, lr}
+
+do_push {r4, r5, lr}@ sp -= 12
+.cfi_adjust_cfa_offset 12   @ CFA is now sp + previousOffset + 12
+.cfi_rel_offset r4, 0   @ Registers are saved from sp + 0 to sp + 8.
+.cfi_rel_offset r5, 4
+.cfi_rel_offset lr, 8
+
 mov r4, #0x400  @ initial exponent
 add r4, r4, #(52-1 - 1)
 mov r5, #0  @ sign bit is 0
@@ -435,17 +459,25 @@ ARM_FUNC_ALIAS aeabi_ui2d floatunsidf
 mov xh, #0
 b   LSYM(Lad_l)

+CFI_END_FUNCTION
 FUNC_END aeabi_ui2d
 FUNC_END floatunsidf

 ARM_FUNC_START floatsidf
 ARM_FUNC_ALIAS aeabi_i2d floatsidf
+CFI_START_FUNCTION

 teq r0, #0
 do_it   eq, t
 moveq   r1, #0
 RETc(eq)
-do_push {r4, r5, lr}
+
+do_push {r4, r5, lr}@ sp -= 12
+.cfi_adjust_cfa_offset 12   @ CFA is now sp + previousOffset + 12
+.cfi_rel_offset r4, 0   @ Registers are saved from sp + 0 to sp + 8.
+.cfi_rel_offset r5, 4
+.cfi_rel_offset lr, 8
+
 mov r4, #0x400  @ initial exponent
 add r4, r4, #(52-1 - 1)
 andsr5, r0, #0x8000 @ sign bit in r5
@@ -457,11 +489,13 @@ ARM_FUNC_ALIAS aeabi_i2d floatsidf
 mov xh, #0
 b   LSYM(Lad_l)

+CFI_END_FUNCTION
 FUNC_END aeabi_i2d
 FUNC_END floatsidf

 ARM_FUNC_START extendsfdf2
 ARM_FUNC_ALIAS aeabi_f2d extendsfdf2
+CFI_START_FUNCTION

 movsr2, r0, lsl #1  @ toss sign bit
 mov xh, r2, asr #3  @ stretch exponent
@@ -480,34 +514,54 @@ ARM_FUNC_ALIAS aeabi_f2d extendsfdf2

 @ value was denormalized.  We can normalize it now.
 do_push {r4, r5, lr}
+.cfi_adjust_cfa_offset 12   @ CFA is now sp + previousOffset + 12
+.cfi_rel_offset r4, 0   @ Registers are saved from sp + 0 to sp + 8.
+.cfi_rel_offset r5, 4
+.cfi_rel_offset lr, 8
+
 mov r4, #0x380  @ setup corresponding exponent

Re: [PATCH i386] Extend sibcall peepholes to allow source in %eax

2015-05-11 Thread Jeff Law


On 05/11/2015 01:46 PM, Uros Bizjak wrote:

On Mon, May 11, 2015 at 8:00 PM, Jan Hubicka hubi...@ucw.cz wrote:

On Sun, 10 May 2015, Jan Hubicka wrote:


On i386, peepholes that transform memory load and register-indirect jump into
memory-indirect jump are overly restrictive in that they don't allow combining
when the jump target is loaded into %eax, and the called function returns a
value (also in %eax, so it's not dead after the call).  Fix this by checking
for same source and output register operands separately.

OK?
   * config/i386/i386.md (sibcall_value_memory): Extend peepholes to
   allow memory address in %eax.
   (sibcall_value_pop_memory): Likewise.


Why do we need the check for liveness after all?  There is SIBLING_CALL_P
(peep2_next_insn (1)) so we know that the function terminates by the call
and there are no other uses of the value.


Indeed.  Uros, the peep2_reg_dead_p check was added by your patch as svn
revision 211776, git commit e51f8b8fed.  Would you agree that the check is not
necessary for sibcalls as Honza explains?  Would you approve a patch that
removes it in the sibcall peepholes I modify in the patch under discussion?


Don't we however need to check that operands[0] is not used by the call_insn as
parameter of the call?  I.e. something like

void
test(void (*callback ()))
{
   callback(callback);
}


You need a pointer-to-pointer-to-function to trigger the peephole.  Something
like this:

   void foo()
   {
 void (**bar)(void*);
 asm(:=r(bar));
 (*bar)(*bar);
   }


I think instead of peep2_reg_dead_p we want to check that the parameter is not 
in
CALL_INSN_FUNCTION_USAGE of the sibcall..


Playing with the above testcase I can't induce failure.  It seems today GCC
won't allocate the same register as callee address and one of the arguments.
Do you want me to implement such a check anyway?


Hmm, only way I can trigger same register is:
void foo()
   {
 void (**bar)(void*);
 asm(:=r(bar));
 register void (*var)(void *) asm(%eax);
 var=*bar;
 asm(:+r(var));
 var(var);
   }

removing the second asm makes CSE to forward propagae the memory operand
to call that makes the call different from the register variable.

Still I would check for that, but this is more Uros' area.


This is from [1], and reading this reference, it looks to me that the
check was introduced due to:

- Adds check that eliminated register is really dead after the call
(maybe an overkill, but some hard-to-debug problems surfaced due to
missing liveness checks in the past)

Going down that memory lane, it looks like a safety check for
something that *might* happen. Looking at the comment, I'd say we can
remove the check, but we should look for possible fallout.

I'd tend to agree.

jeff

1 2 >

1 - 100 of 148 matches

Mail list logo