date:20161121

Re: [PATCH] Fix gimple store merging (PR tree-optimization/78436)

2016-11-21 Thread Richard Biener

On Mon, 21 Nov 2016, Jakub Jelinek wrote:

> Hi!
> 
> The
>if (!BYTES_BIG_ENDIAN)
> -shift_bytes_in_array (tmpbuf, byte_size, shift_amnt);
> +{
> +  shift_bytes_in_array (tmpbuf, byte_size, shift_amnt);
> +  if (shift_amnt == 0)
> +   byte_size--;
> +}
> hunk below is the actual fix for the PR, where we originally store:
> 8-bit 0 at offset 24-bits followed by 24-bit negative value at offset 0,
> little endian.  encode_tree_to_bitpos actually allocates 1 extra byte in the
> buffer and byte_size is also 1 byte longer, for the case where the
> bits need to be shifted (it only cares about shifts within bytes, so 0 to
> BITS_PER_UNIT - 1).  If no shifting is done and there is no padding, we are
> also fine, because native_encode_expr will only actually write the size of
> TYPE_MODE bytes.  But in this case padding is 1 byte, so native_encode_expr
> writes 4 bytes (the last one is 0xff), byte_size is initially 5, as padding
> is 1, it is decremented to 4.  But we actually want to store just 3 bytes,
> not 4; when we store 4, we overwrite the earlier value of the following
> byte.
> 
> The rest of the patch are just cleanups.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Ok.

Richard.

> 2016-11-21  Jakub Jelinek  
> 
>   PR tree-optimization/78436
>   * gimple-ssa-store-merging.c (zero_char_buf): Removed.
>   (shift_bytes_in_array, shift_bytes_in_array_right,
>   merged_store_group::apply_stores): Formatting fixes.
>   (clear_bit_region): Likewise.  Use memset.
>   (encode_tree_to_bitpos): Formatting fixes.  Fix comment typos - EPXR
>   instead of EXPR and inerted instead of inserted.  Use memset instead
>   of zero_char_buf.  For !BYTES_BIG_ENDIAN decrease byte_size by 1
>   if shift_amnt is 0.
> 
>   * gcc.c-torture/execute/pr78436.c: New test.
> 
> --- gcc/gimple-ssa-store-merging.c.jj 2016-11-09 15:22:36.0 +0100
> +++ gcc/gimple-ssa-store-merging.c2016-11-21 10:54:51.746090238 +0100
> @@ -199,17 +199,6 @@ dump_char_array (FILE *fd, unsigned char
>fprintf (fd, "\n");
>  }
>  
> -/* Fill a byte array PTR of SZ elements with zeroes.  This is to be used by
> -   encode_tree_to_bitpos to zero-initialize most likely small arrays but
> -   with a non-compile-time-constant size.  */
> -
> -static inline void
> -zero_char_buf (unsigned char *ptr, unsigned int sz)
> -{
> -  for (unsigned int i = 0; i < sz; i++)
> -ptr[i] = 0;
> -}
> -
>  /* Shift left the bytes in PTR of SZ elements by AMNT bits, carrying over the
> bits between adjacent elements.  AMNT should be within
> [0, BITS_PER_UNIT).
> @@ -224,14 +213,13 @@ shift_bytes_in_array (unsigned char *ptr
>  return;
>  
>unsigned char carry_over = 0U;
> -  unsigned char carry_mask = (~0U) << ((unsigned char)(BITS_PER_UNIT - 
> amnt));
> +  unsigned char carry_mask = (~0U) << (unsigned char) (BITS_PER_UNIT - amnt);
>unsigned char clear_mask = (~0U) << amnt;
>  
>for (unsigned int i = 0; i < sz; i++)
>  {
>unsigned prev_carry_over = carry_over;
> -  carry_over
> - = (ptr[i] & carry_mask) >> (BITS_PER_UNIT - amnt);
> +  carry_over = (ptr[i] & carry_mask) >> (BITS_PER_UNIT - amnt);
>  
>ptr[i] <<= amnt;
>if (i != 0)
> @@ -263,10 +251,9 @@ shift_bytes_in_array_right (unsigned cha
>for (unsigned int i = 0; i < sz; i++)
>  {
>unsigned prev_carry_over = carry_over;
> -  carry_over
> - = (ptr[i] & carry_mask);
> +  carry_over = ptr[i] & carry_mask;
>  
> - carry_over <<= ((unsigned char)BITS_PER_UNIT - amnt);
> + carry_over <<= (unsigned char) BITS_PER_UNIT - amnt;
>   ptr[i] >>= amnt;
>   ptr[i] |= prev_carry_over;
>  }
> @@ -327,7 +314,7 @@ clear_bit_region (unsigned char *ptr, un
>/* Second base case.  */
>else if ((start + len) <= BITS_PER_UNIT)
>  {
> -  unsigned char mask = (~0U) << ((unsigned char)(BITS_PER_UNIT - len));
> +  unsigned char mask = (~0U) << (unsigned char) (BITS_PER_UNIT - len);
>mask >>= BITS_PER_UNIT - (start + len);
>  
>ptr[0] &= ~mask;
> @@ -346,8 +333,7 @@ clear_bit_region (unsigned char *ptr, un
>unsigned int nbytes = len / BITS_PER_UNIT;
>/* We could recurse on each byte but do the loop here to avoid
>recursing too deep.  */
> -  for (unsigned int i = 0; i < nbytes; i++)
> - ptr[i] = 0U;
> +  memset (ptr, '\0', nbytes);
>/* Clear the remaining sub-byte region if there is one.  */
>if (len % BITS_PER_UNIT != 0)
>   clear_bit_region (ptr + nbytes, 0, len % BITS_PER_UNIT);
> @@ -362,7 +348,7 @@ clear_bit_region (unsigned char *ptr, un
>  
>  static bool
>  encode_tree_to_bitpos (tree expr, unsigned char *ptr, int bitlen, int bitpos,
> - unsigned int total_bytes)
> +unsigned int total_bytes)
>  {
>unsigned int first_byte = bitpos / BITS_PER_UNIT;
>tree tmp_int = expr;
> @@ -370,8 +356,8 @

Re: [PATCH] Fix ICE with -Wuninitialized (PR tree-optimization/78455)

2016-11-21 Thread Jakub Jelinek

On Mon, Nov 21, 2016 at 04:02:40PM -0800, Marek Polacek wrote:
> What seems like a typo caused an ICE here.  We've got a vector of vectors here
> and we're trying to walk all the elements, so the second loop oughta use 'j'.
> 
> Bootstrapped/regtested on x86_64-linux, ok for trunk?
> 
> 2016-11-21  Marek Polacek  
> 
>   PR tree-optimization/78455
>   * tree-ssa-uninit.c (can_chain_union_be_invalidated_p): Fix typo.
> 
>   * gcc.dg/uninit-23.c: New.

I'd say this is even obvious.  Ok.

Jakub

[PATCH] Propagate cv qualifications in variant_alternative

2016-11-21 Thread Tim Shen

Tested on x86_64-linux-gnu.

Thanks!


-- 
Regards,
Tim Shen
commit 69c72d9bb802fd5e4f2704f0fe8a041f8b26d8bd
Author: Tim Shen 
Date:   Mon Nov 21 21:29:13 2016 -0800

2016-11-22  Tim Shen  

	PR libstdc++/78441
	* include/std/variant: Propagate cv qualifications to types returned
	by variant_alternative.
	* testsuite/20_util/variant/compile.cc: Tests.

diff --git a/libstdc++-v3/include/std/variant b/libstdc++-v3/include/std/variant
index 7d93575..34ad3fd 100644
--- a/libstdc++-v3/include/std/variant
+++ b/libstdc++-v3/include/std/variant
@@ -85,6 +85,18 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 using variant_alternative_t =
   typename variant_alternative<_Np, _Variant>::type;
 
+  template
+struct variant_alternative<_Np, const _Variant>
+{ using type = add_const_t>; };
+
+  template
+struct variant_alternative<_Np, volatile _Variant>
+{ using type = add_volatile_t>; };
+
+  template
+struct variant_alternative<_Np, const volatile _Variant>
+{ using type = add_cv_t>; };
+
   constexpr size_t variant_npos = -1;
 
 _GLIBCXX_END_NAMESPACE_VERSION
diff --git a/libstdc++-v3/testsuite/20_util/variant/compile.cc b/libstdc++-v3/testsuite/20_util/variant/compile.cc
index 2470bcc..e3330be 100644
--- a/libstdc++-v3/testsuite/20_util/variant/compile.cc
+++ b/libstdc++-v3/testsuite/20_util/variant/compile.cc
@@ -330,3 +330,12 @@ void test_adl()
variant v8{allocator_arg, a, in_place_type, il, x};
variant v9{allocator_arg, a, in_place_type, 1};
 }
+
+void test_variant_alternative() {
+  static_assert(is_same_v>, int>, "");
+  static_assert(is_same_v>, string>, "");
+
+  static_assert(is_same_v>, const int>, "");
+  static_assert(is_same_v>, volatile int>, "");
+  static_assert(is_same_v>, const volatile int>, "");
+}

Re: [PATCH v3] cpp/c: Add -Wexpansion-to-defined

2016-11-21 Thread Paolo Bonzini

> It's not obvious to me whether this belongs in -Wextra.  After all, this
> is a perfectly reasonable and useful GNU C feature, or at least some cases
> of it are (like "#define FOO (BAR || defined something)").  Is the
> argument that there are too many details of it that differ between
> implementations, as discussed in section 3.2 of
> ?

Yes, and in general it fits the group of "often annoying warnings, that
people may nevertheless appreciate" that are already in -Wextra, for
example -Wunused-parameter, -Wmissing-field-initializers or
-Wshift-negative-value.

Thanks,

Paolo

Ping: Re: [patch, avr] Add flash size to device info and make wrap around default

2016-11-21 Thread Pitchumani Sivanupandi


Ping!

On Monday 14 November 2016 07:03 PM, Pitchumani Sivanupandi wrote:

Ping!

On Thursday 10 November 2016 01:53 PM, Pitchumani Sivanupandi wrote:

On Wednesday 09 November 2016 08:05 PM, Georg-Johann Lay wrote:

On 09.11.2016 10:14, Pitchumani Sivanupandi wrote:

On Tuesday 08 November 2016 02:57 PM, Georg-Johann Lay wrote:

On 08.11.2016 08:08, Pitchumani Sivanupandi wrote:
I have updated patch to include the flash size as well. Took that 
info from
device headers (it was fed into crt's device information note 
section also).


The new option would render -mn-flash superfluous, but we should 
keep it for

backward compatibility.

Ok.

Shouldn't link_pmem_wrap then be removed from link_relax, i.e. from
LINK_RELAX_SPEC?  And what happens if relaxation is off?

Yes. Removed link_pmem_wrap from link_relax.
Disabling relaxation doesn't change -mpmem-wrap-around behavior.

flashsize-and-wrap-around.patch


diff --git a/gcc/config/avr/avr-mcus.def 
b/gcc/config/avr/avr-mcus.def

index 6bcc6ff..9d4aa1a 100644



 /*



 /* Classic, > 8K, <= 64K.  */
-AVR_MCU ("avr3", ARCH_AVR3, AVR_ISA_NONE, 
NULL,0x0060, 0x0, 1)
-AVR_MCU ("at43usb355",   ARCH_AVR3, AVR_ISA_NONE, 
"__AVR_AT43USB355__",0x0060, 0x0, 1)
-AVR_MCU ("at76c711", ARCH_AVR3, AVR_ISA_NONE, 
"__AVR_AT76C711__",  0x0060, 0x0, 1)
+AVR_MCU ("avr3", ARCH_AVR3, AVR_ISA_NONE, 
NULL,0x0060, 0x0, 1, 0x6000)
+AVR_MCU ("at43usb355",   ARCH_AVR3, AVR_ISA_NONE, 
"__AVR_AT43USB355__",0x0060, 0x0, 1, 0x6000)
+AVR_MCU ("at76c711", ARCH_AVR3, AVR_ISA_NONE, 
"__AVR_AT76C711__",  0x0060, 0x0, 1, 0x4000)
+AVR_MCU ("at43usb320",   ARCH_AVR3, AVR_ISA_NONE, 
"__AVR_AT43USB320__",0x0060, 0x0, 1, 0x1)

 /* Classic, == 128K.  */
-AVR_MCU ("avr31",ARCH_AVR31, AVR_ERRATA_SKIP, 
NULL,0x0060, 0x0, 2)
-AVR_MCU ("atmega103",ARCH_AVR31, AVR_ERRATA_SKIP, 
"__AVR_ATmega103__", 0x0060, 0x0, 2)
-AVR_MCU ("at43usb320",   ARCH_AVR31, AVR_ISA_NONE, 
"__AVR_AT43USB320__",   0x0060, 0x0, 2)
+AVR_MCU ("avr31",ARCH_AVR31, AVR_ERRATA_SKIP, 
NULL,0x0060, 0x0, 2, 0x2)
+AVR_MCU ("atmega103",ARCH_AVR31, AVR_ERRATA_SKIP, 
"__AVR_ATmega103__", 0x0060, 0x0, 2, 0x2)

 /* Classic + MOVW + JMP/CALL.  */


If at43usb320 is in the wrong multilib, then this should be handled 
as separate issue / patch together with its own PR. Sorry for the 
confusion.  I just noticed that some fields don't match...


It is not even clear to me from the data sheet if avr3 is the 
correct multilib or perhaps avr35 (if it supports MOVW) or even avr5 
(if it also has MUL) as there is no reference to the exact 
instruction set -- Atmochip will know.


Moreover, such a change should be sync'ed with avr-libc as all 
multilib stuff is hand-wired there: no use of --print-foo meta 
information retrieval by avr-libc :-((


I filed PR78275 and https://savannah.nongnu.org/bugs/index.php?49565 
for this one.


Thats better. I've attached the updated patch. If OK, could someone 
commit please?


I'll try if I could find some more info for AT43USB320.

Regards,
Pitchumani

Re: [PATCH v3] cpp/c: Add -Wexpansion-to-defined

2016-11-21 Thread Joseph Myers

It's not obvious to me whether this belongs in -Wextra.  After all, this 
is a perfectly reasonable and useful GNU C feature, or at least some cases 
of it are (like "#define FOO (BAR || defined something)").  Is the 
argument that there are too many details of it that differ between 
implementations, as discussed in section 3.2 of 
?

-- 
Joseph S. Myers
jos...@codesourcery.com

Ping 3 [PATCH] enhance buffer overflow warnings (and c/53562)

2016-11-21 Thread Martin Sebor


Ping: https://gcc.gnu.org/ml/gcc-patches/2016-11/msg00896.html

On 11/16/2016 08:58 AM, Martin Sebor wrote:

I'm still looking for a review of the patch below, first posted
on 10/28 and last updated/pinged last Wednesday:

  https://gcc.gnu.org/ml/gcc-patches/2016-11/msg00896.html

Thanks

On 11/09/2016 03:49 PM, Martin Sebor wrote:

The attached minor update to the patch also resolves bug 77784 that
points out that -Wformat-length issues a warning also issued during
the expansion of some of the __builtin___sprintf_chk intrinsics.

Martin

On 11/04/2016 02:16 PM, Martin Sebor wrote:

Attached is an update to the patch that takes into consideration
the feedback I got.  It goes back to adding just one option,
-Wstringop-overflow, as in the original, while keeping the Object
Size type as an argument.  It uses type-1 as the default setting
for string functions (strcpy et al.) and, unconditionally, type-0
for raw memory functions (memcpy, etc.)

I retested Binutils 2.27 and the Linux kernel again with this patch
and also added Glibc, and it doesn't complain about anything (both
Binutils and the kernel also build cleanly with an unpatched GCC
with_FORTIFY_SOURCE=2 or its rough equivalent for the kernel).
The emit-rtl.c warning (bug 78174) has also been suppressed by
the change to bos type-0 for memcpy.

While the patch doesn't trigger any false positives (AFAIK) it is
subject to a fair number of false negatives due to the limitations
of the tree-object-size pass, and due to transformations done by
other passes that prevent it from detecting some otherwise obvious
overflows.  Although unfortunate, I believe the warnings that are
emitted are useful as the first line of defense in software that
doesn't use _FORTIFY_SOURCE (such as GCC itself).   And this can
of course be improved if some of the limitations are removed over
time.

Martin

[PATCH] Fix ICE with -Wuninitialized (PR tree-optimization/78455)

2016-11-21 Thread Marek Polacek

What seems like a typo caused an ICE here.  We've got a vector of vectors here
and we're trying to walk all the elements, so the second loop oughta use 'j'.

Bootstrapped/regtested on x86_64-linux, ok for trunk?

2016-11-21  Marek Polacek  

PR tree-optimization/78455
* tree-ssa-uninit.c (can_chain_union_be_invalidated_p): Fix typo.

* gcc.dg/uninit-23.c: New.

diff --git gcc/testsuite/gcc.dg/uninit-23.c gcc/testsuite/gcc.dg/uninit-23.c
index e69de29..b38e1d0 100644
--- gcc/testsuite/gcc.dg/uninit-23.c
+++ gcc/testsuite/gcc.dg/uninit-23.c
@@ -0,0 +1,27 @@
+/* PR tree-optimization/78455 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -Wuninitialized" } */
+
+int ij;
+
+void
+ql (void)
+{
+  int m5 = 0;
+
+  for (;;)
+  {
+if (0)
+  for (;;)
+  {
+int *go;
+int *t4 = go;
+
+ l1:
+*t4 = (*t4 != 0) ? 0 : 2; /* { dg-warning "may be used uninitialized" 
} */
+  }
+
+if (ij != 0)
+  goto l1;
+  }
+}
diff --git gcc/tree-ssa-uninit.c gcc/tree-ssa-uninit.c
index 68dcf15..4557403 100644
--- gcc/tree-ssa-uninit.c
+++ gcc/tree-ssa-uninit.c
@@ -2192,7 +2192,7 @@ can_chain_union_be_invalidated_p (pred_chain_union 
use_preds,
   pred_chain c = use_preds[i];
   bool entire_pred_chain_invalidated = false;
   for (size_t j = 0; j < c.length (); ++j)
-   if (can_one_predicate_be_invalidated_p (c[i], worklist))
+   if (can_one_predicate_be_invalidated_p (c[j], worklist))
  {
entire_pred_chain_invalidated = true;
break;

Marek

PING 2 [PATCH] enable -Wformat-length for dynamically allocated buffers (pr 78245)

2016-11-21 Thread Martin Sebor


Ping.  Still looking for a review of the patch below:

On 11/16/2016 10:33 AM, Martin Sebor wrote:

I'm looking for a review of the patch below:

  https://gcc.gnu.org/ml/gcc-patches/2016-11/msg00779.html

Thanks

On 11/08/2016 05:09 PM, Martin Sebor wrote:

The -Wformat-length checker relies on the compute_builtin_object_size
function to determine the size of the buffer it checks for overflow.
The function returns either a size computed by the tree-object-size
pass for objects referenced by the __builtin_object_size intrinsic
(if it's used in the program) or it tries to compute it for a small
subset of expressions otherwise.  This subset doesn't include objects
allocated by either malloc or alloca, and so for those the function
returns "unknown" or (size_t)-1 in the case of -Wformat-length.  As
a consequence, -Wformat-length is unable to detect overflows
involving such objects.

The attached patch adds a new function, compute_object_size, that
uses the existing algorithms to compute and return the sizes of
allocated objects as well, as if they were referenced by
__builtin_object_size in the program source, enabling the
-Wformat-length checker to detect more buffer overflows.

Martin

PS The function makes use of the init_function_sizes API that is
otherwise unused outside the tree-object-size pass to initialize
the internal structures, but then calls fini_object_sizes to
release them before returning.  That seems wasteful because
the size of the same object or one related to it might need
to computed again in the context of the same function.  I
experimented with allocating and releasing the structures only
when current_function_decl changes but that led to crashes.
I suspect I'm missing something about the management of memory
allocated for these structures.  Does anyone have any suggestions
how to make this work?  (Do I perhaps need to allocate them using
a special allocator so they don't get garbage collected?)

PING [PATCH] have __builtin_object_size handle POINTER_PLUS with non-const offset (pr 77608)

2016-11-21 Thread Martin Sebor


Richard,

Attached is a lightly updated patch mostly with just clarifying
comments and a small bug fix.  I'd appreciate your input (please
see my reply and questions below).  I'm hoping to finalize this
patch based on your feedback so it can be committed soon.

  https://gcc.gnu.org/ml/gcc-patches/2016-11/msg01127.html

Thanks
Martin

On 11/11/2016 08:56 AM, Martin Sebor wrote:

Thanks for the review and comments!



@@ -158,14 +170,149 @@ compute_object_offset (const_tree expr,
const_tree var)
   return size_binop (code, base, off);
 }

+static bool
+operand_unsigned_p (tree op)
+{
+  if (TREE_CODE (op) == SSA_NAME)

new functions need a comment.  But maybe you want to use
tree_expr_nonnegative_p
to also allow signed but known positive ones?


Let me add a comment.

operand_unsigned_p returns true if the type of the original offset
before conversion to sizetype is unsigned.  tree_expr_nonnegative_p
returns true if the argument's type is unsigned, which is always
the case here.



+/* Fill the 2-element OFFRANGE array with the range of values OFF
+   is known to be in.  Postcodition: OFFRANGE[0] <= OFFRANGE[1].  */
+
+static bool
+get_offset_range (tree off, HOST_WIDE_INT offrange[2])
+{
+  STRIP_NOPS (off);

why strip nops (even sign changes!) here?


That might be a leftover from an earlier/failed attempt to simplify
things that I forgot to remove.  Let me do that in a followup patch.
Unless I misunderstand your comment there are no sign changes (AFAIK)
because the offset is always represented as sizetype.


Why below convert things
via to_uhwi when offrange is of type signed HOST_WIDE_INT[2]?


The offset is always represented as sizetype but the code treats
it as signed because in reality it can be negative.  That said,
I don't find dealing with ranges very intuitive so I could very
well be missing something and there may be a better way to code
this.  I welcome suggestions.



+ gimple *def = SSA_NAME_DEF_STMT (off);
+ if (is_gimple_assign (def))
+   {
+ tree_code code = gimple_assign_rhs_code (def);
+ if (code == PLUS_EXPR)
+   {
+ /* Handle offset in the form VAR + CST where VAR's type
+is unsigned so the offset must be the greater of
+OFFRANGE[0] and CST.  This assumes the PLUS_EXPR
+is in a canonical form with CST second.  */
+ tree rhs2 = gimple_assign_rhs2 (def);

err, what?  What about overflow?  Aren't you just trying to decompose
'off' into a variable and a constant part here and somehow extracting a
range for the variable part?  So why not just do that?


Sorry, what about overflow?

The purpose of this code is to handle cases of the form

   & PTR [range (MIN, MAX)] + CST

where CST is unsigned implying that the lower bound of the offset
is the greater of CST and MIN.  For instance, in the following it
determines that bos(p, 0) is 4 (and if the 3 were greater than 7
and overflowed the addition the result would be zero).  I'm not
sure I understand what you suggest I do differently to make this
work.

   char d[7];

   #define bos(p, t) __builtin_object_size (p, t)

   long f (unsigned i)
   {
 if (2 < i) i = 2;

 char *p = &d[i] + 3;

 return bos (p, 0);
   }


+  else if (range_type == VR_ANTI_RANGE)
+   {
+ offrange[0] = max.to_uhwi () + 1;
+ offrange[1] = min.to_uhwi () - 1;
+ return true;
+   }

first of all, how do you know it fits uhwi?  Second, from ~[5, 9] you get
[10, 4] !?  That looks bogus (and contrary to the function comment
postcondition)


I admit I have some trouble working with anti-ranges.  It's also
difficult to fully exercise them in this pass because it only runs
after EVRP but not after VRP1 (except with -g), so only limited
range information is available. (I'm hoping to eventually change
it but moving the passes broke a test in a way that seemed too
complex to fix for this project).

The code above is based on the observation that an anti-range is
often used to represent the full subrange of a narrower signed type
like signed char (as ~[128, -129]).  I haven't been able to create
an anti-range like ~[5, 9]. When/how would a range like that come
about (so I can test it and implement the above correctly)?



+  else if (range_type == VR_VARYING)
+   {
+ gimple *def = SSA_NAME_DEF_STMT (off);
+ if (is_gimple_assign (def))
+   {
+ tree_code code = gimple_assign_rhs_code (def);
+ if (code == NOP_EXPR)
+   {

please trust range-info instead of doing your own little VRP here.
VR_VARYING -> return false.


I would prefer to rely on the range information and not have to work
around it like I do here but, unfortunately, it doesn't always appear
to be available.  For example, in the following test case:

   char d[130];

   #define bos(p, t) __builtin_object_size (p, t)

   void f (void*);

   void g (signed char i)

Re: [PATCH v2, rs6000] Add built-in support for vector compare

2016-11-21 Thread Segher Boessenkool

On Mon, Nov 21, 2016 at 02:42:53PM -0800, Carl E. Love wrote:
> The Power ABI document lists a number of built-ins that it is supposed
> to support.  There are still some missing.  This patch adds the built-in
> support for the following built-ins:

[ snip ]

So, which are still missing after this?  Or were these the last ones?

> Is this ok for trunk?

Yes, thanks!


Segher

Re: [RFA] [PR target/68538] Fix ICE on cris port

2016-11-21 Thread Jeff Law


On 11/21/2016 03:33 PM, Hans-Peter Nilsson wrote:

On Mon, 21 Nov 2016, Jeff Law wrote:



The CRIS port seems to have made a minor goof in a conditional guarding a call
to copy_to_mode_reg.

copy_to_mode_reg always allocates a new pseudo, so calling it when
!can_create_pseudo_p is going to result in an ICE.

The attached patch fixes the ICE.


Ok and thanks and sorry for not keeping up on this.
No worries.  I just happened to stumble across it while doing some BZ 
cleanup.


Jeff

Re: [PATCH] Delete GCJ

2016-11-21 Thread Matthias Klose

On 21.11.2016 18:16, Rainer Orth wrote:
> Hi Matthias,
> 
>> ahh, didn't see that :-/ Now fixed, is this clearer now?
>>
>> The options @option{--with-target-bdw-gc-include} and
>> @option{--with-target-bdw-gc-lib} must always specified together for
>^ be 

thanks to all sorting out the documentation issues. Now attaching the updated
diff. Ok to commit?

Matthias



2016-11-19  Matthias Klose  

* Makefile.def: Remove reference to boehm-gc target module.
* configure.ac: Include pkg.m4, check for --with-target-bdw-gc
options and for the bdw-gc pkg-config module.
* configure: Regenerate.
* Makefile.in: Regenerate.

gcc/

2016-11-19  Matthias Klose  

* doc/install.texi: Document configure options --enable-objc-gc
and --with-target-bdw-gc.

config/

2016-11-19  Matthias Klose  

* pkg.m4: New file.

libobjc/

2016-11-19  Matthias Klose  

* configure.ac (--enable-objc-gc): Allow to configure with a
system provided boehm-gc.
* configure: Regenerate.
* Makefile.in (OBJC_BOEHM_GC_LIBS): Get value from configure.
* gc.c: Include system bdw-gc headers.
* memory.c: Likewise
* objects.c: Likewise

boehm-gc/

2016-11-19  Matthias Klose  

Remove



2016-11-19  Matthias Klose  

	* Makefile.def: Remove reference to boehm-gc target module.
  	* configure.ac: Include pkg.m4, check for --with-target-bdw-gc
	options and for the bdw-gc pkg-config module.
	* configure: Regenerate.
	* Makefile.in: Regenerate.

gcc/

2016-11-19  Matthias Klose  

	* doc/install.texi: Document configure options --enable-objc-gc
	and --with-target-bdw-gc.

config/

2016-11-19  Matthias Klose  

	* pkg.m4: New file.

libobjc/

2016-11-19  Matthias Klose  

	* configure.ac (--enable-objc-gc): Allow to configure with a
	system provided boehm-gc.
	* configure: Regenerate.
	* Makefile.in (OBJC_BOEHM_GC_LIBS): Get value from configure.
	* gc.c: Include system bdw-gc headers.
	* memory.c: Likewise
	* objects.c: Likewise

boehm-gc/

2016-11-19  Matthias Klose  

	Remove

Index: Makefile.def
===
--- Makefile.def	(revision 242681)
+++ Makefile.def	(working copy)
@@ -166,7 +166,6 @@
 target_modules = { module= libgloss; no_check=true; };
 target_modules = { module= libffi; no_install=true; };
 target_modules = { module= zlib; };
-target_modules = { module= boehm-gc; };
 target_modules = { module= rda; };
 target_modules = { module= libada; };
 target_modules = { module= libgomp; bootstrap= true; lib_path=.libs; };
@@ -543,7 +542,6 @@
 // a dependency on libgcc for native targets to configure.
 lang_env_dependencies = { module=libiberty; no_c=true; };
 
-dependencies = { module=configure-target-boehm-gc; on=all-target-libstdc++-v3; };
 dependencies = { module=configure-target-fastjar; on=configure-target-zlib; };
 dependencies = { module=all-target-fastjar; on=all-target-zlib; };
 dependencies = { module=configure-target-libgo; on=configure-target-libffi; };
@@ -551,8 +549,6 @@
 dependencies = { module=all-target-libgo; on=all-target-libbacktrace; };
 dependencies = { module=all-target-libgo; on=all-target-libffi; };
 dependencies = { module=all-target-libgo; on=all-target-libatomic; };
-dependencies = { module=configure-target-libobjc; on=configure-target-boehm-gc; };
-dependencies = { module=all-target-libobjc; on=all-target-boehm-gc; };
 dependencies = { module=configure-target-libstdc++-v3; on=configure-target-libgomp; };
 dependencies = { module=configure-target-liboffloadmic; on=configure-target-libgomp; };
 dependencies = { module=configure-target-libsanitizer; on=all-target-libstdc++-v3; };
Index: config/pkg.m4
===
--- config/pkg.m4	(nonexistent)
+++ config/pkg.m4	(working copy)
@@ -0,0 +1,825 @@
+dnl pkg.m4 - Macros to locate and utilise pkg-config.   -*- Autoconf -*-
+dnl serial 11 (pkg-config-0.29)
+dnl
+dnl Copyright Â© 2004 Scott James Remnant .
+dnl Copyright Â© 2012-2015 Dan Nicholson 
+dnl
+dnl This program is free software; you can redistribute it and/or modify
+dnl it under the terms of the GNU General Public License as published by
+dnl the Free Software Foundation; either version 2 of the License, or
+dnl (at your option) any later version.
+dnl
+dnl This program is distributed in the hope that it will be useful, but
+dnl WITHOUT ANY WARRANTY; without even the implied warranty of
+dnl MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+dnl General Public License for more details.
+dnl
+dnl You should have received a copy of the GNU General Public License
+dnl along with this program; if not, write to the Free Software
+dnl Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA
+dnl 02111-1307, USA.
+dnl
+dnl As a special exception to the GNU General Public License, if you
+dnl distribute this file as part of a program tha

[PATCH v2, rs6000] Add built-in support for vector compare

2016-11-21 Thread Carl E. Love

Segher:

I realized over the weekend that I forgot to update the built-in documentation
file, doc/extend.texi.  I have updated the patch with these additions and fixed
the issues you mentioned before.

The Power ABI document lists a number of built-ins that it is supposed
to support.  There are still some missing.  This patch adds the built-in
support for the following built-ins:

vector bool char  vec_cmpeq vector bool char   vector bool char
vector bool int   vec_cmpeq vector bool intvector bool int
vector bool long long vec_cmpeq vector bool long long  vector bool long long
vector bool short vec_cmpeq vector bool short  vector bool short
vector bool char  vec_cmpne vector bool char   vector bool char
vector bool int   vec_cmpne vector bool intvector bool int
vector bool long long vec_cmpne vector bool long long  vector bool long long
vector bool short vec_cmpne vector bool short  vector bool short

Bootstrapped and tested on powerpc64le-unknown-linux-gnu with no
regressions. 

Is this ok for trunk?

Carl Love


gcc/ChangeLog:

2016-11-21  Carl Love  

* config/rs6000/rs6000-c.c: Add built-in support for vector compare
equal and vector compare not equal.  The vector compares take two
arguments of type vector bool char, vector bool short, vector bool int,
vector bool long long with the same return type.
* doc/extend.texi: Update built-in documentation file for the new
powerpc built-ins.

gcc/testsuite/ChangeLog:

2016-11-21  Carl Love  

* gcc.target/powerpc/builtins-3.c: New file to test the new
built-ins for vector compare equal and vector compare not equal.
---
 gcc/config/rs6000/rs6000-c.c  | 17 ++-
 gcc/doc/extend.texi   | 10 
 gcc/testsuite/gcc.target/powerpc/builtins-3.c | 68 +++
 3 files changed, 94 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/builtins-3.c

diff --git a/gcc/config/rs6000/rs6000-c.c b/gcc/config/rs6000/rs6000-c.c
index 4bba293..4f332d7 100644
--- a/gcc/config/rs6000/rs6000-c.c
+++ b/gcc/config/rs6000/rs6000-c.c
@@ -1107,15 +1107,23 @@ const struct altivec_builtin_types 
altivec_overloaded_builtins[] = {
 RS6000_BTI_bool_V16QI, RS6000_BTI_V16QI, RS6000_BTI_V16QI, 0 },
   { ALTIVEC_BUILTIN_VEC_CMPEQ, ALTIVEC_BUILTIN_VCMPEQUB,
 RS6000_BTI_bool_V16QI, RS6000_BTI_unsigned_V16QI, 
RS6000_BTI_unsigned_V16QI, 0 },
+  { ALTIVEC_BUILTIN_VEC_CMPEQ, ALTIVEC_BUILTIN_VCMPEQUB,
+RS6000_BTI_bool_V16QI, RS6000_BTI_bool_V16QI, RS6000_BTI_bool_V16QI, 0 },
+  { ALTIVEC_BUILTIN_VEC_CMPEQ, ALTIVEC_BUILTIN_VCMPEQUH,
+RS6000_BTI_bool_V8HI, RS6000_BTI_bool_V8HI, RS6000_BTI_bool_V8HI, 0 },
   { ALTIVEC_BUILTIN_VEC_CMPEQ, ALTIVEC_BUILTIN_VCMPEQUH,
 RS6000_BTI_bool_V8HI, RS6000_BTI_V8HI, RS6000_BTI_V8HI, 0 },
   { ALTIVEC_BUILTIN_VEC_CMPEQ, ALTIVEC_BUILTIN_VCMPEQUH,
 RS6000_BTI_bool_V8HI, RS6000_BTI_unsigned_V8HI, RS6000_BTI_unsigned_V8HI, 
0 },
   { ALTIVEC_BUILTIN_VEC_CMPEQ, ALTIVEC_BUILTIN_VCMPEQUW,
+RS6000_BTI_bool_V4SI, RS6000_BTI_bool_V4SI, RS6000_BTI_bool_V4SI, 0 },
+  { ALTIVEC_BUILTIN_VEC_CMPEQ, ALTIVEC_BUILTIN_VCMPEQUW,
 RS6000_BTI_bool_V4SI, RS6000_BTI_V4SI, RS6000_BTI_V4SI, 0 },
   { ALTIVEC_BUILTIN_VEC_CMPEQ, ALTIVEC_BUILTIN_VCMPEQUW,
 RS6000_BTI_bool_V4SI, RS6000_BTI_unsigned_V4SI, RS6000_BTI_unsigned_V4SI, 
0 },
   { ALTIVEC_BUILTIN_VEC_CMPEQ, P8V_BUILTIN_VCMPEQUD,
+RS6000_BTI_bool_V2DI, RS6000_BTI_bool_V2DI, RS6000_BTI_bool_V2DI, 0 },
+  { ALTIVEC_BUILTIN_VEC_CMPEQ, P8V_BUILTIN_VCMPEQUD,
 RS6000_BTI_bool_V2DI, RS6000_BTI_V2DI, RS6000_BTI_V2DI, 0 },
   { ALTIVEC_BUILTIN_VEC_CMPEQ, P8V_BUILTIN_VCMPEQUD,
 RS6000_BTI_bool_V2DI, RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, 
0 },
@@ -4486,6 +4494,9 @@ const struct altivec_builtin_types 
altivec_overloaded_builtins[] = {
 RS6000_BTI_bool_V16QI, RS6000_BTI_V16QI,
 RS6000_BTI_V16QI, 0 },
   { ALTIVEC_BUILTIN_VEC_CMPNE, P9V_BUILTIN_CMPNEB,
+RS6000_BTI_bool_V16QI, RS6000_BTI_bool_V16QI,
+RS6000_BTI_bool_V16QI, 0 },
+  { ALTIVEC_BUILTIN_VEC_CMPNE, P9V_BUILTIN_CMPNEB,
 RS6000_BTI_bool_V16QI, RS6000_BTI_unsigned_V16QI,
 RS6000_BTI_unsigned_V16QI, 0 },
 
@@ -4508,7 +4519,11 @@ const struct altivec_builtin_types 
altivec_overloaded_builtins[] = {
   { ALTIVEC_BUILTIN_VEC_CMPNE, P9V_BUILTIN_CMPNEW,
 RS6000_BTI_bool_V4SI, RS6000_BTI_unsigned_V4SI,
 RS6000_BTI_unsigned_V4SI, 0 },
-
+  { ALTIVEC_BUILTIN_VEC_CMPNE, P9V_BUILTIN_CMPNEB,
+RS6000_BTI_bool_V4SI, RS6000_BTI_bool_V4SI,
+RS6000_BTI_bool_V4SI, 0 },
+  { ALTIVEC_BUILTIN_VEC_CMPNE, P9V_BUILTIN_CMPNED,
+RS6000_BTI_bool_V2DI, RS6000_BTI_bool_V2DI, RS6000_BTI_bool_V2DI, 0 },
   { ALTIVEC_BUILTIN_VEC_CMPNE, P9V_BUILTIN_CMPNEF,
 RS6000_BTI_bool_V4SI, RS6000_BTI_V4SF, RS6000_BTI_V4SF, 0 },
   { ALTIVEC_BUILTIN_VEC_CMPNE, P9V_BUILTIN_CMPNED,
diff --git a/gcc/doc/extend.texi

Re: [RFA] [PR target/68538] Fix ICE on cris port

2016-11-21 Thread Hans-Peter Nilsson

On Mon, 21 Nov 2016, Jeff Law wrote:

>
> The CRIS port seems to have made a minor goof in a conditional guarding a call
> to copy_to_mode_reg.
>
> copy_to_mode_reg always allocates a new pseudo, so calling it when
> !can_create_pseudo_p is going to result in an ICE.
>
> The attached patch fixes the ICE.

Ok and thanks and sorry for not keeping up on this.

>  But I don't know enough about the CRIS port
> to know if other adjustments are necessary.

While there's no substitution to running the test-suite for
knowing that, the patch reasonably shouldn't be able to
mess things up (that aren't already messed up).  Famous last
words, I know...

Also, SVN (specifically r130971) helped me remember and pointed
out that fallout (when no pseudos can be created) is supposed be
handled by a pair of define_insn_and_splits.

brgds, H-P

Re: [PATCH], Tweak PowerPC movdi constraints

2016-11-21 Thread Michael Meissner

On Mon, Nov 21, 2016 at 12:51:38PM -0600, Segher Boessenkool wrote:
> 
> Okay, if you change the changelog to say what the patch actually does ;-)
> And please watch for fallout.

This is the ChangeLog entry I checked in.

2016-11-21  Michael Meissner  

* config/rs6000/rs6000.md (movdi_internal32): Change constraints
so that DImode can be allocated to FP/vector registers in more
cases, and we can avoid direct move operations.  If the register
needs reloading, prefer GPRs over FP/vector registers.  In the
case of FPR vs. Altivec registers, prefer FPR registers unless we
have the ISA 3.0 reg+offset scalar instructions.
(movdi_internal64): Likewise.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797

Re: [PATCH] rs6000: rl[wd]imi without shift/rotate (PR68803)

2016-11-21 Thread David Edelsohn

On Mon, Nov 21, 2016 at 3:53 PM, Segher Boessenkool
 wrote:
> We didn't have patterns yet for rl[wd]imi insns that do a rotate by 0.
> This fixes it.  Tested on powerpc64-linux {-m32,-m64}.  With a further
> patch (to generic code) now all my rl*imi tests works (I still need to
> make those tests usable in the GCC testsuite, they currently take
> hours to run).
>
> Is this okay for trunk?
>
>
> Segher
>
>
> 2016-11-21  Segher Boessenkool  
>
> PR target/68803
> * config/rs6000/rs6000.md (*rotlsi3_insert_5, *rotldi3_insert_6,
> *rotldi3_insert_7): New define_insns.

Okay.

Thanks, David

Re: [PATCH PR68030/PR69710][RFC]Introduce a simple local CSE interface and use it in vectorizer

2016-11-21 Thread Doug Gilmore

I haven't seen any followups to this discussion of Bin's patch to
PR68303 and PR69710, the patch submission:
http://gcc.gnu.org/ml/gcc-patches/2016-05/msg02000.html

Discussion:
http://gcc.gnu.org/ml/gcc-patches/2016-07/msg00761.html
http://gcc.gnu.org/ml/gcc-patches/2016-06/msg01551.html
http://gcc.gnu.org/ml/gcc-patches/2016-06/msg00372.html
http://gcc.gnu.org/ml/gcc-patches/2016-06/msg01550.html
http://gcc.gnu.org/ml/gcc-patches/2016-05/msg02162.html
http://gcc.gnu.org/ml/gcc-patches/2016-05/msg02155.html
http://gcc.gnu.org/ml/gcc-patches/2016-05/msg02154.html

so I did some investigation to get a better understanding of the
issues involved.

On 07/13/2016 01:59 PM, Jeff Law wrote:
> On 05/25/2016 05:22 AM, Bin Cheng wrote:
>> Hi, As analyzed in PR68303 and PR69710, vectorizer generates
>> duplicated computations in loop's pre-header basic block when
>> creating base address for vector reference to the same memory object.
> Not a huge surprise.  Loop optimizations generally have a tendency
> to create and/or expose CSE opportunities.  Unrolling is a common
> culprit, there's certainly the possibility for header duplication,
> code motions and IV rewriting to also expose/create redundant code.
>
> ...
> 
>  But, 1) It
>> doesn't fix all the problem on x86_64.  Root cause is computation for
>> base address of the first reference is somehow moved outside of
>> loop's pre-header, local CSE can't help in this case.
> That's a bid odd -- have you investigated why this is outside the loop header?
> ...
I didn't look at this issue per se, but I did try running DOM between
autovectorization and IVS.  Just running DOM had little effect, what
was crucial was adding the change Bin mentioned in his original
message:

Besides CSE issue, this patch also re-associates address
expressions in vect_create_addr_base_for_vector_ref, specifically,
it splits constant offset and adds it back near the expression
root in IR.  This is necessary because GCC only handles
re-association for commutative operators in CSE.

I attached a patch for these changes only.  These are the important
modifications that address the some of the IVS related issues exposed
by PR68303. I found that adding the CSE change (or calling DOM between
autovectorization and IVOPTS) is not needed, and from what I have
seen, actually makes the code worse.

Applying only the modifications to
vect_create_addr_base_for_vector_ref, additional simplifications will
be done when induction variables are found (function
find_induction_variables).  These simplications are indicated by the
appearance of lines:

Applying pattern match.pd:1056, generic-match.c:11865

in the IVOPS dump file.  Now IVOPTs transforms the code so that
constants now appear in the computation of the effective addresses for
the memory OPs.  However the code generated by IVOPTS still uses a
separate base register for each memory reference.  Later DOM3
transforms the code to use just one base register, which is the form
the code needs to be in for the preliminary phase of IVOPTs where
"IV uses" associated with memory OPs are placed into groups.  At the
time of this grouping, checks are done to ensure that for each member
of a group the constant offsets don't overflow the immediate fields in
actual machine instructions (more in this see * below).

Currently it appears that an IV is generated for each memory
reference.  Instead of generating a new IV for each memory reference,
we could try to detect that value of the new IV is just a constant
offset of an existing IV and just generate a new temp reflecting that.
I haven't worked through what needs to be done to implement that, but
for the issue in PR69710 (saxpy example where the same IV should be
used for a load and store) is straightforward to implement so since
work has already been done in during data dependence analysis to
detect this situation.  I attached a patch for PR69710 that was
bootstrapped and tested on X86_64 without errors.  It does appear that
it needs more testing, since I did notice SPEC 2006 h264ref produces
different results with the patch applied, which I still need to
investigate.

Doug

* Note that when IV uses are grouped, only positive constant offsets
constraints are considered.  That negative offsets can be used are
reflected in the costs of using a different IV than the IV associated
with a particular group.  Thus once the optimal IV set is found, a
different IV may chosen, which causes negative constant offsets to be
used.

>From 3ea70edb3bf68057c955d2b22204f17bb670f65a Mon Sep 17 00:00:00 2001
From: Doug Gilmore 
Date: Fri, 4 Nov 2016 18:49:58 -0700
Subject: [PATCH] [PR68030/PR69710] vect_create_addr_base_for_vector_ref
 changes only fix.

This patch include changes noted in Bin's first patch message:
https://gcc.gnu.org/ml/gcc-patches/2016-05/msg02000.html

Besides CSE issue, this patch also re-associates address expressions
in vect_create_addr_base_for_vector_ref, specifically, it splits

RE: [PATCH, testsuite] MIPS: Add isa>=2 option to interrupt_handler-bug-1.c.

2016-11-21 Thread Moore, Catherine



> -Original Message-
> From: Toma Tabacu [mailto:toma.tab...@imgtec.com]
> Sent: Monday, November 21, 2016 8:53 AM
> To: gcc-patches@gcc.gnu.org
> Cc: Matthew Fortune ; Moore,
> Catherine 
> Subject: [PATCH, testsuite] MIPS: Add isa>=2 option to
> interrupt_handler-bug-1.c.
> 
> Hi,
> 
> Currently, the interrupt_handler-bug-1.c test will fail on pre-R2 targets
> because the "interrupt" function attribute requires at least an R2 target
> and
> the test does not enforce this requirement.
> 
> This patch fixes this by adding the isa_rev>=2 option to the test's dg-
> options.
> 
> Tested with mips-mti-elf.
> 
> Regards,
> Toma Tabacu
> 
> gcc/testsuite/ChangeLog:
> 
> 2016-11-21  Toma Tabacu  
> 
>   * gcc.target/mips/interrupt_handler-bug-1.c (dg-options): Add
>   isa_rev>=2.
> 

Yes, this is okay to commit.

[PATCH] rs6000: rl[wd]imi without shift/rotate (PR68803)

2016-11-21 Thread Segher Boessenkool

We didn't have patterns yet for rl[wd]imi insns that do a rotate by 0.
This fixes it.  Tested on powerpc64-linux {-m32,-m64}.  With a further
patch (to generic code) now all my rl*imi tests works (I still need to
make those tests usable in the GCC testsuite, they currently take
hours to run).

Is this okay for trunk?


Segher


2016-11-21  Segher Boessenkool  

PR target/68803
* config/rs6000/rs6000.md (*rotlsi3_insert_5, *rotldi3_insert_6,
*rotldi3_insert_7): New define_insns.

---
 gcc/config/rs6000/rs6000.md | 44 
 1 file changed, 44 insertions(+)

diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index a779f5c..52f07e8 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -3853,6 +3853,50 @@ (define_insn "*rotl3_insert_4"
 }
   [(set_attr "type" "insert")])
 
+(define_insn "*rotlsi3_insert_5"
+  [(set (match_operand:SI 0 "gpc_reg_operand" "=r,r")
+   (ior:SI (and:SI (match_operand:SI 1 "gpc_reg_operand" "0,r")
+   (match_operand:SI 2 "const_int_operand" "n,n"))
+   (and:SI (match_operand:SI 3 "gpc_reg_operand" "r,0")
+   (match_operand:SI 4 "const_int_operand" "n,n"]
+  "rs6000_is_valid_mask (operands[2], NULL, NULL, SImode)
+   && UINTVAL (operands[2]) != 0 && UINTVAL (operands[4]) != 0
+   && UINTVAL (operands[2]) + UINTVAL (operands[4]) + 1 == 0"
+  "@
+   rlwimi %0,%3,0,%4
+   rlwimi %0,%1,0,%2"
+  [(set_attr "type" "insert")])
+
+(define_insn "*rotldi3_insert_6"
+  [(set (match_operand:DI 0 "gpc_reg_operand" "=r")
+   (ior:DI (and:DI (match_operand:DI 1 "gpc_reg_operand" "0")
+   (match_operand:DI 2 "const_int_operand" "n"))
+   (and:DI (match_operand:DI 3 "gpc_reg_operand" "r")
+   (match_operand:DI 4 "const_int_operand" "n"]
+  "exact_log2 (-UINTVAL (operands[2])) > 0
+   && UINTVAL (operands[2]) + UINTVAL (operands[4]) + 1 == 0"
+{
+  operands[5] = GEN_INT (64 - exact_log2 (-UINTVAL (operands[2])));
+  return "rldimi %0,%3,0,%5";
+}
+  [(set_attr "type" "insert")
+   (set_attr "size" "64")])
+
+(define_insn "*rotldi3_insert_7"
+  [(set (match_operand:DI 0 "gpc_reg_operand" "=r")
+   (ior:DI (and:DI (match_operand:DI 3 "gpc_reg_operand" "r")
+   (match_operand:DI 4 "const_int_operand" "n"))
+   (and:DI (match_operand:DI 1 "gpc_reg_operand" "0")
+   (match_operand:DI 2 "const_int_operand" "n"]
+  "exact_log2 (-UINTVAL (operands[2])) > 0
+   && UINTVAL (operands[2]) + UINTVAL (operands[4]) + 1 == 0"
+{
+  operands[5] = GEN_INT (64 - exact_log2 (-UINTVAL (operands[2])));
+  return "rldimi %0,%3,0,%5";
+}
+  [(set_attr "type" "insert")
+   (set_attr "size" "64")])
+
 
 ; This handles the important case of multiple-precision shifts.  There is
 ; no canonicalization rule for ASHIFT vs. LSHIFTRT, so two patterns.
-- 
1.9.3

Re: [PATCH, ARM] Enable ldrd/strd peephole rules unconditionally

2016-11-21 Thread Christophe Lyon

On 18 November 2016 at 16:50, Bernd Edlinger  wrote:
> On 11/18/16 12:58, Christophe Lyon wrote:
>> On 17 November 2016 at 10:23, Kyrill Tkachov
>>  wrote:
>>>
>>> On 09/11/16 12:58, Bernd Edlinger wrote:

 Hi!


 This patch enables the ldrd/strd peephole rules unconditionally.

 It is meant to fix cases, where the patch to reduce the sha512
 stack usage splits ldrd/strd instructions into separate ldr/str insns,
 but is technically independent from the other patch:

 See https://gcc.gnu.org/ml/gcc-patches/2016-11/msg00523.html

 It was necessary to change check_effective_target_arm_prefer_ldrd_strd
 to retain the true prefer_ldrd_strd tuning flag.


 Bootstrapped and reg-tested on arm-linux-gnueabihf.
 Is it OK for trunk?
>>>
>>>
>>> This is ok.
>>> Thanks,
>>> Kyrill
>>>
>>
>> Hi Bernd,
>>
>> Since you committed this patch (r242549), I'm seeing the new test
>> failing on some arm*-linux-gnueabihf configurations:
>>
>> FAIL:  gcc.target/arm/pr53447-5.c scan-assembler-times ldrd 10
>> FAIL:  gcc.target/arm/pr53447-5.c scan-assembler-times strd 9
>>
>> See 
>> http://people.linaro.org/~christophe.lyon/cross-validation/gcc/trunk/242549/report-build-info.html
>> for a map of failures.
>>
>> Am I missing something?
>
> Hi Christophe,
>
> as always many thanks for your testing...
>
> I have apparently only looked at the case -mfloat-abi=soft here, which
> is what my other patch is going to address.  But all targets with
> -mfpu=neon -mfloat-abi=hard can also use vldr.64 instead of ldrd
> and vstr.64 instead of strd, which should be accepted as well.
>
> So the attached patch should fix at least most of the fallout.
>

I've tested it, and indeed it fixes the failures I've reported.

Thanks

> Is it OK for trunk?
>
>
> Thanks
> Bernd.

[RFA] [PR target/68538] Fix ICE on cris port

2016-11-21 Thread Jeff Law



The CRIS port seems to have made a minor goof in a conditional guarding 
a call to copy_to_mode_reg.


copy_to_mode_reg always allocates a new pseudo, so calling it when 
!can_create_pseudo_p is going to result in an ICE.


The attached patch fixes the ICE.  But I don't know enough about the 
CRIS port to know if other adjustments are necessary.


Jeff
diff --git a/gcc/config/cris/cris.md b/gcc/config/cris/cris.md
index 59a3862..13279b5 100644
--- a/gcc/config/cris/cris.md
+++ b/gcc/config/cris/cris.md
@@ -499,7 +499,8 @@
 {
   if (MEM_P (operands[0])
   && operands[1] != const0_rtx
-  && (!TARGET_V32 || (!REG_P (operands[1]) && can_create_pseudo_p (
+  && can_create_pseudo_p ()
+  && (!TARGET_V32 || !REG_P (operands[1])))
 operands[1] = copy_to_mode_reg (DImode, operands[1]);
 
   /* Some other ports (as of 2001-09-10 for example mcore and romp) also
diff --git a/gcc/testsuite/gcc.c-torture/compile/pr68538.c 
b/gcc/testsuite/gcc.c-torture/compile/pr68538.c
new file mode 100644
index 000..2822cdb
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/compile/pr68538.c
@@ -0,0 +1,52 @@
+struct percpu_counter {
+   signed long long count;
+};
+struct blkg_rwstat {
+   struct percpu_counter cpu_cnt[4];
+};
+struct cfq_group {
+   struct blkg_rwstat service_time;
+};
+struct cfq_queue {
+   struct cfq_group *group;
+};
+struct request {
+   struct cfq_queue *active_queue;
+   unsigned long long cmd_flags;
+   void *priv;
+};
+static void blkg_rwstat_add(struct blkg_rwstat *rwstat, int rw, unsigned long 
long val)
+{
+   struct percpu_counter *cnt;
+   if (rw & 1)
+   cnt = &rwstat->cpu_cnt[1];
+   else
+   cnt = &rwstat->cpu_cnt[0];
+   cnt->count += val;
+   if (rw & 2)
+   cnt = &rwstat->cpu_cnt[2];
+   else
+   cnt = &rwstat->cpu_cnt[3];
+   cnt->count += val;
+}
+extern unsigned long long rq_start_time_ns(void);
+extern unsigned long long rq_io_start_time_ns(void);
+extern int rq_is_sync(void);
+extern void cfq_arm_slice_timer(void);
+void cfq_completed_request(struct request *rq)
+{
+   struct cfq_queue *queue = rq->priv;
+   int sync = rq_is_sync();
+   struct cfq_group *group = queue->group;
+   long long start_time = rq_start_time_ns();
+   long long io_start_time = rq_io_start_time_ns();
+   int rw = rq->cmd_flags;
+
+   if (io_start_time < 1)
+   blkg_rwstat_add(&group->service_time, rw, 1 - io_start_time);
+   blkg_rwstat_add(0, rw, io_start_time - start_time);
+
+   if (rq->active_queue == queue && sync)
+   cfq_arm_slice_timer();
+}
+

[PATCH] Fix ICE with masked stores (PR tree-optimization/78445)

2016-11-21 Thread Jakub Jelinek

On Wed, Nov 16, 2016 at 09:14:57PM -0600, Bill Schmidt wrote:
> 2016-11-16  Bill Schmidt  
> Richard Biener  
> 
>   PR tree-optimization/77848
>   * tree-if-conv.c (tree_if_conversion): Always version loops unless
>   the user specified -ftree-loop-if-convert.

This broke the attached testcase.

> --- gcc/tree-if-conv.c(revision 242521)
> +++ gcc/tree-if-conv.c(working copy)
> @@ -2803,10 +2803,12 @@ tree_if_conversion (struct loop *loop)
> || loop->dont_vectorize))
>  goto cleanup;
>  
> -  /* Either version this loop, or if the pattern is right for outer-loop
> - vectorization, version the outer loop.  In the latter case we will
> - still if-convert the original inner loop.  */
> -  if ((any_pred_load_store || any_complicated_phi)
> +  /* Since we have no cost model, always version loops unless the user
> + specified -ftree-loop-if-convert.  Either version this loop, or if
> + the pattern is right for outer-loop vectorization, version the
> + outer loop.  In the latter case we will still if-convert the
> + original inner loop.  */
> +  if (flag_tree_loop_if_convert != 1
>&& !version_loop_for_if_conversion
>(versionable_outer_loop_p (loop_outer (loop))
> ? loop_outer (loop) : loop))

If there are masked loads/stores (and I assume also the complicated phi
stuff, but haven't verified), then it isn't just some kind of optimization
to version the loop based on LOOP_VECTORIZED ifn, it is a requirement
- MASK_LOAD/MASK_STORE aren't supported for scalar code, so they can only
appear in the vectorized version.  Fixed by reverting that - if
-ftree-loop-if-convert we'll do what we used to do before, without it
we do what you've added, i.e. version always.

The rest is just formatting fix, the too large argument that forces
call's ( on the next line and even misindented is IMHO much cleaner
if a temporary is used.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2016-11-21  Jakub Jelinek  

PR tree-optimization/78445
* tree-if-conv.c (tree_if_conversion): If any_pred_load_store or
any_complicated_phi, version loop even if flag_tree_loop_if_convert is
1.  Formatting fix.

* gcc.dg/pr78445.c: New test.

--- gcc/tree-if-conv.c.jj   2016-11-17 18:08:12.0 +0100
+++ gcc/tree-if-conv.c  2016-11-21 17:28:30.807242395 +0100
@@ -2804,15 +2804,20 @@ tree_if_conversion (struct loop *loop)
 goto cleanup;
 
   /* Since we have no cost model, always version loops unless the user
- specified -ftree-loop-if-convert.  Either version this loop, or if
- the pattern is right for outer-loop vectorization, version the
- outer loop.  In the latter case we will still if-convert the
- original inner loop.  */
-  if (flag_tree_loop_if_convert != 1
-  && !version_loop_for_if_conversion
-  (versionable_outer_loop_p (loop_outer (loop))
-   ? loop_outer (loop) : loop))
-goto cleanup;
+ specified -ftree-loop-if-convert or unless versioning is required.
+ Either version this loop, or if the pattern is right for outer-loop
+ vectorization, version the outer loop.  In the latter case we will
+ still if-convert the original inner loop.  */
+  if (any_pred_load_store
+  || any_complicated_phi
+  || flag_tree_loop_if_convert != 1)
+{
+  struct loop *vloop
+   = (versionable_outer_loop_p (loop_outer (loop))
+  ? loop_outer (loop) : loop);
+  if (!version_loop_for_if_conversion (vloop))
+   goto cleanup;
+}
 
   /* Now all statements are if-convertible.  Combine all the basic
  blocks into one huge basic block doing the if-conversion
--- gcc/testsuite/gcc.dg/pr78445.c.jj   2016-11-21 17:30:58.534400256 +0100
+++ gcc/testsuite/gcc.dg/pr78445.c  2016-11-21 17:30:41.0 +0100
@@ -0,0 +1,19 @@
+/* PR tree-optimization/78445 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -ftree-loop-if-convert -ftree-vectorize" } */
+/* { dg-additional-options "-mavx2" { target { i?86-*-* x86_64-*-* } } } */
+
+int a;
+
+void
+foo (int x, int *y)
+{
+  while (a != 0)
+if (x != 0)
+  {
+   *y = a;
+   x = *y;
+  }
+else
+  x = a;
+}


Jakub

[committed] Fix simd clone creation in functions with unnamed arguments (PR middle-end/67335)

2016-11-21 Thread Jakub Jelinek

Hi!

If some argument is nameless, then it can be only vector kind, because
it is impossible to use clauses on such argument.  It doesn't make much
sense to use that (it is inefficient), but we shouldn't ICE on it.

Bootstrapped/regtested on x86_64-linux and i686-linux, committed to trunk
so far.

2016-11-21  Jakub Jelinek  

PR middle-end/67335
* omp-simd-clone.c (simd_clone_adjust_argument_types): Use NULL prefix
for tmp simd array if DECL_NAME (parm) is NULL.

* g++.dg/vect/simd-clone-7.cc: New test.

--- gcc/omp-simd-clone.c.jj 2016-11-18 20:04:31.0 +0100
+++ gcc/omp-simd-clone.c2016-11-21 11:36:40.897643271 +0100
@@ -630,8 +630,9 @@ simd_clone_adjust_argument_types (struct
 
  if (node->definition)
sc->args[i].simd_array
- = create_tmp_simd_array (IDENTIFIER_POINTER (DECL_NAME (parm)),
-  parm_type, sc->simdlen);
+ = create_tmp_simd_array (DECL_NAME (parm)
+  ? IDENTIFIER_POINTER (DECL_NAME (parm))
+  : NULL, parm_type, sc->simdlen);
}
   adjustments.safe_push (adj);
 }
--- gcc/testsuite/g++.dg/vect/simd-clone-7.cc.jj2016-11-21 
11:49:12.810219423 +0100
+++ gcc/testsuite/g++.dg/vect/simd-clone-7.cc   2016-11-21 11:49:25.980054363 
+0100
@@ -0,0 +1,10 @@
+// PR middle-end/67335
+// { dg-do compile }
+// { dg-additional-options "-fopenmp-simd" }
+
+#pragma omp declare simd notinbranch uniform(y)
+float
+bar (float x, float *y, int)
+{
+  return y[0] + y[1] * x;
+}

Jakub

[PATCH] Fix gimple store merging (PR tree-optimization/78436)

2016-11-21 Thread Jakub Jelinek

Hi!

The
   if (!BYTES_BIG_ENDIAN)
-shift_bytes_in_array (tmpbuf, byte_size, shift_amnt);
+{
+  shift_bytes_in_array (tmpbuf, byte_size, shift_amnt);
+  if (shift_amnt == 0)
+   byte_size--;
+}
hunk below is the actual fix for the PR, where we originally store:
8-bit 0 at offset 24-bits followed by 24-bit negative value at offset 0,
little endian.  encode_tree_to_bitpos actually allocates 1 extra byte in the
buffer and byte_size is also 1 byte longer, for the case where the
bits need to be shifted (it only cares about shifts within bytes, so 0 to
BITS_PER_UNIT - 1).  If no shifting is done and there is no padding, we are
also fine, because native_encode_expr will only actually write the size of
TYPE_MODE bytes.  But in this case padding is 1 byte, so native_encode_expr
writes 4 bytes (the last one is 0xff), byte_size is initially 5, as padding
is 1, it is decremented to 4.  But we actually want to store just 3 bytes,
not 4; when we store 4, we overwrite the earlier value of the following
byte.

The rest of the patch are just cleanups.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2016-11-21  Jakub Jelinek  

PR tree-optimization/78436
* gimple-ssa-store-merging.c (zero_char_buf): Removed.
(shift_bytes_in_array, shift_bytes_in_array_right,
merged_store_group::apply_stores): Formatting fixes.
(clear_bit_region): Likewise.  Use memset.
(encode_tree_to_bitpos): Formatting fixes.  Fix comment typos - EPXR
instead of EXPR and inerted instead of inserted.  Use memset instead
of zero_char_buf.  For !BYTES_BIG_ENDIAN decrease byte_size by 1
if shift_amnt is 0.

* gcc.c-torture/execute/pr78436.c: New test.

--- gcc/gimple-ssa-store-merging.c.jj   2016-11-09 15:22:36.0 +0100
+++ gcc/gimple-ssa-store-merging.c  2016-11-21 10:54:51.746090238 +0100
@@ -199,17 +199,6 @@ dump_char_array (FILE *fd, unsigned char
   fprintf (fd, "\n");
 }
 
-/* Fill a byte array PTR of SZ elements with zeroes.  This is to be used by
-   encode_tree_to_bitpos to zero-initialize most likely small arrays but
-   with a non-compile-time-constant size.  */
-
-static inline void
-zero_char_buf (unsigned char *ptr, unsigned int sz)
-{
-  for (unsigned int i = 0; i < sz; i++)
-ptr[i] = 0;
-}
-
 /* Shift left the bytes in PTR of SZ elements by AMNT bits, carrying over the
bits between adjacent elements.  AMNT should be within
[0, BITS_PER_UNIT).
@@ -224,14 +213,13 @@ shift_bytes_in_array (unsigned char *ptr
 return;
 
   unsigned char carry_over = 0U;
-  unsigned char carry_mask = (~0U) << ((unsigned char)(BITS_PER_UNIT - amnt));
+  unsigned char carry_mask = (~0U) << (unsigned char) (BITS_PER_UNIT - amnt);
   unsigned char clear_mask = (~0U) << amnt;
 
   for (unsigned int i = 0; i < sz; i++)
 {
   unsigned prev_carry_over = carry_over;
-  carry_over
-   = (ptr[i] & carry_mask) >> (BITS_PER_UNIT - amnt);
+  carry_over = (ptr[i] & carry_mask) >> (BITS_PER_UNIT - amnt);
 
   ptr[i] <<= amnt;
   if (i != 0)
@@ -263,10 +251,9 @@ shift_bytes_in_array_right (unsigned cha
   for (unsigned int i = 0; i < sz; i++)
 {
   unsigned prev_carry_over = carry_over;
-  carry_over
-   = (ptr[i] & carry_mask);
+  carry_over = ptr[i] & carry_mask;
 
- carry_over <<= ((unsigned char)BITS_PER_UNIT - amnt);
+ carry_over <<= (unsigned char) BITS_PER_UNIT - amnt;
  ptr[i] >>= amnt;
  ptr[i] |= prev_carry_over;
 }
@@ -327,7 +314,7 @@ clear_bit_region (unsigned char *ptr, un
   /* Second base case.  */
   else if ((start + len) <= BITS_PER_UNIT)
 {
-  unsigned char mask = (~0U) << ((unsigned char)(BITS_PER_UNIT - len));
+  unsigned char mask = (~0U) << (unsigned char) (BITS_PER_UNIT - len);
   mask >>= BITS_PER_UNIT - (start + len);
 
   ptr[0] &= ~mask;
@@ -346,8 +333,7 @@ clear_bit_region (unsigned char *ptr, un
   unsigned int nbytes = len / BITS_PER_UNIT;
   /* We could recurse on each byte but do the loop here to avoid
 recursing too deep.  */
-  for (unsigned int i = 0; i < nbytes; i++)
-   ptr[i] = 0U;
+  memset (ptr, '\0', nbytes);
   /* Clear the remaining sub-byte region if there is one.  */
   if (len % BITS_PER_UNIT != 0)
clear_bit_region (ptr + nbytes, 0, len % BITS_PER_UNIT);
@@ -362,7 +348,7 @@ clear_bit_region (unsigned char *ptr, un
 
 static bool
 encode_tree_to_bitpos (tree expr, unsigned char *ptr, int bitlen, int bitpos,
-   unsigned int total_bytes)
+  unsigned int total_bytes)
 {
   unsigned int first_byte = bitpos / BITS_PER_UNIT;
   tree tmp_int = expr;
@@ -370,8 +356,8 @@ encode_tree_to_bitpos (tree expr, unsign
|| mode_for_size (bitlen, MODE_INT, 0) == BLKmode;
 
   if (!sub_byte_op_p)
-return native_encode_expr (tmp_int, ptr + first_byte, total_bytes, 0)
-  != 0;
+return (native_encode

[PATCH] Fix divmod expansion (PR middle-end/78416, take 2)

2016-11-21 Thread Jakub Jelinek

Hi!

On Fri, Nov 18, 2016 at 11:10:58PM +0100, Richard Biener wrote:
> I wonder if transforming the const-int to wide int makes this all easier to 
> read?

Here is updated patch that does that.
Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2016-11-21  Jakub Jelinek  

PR middle-end/78416
* expmed.c (expand_divmod): Use wide_int for computation of
op1_is_pow2.  Don't set it if op1 is 0.  Formatting fixes.
Use size <= HOST_BITS_PER_WIDE_INT instead of
HOST_BITS_PER_WIDE_INT >= size.

* gcc.dg/torture/pr78416.c: New test.

--- gcc/expmed.c.jj 2016-11-19 18:02:45.431380371 +0100
+++ gcc/expmed.c2016-11-21 16:13:32.980271174 +0100
@@ -3994,11 +3994,10 @@ expand_divmod (int rem_flag, enum tree_c
   op1_is_constant = CONST_INT_P (op1);
   if (op1_is_constant)
 {
-  unsigned HOST_WIDE_INT ext_op1 = UINTVAL (op1);
-  if (unsignedp)
-   ext_op1 &= GET_MODE_MASK (mode);
-  op1_is_pow2 = ((EXACT_POWER_OF_2_OR_ZERO_P (ext_op1)
-|| (! unsignedp && EXACT_POWER_OF_2_OR_ZERO_P 
(-ext_op1;
+  wide_int ext_op1 = rtx_mode_t (op1, mode);
+  op1_is_pow2 = (wi::popcount (ext_op1) == 1
+|| (! unsignedp
+&& wi::popcount (wi::neg (ext_op1)) == 1));
 }
 
   /*
@@ -4079,11 +4078,10 @@ expand_divmod (int rem_flag, enum tree_c
  not straightforward to generalize this.  Maybe we should make an array
  of possible modes in init_expmed?  Save this for GCC 2.7.  */
 
-  optab1 = ((op1_is_pow2 && op1 != const0_rtx)
+  optab1 = (op1_is_pow2
? (unsignedp ? lshr_optab : ashr_optab)
: (unsignedp ? udiv_optab : sdiv_optab));
-  optab2 = ((op1_is_pow2 && op1 != const0_rtx)
-   ? optab1
+  optab2 = (op1_is_pow2 ? optab1
: (unsignedp ? udivmod_optab : sdivmod_optab));
 
   for (compute_mode = mode; compute_mode != VOIDmode;
@@ -4139,10 +4137,15 @@ expand_divmod (int rem_flag, enum tree_c
   /* convert_modes may have placed op1 into a register, so we
 must recompute the following.  */
   op1_is_constant = CONST_INT_P (op1);
-  op1_is_pow2 = (op1_is_constant
-&& ((EXACT_POWER_OF_2_OR_ZERO_P (INTVAL (op1))
- || (! unsignedp
- && EXACT_POWER_OF_2_OR_ZERO_P (-UINTVAL 
(op1));
+  if (op1_is_constant)
+   {
+ wide_int ext_op1 = rtx_mode_t (op1, compute_mode);
+ op1_is_pow2 = (wi::popcount (ext_op1) == 1
+|| (! unsignedp
+&& wi::popcount (wi::neg (ext_op1)) == 1));
+   }
+  else
+   op1_is_pow2 = 0;
 }
 
   /* If one of the operands is a volatile MEM, copy it into a register.  */
@@ -4182,10 +4185,10 @@ expand_divmod (int rem_flag, enum tree_c
unsigned HOST_WIDE_INT mh, ml;
int pre_shift, post_shift;
int dummy;
-   unsigned HOST_WIDE_INT d = (INTVAL (op1)
-   & GET_MODE_MASK (compute_mode));
+   wide_int wd = rtx_mode_t (op1, compute_mode);
+   unsigned HOST_WIDE_INT d = wd.to_uhwi ();
 
-   if (EXACT_POWER_OF_2_OR_ZERO_P (d))
+   if (wi::popcount (wd) == 1)
  {
pre_shift = floor_log2 (d);
if (rem_flag)
@@ -4325,7 +4328,7 @@ expand_divmod (int rem_flag, enum tree_c
else if (d == -1)
  quotient = expand_unop (compute_mode, neg_optab, op0,
  tquotient, 0);
-   else if (HOST_BITS_PER_WIDE_INT >= size
+   else if (size <= HOST_BITS_PER_WIDE_INT
 && abs_d == HOST_WIDE_INT_1U << (size - 1))
  {
/* This case is not handled correctly below.  */
@@ -4335,6 +4338,7 @@ expand_divmod (int rem_flag, enum tree_c
  goto fail1;
  }
else if (EXACT_POWER_OF_2_OR_ZERO_P (d)
+&& (size <= HOST_BITS_PER_WIDE_INT || d >= 0)
 && (rem_flag
 ? smod_pow2_cheap (speed, compute_mode)
 : sdiv_pow2_cheap (speed, compute_mode))
@@ -4348,7 +4352,9 @@ expand_divmod (int rem_flag, enum tree_c
compute_mode)
 != CODE_FOR_nothing)))
  ;
-   else if (EXACT_POWER_OF_2_OR_ZERO_P (abs_d))
+   else if (EXACT_POWER_OF_2_OR_ZERO_P (abs_d)
+&& (size <= HOST_BITS_PER_WIDE_INT
+|| abs_d != (unsigned HOST_WIDE_INT) d))
  {
if (rem_flag)
  {
@@ -4483,7 +4489,7 @@ expand_divmod (int rem_flag, enum tree_c
   case FLOOR_DIV_EXPR:
   case FLOOR_MOD_EXPR:

Re: [PATCHv2, C++] Warn on redefinition of builtin functions (PR c++/71973)

2016-11-21 Thread Jakub Jelinek

On Sat, Nov 19, 2016 at 11:11:18AM +, Bernd Edlinger wrote:
> 2016-11-19  Bernd Edlinger  
> 
>   PR c++/71973
>   * doc/invoke.texi (-Wno-builtin-declaration-mismatch): Document the
>   new default-enabled warning..
>   * builtin-types.def (BT_CONST_TM_PTR): New primitive type.
>   (BT_PTR_CONST_STRING): Updated.
>   (BT_FN_SIZE_STRING_SIZE_CONST_STRING_CONST_PTR): Removed.
>   (BT_FN_SIZE_STRING_SIZE_CONST_STRING_CONST_TM_PTR): New function type.
>   * builtins.def (DEF_TM_BUILTIN): Disable BOTH_P for TM builtins.
>   (strftime): Update builtin function.
>   * tree-core.h (TI_CONST_TM_PTR_TYPE): New enum value.
>   * tree.h (const_tm_ptr_type_node): New type node.
>   * tree.c (free_lang_data, build_common_tree_nodes): Initialize
>   const_tm_ptr_type_node.
...

This broke 2 tests on i686-linux, I've committed this as obvious to fix it:

2016-11-21  Jakub Jelinek  

PR c++/71973
* g++.dg/torture/pr53321.C (size_t): Use __SIZE_TYPE__ instead of
long unsigned int.
* g++.dg/torture/pr63512.C (::strlen): Use __SIZE_TYPE__ instead of
unsigned long.

--- gcc/testsuite/g++.dg/torture/pr53321.C.jj   2012-07-16 14:38:22.514585151 
+0200
+++ gcc/testsuite/g++.dg/torture/pr53321.C  2016-11-21 19:52:00.561899801 
+0100
@@ -2,7 +2,7 @@
 // { dg-require-profiling "-fprofile-generate" }
 // { dg-options "-fprofile-generate" }
 
-typedef long unsigned int size_t;
+typedef __SIZE_TYPE__ size_t;
 
 extern "C"
 {
--- gcc/testsuite/g++.dg/torture/pr63512.C.jj   2014-10-15 12:28:16.417303928 
+0200
+++ gcc/testsuite/g++.dg/torture/pr63512.C  2016-11-21 19:52:45.006330942 
+0100
@@ -2,7 +2,7 @@
 
 extern "C" {
 void __assert_fail ();
-unsigned long strlen (const char *);
+__SIZE_TYPE__ strlen (const char *);
 }
 class A
 {


Jakub

Re: [PATCH], Tweak PowerPC movdi constraints

2016-11-21 Thread Segher Boessenkool

On Mon, Nov 21, 2016 at 01:27:59PM -0500, Michael Meissner wrote:
> On Fri, Nov 18, 2016 at 05:07:21PM -0600, Segher Boessenkool wrote:
> > On Fri, Nov 18, 2016 at 05:52:12PM -0500, Michael Meissner wrote:
> > > On Fri, Nov 18, 2016 at 04:43:40PM -0600, Segher Boessenkool wrote:
> > > > Could you also test with reload please?  Just LE is enough I guess.
> > > > We'd like to keep reload working for GCC 7 at least, and these cost
> > > > prefixes tend to break mov patterns :-/
> > > 
> > > Argh, I guess you are right, but then if reload doesn't work, I will 
> > > likely
> > > submit the patch where there are three different movdi's (one for 32-bit
> > > without the change, one for 64-bit with reload, and one for 64-bit with 
> > > lra).
> > > I would prefer not to do that.
> > 
> > Let's hope it just works :-)
> 
> I did test it over the weekend.
> 
> 29 of the 30 spec 2006 benchmarks currently build with reload (gamess fails).
> The same 29 build and run with the new patch.  Like the patch under LRA, there
> are no regressions in performance, and one FP benchmark is faster.
> 
> Under LRA, sphinx3 is 2.5% faster (compared to LRA without the patch).
> 
> Under reload, sphinx3 is roughly the same performance, but calculix is 3.8%
> faster.

Great, thanks for testing.

> Can I apply the patch?

Okay, if you change the changelog to say what the patch actually does ;-)
And please watch for fallout.


Segher

Re: [PATCH] Enable Intel AVX512_4FMAPS and AVX512_4VNNIW instructions

2016-11-21 Thread Jakub Jelinek

On Mon, Nov 21, 2016 at 08:40:37PM +0300, Andrew Senkevich wrote:
> > FWIW, I came across the same error in my own testing and raised
> > bug 78451.
> 
> Can we fix it with the following patch? Regtesting in progress.
> 
> PR target/78451
> * gcc/config/i386/avx5124fmapsintrin.h: Avoid call to
> _mm512_setzero_ps.
> * gcc/config/i386/avx5124vnniwintrin.h: Ditto.

That is just a workaround, we want to fix the real bug.  I'll have a look
tomorrow.

Jakub

Re: [PATCH, ARM] Enable ldrd/strd peephole rules unconditionally

2016-11-21 Thread Bernd Edlinger

On 11/21/16 18:50, Bin.Cheng wrote:
> Hi Bernd,
> Any update on the other patch you mentioned?  This one breaks
> bootstrap of arm-linux-gnueabihf with certain options like
> "--with-arch=armv7-a --with-fpu=neon --with-float=hard".
> I created PR78453 for tracking.
>
> Thanks,
> bin

Oh, sorry.

This should not depend on the other patch at all.
Unfortunately I used --with-fpu=vfpv3-d16 where
the bootstrap seems to work.

I will try your configuration immediately.

Bernd.

Re: [PATCH], Tweak PowerPC movdi constraints

2016-11-21 Thread Michael Meissner

On Fri, Nov 18, 2016 at 05:07:21PM -0600, Segher Boessenkool wrote:
> On Fri, Nov 18, 2016 at 05:52:12PM -0500, Michael Meissner wrote:
> > On Fri, Nov 18, 2016 at 04:43:40PM -0600, Segher Boessenkool wrote:
> > > Could you also test with reload please?  Just LE is enough I guess.
> > > We'd like to keep reload working for GCC 7 at least, and these cost
> > > prefixes tend to break mov patterns :-/
> > 
> > Argh, I guess you are right, but then if reload doesn't work, I will likely
> > submit the patch where there are three different movdi's (one for 32-bit
> > without the change, one for 64-bit with reload, and one for 64-bit with 
> > lra).
> > I would prefer not to do that.
> 
> Let's hope it just works :-)

I did test it over the weekend.

29 of the 30 spec 2006 benchmarks currently build with reload (gamess fails).
The same 29 build and run with the new patch.  Like the patch under LRA, there
are no regressions in performance, and one FP benchmark is faster.

Under LRA, sphinx3 is 2.5% faster (compared to LRA without the patch).

Under reload, sphinx3 is roughly the same performance, but calculix is 3.8%
faster.

Can I apply the patch?

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797

[PR target/25128] Improve comparisons against 65536 on m68k

2016-11-21 Thread Jeff Law


One more gift to the retro-computing folks.

GCC for the m68k generates poor code for some relational comparisons 
against 65536.  Swapping the upper/lower halves of the register and 
doing a word sized tst is faster on the 68000/68010 and smaller on all 
variants.  This doesn't work for all comparisons against 65536.


Anyway, it's easy to implement, so here we go.  Tested with m68k.md as 
well as by building newlib using a version which unconditionally made 
the transformation when it was valid (the committed version only fires 
with -Os or when tuning for a m68000/m68010.


You might question if any of that testing is worthwhile -- it actually 
found a case where I thought I'd changed the in-progress peephole, but 
had instead changed something totally unrelated causing the compiler to 
hang!


Installing on the trunk.  I'm probably done with m68k hacking for 2016, 
barring problems with the yearly bootstrap.


Jeff





commit bdc7c1bb83d9baa342c0101049f7586330e406da
Author: Jeff Law 
Date:   Mon Nov 21 11:18:34 2016 -0700

PR target/25128
* config/m68k/predicates.md (swap_peephole_relational_operator): New
predicate.
* config/m68k/m68k.md (relational tests against 65535/65536): New
peephole2.

PR target/25128
* gcc.target/m68k/pr25128.c: New test.

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index b30319e..d546162 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,11 @@
+2016-11-20  Jeff Law  
+
+   PR target/25128
+   * config/m68k/predicates.md (swap_peephole_relational_operator): New
+   predicate.
+   * config/m68k/m68k.md (relational tests against 65535/65536): New
+   peephole2.
+
 2016-11-21  Kyrylo Tkachov  
 
* tree-ssa-loop-prefetch.c: Delete FIXME after the includes.
diff --git a/gcc/config/m68k/m68k.md b/gcc/config/m68k/m68k.md
index 2085619..c6130f1 100644
--- a/gcc/config/m68k/m68k.md
+++ b/gcc/config/m68k/m68k.md
@@ -7786,7 +7786,6 @@
   operands[6] = GEN_INT (exact_log2 (INTVAL (operands[1]) + 1));
   operands[7] = operands[2];
 }")
-
 (define_peephole2
   [(set (cc0) (compare (match_operand:SI 0 "register_operand" "")
   (match_operand:SI 1 "pow2_m1_operand" "")))
@@ -7804,3 +7803,23 @@
   (match_dup 2) (match_dup 3)))]
   "{ operands[4] = GEN_INT (exact_log2 (INTVAL (operands[1]) + 1)); }")
 
+;; When optimizing for size or for the original 68000 or 68010, we can
+;; improve some relational tests against 65536 (which get canonicalized
+;; internally against 65535).
+;; The rotate in the output pattern will turn into a swap.
+(define_peephole2
+  [(set (cc0) (compare (match_operand:SI 0 "register_operand" "")
+  (const_int 65535)))
+   (set (pc) (if_then_else (match_operator 1 
"swap_peephole_relational_operator"
+[(cc0) (const_int 0)])
+  (match_operand 2 "pc_or_label_operand")
+  (match_operand 3 "pc_or_label_operand")))]
+  "peep2_reg_dead_p (1, operands[0])
+   && (operands[2] == pc_rtx || operands[3] == pc_rtx)
+   && (optimize_size || TUNE_68000_10)
+   && DATA_REG_P (operands[0])"
+  [(set (match_dup 0) (rotate:SI (match_dup 0) (const_int 16)))
+   (set (cc0) (compare (subreg:HI (match_dup 0) 2) (const_int 0)))
+   (set (pc) (if_then_else (match_op_dup 1 [(cc0) (const_int 0)])
+  (match_dup 2) (match_dup 3)))]
+  "")
diff --git a/gcc/config/m68k/predicates.md b/gcc/config/m68k/predicates.md
index bfb548a..be32ef6 100644
--- a/gcc/config/m68k/predicates.md
+++ b/gcc/config/m68k/predicates.md
@@ -279,3 +279,6 @@
 ;; Used to detect (pc) or (label_ref) in some jumping patterns to cut down
 (define_predicate "pc_or_label_operand"
   (match_code "pc,label_ref"))
+
+(define_predicate "swap_peephole_relational_operator"
+  (match_code "gtu,leu,gt,le"))
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index 828c941..7dbfcaa 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,8 @@
+2016-11-20  Jeff Law  
+
+   PR target/25128
+   * gcc.target/m68k/pr25128.c: New test.
+
 2016-11-21  Richard Sandiford  
 
* gcc.dg/tree-ssa/tailcall-7.c: New test.
diff --git a/gcc/testsuite/gcc.target/m68k/pr25128.c 
b/gcc/testsuite/gcc.target/m68k/pr25128.c
new file mode 100644
index 000..f99f817
--- /dev/null
+++ b/gcc/testsuite/gcc.target/m68k/pr25128.c
@@ -0,0 +1,48 @@
+/* { dg-do compile } */
+/* { dg-options "-Os" } */
+
+/* { dg-final { scan-assembler-times "swap" 4 } } */
+/* { dg-final { scan-assembler-times "tst.w" 4 } } */
+/* { dg-final { scan-assembler-not "cmp.l" } } */
+
+
+unsigned int bar (void);
+void
+foo1 (void)
+{
+  unsigned int a = bar ();
+  if (0x1 <= a)
+bar ();
+}
+
+
+void
+foo2 (void)
+{
+  unsigned int a = bar ();
+  if (0x1 > a)
+bar ();
+}
+
+
+void
+foo3 (void)
+{
+  int a = bar ();
+  if (0x1 <= a)
+bar ();
+}
+
+
+void
+foo4 (void)
+{

Re: [PATCH, GCC/ARM, ping] Optional -mthumb for Thumb only targets

2016-11-21 Thread Thomas Preudhomme




On 21/11/16 15:21, Christophe Lyon wrote:

On 21 November 2016 at 15:16, Thomas Preudhomme
 wrote:

On 21/11/16 08:51, Christophe Lyon wrote:


Hi Thomas,



Hi Christophe,





On 18 November 2016 at 17:51, Thomas Preudhomme
 wrote:


On 11/11/16 14:35, Kyrill Tkachov wrote:




On 08/11/16 13:36, Thomas Preudhomme wrote:



Ping?

Best regards,

Thomas

On 25/10/16 18:07, Thomas Preudhomme wrote:



Hi,

Currently when a user compiles for a thumb-only target (such as
Cortex-M
processors) without specifying the -mthumb option GCC throws the error
"target
CPU does not support ARM mode". This is suboptimal from a usability
point of
view: the -mthumb could be deduced from the -march or -mcpu option
when
there is
no ambiguity.

This patch implements this behavior by extending the DRIVER_SELF_SPECS
to
automatically append -mthumb to the command line for thumb-only
targets.
It does
so by checking the last -march option if any is given or the last
-mcpu
option
otherwise. There is no ordering issue because conflicting -mcpu and
-march is
already handled.

Note that the logic cannot be implemented in function
arm_option_override
because we need to provide the modified command line to the GCC driver
for
finding the right multilib path and the function arm_option_override
is
executed
too late for that effect.

ChangeLog entries are as follow:

*** gcc/ChangeLog ***

2016-10-18  Terry Guo  
Thomas Preud'homme 

PR target/64802
* common/config/arm/arm-common.c (arm_target_thumb_only): New
function.
* config/arm/arm-opts.h: Include arm-flags.h.
(struct arm_arch_core_flag): Define.
(arm_arch_core_flags): Define.
* config/arm/arm-protos.h: Include arm-flags.h.
(FL_NONE, FL_ANY, FL_CO_PROC, FL_ARCH3M, FL_MODE26, FL_MODE32,
FL_ARCH4, FL_ARCH5, FL_THUMB, FL_LDSCHED, FL_STRONG,
FL_ARCH5E,
FL_XSCALE, FL_ARCH6, FL_VFPV2, FL_WBUF, FL_ARCH6K, FL_THUMB2,
FL_NOTM,
FL_THUMB_DIV, FL_VFPV3, FL_NEON, FL_ARCH7EM, FL_ARCH7,
FL_ARM_DIV,
FL_ARCH8, FL_CRC32, FL_SMALLMUL, FL_NO_VOLATILE_CE, FL_IWMMXT,
FL_IWMMXT2, FL_ARCH6KZ, FL2_ARCH8_1, FL2_ARCH8_2,
FL2_FP16INST,
FL_TUNE, FL_FOR_ARCH2, FL_FOR_ARCH3, FL_FOR_ARCH3M,
FL_FOR_ARCH4,
FL_FOR_ARCH4T, FL_FOR_ARCH5, FL_FOR_ARCH5T, FL_FOR_ARCH5E,
FL_FOR_ARCH5TE, FL_FOR_ARCH5TEJ, FL_FOR_ARCH6, FL_FOR_ARCH6J,
FL_FOR_ARCH6K, FL_FOR_ARCH6Z, FL_FOR_ARCH6ZK, FL_FOR_ARCH6KZ,
FL_FOR_ARCH6T2, FL_FOR_ARCH6M, FL_FOR_ARCH7, FL_FOR_ARCH7A,
FL_FOR_ARCH7VE, FL_FOR_ARCH7R, FL_FOR_ARCH7M, FL_FOR_ARCH7EM,
FL_FOR_ARCH8A, FL2_FOR_ARCH8_1A, FL2_FOR_ARCH8_2A,
FL_FOR_ARCH8M_BASE,
FL_FOR_ARCH8M_MAIN, arm_feature_set, ARM_FSET_MAKE,
ARM_FSET_MAKE_CPU1, ARM_FSET_MAKE_CPU2, ARM_FSET_CPU1,
ARM_FSET_CPU2,
ARM_FSET_EMPTY, ARM_FSET_ANY, ARM_FSET_HAS_CPU1,
ARM_FSET_HAS_CPU2,
ARM_FSET_HAS_CPU, ARM_FSET_ADD_CPU1, ARM_FSET_ADD_CPU2,
ARM_FSET_DEL_CPU1, ARM_FSET_DEL_CPU2, ARM_FSET_UNION,
ARM_FSET_INTER,
ARM_FSET_XOR, ARM_FSET_EXCLUDE, ARM_FSET_IS_EMPTY,
ARM_FSET_CPU_SUBSET): Move to ...
* config/arm/arm-flags.h: This new file.
* config/arm/arm.h (TARGET_MODE_SPEC_FUNCTIONS): Define.
(EXTRA_SPEC_FUNCTIONS): Add TARGET_MODE_SPEC_FUNCTIONS to its
value.
(TARGET_MODE_SPECS): Define.
(DRIVER_SELF_SPECS): Add TARGET_MODE_SPECS to its value.


*** gcc/testsuite/ChangeLog ***

2016-10-11  Thomas Preud'homme 

PR target/64802
* gcc.target/arm/optional_thumb-1.c: New test.
* gcc.target/arm/optional_thumb-2.c: New test.
* gcc.target/arm/optional_thumb-3.c: New test.


No regression when running the testsuite for -mcpu=cortex-m0 -mthumb,
-mcpu=cortex-m0 -marm and -mcpu=cortex-a8 -marm

Is this ok for trunk?



This looks like a useful usability improvement.
This is ok after a bootstrap on an arm-none-linux-gnueabihf target.

Sorry for the delay,
Kyrill




I've rebased the patch on top of the arm_feature_set type consistency fix
[1] and committed it. The committed patch is in attachment for reference.

[1] https://gcc.gnu.org/ml/gcc-patches/2016-11/msg01680.html



Since this commit (242597), I've noticed that:
- the 2 new tests optional_thumb-1.c and optional_thumb-2.c fail
if GCC was configured --with-mode=arm. The error message is:
cc1: error: target CPU does not support ARM mode



I need to skip the test when GCC is built with --with-mode= but we do not
have a directive for that. I'll see if I can add one.


Since the test adds -march=armv6m, shouldn't you patch implicitly add -mthumb?


Precisely not, since the purpose is to test that GCC deduce the -mthumb 
automatically when the target is Thumb-only. The thing is the test should only 
be run when no -mthumb or -marm is passed and --with-mode=* is equivalent to 
passing -mthumb or -marm.






- on armeb --with-mode=arm, gcc.dg/vect/pr64252.c fails at execution

See:
http://people.linaro.org/~christoph

Re: [PATCH, ARM] Enable ldrd/strd peephole rules unconditionally

2016-11-21 Thread Bin.Cheng

On Fri, Nov 18, 2016 at 3:50 PM, Bernd Edlinger
 wrote:
> On 11/18/16 12:58, Christophe Lyon wrote:
>> On 17 November 2016 at 10:23, Kyrill Tkachov
>>  wrote:
>>>
>>> On 09/11/16 12:58, Bernd Edlinger wrote:

 Hi!


 This patch enables the ldrd/strd peephole rules unconditionally.

 It is meant to fix cases, where the patch to reduce the sha512
 stack usage splits ldrd/strd instructions into separate ldr/str insns,
 but is technically independent from the other patch:

 See https://gcc.gnu.org/ml/gcc-patches/2016-11/msg00523.html

 It was necessary to change check_effective_target_arm_prefer_ldrd_strd
 to retain the true prefer_ldrd_strd tuning flag.


 Bootstrapped and reg-tested on arm-linux-gnueabihf.
 Is it OK for trunk?
Hi Bernd,
Any update on the other patch you mentioned?  This one breaks
bootstrap of arm-linux-gnueabihf with certain options like
"--with-arch=armv7-a --with-fpu=neon --with-float=hard".
I created PR78453 for tracking.

Thanks,
bin
>>>
>>>
>>> This is ok.
>>> Thanks,
>>> Kyrill
>>>
>>
>> Hi Bernd,
>>
>> Since you committed this patch (r242549), I'm seeing the new test
>> failing on some arm*-linux-gnueabihf configurations:
>>
>> FAIL:  gcc.target/arm/pr53447-5.c scan-assembler-times ldrd 10
>> FAIL:  gcc.target/arm/pr53447-5.c scan-assembler-times strd 9
>>
>> See 
>> http://people.linaro.org/~christophe.lyon/cross-validation/gcc/trunk/242549/report-build-info.html
>> for a map of failures.
>>
>> Am I missing something?
>
> Hi Christophe,
>
> as always many thanks for your testing...
>
> I have apparently only looked at the case -mfloat-abi=soft here, which
> is what my other patch is going to address.  But all targets with
> -mfpu=neon -mfloat-abi=hard can also use vldr.64 instead of ldrd
> and vstr.64 instead of strd, which should be accepted as well.
>
> So the attached patch should fix at least most of the fallout.
>
> Is it OK for trunk?
>
>
> Thanks
> Bernd.

Re: [PATCH] Enable Intel AVX512_4FMAPS and AVX512_4VNNIW instructions

2016-11-21 Thread Andrew Senkevich

2016-11-21 20:12 GMT+03:00 Martin Sebor :
> On 11/20/2016 11:16 AM, Uros Bizjak wrote:
>>
>> On Sat, Nov 19, 2016 at 7:52 PM, Uros Bizjak  wrote:
>>>
>>> On Sat, Nov 19, 2016 at 6:24 PM, Jakub Jelinek  wrote:

 On Sat, Nov 19, 2016 at 12:28:22PM +0100, Jakub Jelinek wrote:
>
> On x86_64-linux with the 3 patches I'm not seeing any new FAILs
> compared to before r242569, on i686-linux there is still:
> +FAIL: gcc.target/i386/pr57756.c  (test for errors, line 6)
> +FAIL: gcc.target/i386/pr57756.c  (test for warnings, line 14)
> compared to pre-r242569 (so some further fix is needed).


 And finally here is yet another patch that fixes pr57756 on i686-linux.
 Ok for trunk together with the other 3 patches?
>>>
>>>
>>> OK for the whole patch series.
>>
>>
>> Hm, I still see (both, 32bit and 64bit targets):
>>
>> In file included from /ssd/uros/gcc-build/gcc/include/immintrin.h:45:0,^M
>>  from
>> /home/uros/gcc-svn/trunk/gcc/testsuite/gcc.target/i386/sse-22.c:223,^M
>>  from
>> /home/uros/gcc-svn/trunk/gcc/testsuite/gcc.target/i386/sse-22a.c:7:^M
>> /ssd/uros/gcc-build/gcc/include/avx5124fmapsintrin.h: In function
>> '_mm512_maskz_4fmadd_ps':^M
>> /ssd/uros/gcc-build/gcc/include/avx512fintrin.h:244:1: error: inlining
>> failed in call to always_inline '_mm512_setzero_ps': target specific
>> option mismatch^M
>> In file included from /ssd/uros/gcc-build/gcc/include/immintrin.h:71:0,^M
>>  from
>> /home/uros/gcc-svn/trunk/gcc/testsuite/gcc.target/i386/sse-22.c:223,^M
>>  from
>> /home/uros/gcc-svn/trunk/gcc/testsuite/gcc.target/i386/sse-22a.c:7:^M
>> /ssd/uros/gcc-build/gcc/include/avx5124fmapsintrin.h:77:17: note:
>> called from here^M
>> compiler exited with status 1
>> FAIL: gcc.target/i386/sse-22a.c (test for excess errors)
>> Excess errors:
>> /ssd/uros/gcc-build/gcc/include/avx512fintrin.h:244:1: error: inlining
>> failed in call to always_inline '_mm512_setzero_ps': target specific
>> option mismatch
>
>
> FWIW, I came across the same error in my own testing and raised
> bug 78451.

Can we fix it with the following patch? Regtesting in progress.

PR target/78451
* gcc/config/i386/avx5124fmapsintrin.h: Avoid call to
_mm512_setzero_ps.
* gcc/config/i386/avx5124vnniwintrin.h: Ditto.

diff --git a/gcc/config/i386/avx5124fmapsintrin.h
b/gcc/config/i386/avx5124fmapsintrin.h
index 6113ee9..dd9a322
--- a/gcc/config/i386/avx5124fmapsintrin.h
+++ b/gcc/config/i386/avx5124fmapsintrin.h
@@ -74,7 +74,9 @@ _mm512_maskz_4fmadd_ps (__mmask16 __U,
  (__v16sf) __E,
  (__v16sf) __A,
  (const __v4sf *) __F,
- (__v16sf) _mm512_setzero_ps (),
+ (__v16sf) {0, 0, 0, 0,
+ 0, 0, 0, 0, 0, 0, 0, 0,
+ 0, 0, 0, 0},
  (__mmask16) __U);
 }

@@ -161,7 +163,9 @@ _mm512_maskz_4fnmadd_ps (__mmask16 __U,
  (__v16sf) __E,
  (__v16sf) __A,
  (const __v4sf *) __F,
- (__v16sf) _mm512_setzero_ps (),
+ (__v16sf) {0, 0, 0, 0,
+ 0, 0, 0, 0, 0, 0, 0, 0,
+ 0, 0, 0, 0},
  (__mmask16) __U);
 }

diff --git a/gcc/config/i386/avx5124vnniwintrin.h
b/gcc/config/i386/avx5124vnniwintrin.h
index 392c6a5..a4faa24
--- a/gcc/config/i386/avx5124vnniwintrin.h
+++ b/gcc/config/i386/avx5124vnniwintrin.h
@@ -75,7 +75,9 @@ _mm512_maskz_4dpwssd_epi32 (__mmask16 __U, __m512i
__A, __m512i __B,
   (__v16si) __E,
   (__v16si) __A,
   (const __v4si *) __F,
-  (__v16si) _mm512_setzero_ps (),
+  (__v16si) {0, 0, 0, 0,
+  0, 0, 0, 0, 0, 0, 0, 0,
+  0, 0, 0, 0},
   (__mmask16) __U);
 }

@@ -120,7 +122,9 @@ _mm512_maskz_4dpwssds_epi32 (__mmask16 __U,
__m512i __A, __m512i __B,
(__v16si) __E,
(__v16si) __A,
(const __v4si *) __F,
-   (__v16si) _mm512_setzero_ps (),
+   (__v16si) {0, 0, 0, 0,
+   0, 0, 0, 0, 0, 0, 0, 0,
+   0, 0, 0, 0},
(__mmask16) __U);
 }


--
WBR,
Andrew


sse-22a-fix.patch
Description: Binary data

Re: [PATCH v3] cpp/c: Add -Wexpansion-to-defined

2016-11-21 Thread Paolo Bonzini



On 10/08/2016 15:53, Paolo Bonzini wrote:
> From: Paolo Bonzini 
> 
> clang recently added a new warning -Wexpansion-to-defined, which
> warns when `defined' is used outside a #if expression (including the
> case of a macro that is then used in a #if expression).
> 
> While I disagree with their inclusion of the warning to -Wall, I think
> it is a good addition overall.  First, it is a logical extension of the
> existing warning for breaking defined across a macro and its caller.
> Second, it is good to make these warnings for `defined' available with
> a command-line option other than -pedantic.
> 
> This patch adds -Wexpansion-to-defined, and enables it automatically
> for both -pedantic (as before) and -Wextra.  It also changes the warning
> from a warning to a pedwarn; this is mostly for documentation sake
> since the EnabledBy machinery would turn -pedantic-errors into
> -Werror=expansion-to-defined anyway.
> 
> Bootstrapped and regression tested x86_64-pc-linux-gnu, ok?
> 
> Paolo

Ping?  I would like to commit this now as "posted early enough during
Stage 1 and not yet fully reviewed".

Thanks,

Paolo

> libcpp:
> 2016-08-09  Paolo Bonzini  
> 
> * include/cpplib.h (struct cpp_options): Add new member
> warn_expansion_to_defined.
> (CPP_W_EXPANSION_TO_DEFINED): New enum member.
> * expr.c (parse_defined): Warn for all uses of "defined"
> in macros, and tie warning to CPP_W_EXPANSION_TO_DEFINED.
> Make it a pedwarning instead of a warning.
> * system.h (HAVE_DESIGNATED_INITIALIZERS): Do not use
> "defined" in macros.
> 
> gcc:
> 2016-08-09  Paolo Bonzini  
> 
> * system.h (HAVE_DESIGNATED_INITIALIZERS,
> HAVE_DESIGNATED_UNION_INITIALIZERS): Do not use
> "defined" in macros.
> 
> gcc/c-family:
> 2016-08-09  Paolo Bonzini  
> 
> * c.opt (Wexpansion-to-defined): New.
> 
> gcc/doc:
> 2016-08-09  Paolo Bonzini  
> 
> * cpp.texi (Defined): Mention -Wexpansion-to-defined.
> * cppopts.texi (Invocation): Document -Wexpansion-to-defined.
> * invoke.texi (Warning Options): Document -Wexpansion-to-defined.
> 
> gcc/testsuite:
> 2016-08-09  Paolo Bonzini  
> 
> * gcc.dg/cpp/defined.c: Mark newly introduced warnings and
> adjust for warning->pedwarn change.
> * gcc.dg/cpp/defined-Wexpansion-to-defined.c,
> gcc.dg/cpp/defined-Wextra-Wno-expansion-to-defined.c,
> gcc.dg/cpp/defined-Wextra.c,
> gcc.dg/cpp/defined-Wno-expansion-to-defined.c: New testcases.
> 
> Index: gcc/c-family/c.opt
> ===
> --- gcc/c-family/c.opt(revision 239276)
> +++ gcc/c-family/c.opt(working copy)
> @@ -506,6 +506,10 @@
>  C ObjC C++ ObjC++ Var(warn_double_promotion) Warning
>  Warn about implicit conversions from \"float\" to \"double\".
>  
> +Wexpansion-to-defined
> +C ObjC C++ ObjC++ CPP(warn_expansion_to_defined) 
> CppReason(CPP_W_EXPANSION_TO_DEFINED) Var(cpp_warn_expansion_to_defined) 
> Init(0) Warning EnabledBy(Wextra || Wpedantic)
> +Warn if \"defined\" is used outside #if.
> +
>  Wimplicit-function-declaration
>  C ObjC Var(warn_implicit_function_declaration) Init(-1) Warning 
> LangEnabledBy(C ObjC,Wimplicit)
>  Warn about implicit function declarations.
> Index: gcc/doc/cpp.texi
> ===
> --- gcc/doc/cpp.texi  (revision 239276)
> +++ gcc/doc/cpp.texi  (working copy)
> @@ -3342,7 +3342,9 @@
>  the C standard says the behavior is undefined.  GNU cpp treats it as a
>  genuine @code{defined} operator and evaluates it normally.  It will warn
>  wherever your code uses this feature if you use the command-line option
> -@option{-pedantic}, since other compilers may handle it differently.
> +@option{-Wpedantic}, since other compilers may handle it differently.  The
> +warning is also enabled by @option{-Wextra}, and can also be enabled
> +individually with @option{-Wexpansion-to-defined}.
>  
>  @node Else
>  @subsection Else
> Index: gcc/doc/cppopts.texi
> ===
> --- gcc/doc/cppopts.texi  (revision 239276)
> +++ gcc/doc/cppopts.texi  (working copy)
> @@ -120,6 +120,12 @@
>  @samp{#if} directive, outside of @samp{defined}.  Such identifiers are
>  replaced with zero.
>  
> +@item -Wexpansion-to-defined
> +@opindex Wexpansion-to-defined
> +Warn whenever @samp{defined} is encountered in the expansion of a macro
> +(including the case where the macro is expanded by an @samp{#if} directive).
> +Such usage is not portable.
> +
>  @item -Wunused-macros
>  @opindex Wunused-macros
>  Warn about macros defined in the main file that are unused.  A macro
> Index: gcc/doc/invoke.texi
> ===
> --- gcc/doc/invoke.texi   (revision 239276)
> +++ gcc/doc/invoke.texi   (working copy)
> @@ -4914,6 +4914,11 @@
>

[PATCH][committed] Remove dead FIXME

2016-11-21 Thread Kyrill Tkachov


Hi all,

tree-ssa-loop-prefetch.c used to include expr.h so that optabs would work but 
since the
optabs headers were cleaned up in the last (couple of?) releases it no longer 
needs to do that
and expr.h is no longer included, but the stale FIXME comment is still there.
This patch removes it.

Committing to trunk as obvious.

Thanks,
Kyrill

2016-11-21  Kyrylo Tkachov  

* tree-ssa-loop-prefetch.c: Delete FIXME after the includes.
commit 153c0f1b4c5057482acdd1447239e0cb29261625
Author: Kyrylo Tkachov 
Date:   Mon Nov 21 14:12:42 2016 +

[obvious] Remove dead FIXME

diff --git a/gcc/tree-ssa-loop-prefetch.c b/gcc/tree-ssa-loop-prefetch.c
index 43ee85a..0a2ee5e 100644
--- a/gcc/tree-ssa-loop-prefetch.c
+++ b/gcc/tree-ssa-loop-prefetch.c
@@ -48,10 +48,6 @@ along with GCC; see the file COPYING3.  If not see
 #include "tree-inline.h"
 #include "tree-data-ref.h"
 
-
-/* FIXME: Needed for optabs, but this should all be moved to a TBD interface
-   between the GIMPLE and RTL worlds.  */
-
 /* This pass inserts prefetch instructions to optimize cache usage during
accesses to arrays in loops.  It processes loops sequentially and:

Re: [PATCH] Delete GCJ

2016-11-21 Thread Rainer Orth

Hi Matthias,

> ahh, didn't see that :-/ Now fixed, is this clearer now?
>
> The options @option{--with-target-bdw-gc-include} and
> @option{--with-target-bdw-gc-lib} must always specified together for
   ^ be 
Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University

Re: [PATCH] Delete GCJ

2016-11-21 Thread Peter Bergner


On 11/21/16 10:40 AM, Matthias Klose wrote:

On 21.11.2016 17:23, Sandra Loosemore wrote:

On 11/21/2016 05:57 AM, Matthias Klose wrote:
I'm sure you didn't mean exactly the same --with-foo in all 3 places, but I'm a
dummy about what you really intended to say here.


ahh, didn't see that :-/ Now fixed, is this clearer now?

The options @option{--with-target-bdw-gc-include} and
@option{--with-target-bdw-gc-lib} must always specified together for
each multilib variant and take precedence over
@option{--with-target-bdw-gc}.  If none of these options are
specified, the values are taken from the @command{pkg-config}
@samp{bdw-gc} module.


s/must always specified/must always be specified/

and possibly:

s/variant and take precedence/variant and they take precedence/ ???


Peter

Re: [PATCH] Enable Intel AVX512_4FMAPS and AVX512_4VNNIW instructions

2016-11-21 Thread Martin Sebor


On 11/20/2016 11:16 AM, Uros Bizjak wrote:

On Sat, Nov 19, 2016 at 7:52 PM, Uros Bizjak  wrote:

On Sat, Nov 19, 2016 at 6:24 PM, Jakub Jelinek  wrote:

On Sat, Nov 19, 2016 at 12:28:22PM +0100, Jakub Jelinek wrote:

On x86_64-linux with the 3 patches I'm not seeing any new FAILs
compared to before r242569, on i686-linux there is still:
+FAIL: gcc.target/i386/pr57756.c  (test for errors, line 6)
+FAIL: gcc.target/i386/pr57756.c  (test for warnings, line 14)
compared to pre-r242569 (so some further fix is needed).


And finally here is yet another patch that fixes pr57756 on i686-linux.
Ok for trunk together with the other 3 patches?


OK for the whole patch series.


Hm, I still see (both, 32bit and 64bit targets):

In file included from /ssd/uros/gcc-build/gcc/include/immintrin.h:45:0,^M
 from
/home/uros/gcc-svn/trunk/gcc/testsuite/gcc.target/i386/sse-22.c:223,^M
 from
/home/uros/gcc-svn/trunk/gcc/testsuite/gcc.target/i386/sse-22a.c:7:^M
/ssd/uros/gcc-build/gcc/include/avx5124fmapsintrin.h: In function
'_mm512_maskz_4fmadd_ps':^M
/ssd/uros/gcc-build/gcc/include/avx512fintrin.h:244:1: error: inlining
failed in call to always_inline '_mm512_setzero_ps': target specific
option mismatch^M
In file included from /ssd/uros/gcc-build/gcc/include/immintrin.h:71:0,^M
 from
/home/uros/gcc-svn/trunk/gcc/testsuite/gcc.target/i386/sse-22.c:223,^M
 from
/home/uros/gcc-svn/trunk/gcc/testsuite/gcc.target/i386/sse-22a.c:7:^M
/ssd/uros/gcc-build/gcc/include/avx5124fmapsintrin.h:77:17: note:
called from here^M
compiler exited with status 1
FAIL: gcc.target/i386/sse-22a.c (test for excess errors)
Excess errors:
/ssd/uros/gcc-build/gcc/include/avx512fintrin.h:244:1: error: inlining
failed in call to always_inline '_mm512_setzero_ps': target specific
option mismatch


FWIW, I came across the same error in my own testing and raised
bug 78451.

Martin

[avr,committed] Use more C++ for-loop declarations.

2016-11-21 Thread Georg-Johann Lay


https://gcc.gnu.org/r242672

Committed this obvious code clean-up to avr.c and avr-c.c.

Johann

gcc/
* config/avr/avr-c.c (avr_register_target_pragmas): Use C++
for-loop declaration of loop variable.
(avr_register_target_pragmas, avr_cpu_cpp_builtins): Same.
* config/avr/avr.c (avr_popcount_each_byte)
(avr_init_expanders, avr_regs_to_save, sequent_regs_live)
(get_sequence_length, avr_prologue_setup_frame, avr_map_metric)
(avr_expand_epilogue, avr_function_arg_advance)
(avr_out_compare, avr_out_plus_1, avr_out_bitop, avr_out_fract)
(avr_rotate_bytes, _reg_unused_after, avr_assemble_integer)
(avr_adjust_reg_alloc_order, output_reload_in_const)
(avr_conditional_register_usage, avr_find_unused_d_reg)
(avr_map_decompose, avr_fold_builtin): Same.

Index: config/avr/avr-c.c
===
--- config/avr/avr-c.c	(revision 242671)
+++ config/avr/avr-c.c	(working copy)
@@ -249,8 +249,6 @@ avr_resolve_overloaded_builtin (unsigned
 void
 avr_register_target_pragmas (void)
 {
-  int i;
-
   gcc_assert (ADDR_SPACE_GENERIC == ADDR_SPACE_RAM);
 
   /* Register address spaces.  The order must be the same as in the respective
@@ -259,7 +257,7 @@ avr_register_target_pragmas (void)
  sense for some targets.  Diagnose for non-supported spaces will be
  emit by TARGET_ADDR_SPACE_DIAGNOSE_USAGE.  */
 
-  for (i = 0; i < ADDR_SPACE_COUNT; i++)
+  for (int i = 0; i < ADDR_SPACE_COUNT; i++)
 {
   gcc_assert (i == avr_addrspace[i].id);
 
@@ -292,8 +290,6 @@ avr_toupper (char *up, const char *lo)
 void
 avr_cpu_cpp_builtins (struct cpp_reader *pfile)
 {
-  int i;
-
   builtin_define_std ("AVR");
 
   /* __AVR_DEVICE_NAME__ and  avr_mcu_types[].macro like __AVR_ATmega8__
@@ -391,7 +387,7 @@ /* Define builtin macros so that the use
 
   if (lang_GNU_C ())
 {
-  for (i = 0; i < ADDR_SPACE_COUNT; i++)
+  for (int i = 0; i < ADDR_SPACE_COUNT; i++)
 if (!ADDR_SPACE_GENERIC_P (i)
 /* Only supply __FLASH macro if the address space is reasonable
for this target.  The address space qualifier itself is still
Index: config/avr/avr.c
===
--- config/avr/avr.c	(revision 242671)
+++ config/avr/avr.c	(working copy)
@@ -251,14 +251,12 @@ avr_tolower (char *lo, const char *up)
 bool
 avr_popcount_each_byte (rtx xval, int n_bytes, int pop_mask)
 {
-  int i;
-
   machine_mode mode = GET_MODE (xval);
 
   if (VOIDmode == mode)
 mode = SImode;
 
-  for (i = 0; i < n_bytes; i++)
+  for (int i = 0; i < n_bytes; i++)
 {
   rtx xval8 = simplify_gen_subreg (QImode, xval, mode, i);
   unsigned int val8 = UINTVAL (xval8) & GET_MODE_MASK (QImode);
@@ -812,9 +810,7 @@ avr_init_machine_status (void)
 void
 avr_init_expanders (void)
 {
-  int regno;
-
-  for (regno = 0; regno < 32; regno ++)
+  for (int regno = 0; regno < 32; regno ++)
 all_regs_rtx[regno] = gen_rtx_REG (QImode, regno);
 
   lpm_reg_rtx  = all_regs_rtx[LPM_REGNO];
@@ -1138,7 +1134,7 @@ avr_starting_frame_offset (void)
 static int
 avr_regs_to_save (HARD_REG_SET *set)
 {
-  int reg, count;
+  int count;
   int int_or_sig_p = cfun->machine->is_interrupt || cfun->machine->is_signal;
 
   if (set)
@@ -1153,7 +1149,7 @@ avr_regs_to_save (HARD_REG_SET *set)
   || cfun->machine->is_OS_main)
 return 0;
 
-  for (reg = 0; reg < 32; reg++)
+  for (int reg = 0; reg < 32; reg++)
 {
   /* Do not push/pop __tmp_reg__, __zero_reg__, as well as
  any global register variables.  */
@@ -1340,11 +1336,10 @@ avr_simple_epilogue (void)
 static int
 sequent_regs_live (void)
 {
-  int reg;
   int live_seq = 0;
   int cur_seq = 0;
 
-  for (reg = 0; reg <= LAST_CALLEE_SAVED_REG; ++reg)
+  for (int reg = 0; reg <= LAST_CALLEE_SAVED_REG; ++reg)
 {
   if (fixed_regs[reg])
 {
@@ -1400,10 +1395,9 @@ sequent_regs_live (void)
 int
 get_sequence_length (rtx_insn *insns)
 {
-  rtx_insn *insn;
-  int length;
+  int length = 0;
 
-  for (insn = insns, length = 0; insn; insn = NEXT_INSN (insn))
+  for (rtx_insn *insn = insns; insn; insn = NEXT_INSN (insn))
 length += get_attr_length (insn);
 
   return length;
@@ -1539,9 +1533,7 @@ avr_prologue_setup_frame (HOST_WIDE_INT
 }
   else /* !minimize */
 {
-  int reg;
-
-  for (reg = 0; reg < 32; ++reg)
+  for (int reg = 0; reg < 32; ++reg)
 if (TEST_HARD_REG_BIT (set, reg))
   emit_push_byte (reg, true);
 
@@ -1884,7 +1876,6 @@ emit_pop_byte (unsigned regno)
 void
 avr_expand_epilogue (bool sibcall_p)
 {
-  int reg;
   int live_seq;
   HARD_REG_SET set;
   int minimize;
@@ -2026,7 +2017,7 @@ avr_expand_epilogue (bool sibcall_p)
 
   /* Restore used registers.  */
 
-  for (reg = 31; reg >= 0; --reg)
+  for (int reg = 31; reg >= 0; --reg)
 if (TEST_HARD_REG_BIT (set, reg))
   emit_pop_byte (reg);
 
@@ -3218,9 +3

Re: [PATCH] Delete GCJ

2016-11-21 Thread Matthias Klose

On 21.11.2016 17:23, Sandra Loosemore wrote:
> On 11/21/2016 05:57 AM, Matthias Klose wrote:
>>
>> --with-target-bdw-gc=/opt/bdw-gc,32=/opt/bdw-gc32
>>
>> sets the include and lib dirs by appending include and lib to the paths. If 
>> you
>> have options --with-target-bdw-gc-include= and --with-target-bdw-gc-lib= as
>> well, it overrides the settings done in --with-target-bdw-gc=. This is copied
>> from the setting of the gmp/mpfr options.
>>
>> Any of these options override the automatic discovery using pkg-config.
>>
>> Please suggest a better wording; I thought that was clear enough (and better
>> than the undocumented --enable-libobjc-gc anyway ;)
> 
> FAOD, my complaint about your patch is that it essentially said
> 
> "The options --with-foo and --with-foo must also be specified together and
> override --with-foo."
> 
> I'm sure you didn't mean exactly the same --with-foo in all 3 places, but I'm 
> a
> dummy about what you really intended to say here.

ahh, didn't see that :-/ Now fixed, is this clearer now?

The options @option{--with-target-bdw-gc-include} and
@option{--with-target-bdw-gc-lib} must always specified together for
each multilib variant and take precedence over
@option{--with-target-bdw-gc}.  If none of these options are
specified, the values are taken from the @command{pkg-config}
@samp{bdw-gc} module.

Matthias

Re: [PATCH] Delete GCJ

2016-11-21 Thread Sandra Loosemore


On 11/21/2016 05:57 AM, Matthias Klose wrote:


--with-target-bdw-gc=/opt/bdw-gc,32=/opt/bdw-gc32

sets the include and lib dirs by appending include and lib to the paths. If you
have options --with-target-bdw-gc-include= and --with-target-bdw-gc-lib= as
well, it overrides the settings done in --with-target-bdw-gc=. This is copied
from the setting of the gmp/mpfr options.

Any of these options override the automatic discovery using pkg-config.

Please suggest a better wording; I thought that was clear enough (and better
than the undocumented --enable-libobjc-gc anyway ;)


FAOD, my complaint about your patch is that it essentially said

"The options --with-foo and --with-foo must also be specified together 
and override --with-foo."


I'm sure you didn't mean exactly the same --with-foo in all 3 places, 
but I'm a dummy about what you really intended to say here.


-Sandra

Re: PR61409: -Wmaybe-uninitialized false-positive with -O2

2016-11-21 Thread Aldy Hernandez




Since this commit (r242639), I've noticed regressions on arm targets:

  - PASS now FAIL [PASS => FAIL]:

  gcc.dg/uninit-pred-6_a.c warning (test for warnings, line 36)
  gcc.dg/uninit-pred-6_b.c warning (test for warnings, line 42)
  gcc.dg/uninit-pred-7_c.c (test for excess errors)
  gcc.dg/uninit-pred-7_c.c warning (test for warnings, line 29)

  - FAIL appears  [ => FAIL]:

  gcc.dg/uninit-pred-7_c.c (internal compiler error)

This ICE says:
/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/testsuite/gcc.dg/uninit-pred-7_c.c:
In function 'foo':
/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/testsuite/gcc.dg/uninit-pred-7_c.c:9:5:
internal compiler error: in operator[], at vec.h:732


I'll look into this.

[patch,avr,committed] Use popcount_hwi instead of home-brew code.

2016-11-21 Thread Georg-Johann Lay


http://gcc.gnu.org/r242670

Committed as obvious code clean-up.

Johann

gcc/
* config/avr/avr.c (avr_popcount): Remove static function.
(avr_popcount_each_byte, avr_out_bitop): Use popcount_hwi instead.

Index: config/avr/avr.c
===
--- config/avr/avr.c	(revision 242660)
+++ config/avr/avr.c	(working copy)
@@ -243,23 +243,6 @@ avr_tolower (char *lo, const char *up)
 }
 
 
-/* Custom function to count number of set bits.  */
-
-static inline int
-avr_popcount (unsigned int val)
-{
-  int pop = 0;
-
-  while (val)
-{
-  val &= val-1;
-  pop++;
-}
-
-  return pop;
-}
-
-
 /* Constraint helper function.  XVAL is a CONST_INT or a CONST_DOUBLE.
Return true if the least significant N_BYTES bytes of XVAL all have a
popcount in POP_MASK and false, otherwise.  POP_MASK represents a subset
@@ -280,7 +263,7 @@ avr_popcount_each_byte (rtx xval, int n_
   rtx xval8 = simplify_gen_subreg (QImode, xval, mode, i);
   unsigned int val8 = UINTVAL (xval8) & GET_MODE_MASK (QImode);
 
-  if (0 == (pop_mask & (1 << avr_popcount (val8
+  if (0 == (pop_mask & (1 << popcount_hwi (val8
 return false;
 }
 
@@ -8135,7 +8118,7 @@ avr_out_bitop (rtx insn, rtx *xop, int *
   unsigned int val8 = UINTVAL (xval8) & GET_MODE_MASK (QImode);
 
   /* Number of bits set in the current byte of the constant.  */
-  int pop8 = avr_popcount (val8);
+  int pop8 = popcount_hwi (val8);
 
   /* Registers R16..R31 can operate with immediate.  */
   bool ld_reg_p = test_hard_reg_class (LD_REGS, reg8);

[committed] substring_loc info needs default track-macro-expansion (PR preprocessor/78324)

2016-11-21 Thread David Malcolm

On Fri, 2016-11-18 at 15:47 -0700, Martin Sebor wrote:
> On 11/18/2016 03:57 PM, David Malcolm wrote:
> > On Fri, 2016-11-18 at 09:51 -0700, Martin Sebor wrote:
> > > > Martin: are the changes to your test cases OK by you, or is
> > > > there
> > > > a better way to rewrite them?
> > > 
> > > Thanks for looking into it!
> > > 
> > > Since the purpose of the test_sprintf_note function in the test
> > > is
> > > to verify the location of the caret within the warnings I think
> > > we
> > > should keep it if it's possible.  Would either removing the P
> > > macro
> > > or moving the function to a different file that doesn't use the
> > > -ftrack-macro-expansion=0 option work?
> > 
> > To get substring locations with the proposed patch, that code will
> > need to
> > be in a file without -ftrack-macro-expansion=0.
> > 
> > builtin-sprintf-warn-4.c seems to fit the bill, and has caret
> > -printing
> > enabled too, so here's another attempt at the patch, just covering
> > the
> > affected test cases, which moves test_sprintf_note to that file,
> > and
> > drops "P".
> > 
> > The carets/underlines from the warnings look sane, and the test
> > case
> > verifies that via dg-begin/end-multiline-output directives.  The
> > test
> > case also verifies the carets/underlins from the *notes*.
> > 
> > [FWIW, I'm less convinced by the carets/underlines from the notes:
> > they all underline the whole of the __builtin_sprintf expression,
> > though looking at gimple-ssa-sprintf.c, I see:
> > 
> >   location_t callloc = gimple_location (info.callstmt);
> > 
> > which is used for the "inform" calls in question.  Hence I think
> > it's faithfully printing what that code is asking it to.  I'd
> > prefer
> > to not touch that location in this patch, since it seems
> > orthogonal to fixing the PR preprocessor/78324; perhaps something
> > to address as part of PR middle-end/77696 ?].
> > 
> > Martin: how does this look to you?  Any objections to this change
> > as part of my fix for PR preprocessor/78324?
> 
> Not at all.  It looks great -- using the multiline output is even
> better than the original.  You also noticed the comment about the
> caret limitation being out of data and removed it.  Thanks for
> that too!
> 
> I agree that the underlining in the notes could stand to be
> improved at some point.  I'll see if I can get to it sometime
> after I'm done with all my pending patches.
> 
> Thanks again!
> Martin
> 
> PS If there's something I can help with while you're working on
> the rest of the bug let me know.

Thanks.

The updated patch successfully bootstrapped®rtested
(on x86_64-pc-linux-gnu), so I've committed it to trunk (r242667).

I also verified the problematic test case under valgrind, and
confirmed that it's fixed (gcc.dg/tree-ssa/builtin-sprintf-2.c),
so I'll close out the bug.

Patch follows, for reference.

gcc/ChangeLog:
PR preprocessor/78324
* input.c (get_substring_ranges_for_loc): Fail gracefully if
-ftrack-macro-expansion has a value other than 2.

gcc/testsuite/ChangeLog:
PR preprocessor/78324
* gcc.dg/plugin/diagnostic-test-string-literals-1.c
(test_multitoken_macro): New function.
* gcc.dg/plugin/diagnostic-test-string-literals-3.c: New test
case.
* gcc.dg/plugin/diagnostic-test-string-literals-4.c: New test
case.
* gcc.dg/plugin/plugin.exp (plugin_test_list): Add the new test
cases.
* gcc.dg/tree-ssa/builtin-sprintf-warn-1.c (test_sprintf_note):
Move to...
* gcc.dg/tree-ssa/builtin-sprintf-warn-4.c: ...here.  Drop
-ftrack-macro-expansion=0.
(test_sprintf_note): Remove "P" macro.  Add
dg-begin/end-multiline-output directives.
(LINE, buffer, ptr): Copy from builtin-sprintf-warn-1.c.
---
 gcc/input.c|  9 +++
 .../plugin/diagnostic-test-string-literals-1.c | 16 +
 .../plugin/diagnostic-test-string-literals-3.c | 43 
 .../plugin/diagnostic-test-string-literals-4.c | 43 
 gcc/testsuite/gcc.dg/plugin/plugin.exp |  4 +-
 .../gcc.dg/tree-ssa/builtin-sprintf-warn-1.c   | 29 
 .../gcc.dg/tree-ssa/builtin-sprintf-warn-4.c   | 80 +-
 7 files changed, 193 insertions(+), 31 deletions(-)
 create mode 100644 
gcc/testsuite/gcc.dg/plugin/diagnostic-test-string-literals-3.c
 create mode 100644 
gcc/testsuite/gcc.dg/plugin/diagnostic-test-string-literals-4.c

diff --git a/gcc/input.c b/gcc/input.c
index 728f4dd..611e18b 100644
--- a/gcc/input.c
+++ b/gcc/input.c
@@ -1322,6 +1322,15 @@ get_substring_ranges_for_loc (cpp_reader *pfile,
   if (strloc == UNKNOWN_LOCATION)
 return "unknown location";
 
+  /* Reparsing the strings requires accurate location information.
+ If -ftrack-macro-expansion has been overridden from its default
+ of 2, then we might have a location of a macro expansion point,
+ rather than the location of th

[arm] Delete unimplemented option -mapcs-float

2016-11-21 Thread Richard Earnshaw (lists)

The option -m[no-]apcs-float was never implemented in GCC.  Since it
referred specifically to the FPA floating point co-processor which is
no-longer supported in GCC the option is now irrelevant.  This patch
cleans up the hunks that were left behind.

* arm.opt (mapcs-float): Delete option
* arm.c (arm_option_override): Remove hunk relating to
TARGET_APCS_FLOAT.
* doc/invoke.texi (arm options): Remove documentation for
-mapcs-float.

Applied to trunk.

R.
Index: gcc/config/arm/arm.c
===
--- gcc/config/arm/arm.c(revision 242668)
+++ gcc/config/arm/arm.c(working copy)
@@ -3187,9 +3187,6 @@
   if (TARGET_APCS_REENT)
 warning (0, "APCS reentrant code not supported.  Ignored");
 
-  if (TARGET_APCS_FLOAT)
-warning (0, "passing floating point arguments in fp regs not yet 
supported");
-
   /* Initialize boolean versions of the flags, for use in the arm.md file.  */
   arm_arch3m = ARM_FSET_HAS_CPU1 (insn_flags, FL_ARCH3M);
   arm_arch4 = ARM_FSET_HAS_CPU1 (insn_flags, FL_ARCH4);
Index: gcc/config/arm/arm.opt
===
--- gcc/config/arm/arm.opt  (revision 242668)
+++ gcc/config/arm/arm.opt  (working copy)
@@ -61,10 +61,6 @@
 mapcs
 Target RejectNegative Mask(APCS_FRAME) Undocumented
 
-mapcs-float
-Target Report Mask(APCS_FLOAT)
-Pass FP arguments in FP registers.
-
 mapcs-frame
 Target Report Mask(APCS_FRAME)
 Generate APCS conformant stack frames.
Index: gcc/doc/invoke.texi
===
--- gcc/doc/invoke.texi (revision 242668)
+++ gcc/doc/invoke.texi (working copy)
@@ -621,7 +621,6 @@
 @gccoptlist{-mapcs-frame  -mno-apcs-frame @gol
 -mabi=@var{name} @gol
 -mapcs-stack-check  -mno-apcs-stack-check @gol
--mapcs-float  -mno-apcs-float @gol
 -mapcs-reentrant  -mno-apcs-reentrant @gol
 -msched-prolog  -mno-sched-prolog @gol
 -mlittle-endian  -mbig-endian @gol
@@ -14892,16 +14891,6 @@
 @option{-mno-apcs-stack-check}, since this produces smaller code.
 
 @c not currently implemented
-@item -mapcs-float
-@opindex mapcs-float
-Pass floating-point arguments using the floating-point registers.  This is
-one of the variants of the APCS@.  This option is recommended if the
-target hardware has a floating-point unit or if a lot of floating-point
-arithmetic is going to be performed by the code.  The default is
-@option{-mno-apcs-float}, since the size of integer-only code is 
-slightly increased if @option{-mapcs-float} is used.
-
-@c not currently implemented
 @item -mapcs-reentrant
 @opindex mapcs-reentrant
 Generate reentrant, position-independent code.  The default is

Re: [fixincludes, v3] Don't define libstdc++-internal macros in Solaris 10+

2016-11-21 Thread Bruce Korb

I missed the patch because the thread got too long.  Also, I trust you
after all these years. :)

On Mon, Nov 21, 2016 at 1:48 AM, Rainer Orth
 wrote:
> Hi Jonathan,
>
>>>Ok for mainline now and the backports after some soak time?
>>
>> Yes, the libstdc++ parts are OK, thanks.
>
> I assume Bruce is ok with the change to the hpux11_fabsf fix given that it
> was suggested by the HP-UX maintainer and fixes fixincludes make check ;-)
>
> Rainer
>
> --
> -
> Rainer Orth, Center for Biotechnology, Bielefeld University

Re: [PATCH] Add sem_item::m_hash_set (PR ipa/78309) (v2)

2016-11-21 Thread Jan Hubicka

> On 11/15/2016 05:46 PM, Jan Hubicka wrote:
> > Yep, zero is definitly valid hash value:0
> > 
> > Patch is OK. We may consider backporting it to release branches.
> > Honza
> 
> Thanks, sending v2 as I found an error in the previous version.
> Changes from last version:
> - comments for ctors are just in header file, not duplicated in ipa-icf.c
> - hash argument has been removed from ctors
> - ctors GNU coding style has been fixed and unified
> 
> Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.
> 
> Ready to be installed?
> Martin

> >From 5be0ca49cfa67ca848002d6fe008ef4c2885bd40 Mon Sep 17 00:00:00 2001
> From: marxin 
> Date: Fri, 11 Nov 2016 16:15:20 +0100
> Subject: [PATCH] Add sem_item::m_hash_set (PR ipa/78309)
> 
> gcc/ChangeLog:
> 
> 2016-11-16  Martin Liska  
> 
>   PR ipa/78309
>   * ipa-icf.c (void sem_item::set_hash): Update m_hash_set.
>   (sem_function::get_hash): Use the new field.
>   (sem_function::parse): Remove an argument from ctor.
>   (sem_variable::parse): Likewise.
>   (sem_variable::get_hash): Use the new field.
>   (sem_item_optimizer::read_section): Use new ctor and set hash.
>   * ipa-icf.h: _hash is removed from sem_item::sem_item,
>   sem_variable::sem_variable, sem_function::sem_function.
OK,
thanks!
Honza
> ---
>  gcc/ipa-icf.c | 64 
> ---
>  gcc/ipa-icf.h | 17 
>  2 files changed, 35 insertions(+), 46 deletions(-)
> 
> diff --git a/gcc/ipa-icf.c b/gcc/ipa-icf.c
> index 1ab67f3..212e406 100644
> --- a/gcc/ipa-icf.c
> +++ b/gcc/ipa-icf.c
> @@ -131,27 +131,20 @@ symbol_compare_collection::symbol_compare_collection 
> (symtab_node *node)
>  
>  /* Constructor for key value pair, where _ITEM is key and _INDEX is a 
> target.  */
>  
> -sem_usage_pair::sem_usage_pair (sem_item *_item, unsigned int _index):
> -  item (_item), index (_index)
> +sem_usage_pair::sem_usage_pair (sem_item *_item, unsigned int _index)
> +: item (_item), index (_index)
>  {
>  }
>  
> -/* Semantic item constructor for a node of _TYPE, where STACK is used
> -   for bitmap memory allocation.  */
> -
> -sem_item::sem_item (sem_item_type _type,
> - bitmap_obstack *stack): type (_type), m_hash (0)
> +sem_item::sem_item (sem_item_type _type, bitmap_obstack *stack)
> +: type (_type), m_hash (-1), m_hash_set (false)
>  {
>setup (stack);
>  }
>  
> -/* Semantic item constructor for a node of _TYPE, where STACK is used
> -   for bitmap memory allocation. The item is based on symtab node _NODE
> -   with computed _HASH.  */
> -
>  sem_item::sem_item (sem_item_type _type, symtab_node *_node,
> - hashval_t _hash, bitmap_obstack *stack): type(_type),
> -  node (_node), m_hash (_hash)
> + bitmap_obstack *stack)
> +: type (_type), node (_node), m_hash (-1), m_hash_set (false)
>  {
>decl = node->decl;
>setup (stack);
> @@ -230,23 +223,20 @@ sem_item::target_supports_symbol_aliases_p (void)
>  void sem_item::set_hash (hashval_t hash)
>  {
>m_hash = hash;
> +  m_hash_set = true;
>  }
>  
>  /* Semantic function constructor that uses STACK as bitmap memory stack.  */
>  
> -sem_function::sem_function (bitmap_obstack *stack): sem_item (FUNC, stack),
> -  m_checker (NULL), m_compared_func (NULL)
> +sem_function::sem_function (bitmap_obstack *stack)
> +: sem_item (FUNC, stack), m_checker (NULL), m_compared_func (NULL)
>  {
>bb_sizes.create (0);
>bb_sorted.create (0);
>  }
>  
> -/*  Constructor based on callgraph node _NODE with computed hash _HASH.
> -Bitmap STACK is used for memory allocation.  */
> -sem_function::sem_function (cgraph_node *node, hashval_t hash,
> - bitmap_obstack *stack):
> -  sem_item (FUNC, node, hash, stack),
> -  m_checker (NULL), m_compared_func (NULL)
> +sem_function::sem_function (cgraph_node *node, bitmap_obstack *stack)
> +: sem_item (FUNC, node, stack), m_checker (NULL), m_compared_func (NULL)
>  {
>bb_sizes.create (0);
>bb_sorted.create (0);
> @@ -279,7 +269,7 @@ sem_function::get_bb_hash (const sem_bb *basic_block)
>  hashval_t
>  sem_function::get_hash (void)
>  {
> -  if (!m_hash)
> +  if (!m_hash_set)
>  {
>inchash::hash hstate;
>hstate.add_int (177454); /* Random number for function type.  */
> @@ -1704,7 +1694,7 @@ sem_function::parse (cgraph_node *node, bitmap_obstack 
> *stack)
>|| DECL_STATIC_DESTRUCTOR (node->decl))
>  return NULL;
>  
> -  sem_function *f = new sem_function (node, 0, stack);
> +  sem_function *f = new sem_function (node, stack);
>  
>f->init ();
>  
> @@ -1807,19 +1797,12 @@ sem_function::bb_dict_test (vec *bb_dict, int 
> source, int target)
>  return (*bb_dict)[source] == target;
>  }
>  
> -
> -/* Semantic variable constructor that uses STACK as bitmap memory stack.  */
> -
>  sem_variable::sem_variable (bitmap_obstack *stack): sem_item (VAR, stack)
>  {
>  }
>  
> -/*  Constructor base

[C++ Patch/RFC] PR 77545 ("[7 Regression] ICE on valid C++11 code: in potential_constant_expression_1..")

2016-11-21 Thread Paolo Carlini


Hi,

I have  been spending some time on this regression, where we ICE in 
potential_constant_expression_1 because CLEANUP_STMT is unhandled. 
Apparently the ICE started with 
https://gcc.gnu.org/viewcvs/gcc?view=revision&revision=238559, where we 
started calling maybe_constant_value from cp_fully_fold, but now happens 
in the same way with that commit reverted too. The context of the ICE is 
the following: from store_init_value we call maybe_constant_init for a 
__for_range VAR_DECL as decl and a COMPUND_EXPR as value:


 
needs-constructing type_4 BLK
size 
unit size 
align 32 symtab 0 alias set -1 canonical type 
0x769d9f18 domain 
pointer_to_this  
reference_to_this >

private unsigned DI
size 
unit size 
align 64 symtab 0 alias set -1 canonical type 0x769de690>
side-effects
arg 0 >
side-effects head 0x769d3e28 tail 0x769d3e40 stmts 
0x769d5e60 0x76855bd0


stmt 0x7687d000 void>

side-effects tree_1
arg 0 0x7687d000 void>

side-effects
arg 0 0x769d93f0 A>

side-effects
arg 0 0x769d93f0 A>

arg 0 
arg 1 > 
arg 1 >

77545.C:10:35 start: 77545.C:10:35 finish: 77545.C:10:35>
77545.C:10:35 start: 77545.C:10:35 finish: 77545.C:10:35>
stmt 0x7687d000 void>

side-effects static tree_1
arg 0 0x7687d000 void>

head (nil) tail (nil) stmts
>
arg 1 0x7687d000 void>

side-effects nothrow
fn 0x769de0a8>
constant arg 0 __comp_dtor >>
arg 0 0x769d95e8>


arg 0 0x769d95e8>

arg 0 >>>
77545.C:10:35 start: 77545.C:10:35 finish: 77545.C:10:35>>
arg 1 

arg 0 
arg 0 >>>

then, obviously, a bit later potential_constant_expression_1 stumbles 
into the CLEANUP_STMT among the statements in the STATEMENT_LIST we pass 
as ARG 0 of the COMPOUND_EXPR.


I have been investigating how we build and handle CLEANUP_STMTs in 
constexpr.c (see in particular the comment at the beginning of 
build_data_member_initialization) and wondering if simply returning true 
for it from potential_constant_expression_1 wouldn't be correct... 
Certainly passes testing.


Thanks for any feedback!

Paolo.

///




Index: cp/constexpr.c
===
--- cp/constexpr.c  (revision 242657)
+++ cp/constexpr.c  (working copy)
@@ -4915,6 +4915,7 @@ potential_constant_expression_1 (tree t, bool want
 case CONTINUE_STMT:
 case REQUIRES_EXPR:
 case STATIC_ASSERT:
+case CLEANUP_STMT:
   return true;
 
 case AGGR_INIT_EXPR:
Index: testsuite/g++.dg/cpp0x/range-for32.C
===
--- testsuite/g++.dg/cpp0x/range-for32.C(revision 0)
+++ testsuite/g++.dg/cpp0x/range-for32.C(working copy)
@@ -0,0 +1,16 @@
+// PR c++/77545
+// { dg-do compile { target c++11 } }
+// { dg-options "" }
+
+template < typename T > struct A
+{ 
+  A ();
+  ~A ();
+  T t;
+};
+
+void f (A < int > a)
+{ 
+  for (auto x : (A < int >[]) { a })
+;
+}

[testsuite,avr,committed] Treat reduced avr tiny devices as "tiny".

2016-11-21 Thread Georg-Johann Lay

Committed this change in order to reduce the FAILs for AVR_TINY from 
~3000 to ~2000.  Rationale is to turn FAILs because of "relocation 
truncated to fit" to UNSUPPORTED.


Johann


gcc/testsuite/
* lib/target-supports.exp (check_effective_target_tiny) [avr]:
Return 1 for AVR_TINY.


Index: lib/target-supports.exp
===
--- lib/target-supports.exp (revision 242665)
+++ lib/target-supports.exp (working copy)
@@ -7864,7 +7864,7 @@ proc check_effective_target_fenv_excepti
 proc check_effective_target_tiny {} {
 global et_target_tiny_saved

-if [info exists et_target_tine_saved] {
+if [info exists et_target_tiny_saved] {
   verbose "check_effective_target_tiny: using cached result" 2
 } else {
set et_target_tiny_saved 0
@@ -7872,6 +7872,10 @@ proc check_effective_target_tiny {} {
  && [check_effective_target_aarch64_tiny] } {
  set et_target_tiny_saved 1
}
+   if { [istarget avr-*-*]
+ && [check_effective_target_avr_tiny] } {
+ set et_target_tiny_saved 1
+   }
 }

 return $et_target_tiny_saved

Re: [PATCH v2][PR libgfortran/78314] Fix ieee_support_halting

2016-11-21 Thread Szabolcs Nagy

On 21/11/16 14:16, FX wrote:
>> it seems this broke ieee_8.f90 which tests compile time vs runtime value of 
>> ieee_support_halting
>> if fortran needs this, then support_halting should be always false on arm 
>> and aarch64.
>> but i'm not familiar enough with fortran to tell if there is some better 
>> workaround.
> 
> Can you XFAIL the test on your platform, open a PR and assign it to me?
> This is a corner of the Fortran standard that is not entirely clear, and 
> there have been some corrigenda on the topic, so I need to review it.
> 

i opened PR 78449

will do the xfail later.

Re: [PATCH, GCC/ARM, ping] Optional -mthumb for Thumb only targets

2016-11-21 Thread Christophe Lyon

On 21 November 2016 at 15:16, Thomas Preudhomme
 wrote:
> On 21/11/16 08:51, Christophe Lyon wrote:
>>
>> Hi Thomas,
>
>
> Hi Christophe,
>
>
>>
>>
>> On 18 November 2016 at 17:51, Thomas Preudhomme
>>  wrote:
>>>
>>> On 11/11/16 14:35, Kyrill Tkachov wrote:



 On 08/11/16 13:36, Thomas Preudhomme wrote:
>
>
> Ping?
>
> Best regards,
>
> Thomas
>
> On 25/10/16 18:07, Thomas Preudhomme wrote:
>>
>>
>> Hi,
>>
>> Currently when a user compiles for a thumb-only target (such as
>> Cortex-M
>> processors) without specifying the -mthumb option GCC throws the error
>> "target
>> CPU does not support ARM mode". This is suboptimal from a usability
>> point of
>> view: the -mthumb could be deduced from the -march or -mcpu option
>> when
>> there is
>> no ambiguity.
>>
>> This patch implements this behavior by extending the DRIVER_SELF_SPECS
>> to
>> automatically append -mthumb to the command line for thumb-only
>> targets.
>> It does
>> so by checking the last -march option if any is given or the last
>> -mcpu
>> option
>> otherwise. There is no ordering issue because conflicting -mcpu and
>> -march is
>> already handled.
>>
>> Note that the logic cannot be implemented in function
>> arm_option_override
>> because we need to provide the modified command line to the GCC driver
>> for
>> finding the right multilib path and the function arm_option_override
>> is
>> executed
>> too late for that effect.
>>
>> ChangeLog entries are as follow:
>>
>> *** gcc/ChangeLog ***
>>
>> 2016-10-18  Terry Guo  
>> Thomas Preud'homme 
>>
>> PR target/64802
>> * common/config/arm/arm-common.c (arm_target_thumb_only): New
>> function.
>> * config/arm/arm-opts.h: Include arm-flags.h.
>> (struct arm_arch_core_flag): Define.
>> (arm_arch_core_flags): Define.
>> * config/arm/arm-protos.h: Include arm-flags.h.
>> (FL_NONE, FL_ANY, FL_CO_PROC, FL_ARCH3M, FL_MODE26, FL_MODE32,
>> FL_ARCH4, FL_ARCH5, FL_THUMB, FL_LDSCHED, FL_STRONG,
>> FL_ARCH5E,
>> FL_XSCALE, FL_ARCH6, FL_VFPV2, FL_WBUF, FL_ARCH6K, FL_THUMB2,
>> FL_NOTM,
>> FL_THUMB_DIV, FL_VFPV3, FL_NEON, FL_ARCH7EM, FL_ARCH7,
>> FL_ARM_DIV,
>> FL_ARCH8, FL_CRC32, FL_SMALLMUL, FL_NO_VOLATILE_CE, FL_IWMMXT,
>> FL_IWMMXT2, FL_ARCH6KZ, FL2_ARCH8_1, FL2_ARCH8_2,
>> FL2_FP16INST,
>> FL_TUNE, FL_FOR_ARCH2, FL_FOR_ARCH3, FL_FOR_ARCH3M,
>> FL_FOR_ARCH4,
>> FL_FOR_ARCH4T, FL_FOR_ARCH5, FL_FOR_ARCH5T, FL_FOR_ARCH5E,
>> FL_FOR_ARCH5TE, FL_FOR_ARCH5TEJ, FL_FOR_ARCH6, FL_FOR_ARCH6J,
>> FL_FOR_ARCH6K, FL_FOR_ARCH6Z, FL_FOR_ARCH6ZK, FL_FOR_ARCH6KZ,
>> FL_FOR_ARCH6T2, FL_FOR_ARCH6M, FL_FOR_ARCH7, FL_FOR_ARCH7A,
>> FL_FOR_ARCH7VE, FL_FOR_ARCH7R, FL_FOR_ARCH7M, FL_FOR_ARCH7EM,
>> FL_FOR_ARCH8A, FL2_FOR_ARCH8_1A, FL2_FOR_ARCH8_2A,
>> FL_FOR_ARCH8M_BASE,
>> FL_FOR_ARCH8M_MAIN, arm_feature_set, ARM_FSET_MAKE,
>> ARM_FSET_MAKE_CPU1, ARM_FSET_MAKE_CPU2, ARM_FSET_CPU1,
>> ARM_FSET_CPU2,
>> ARM_FSET_EMPTY, ARM_FSET_ANY, ARM_FSET_HAS_CPU1,
>> ARM_FSET_HAS_CPU2,
>> ARM_FSET_HAS_CPU, ARM_FSET_ADD_CPU1, ARM_FSET_ADD_CPU2,
>> ARM_FSET_DEL_CPU1, ARM_FSET_DEL_CPU2, ARM_FSET_UNION,
>> ARM_FSET_INTER,
>> ARM_FSET_XOR, ARM_FSET_EXCLUDE, ARM_FSET_IS_EMPTY,
>> ARM_FSET_CPU_SUBSET): Move to ...
>> * config/arm/arm-flags.h: This new file.
>> * config/arm/arm.h (TARGET_MODE_SPEC_FUNCTIONS): Define.
>> (EXTRA_SPEC_FUNCTIONS): Add TARGET_MODE_SPEC_FUNCTIONS to its
>> value.
>> (TARGET_MODE_SPECS): Define.
>> (DRIVER_SELF_SPECS): Add TARGET_MODE_SPECS to its value.
>>
>>
>> *** gcc/testsuite/ChangeLog ***
>>
>> 2016-10-11  Thomas Preud'homme 
>>
>> PR target/64802
>> * gcc.target/arm/optional_thumb-1.c: New test.
>> * gcc.target/arm/optional_thumb-2.c: New test.
>> * gcc.target/arm/optional_thumb-3.c: New test.
>>
>>
>> No regression when running the testsuite for -mcpu=cortex-m0 -mthumb,
>> -mcpu=cortex-m0 -marm and -mcpu=cortex-a8 -marm
>>
>> Is this ok for trunk?
>>

 This looks like a useful usability improvement.
 This is ok after a bootstrap on an arm-none-linux-gnueabihf target.

 Sorry for the delay,
 Kyrill
>>>
>>>
>>>
>>> I've rebased the patch on top of the arm_feature_set type consistency fix
>>> [1] and committed it. The committed patch is in attachment for reference.
>>>
>>> [1] https://gcc.gnu.org/ml/gcc-patches/2016-11/msg01680.html
>>>
>>
>> Since this co

Re: [PATCH v3] bb-reorder: Improve compgotos pass (PR71785)

2016-11-21 Thread Segher Boessenkool

On Fri, Nov 18, 2016 at 06:46:21AM -0600, Segher Boessenkool wrote:
> 2016-11-18  Segher Boessenkool  
> 
> gcc/testsuite/
>   PR rtl-optimization/71785
>   * gcc.target/powerpc/pr71785.c: New file.

I have committed this now.


Segher

Re: [PATCH v2][PR libgfortran/78314] Fix ieee_support_halting

2016-11-21 Thread FX

Dear Nagy,

> it seems this broke ieee_8.f90 which tests compile time vs runtime value of 
> ieee_support_halting
> if fortran needs this, then support_halting should be always false on arm and 
> aarch64.
> but i'm not familiar enough with fortran to tell if there is some better 
> workaround.

Can you XFAIL the test on your platform, open a PR and assign it to me?
This is a corner of the Fortran standard that is not entirely clear, and there 
have been some corrigenda on the topic, so I need to review it.

Thanks,
FX

Re: [PATCH, GCC/ARM, ping] Optional -mthumb for Thumb only targets

2016-11-21 Thread Thomas Preudhomme


On 21/11/16 08:51, Christophe Lyon wrote:

Hi Thomas,


Hi Christophe,




On 18 November 2016 at 17:51, Thomas Preudhomme
 wrote:

On 11/11/16 14:35, Kyrill Tkachov wrote:



On 08/11/16 13:36, Thomas Preudhomme wrote:


Ping?

Best regards,

Thomas

On 25/10/16 18:07, Thomas Preudhomme wrote:


Hi,

Currently when a user compiles for a thumb-only target (such as Cortex-M
processors) without specifying the -mthumb option GCC throws the error
"target
CPU does not support ARM mode". This is suboptimal from a usability
point of
view: the -mthumb could be deduced from the -march or -mcpu option when
there is
no ambiguity.

This patch implements this behavior by extending the DRIVER_SELF_SPECS
to
automatically append -mthumb to the command line for thumb-only targets.
It does
so by checking the last -march option if any is given or the last -mcpu
option
otherwise. There is no ordering issue because conflicting -mcpu and
-march is
already handled.

Note that the logic cannot be implemented in function
arm_option_override
because we need to provide the modified command line to the GCC driver
for
finding the right multilib path and the function arm_option_override is
executed
too late for that effect.

ChangeLog entries are as follow:

*** gcc/ChangeLog ***

2016-10-18  Terry Guo  
Thomas Preud'homme 

PR target/64802
* common/config/arm/arm-common.c (arm_target_thumb_only): New
function.
* config/arm/arm-opts.h: Include arm-flags.h.
(struct arm_arch_core_flag): Define.
(arm_arch_core_flags): Define.
* config/arm/arm-protos.h: Include arm-flags.h.
(FL_NONE, FL_ANY, FL_CO_PROC, FL_ARCH3M, FL_MODE26, FL_MODE32,
FL_ARCH4, FL_ARCH5, FL_THUMB, FL_LDSCHED, FL_STRONG, FL_ARCH5E,
FL_XSCALE, FL_ARCH6, FL_VFPV2, FL_WBUF, FL_ARCH6K, FL_THUMB2,
FL_NOTM,
FL_THUMB_DIV, FL_VFPV3, FL_NEON, FL_ARCH7EM, FL_ARCH7,
FL_ARM_DIV,
FL_ARCH8, FL_CRC32, FL_SMALLMUL, FL_NO_VOLATILE_CE, FL_IWMMXT,
FL_IWMMXT2, FL_ARCH6KZ, FL2_ARCH8_1, FL2_ARCH8_2, FL2_FP16INST,
FL_TUNE, FL_FOR_ARCH2, FL_FOR_ARCH3, FL_FOR_ARCH3M,
FL_FOR_ARCH4,
FL_FOR_ARCH4T, FL_FOR_ARCH5, FL_FOR_ARCH5T, FL_FOR_ARCH5E,
FL_FOR_ARCH5TE, FL_FOR_ARCH5TEJ, FL_FOR_ARCH6, FL_FOR_ARCH6J,
FL_FOR_ARCH6K, FL_FOR_ARCH6Z, FL_FOR_ARCH6ZK, FL_FOR_ARCH6KZ,
FL_FOR_ARCH6T2, FL_FOR_ARCH6M, FL_FOR_ARCH7, FL_FOR_ARCH7A,
FL_FOR_ARCH7VE, FL_FOR_ARCH7R, FL_FOR_ARCH7M, FL_FOR_ARCH7EM,
FL_FOR_ARCH8A, FL2_FOR_ARCH8_1A, FL2_FOR_ARCH8_2A,
FL_FOR_ARCH8M_BASE,
FL_FOR_ARCH8M_MAIN, arm_feature_set, ARM_FSET_MAKE,
ARM_FSET_MAKE_CPU1, ARM_FSET_MAKE_CPU2, ARM_FSET_CPU1,
ARM_FSET_CPU2,
ARM_FSET_EMPTY, ARM_FSET_ANY, ARM_FSET_HAS_CPU1,
ARM_FSET_HAS_CPU2,
ARM_FSET_HAS_CPU, ARM_FSET_ADD_CPU1, ARM_FSET_ADD_CPU2,
ARM_FSET_DEL_CPU1, ARM_FSET_DEL_CPU2, ARM_FSET_UNION,
ARM_FSET_INTER,
ARM_FSET_XOR, ARM_FSET_EXCLUDE, ARM_FSET_IS_EMPTY,
ARM_FSET_CPU_SUBSET): Move to ...
* config/arm/arm-flags.h: This new file.
* config/arm/arm.h (TARGET_MODE_SPEC_FUNCTIONS): Define.
(EXTRA_SPEC_FUNCTIONS): Add TARGET_MODE_SPEC_FUNCTIONS to its
value.
(TARGET_MODE_SPECS): Define.
(DRIVER_SELF_SPECS): Add TARGET_MODE_SPECS to its value.


*** gcc/testsuite/ChangeLog ***

2016-10-11  Thomas Preud'homme 

PR target/64802
* gcc.target/arm/optional_thumb-1.c: New test.
* gcc.target/arm/optional_thumb-2.c: New test.
* gcc.target/arm/optional_thumb-3.c: New test.


No regression when running the testsuite for -mcpu=cortex-m0 -mthumb,
-mcpu=cortex-m0 -marm and -mcpu=cortex-a8 -marm

Is this ok for trunk?



This looks like a useful usability improvement.
This is ok after a bootstrap on an arm-none-linux-gnueabihf target.

Sorry for the delay,
Kyrill



I've rebased the patch on top of the arm_feature_set type consistency fix
[1] and committed it. The committed patch is in attachment for reference.

[1] https://gcc.gnu.org/ml/gcc-patches/2016-11/msg01680.html



Since this commit (242597), I've noticed that:
- the 2 new tests optional_thumb-1.c and optional_thumb-2.c fail
if GCC was configured --with-mode=arm. The error message is:
cc1: error: target CPU does not support ARM mode


I need to skip the test when GCC is built with --with-mode= but we do not have a 
directive for that. I'll see if I can add one.




- on armeb --with-mode=arm, gcc.dg/vect/pr64252.c fails at execution

See: 
http://people.linaro.org/~christophe.lyon/cross-validation/gcc/trunk/242597/report-build-info.html


I cannot reproduce that last issue. I've built r242658 for armeb-none-eabi and 
it passes fine in qemu-armeb. I'll try building a linux toolchain.


Best regards,

Thomas

Re: [www-patch] Document new -Wshadow= variants in gcc-7/changes.html

2016-11-21 Thread Gerald Pfeifer

On Mon, 21 Nov 2016, Mark Wielaard wrote:
> If this just isn't something that should be documented in changes.html
> please let me know and I'll stop pinging.

Not at all!  Apologies for not getting back to you earlier, I 
am simply swamped with (non-GCC) stuff right now, and missed
this on my weekend.

This is great and important to have documented, let me just add
an observation or two:

Index: htdocs/gcc-7/changes.html
===
+The -Wshadow warning has been split into 3

I believe for small numbers one usually spells them out ("three"
instead of "3").

+type is compatible (in C++ compatible means that the type of the
+shadowing variable can be converted to that of the shadowed variable).
+
+The following example shows the different kinds of shadow
+warnings:

Take care, what looks like a paragraph between the two above will
just show up as a blank when rendered as HTML.  If you want to
retain the paragraph, use 

Kudos for coming up with such a short, innocently looking example! :-)

Gerald

Re: [www-patch] Document new -Wshadow= variants in gcc-7/changes.html

2016-11-21 Thread Mark Wielaard

On Sun, 2016-11-13 at 18:45 +0100, Mark Wielaard wrote:
> On Sat, Nov 05, 2016 at 10:50:57PM +0100, Mark Wielaard wrote:
> > The attached patch adds an explanation of the new
> > -Wshadow=(global|local|compatible-local) to gcc-7/changes.html.
> > 
> > OK to commit?
> 
> Ping?

If this just isn't something that should be documented in changes.html
please let me know and I'll stop pinging.

> > Index: htdocs/gcc-7/changes.html
> > ===
> > RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-7/changes.html,v
> > retrieving revision 1.21
> > diff -u -r1.21 changes.html
> > --- htdocs/gcc-7/changes.html   26 Oct 2016 19:08:10 -  1.21
> > +++ htdocs/gcc-7/changes.html   5 Nov 2016 20:41:35 -
> > @@ -119,6 +119,60 @@
> >~^  
> >   ~
> >%d
> >  
> > +
> > +The -Wshadow warning has been split into 3
> > +variants. -Wshadow=global warns for any shadowing.  This
> > +is the default when using -Wshadow without any
> > +argument.  -Wshadow=local only warns for a local variable
> > +shadowing another local variable or
> > +parameter. -Wshadow=compatible-local only warns for a
> > +local variable shadowing another local variable or parameter whose
> > +type is compatible (in C++ compatible means that the type of the
> > +shadowing variable can be converted to that of the shadowed variable).
> > +
> > +The following example shows the different kinds of shadow
> > +warnings:
> > +enum operation { add, count };
> > +struct container { int nr; };
> > +
> > +int
> > +container_count (struct container c, int count)
> > +{
> > +  int r = 0;
> > +  for (int count = 0; count > 0; count--)
> > +{
> > +  struct container count = c;
> > +  r += count.nr;
> > +}
> > +  return r;
> > +}
> > +
> > +-Wshadow=compatible-local will warn for the parameter being
> > +shadowed with the same type:
> > +warn-test.c:8:12: warning: 
> > declaration of 'count' shadows a parameter [ > class="boldmagenta">-Wshadow=compatible-local]
> > +   for (int count = 0; count > 0; count--)
> > +^
> > +warn-test.c:5:42: note: shadowed 
> > declaration is here
> > + container_count (struct container c, int  > class="boldcyan">count)
> > +   > class="boldcyan">^
> > +
> > +-Wshadow=local will warn for the above and for the shadowed
> > +declaration with incompatible type:
> > +warn-test.c:10:24: warning: 
> > declaration of 'count' shadows a previous local [ > class="boldmagenta">-Wshadow=local]
> > +   struct container count = c;
> > +^
> > +warn-test.c:8:12: note: shadowed 
> > declaration is here
> > +   for (int count = 0; count > 0; count--)
> > +^
> > +
> > +-Wshadow=global will warn for all of the above and the 
> > shadowing
> > +of the global declaration: 
> > +warn-test.c:5:42: warning: 
> > declaration of 'count' shadows a global declaration [ > class="boldmagenta">-Wshadow]
> > + container_count (struct container c, int  > class="boldmagenta">count)
> > +   > class="boldmagenta">^
> > +warn-test.c:1:23: note: shadowed 
> > declaration is here
> > + enum operation { add, count };
> > +> class="boldcyan">^
> >  
> >  
> >  C
>

[PATCH, testsuite] MIPS: Add isa>=2 option to interrupt_handler-bug-1.c.

2016-11-21 Thread Toma Tabacu

Hi,

Currently, the interrupt_handler-bug-1.c test will fail on pre-R2 targets
because the "interrupt" function attribute requires at least an R2 target and
the test does not enforce this requirement.

This patch fixes this by adding the isa_rev>=2 option to the test's dg-options.

Tested with mips-mti-elf.

Regards,
Toma Tabacu

gcc/testsuite/ChangeLog:

2016-11-21  Toma Tabacu  

* gcc.target/mips/interrupt_handler-bug-1.c (dg-options): Add
isa_rev>=2.

diff --git a/gcc/testsuite/gcc.target/mips/interrupt_handler-bug-1.c 
b/gcc/testsuite/gcc.target/mips/interrupt_handler-bug-1.c
index 2784705..083e152 100644
--- a/gcc/testsuite/gcc.target/mips/interrupt_handler-bug-1.c
+++ b/gcc/testsuite/gcc.target/mips/interrupt_handler-bug-1.c
@@ -1,4 +1,4 @@
-/* { dg-options "-funroll-loops" } */
+/* { dg-options "-funroll-loops isa_rev>=2" } */
 int foo;
 int bar;

Re: [PATCH] Delete GCJ

2016-11-21 Thread Matthias Klose

On 21.11.2016 11:23, Iain Sandoe wrote:
> 
>> On 20 Nov 2016, at 20:42, Matthias Klose  wrote:
>>
>> On 10.10.2016 09:58, Iain Sandoe wrote:
>>>
> 
>>> The point here was to simplify the dependent configury so that it only 
>>> needs to test something that the configuring user specifies (i.e. if they 
>>> specify objc-gc, then they need also to specify the place that the gc lib 
>>> can be found).
>>
>> So here is the next proposal, I hope the added documentation in install.texi
>> makes the usage clear.
> 
> thanks for working on this!
> 
>>
>> 
>>
>> 2016-11-19  Matthias Klose  
>>
>>  * Makefile.def: Remove reference to boehm-gc target module.
>>  * configure.ac: Include pkg.m4, check for --with-target-bdw-gc
>>  options and for the bdw-gc pkg-config module.
>>  * configure: Regenerate.
>>  * Makefile.in: Regenerate.
> 
> 
> +AC_ARG_ENABLE(objc-gc,
> +[AS_HELP_STRING([--enable-objc-gc],
> + [enable use of Boehm's garbage collector with the
> +  GNU Objective-C runtime])])
> +AC_ARG_WITH([target-bdw-gc],
> +[AS_HELP_STRING([--with-target-bdw-gc=PATHLIST],
> + [specify prefix directory for installed bdw-gc package.
> +  Equivalent to --with-bdw-gc-include=PATH/include
> +  plus --with-bdw-gc-lib=PATH/lib])])
> 
> missing “target” in the --with-bdw-gc-*

thanks, fixed.

>> gcc/
>>
>> 2016-11-19  Matthias Klose  
>>
>>  * doc/install.texi: Document configure options --enable-objc-gc
>>  and --with-target-bdw-gc.
> 
> As per Sandra’s comment,  should we understand the priority of options is
> 
> --with-target-bdw-gc-*
> 
> which overrides…
> 
> --with-target-bdw-gc=
> 
> which overrides automatic discovery using pkg_config?

--with-target-bdw-gc=/opt/bdw-gc,32=/opt/bdw-gc32

sets the include and lib dirs by appending include and lib to the paths. If you
have options --with-target-bdw-gc-include= and --with-target-bdw-gc-lib= as
well, it overrides the settings done in --with-target-bdw-gc=. This is copied
from the setting of the gmp/mpfr options.

Any of these options override the automatic discovery using pkg-config.

Please suggest a better wording; I thought that was clear enough (and better
than the undocumented --enable-libobjc-gc anyway ;)

Matthias

Re: Ping: Re: [PATCH 1/2] gcc: Remove unneeded global flag.

2016-11-21 Thread Christophe Lyon

On 20 November 2016 at 18:27, Mike Stump  wrote:
> On Nov 19, 2016, at 1:59 PM, Andrew Burgess  
> wrote:
>>> So, your new test fails on arm* targets:
>>
>> After a little digging I think the problem might be that
>> -freorder-blocks-and-partition is not supported on arm.
>>
>> This should be detected as the new tests include:
>>
>>/* { dg-require-effective-target freorder } */
>>
>> however this test passed on arm as -freorder-blocks-and-partition does
>> not issue any warning unless -fprofile-use is also passed.
>>
>> The patch below extends check_effective_target_freorder to check using
>> -fprofile-use.  With this change in place the tests are skipped on
>> arm.
>
>> All feedback welcome,
>
> Seems reasonable, unless a -freorder-blocks-and-partition/-fprofile-use 
> person thinks this is the wrong solution.
>

Hi,

As promised, I tested this patch: it makes
gcc.dg/tree-prof/section-attr-[123].c
unsupported on arm*, and thus they are not failing anymore :-)

However, it also makes other tests unsupported, while they used to pass:

  gcc.dg/pr33648.c
  gcc.dg/pr46685.c
  gcc.dg/tree-prof/20041218-1.c
  gcc.dg/tree-prof/bb-reorg.c
  gcc.dg/tree-prof/cold_partition_label.c
  gcc.dg/tree-prof/comp-goto-1.c
  gcc.dg/tree-prof/pr34999.c
  gcc.dg/tree-prof/pr45354.c
  gcc.dg/tree-prof/pr50907.c
  gcc.dg/tree-prof/pr52027.c
  gcc.dg/tree-prof/va-arg-pack-1.c

and failures are now unsupported:
  gcc.dg/tree-prof/cold_partition_label.c
  gcc.dg/tree-prof/section-attr-1.c
  gcc.dg/tree-prof/section-attr-2.c
  gcc.dg/tree-prof/section-attr-3.c

So, maybe this patch is too strong?

Christophe

Re: [PATCH v2] Support ASan ODR indicators at compiler side.

2016-11-21 Thread Maxim Ostapenko


On 21/11/16 15:17, Jakub Jelinek wrote:

On Mon, Nov 21, 2016 at 12:09:30PM +, Yuri Gribov wrote:

On Mon, Nov 21, 2016 at 11:50 AM, Jakub Jelinek  wrote:

On Mon, Nov 21, 2016 at 11:43:56AM +, Yuri Gribov wrote:

This is just weird.  DECL_NAME in theory could be NULL, or can be a symbol
much longer than 100 bytes, at which point you have strlen (tmp_name) == 99
and ASM_GENERATE_INTERNAL_LABEL will just misbehave.
I fail to see why you need tmp_name at all, I'd go just with
   char sym_name[40];
   ASM_GENERATE_INTERNAL_LABEL (sym_name, "LASANODR", ++lasan_odr_ind_cnt);
or so.

Given that feature is quite new and hasn't been tested too much (it's
off by default in Clang), having a descriptive name may aid with
debugging bug reports.

What would those symbols help with in debugging bug reports?
You need to have a source reproducer anyway, then anybody can try it
himself.

Well, in case of some weird symbol error at startup we can at least
understand which library/symbol to blame.


Even from just pure *.s file, you can look up where the
.LASANODR1234 is used and from there find the corresponding symbol.
Plus, as I said, with 95 chars or longer symbols (approx.) you get a buffer
overflow.  We don't use descriptive symbols in other internal asan, dwarf2
etc. labels.

Note that indicators need to have default visibility so simple scheme
like this will cause runtime collisions.

But then why do you use ASM_GENERATE_INTERNAL_LABEL?  That is for internal
labels.  If the indicators are visible outside of TUs, then their mangling
is an ABI thing.  In that case you shouldn't add any kind of counter
to them, but you should use something like __asan_odr. or something
similar, and if . is not supported in symbol names, use $ instead and if
neither, then just disable the odr indicators.

Or how exactly are these odr indicators supposed to work?


Odr indicators act as visible "delegates" of protected by ASan globals 
(their private aliases actually). We introduce them to catch cross-dso 
symbols clashing at runtime.
Of course, some people intentionally use ELF interposition and ASan 
would generate false alarm there, but a) we can use suppressions to 
avoid false alarms here and b) I believe in most cases such an alarm 
would indicate a real bug in user's code.


-Maxim



Jakub

Re: [PATCH] Do not simplify "(and (reg) (const bit))" to if_then_else.

2016-11-21 Thread Dominik Vogt

On Fri, Nov 11, 2016 at 12:10:28PM +0100, Dominik Vogt wrote:
> On Mon, Nov 07, 2016 at 09:29:26PM +0100, Bernd Schmidt wrote:
> > On 10/31/2016 08:56 PM, Dominik Vogt wrote:
> > 
> > >combine_simplify_rtx() tries to replace rtx expressions with just two
> > >possible values with an experession that uses if_then_else:
> > >
> > >  (if_then_else (condition) (value1) (value2))
> > >
> > >If the original expression is e.g.
> > >
> > >  (and (reg) (const_int 2))
> > 
> > I'm not convinced that if_then_else_cond is the right place to do
> > this. That function is designed to answer the question of whether an
> > rtx has exactly one of two values and under which condition; I feel
> > it should continue to work this way.
> > 
> > Maybe simplify_ternary_expression needs to be taught to deal with this case?
> 
> But simplify_ternary_expression isn't called with the following
> test program (only tried it on s390x):
> 
>   void bar(int, int); 
>   int foo(int a, int *b) 
>   { 
> if (a) 
>   bar(0, *b & 2); 
> return *b; 
>   } 
> 
> combine_simplify_rtx() is called with 
> 
>   (sign_extend:DI (and:SI (reg:SI 61) (const_int 2)))
> 
> In the switch it calls simplify_unary_operation(), which return
> NULL.  The next thing it does is call if_then_else_cond(), and
> that calls itself with the sign_extend peeled off:
> 
>   (and:SI (reg:SI 61) (const_int 2))
> 
> takes the "BINARY_P (x)" path and returns false.  The problem
> exists only if the (and ...) is wrapped in ..._extend, i.e. the
> ondition dealing with (and ...) directly can be removed from the
> patch.
> 
> So, all recursive calls to if_then_els_cond() return false, and
> finally the condition in
> 
> else if (HWI_COMPUTABLE_MODE_P (mode) 
>&& pow2p_hwi (nz = nonzero_bits (x, mode))
> 
> is true.
> 
> Thus, if if_then_else_cond should remain unchanged, the only place
> to fix this would be after the call to if_then_else_cond() in
> combine_simplify_rtx().  Actually, there already is some special
> case handling to override the return code of if_then_else_cond():
> 
>   cond = if_then_else_cond (x, &true_rtx, &false_rtx); 
>   if (cond != 0 
>   /* If everything is a comparison, what we have is highly unlikely 
>  to be simpler, so don't use it.  */ 
> --->  && ! (COMPARISON_P (x) 
> && (COMPARISON_P (true_rtx) || COMPARISON_P (false_rtx 
> { 
>   rtx cop1 = const0_rtx; 
>   enum rtx_code cond_code = simplify_comparison (NE, &cond, &cop1); 
>  
> --->  if (cond_code == NE && COMPARISON_P (cond)) 
> return x; 
>   ...
> 
> Should be easy to duplicate the test in the if-body, if that is
> what you prefer:
> 
>   ...
>   if (HWI_COMPUTABLE_MODE_P (GET_MODE (x)) 
>   && pow2p_hwi (nz = nonzero_bits (x, GET_MODE (x))) 
>   && ! ((code == SIGN_EXTEND || code == ZERO_EXTEND) 
> && GET_CODE (XEXP (x, 0)) == AND 
> && CONST_INT_P (XEXP (XEXP (x, 0), 0)) 
> && UINTVAL (XEXP (XEXP (x, 0), 0)) == nz)) 
> return x; 
> 
> (untested)

Updated and tested version of the patch attached.  The extra logic
is now in combine_simplify_rtx.

Ciao

Dominik ^_^  ^_^

-- 

Dominik Vogt
IBM Germany
gcc/ChangeLog

* combine.c (combine_simplify_rtx):  Suppress replacement of
"(and (reg) (const_int bit))" with "if_then_else".
>From 2ebe692928b4ebee3fa6dc02136980801a04b33d Mon Sep 17 00:00:00 2001
From: Dominik Vogt 
Date: Mon, 31 Oct 2016 09:00:31 +0100
Subject: [PATCH] Do not simplify "(and (reg) (const bit)" to if_then_else.

combine_simplify_rtx() tries to replace rtx expressions with just two
possible values with an experession that uses if_then_else:

  (if_then_else (condition) (value1) (value2))

If the original expression is e.g.

  (and (reg) (const_int 2))

where the constant is the mask for a single bit, the replacement results
in a more complex expression than before:

  (if_then_else (ne (zero_extract (reg) (1) (31))) (2) (0))

Similar replacements are done for

  (signextend (and ...))
  (zeroextend (and ...))

Suppress the replacement this special case in if_then_else_cond().
---
 gcc/combine.c | 13 +
 1 file changed, 13 insertions(+)

diff --git a/gcc/combine.c b/gcc/combine.c
index b22a274..457fe8a 100644
--- a/gcc/combine.c
+++ b/gcc/combine.c
@@ -5575,10 +5575,23 @@ combine_simplify_rtx (rtx x, machine_mode op0_mode, int 
in_dest,
{
  rtx cop1 = const0_rtx;
  enum rtx_code cond_code = simplify_comparison (NE, &cond, &cop1);
+ unsigned HOST_WIDE_INT nz;
 
  if (cond_code == NE && COMPARISON_P (cond))
return x;
 
+ /* If the operation is an AND wrapped in a SIGN_EXTEND or ZERO_EXTEND
+with either operand being just a constant single bit value, do
+nothing since IF_THEN_ELSE is likely to increase the expression's
+com

Re: Handle sibcalls with aggregate returns

2016-11-21 Thread Richard Biener

On Mon, Nov 21, 2016 at 11:34 AM, Richard Sandiford
 wrote:
> We treated this g as a sibling call to f:
>
>   int f (int);
>   int g (void) { return f (1); }
>
> but not this one:
>
>   struct s { int i; };
>   struct s f (int);
>   struct s g (void) { return f (1); }
>
> We treated them both as sibcalls on x86 before the first patch for PR36326,
> so I suppose this is a regression of sorts from 4.3.
>
> The patch allows function returns to be local aggregate variables as well
> as gimple registers.
>
> Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?

Ok.

Richard.

> Thanks,
> Richard
>
>
> gcc/
> * tree-tailcall.c (process_assignment): Simplify the check for
> a valid copy, allowing the source to be a local variable as
> well as an SSA name.
> (find_tail_calls): Allow copies between local variables to follow
> the call.  Allow the result to be stored in any local variable,
> even if it's an aggregate.
> (eliminate_tail_call): Check whether the result is an SSA name
> before updating its SSA_NAME_DEF_STMT.
>
> gcc/testsuite/
> * gcc.dg/tree-ssa/tailcall-7.c: New test.
>
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/tailcall-7.c 
> b/gcc/testsuite/gcc.dg/tree-ssa/tailcall-7.c
> new file mode 100644
> index 000..eabf1a8
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/tailcall-7.c
> @@ -0,0 +1,89 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-tailc-details" } */
> +
> +struct s { int x; };
> +struct s f (int);
> +struct s global;
> +void callit (void (*) (void));
> +
> +/* Tail call.  */
> +void
> +g1 (void)
> +{
> +  f (1);
> +}
> +
> +/* Not a tail call.  */
> +void
> +g2 (void)
> +{
> +  global = f (2);
> +}
> +
> +/* Not a tail call.  */
> +void
> +g3 (struct s *ptr)
> +{
> +  *ptr = f (3);
> +}
> +
> +/* Tail call.  */
> +struct s
> +g4 (struct s param)
> +{
> +  param = f (4);
> +  return param;
> +}
> +
> +/* Tail call.  */
> +struct s
> +g5 (void)
> +{
> +  struct s local = f (5);
> +  return local;
> +}
> +
> +/* Tail call.  */
> +struct s
> +g6 (void)
> +{
> +  return f (6);
> +}
> +
> +/* Not a tail call.  */
> +struct s
> +g7 (void)
> +{
> +  struct s local = f (7);
> +  global = local;
> +  return local;
> +}
> +
> +/* Not a tail call.  */
> +struct s
> +g8 (struct s *ptr)
> +{
> +  struct s local = f (8);
> +  *ptr = local;
> +  return local;
> +}
> +
> +/* Not a tail call.  */
> +int
> +g9 (struct s param)
> +{
> +  void inner (void) { param = f (9); }
> +  callit (inner);
> +  return 9;
> +}
> +
> +/* Tail call.  */
> +int
> +g10 (int param)
> +{
> +  void inner (void) { f (param); }
> +  callit (inner);
> +  return 10;
> +}
> +
> +/* { dg-final { scan-tree-dump-times "Found tail call" 5 "tailc" } } */
> diff --git a/gcc/tree-tailcall.c b/gcc/tree-tailcall.c
> index 0436f0f..f97541d 100644
> --- a/gcc/tree-tailcall.c
> +++ b/gcc/tree-tailcall.c
> @@ -269,7 +269,7 @@ process_assignment (gassign *stmt, gimple_stmt_iterator 
> call, tree *m,
>   conversions that can never produce extra code between the function
>   call and the function return.  */
>if ((rhs_class == GIMPLE_SINGLE_RHS || gimple_assign_cast_p (stmt))
> -  && (TREE_CODE (src_var) == SSA_NAME))
> +  && src_var == *ass_var)
>  {
>/* Reject a tailcall if the type conversion might need
>  additional code.  */
> @@ -287,9 +287,6 @@ process_assignment (gassign *stmt, gimple_stmt_iterator 
> call, tree *m,
> return false;
> }
>
> -  if (src_var != *ass_var)
> -   return false;
> -
>*ass_var = dest;
>return true;
>  }
> @@ -428,6 +425,13 @@ find_tail_calls (basic_block bb, struct tailcall **ret)
>   break;
> }
>
> +  /* Allow simple copies between local variables, even if they're
> +aggregates.  */
> +  if (is_gimple_assign (stmt)
> + && auto_var_in_fn_p (gimple_assign_lhs (stmt), cfun->decl)
> + && auto_var_in_fn_p (gimple_assign_rhs1 (stmt), cfun->decl))
> +   continue;
> +
>/* If the statement references memory or volatile operands, fail.  */
>if (gimple_references_memory_p (stmt)
>   || gimple_has_volatile_ops (stmt))
> @@ -444,18 +448,20 @@ find_tail_calls (basic_block bb, struct tailcall **ret)
>return;
>  }
>
> -  /* If the LHS of our call is not just a simple register, we can't
> - transform this into a tail or sibling call.  This situation happens,
> - in (e.g.) "*p = foo()" where foo returns a struct.  In this case
> - we won't have a temporary here, but we need to carry out the side
> - effect anyway, so tailcall is impossible.
> +  /* If the LHS of our call is not just a simple register or local
> + variable, we can't transform this into a tail or sibling call.
> + This situation happens, in (e.g.) "*p = foo()" where foo returns a
> + struct.  In this case we won't have a temporary here, but we need
> + to

Re: [PATCH v2] Support ASan ODR indicators at compiler side.

2016-11-21 Thread Maxim Ostapenko


On 21/11/16 15:17, Jakub Jelinek wrote:

On Mon, Nov 21, 2016 at 12:09:30PM +, Yuri Gribov wrote:

On Mon, Nov 21, 2016 at 11:50 AM, Jakub Jelinek  wrote:

On Mon, Nov 21, 2016 at 11:43:56AM +, Yuri Gribov wrote:

This is just weird.  DECL_NAME in theory could be NULL, or can be a symbol
much longer than 100 bytes, at which point you have strlen (tmp_name) == 99
and ASM_GENERATE_INTERNAL_LABEL will just misbehave.
I fail to see why you need tmp_name at all, I'd go just with
   char sym_name[40];
   ASM_GENERATE_INTERNAL_LABEL (sym_name, "LASANODR", ++lasan_odr_ind_cnt);
or so.

Given that feature is quite new and hasn't been tested too much (it's
off by default in Clang), having a descriptive name may aid with
debugging bug reports.

What would those symbols help with in debugging bug reports?
You need to have a source reproducer anyway, then anybody can try it
himself.

Well, in case of some weird symbol error at startup we can at least
understand which library/symbol to blame.


Even from just pure *.s file, you can look up where the
.LASANODR1234 is used and from there find the corresponding symbol.
Plus, as I said, with 95 chars or longer symbols (approx.) you get a buffer
overflow.  We don't use descriptive symbols in other internal asan, dwarf2
etc. labels.

Note that indicators need to have default visibility so simple scheme
like this will cause runtime collisions.

But then why do you use ASM_GENERATE_INTERNAL_LABEL?  That is for internal
labels.  If the indicators are visible outside of TUs, then their mangling
is an ABI thing.  In that case you shouldn't add any kind of counter
to them, but you should use something like __asan_odr. or something
similar, and if . is not supported in symbol names, use $ instead and if
neither, then just disable the odr indicators.

Or how exactly are these odr indicators supposed to work?


Yes, you just caught an error, __asan_odr. is right thing to do 
here. I'm sorry about this.




Jakub

Re: [PATCH v2] Support ASan ODR indicators at compiler side.

2016-11-21 Thread Jakub Jelinek

On Mon, Nov 21, 2016 at 12:09:30PM +, Yuri Gribov wrote:
> On Mon, Nov 21, 2016 at 11:50 AM, Jakub Jelinek  wrote:
> > On Mon, Nov 21, 2016 at 11:43:56AM +, Yuri Gribov wrote:
> >> > This is just weird.  DECL_NAME in theory could be NULL, or can be a 
> >> > symbol
> >> > much longer than 100 bytes, at which point you have strlen (tmp_name) == 
> >> > 99
> >> > and ASM_GENERATE_INTERNAL_LABEL will just misbehave.
> >> > I fail to see why you need tmp_name at all, I'd go just with
> >> >   char sym_name[40];
> >> >   ASM_GENERATE_INTERNAL_LABEL (sym_name, "LASANODR", 
> >> > ++lasan_odr_ind_cnt);
> >> > or so.
> >>
> >> Given that feature is quite new and hasn't been tested too much (it's
> >> off by default in Clang), having a descriptive name may aid with
> >> debugging bug reports.
> >
> > What would those symbols help with in debugging bug reports?
> > You need to have a source reproducer anyway, then anybody can try it
> > himself.
> 
> Well, in case of some weird symbol error at startup we can at least
> understand which library/symbol to blame.
> 
> > Even from just pure *.s file, you can look up where the
> > .LASANODR1234 is used and from there find the corresponding symbol.
> > Plus, as I said, with 95 chars or longer symbols (approx.) you get a buffer
> > overflow.  We don't use descriptive symbols in other internal asan, dwarf2
> > etc. labels.
> 
> Note that indicators need to have default visibility so simple scheme
> like this will cause runtime collisions.

But then why do you use ASM_GENERATE_INTERNAL_LABEL?  That is for internal
labels.  If the indicators are visible outside of TUs, then their mangling
is an ABI thing.  In that case you shouldn't add any kind of counter
to them, but you should use something like __asan_odr. or something
similar, and if . is not supported in symbol names, use $ instead and if
neither, then just disable the odr indicators.

Or how exactly are these odr indicators supposed to work?

Jakub

Re: [PATCH v2] Support ASan ODR indicators at compiler side.

2016-11-21 Thread Yuri Gribov

On Mon, Nov 21, 2016 at 11:50 AM, Jakub Jelinek  wrote:
> On Mon, Nov 21, 2016 at 11:43:56AM +, Yuri Gribov wrote:
>> > This is just weird.  DECL_NAME in theory could be NULL, or can be a symbol
>> > much longer than 100 bytes, at which point you have strlen (tmp_name) == 99
>> > and ASM_GENERATE_INTERNAL_LABEL will just misbehave.
>> > I fail to see why you need tmp_name at all, I'd go just with
>> >   char sym_name[40];
>> >   ASM_GENERATE_INTERNAL_LABEL (sym_name, "LASANODR", ++lasan_odr_ind_cnt);
>> > or so.
>>
>> Given that feature is quite new and hasn't been tested too much (it's
>> off by default in Clang), having a descriptive name may aid with
>> debugging bug reports.
>
> What would those symbols help with in debugging bug reports?
> You need to have a source reproducer anyway, then anybody can try it
> himself.

Well, in case of some weird symbol error at startup we can at least
understand which library/symbol to blame.

> Even from just pure *.s file, you can look up where the
> .LASANODR1234 is used and from there find the corresponding symbol.
> Plus, as I said, with 95 chars or longer symbols (approx.) you get a buffer
> overflow.  We don't use descriptive symbols in other internal asan, dwarf2
> etc. labels.

Note that indicators need to have default visibility so simple scheme
like this will cause runtime collisions.

-Iurii

Re: [PATCH v2] Support ASan ODR indicators at compiler side.

2016-11-21 Thread Jakub Jelinek

On Mon, Nov 21, 2016 at 11:43:56AM +, Yuri Gribov wrote:
> > This is just weird.  DECL_NAME in theory could be NULL, or can be a symbol
> > much longer than 100 bytes, at which point you have strlen (tmp_name) == 99
> > and ASM_GENERATE_INTERNAL_LABEL will just misbehave.
> > I fail to see why you need tmp_name at all, I'd go just with
> >   char sym_name[40];
> >   ASM_GENERATE_INTERNAL_LABEL (sym_name, "LASANODR", ++lasan_odr_ind_cnt);
> > or so.
> 
> Given that feature is quite new and hasn't been tested too much (it's
> off by default in Clang), having a descriptive name may aid with
> debugging bug reports.

What would those symbols help with in debugging bug reports?
You need to have a source reproducer anyway, then anybody can try it
himself.  Even from just pure *.s file, you can look up where the
.LASANODR1234 is used and from there find the corresponding symbol.
Plus, as I said, with 95 chars or longer symbols (approx.) you get a buffer
overflow.  We don't use descriptive symbols in other internal asan, dwarf2
etc. labels.

Jakub

Re: [PATCH v2] Support ASan ODR indicators at compiler side.

2016-11-21 Thread Yuri Gribov

On Mon, Nov 21, 2016 at 11:38 AM, Jakub Jelinek  wrote:
> On Mon, Nov 14, 2016 at 11:44:26AM +0300, Maxim Ostapenko wrote:
>> this is the second attempt to support ASan odr indicators in GCC. I've fixed
>> issues with several flags (e.g.TREE_ADDRESSABLE) and introduced new "asan
>> odr indicator" attribute to distinguish indicators from other symbols.
>> Looks better now?
>>
>> Tested and ASan bootstrapped on x86_64-unknown-linux-gnu.
>>
>> -Maxim
>
>> config/
>>
>>   * bootstrap-asan.mk: Replace LSAN_OPTIONS=detect_leaks=0 with
>>   ASAN_OPTIONS=detect_leaks=0:use_odr_indicator=1.
>>
>> gcc/
>>
>>   * asan.c (asan_global_struct): Refactor.
>>   (create_odr_indicator): New function.
>>   (asan_needs_odr_indicator_p): Likewise.
>>   (is_odr_indicator): Likewise.
>>   (asan_add_global): Introduce odr_indicator_ptr. Pass it into global's
>>   constructor.
>>   (asan_protect_global): Do not protect odr indicators.
>>
>> gcc/c-family/
>>
>>   * c-attribs.c (asan odr indicator): New attribute.
>>   (handle_asan_odr_indicator_attribute): New function.
>>
>> gcc/testsuite/
>>
>>   * c-c++-common/asan/no-redundant-odr-indicators-1.c: New test.
>>
>> diff --git a/config/ChangeLog b/config/ChangeLog
>> index 3b0092b..0c75185 100644
>> --- a/config/ChangeLog
>> +++ b/config/ChangeLog
>> @@ -1,3 +1,8 @@
>> +2016-11-14  Maxim Ostapenko  
>> +
>> + * bootstrap-asan.mk: Replace LSAN_OPTIONS=detect_leaks=0 with
>> + ASAN_OPTIONS=detect_leaks=0:use_odr_indicator=1.
>> +
>>  2016-06-21  Trevor Saunders  
>>
>>   * elf.m4: Remove interix support.
>> diff --git a/config/bootstrap-asan.mk b/config/bootstrap-asan.mk
>> index 70baaf9..e73d4c2 100644
>> --- a/config/bootstrap-asan.mk
>> +++ b/config/bootstrap-asan.mk
>> @@ -1,7 +1,7 @@
>>  # This option enables -fsanitize=address for stage2 and stage3.
>>
>>  # Suppress LeakSanitizer in bootstrap.
>> -export LSAN_OPTIONS="detect_leaks=0"
>> +export ASAN_OPTIONS=detect_leaks=0:use_odr_indicator=1
>>
>>  STAGE2_CFLAGS += -fsanitize=address
>>  STAGE3_CFLAGS += -fsanitize=address
>> diff --git a/gcc/ChangeLog b/gcc/ChangeLog
>> index a76e3e8..64744b9 100644
>> --- a/gcc/ChangeLog
>> +++ b/gcc/ChangeLog
>> @@ -1,3 +1,13 @@
>> +2016-11-14  Maxim Ostapenko  
>> +
>> + * asan.c (asan_global_struct): Refactor.
>> + (create_odr_indicator): New function.
>> + (asan_needs_odr_indicator_p): Likewise.
>> + (is_odr_indicator): Likewise.
>> + (asan_add_global): Introduce odr_indicator_ptr. Pass it into global's
>> + constructor.
>> + (asan_protect_global): Do not protect odr indicators.
>> +
>>  2016-11-09  Kugan Vivekanandarajah  
>>
>>   * ipa-cp.c (ipa_get_jf_pass_through_result): Handle unary expressions.
>> diff --git a/gcc/asan.c b/gcc/asan.c
>> index 6e93ea3..1191ebe 100644
>> --- a/gcc/asan.c
>> +++ b/gcc/asan.c
>> @@ -1388,6 +1388,16 @@ asan_needs_local_alias (tree decl)
>>return DECL_WEAK (decl) || !targetm.binds_local_p (decl);
>>  }
>>
>> +/* Return true if DECL, a global var, is an artificial ODR indicator symbol
>> +   therefore doesn't need protection.  */
>> +
>> +static bool
>> +is_odr_indicator (tree decl)
>> +{
>> +  return DECL_ARTIFICIAL (decl)
>> +  && lookup_attribute ("asan odr indicator", DECL_ATTRIBUTES (decl));
>
> Better use
>   return (DECL_ARTIFICIAL (decl)
>   && lookup_attribute ("asan odr indicator", DECL_ATTRIBUTES (decl)));
> at least emacs users most likely need that.
>
>> - "__name", "__module_name", "__has_dynamic_init", "__location", 
>> "__odr_indicator"};
>> -  tree fields[8], ret;
>> -  int i;
>> + "__name", "__module_name", "__has_dynamic_init", "__location",
>> + "__odr_indicator"};
>
> Please put space before };.
>
>> +/* Create and return odr indicator symbol for DECL.
>> +   TYPE is __asan_global struct type as returned by asan_global_struct.  */
>> +
>> +static tree
>> +create_odr_indicator (tree decl, tree type)
>> +{
>> +  char sym_name[100], tmp_name[100];
>> +  static int lasan_odr_ind_cnt = 0;
>> +  tree uptr = TREE_TYPE (DECL_CHAIN (TYPE_FIELDS (type)));
>> +
>> +  snprintf (tmp_name, sizeof (tmp_name), "__odr_asan_%s_",
>> + IDENTIFIER_POINTER (DECL_NAME (decl)));
>> +  ASM_GENERATE_INTERNAL_LABEL (sym_name, tmp_name, ++lasan_odr_ind_cnt);
>
> This is just weird.  DECL_NAME in theory could be NULL, or can be a symbol
> much longer than 100 bytes, at which point you have strlen (tmp_name) == 99
> and ASM_GENERATE_INTERNAL_LABEL will just misbehave.
> I fail to see why you need tmp_name at all, I'd go just with
>   char sym_name[40];
>   ASM_GENERATE_INTERNAL_LABEL (sym_name, "LASANODR", ++lasan_odr_ind_cnt);
> or so.

Given that feature is quite new and hasn't been tested too much (it's
off by default in Clang), having a descriptive name may aid with
debugging bug reports.

>> +  char *asterisk = sym_name;
>> +  while ((asterisk = strchr (asterisk, '*')))
>> +*asterisk = '_';
>
> Can't * be just at the beginning?

[PATCH] Fix PR78396

2016-11-21 Thread Richard Biener


The following patch deals with the testsuite fallout of the patch
forcing LOOP_VECTORIZED () versioning in GIMPLE if-conversion.  We
no longer see the if-converted body when doing BB vectorization.

While the real fix would be to teach BB vectorization about
conditions (and thus if-conversion) itself a quick workaround to
restore previous behavior is to simply run BB vectorization
for if-converted loop bodies from inside the loop vectorizer itself.
That fixes the testcase and restores behavior for all
if-converted (but not also unrolled) loops.

Bootstrap & regtest in progress on x86_64-unknown-linux-gnu.

Richard.

2016-11-21  Richard Biener  

PR tree-optimization/78396
* tree-vectorizer.c (vectorize_loops): If an innermost loop didn't
vectorize try vectorizing an if-converted body using BB vectorization.

* gcc.dg/vect/bb-slp-cond-1.c: Adjust.

diff --git a/gcc/testsuite/gcc.dg/vect/bb-slp-cond-1.c 
b/gcc/testsuite/gcc.dg/vect/bb-slp-cond-1.c
index 35811bd..ddad853 100644
--- a/gcc/testsuite/gcc.dg/vect/bb-slp-cond-1.c
+++ b/gcc/testsuite/gcc.dg/vect/bb-slp-cond-1.c
@@ -1,4 +1,5 @@
 /* { dg-require-effective-target vect_condition } */
+/* { dg-additional-options "-fdump-tree-vect-details" } */
 
 #include "tree-vect.h"
 
@@ -41,5 +42,10 @@ int main ()
   return 0;
 }
 
-/* { dg-final { scan-tree-dump-times "basic block vectorized" 1 "slp1" { 
target vect_element_align } } } */
+/* Basic blocks of if-converted loops are vectorized from within the loop
+   vectorizer pass.  In this case it is really a deficiency in loop
+   vectorization data dependence analysis that causes us to require
+   basic block vectorization in the first place.  */
+
+/* { dg-final { scan-tree-dump-times "basic block vectorized" 1 "vect" { 
target vect_element_align } } } */
 
diff --git a/gcc/tree-vectorizer.c b/gcc/tree-vectorizer.c
index 35d7a3e..b390664 100644
--- a/gcc/tree-vectorizer.c
+++ b/gcc/tree-vectorizer.c
@@ -540,6 +540,7 @@ vectorize_loops (void)
 || loop->force_vectorize)
   {
loop_vec_info loop_vinfo, orig_loop_vinfo = NULL;
+   gimple *loop_vectorized_call = vect_loop_vectorized_call (loop);
 vectorize_epilogue:
vect_location = find_loop_location (loop);
 if (LOCATION_LOCUS (vect_location) != UNKNOWN_LOCATION
@@ -558,6 +559,33 @@ vectorize_epilogue:
if (loop_constraint_set_p (loop, LOOP_C_FINITE))
  vect_free_loop_info_assumptions (loop);
 
+   /* If we applied if-conversion then try to vectorize the
+  BB of innermost loops.
+  ???  Ideally BB vectorization would learn to vectorize
+  control flow by applying if-conversion on-the-fly, the
+  following retains the if-converted loop body even when
+  only non-if-converted parts took part in BB vectorization.  */
+   if (flag_tree_slp_vectorize != 0
+   && loop_vectorized_call
+   && ! loop->inner)
+ {
+   basic_block bb = loop->header;
+   for (gimple_stmt_iterator gsi = gsi_start_bb (bb);
+!gsi_end_p (gsi); gsi_next (&gsi))
+ {
+   gimple *stmt = gsi_stmt (gsi);
+   gimple_set_uid (stmt, -1);
+   gimple_set_visited (stmt, false);
+ }
+   if (vect_slp_bb (bb))
+ {
+   dump_printf_loc (MSG_OPTIMIZED_LOCATIONS, vect_location,
+"basic block vectorized\n");
+   fold_loop_vectorized_call (loop_vectorized_call,
+  boolean_true_node);
+   ret |= TODO_cleanup_cfg;
+ }
+ }
continue;
  }
 
@@ -575,7 +603,6 @@ vectorize_epilogue:
break;
  }
 
-   gimple *loop_vectorized_call = vect_loop_vectorized_call (loop);
if (loop_vectorized_call)
  set_uid_loop_bbs (loop_vinfo, loop_vectorized_call);
 if (LOCATION_LOCUS (vect_location) != UNKNOWN_LOCATION

[arm-embedded] [PATCH, GCC/ARM, ping] Optional -mthumb for Thumb only targets

2016-11-21 Thread Thomas Preudhomme


Hi,

We have decided to backport this patch to make -mthumb optional for Thumb-only 
targets to our embedded-6-branch.


*** gcc/ChangeLog.arm ***

2016-11-18  Thomas Preud'homme  

Backport from mainline
2016-11-18  Terry Guo  
Thomas Preud'homme  

* common/config/arm/arm-common.c (arm_target_thumb_only): New function.
* config/arm/arm-opts.h: Include arm-flags.h.
(struct arm_arch_core_flag): Define.
(arm_arch_core_flags): Define.
* config/arm/arm-protos.h: Include arm-flags.h
(FL_NONE, FL_ANY, FL_CO_PROC, FL_ARCH3M, FL_MODE26, FL_MODE32,
FL_ARCH4, FL_ARCH5, FL_THUMB, FL_LDSCHED, FL_STRONG, FL_ARCH5E,
FL_XSCALE, FL_ARCH6, FL_VFPV2, FL_WBUF, FL_ARCH6K, FL_THUMB2, FL_NOTM,
FL_THUMB_DIV, FL_VFPV3, FL_NEON, FL_ARCH7EM, FL_ARCH7, FL_ARM_DIV,
FL_ARCH8, FL_CRC32, FL_SMALLMUL, FL_NO_VOLATILE_CE, FL_IWMMXT,
FL_IWMMXT2, FL_ARCH6KZ, FL2_ARCH8_1, FL2_ARCH8_2, FL2_FP16INST,
FL_TUNE, FL_FOR_ARCH2, FL_FOR_ARCH3, FL_FOR_ARCH3M, FL_FOR_ARCH4,
FL_FOR_ARCH4T, FL_FOR_ARCH5, FL_FOR_ARCH5T, FL_FOR_ARCH5E,
FL_FOR_ARCH5TE, FL_FOR_ARCH5TEJ, FL_FOR_ARCH6, FL_FOR_ARCH6J,
FL_FOR_ARCH6K, FL_FOR_ARCH6Z, FL_FOR_ARCH6ZK, FL_FOR_ARCH6KZ,
FL_FOR_ARCH6T2, FL_FOR_ARCH6M, FL_FOR_ARCH7, FL_FOR_ARCH7A,
FL_FOR_ARCH7VE, FL_FOR_ARCH7R, FL_FOR_ARCH7M, FL_FOR_ARCH7EM,
FL_FOR_ARCH8A, FL2_FOR_ARCH8_1A, FL2_FOR_ARCH8_2A, FL_FOR_ARCH8M_BASE,
FL_FOR_ARCH8M_MAIN, arm_feature_set, ARM_FSET_MAKE,
ARM_FSET_MAKE_CPU1, ARM_FSET_MAKE_CPU2, ARM_FSET_CPU1, ARM_FSET_CPU2,
ARM_FSET_EMPTY, ARM_FSET_ANY, ARM_FSET_HAS_CPU1, ARM_FSET_HAS_CPU2,
ARM_FSET_HAS_CPU, ARM_FSET_ADD_CPU1, ARM_FSET_ADD_CPU2,
ARM_FSET_DEL_CPU1, ARM_FSET_DEL_CPU2, ARM_FSET_UNION, ARM_FSET_INTER,
ARM_FSET_XOR, ARM_FSET_EXCLUDE, ARM_FSET_IS_EMPTY,
ARM_FSET_CPU_SUBSET): Move to ...
* config/arm/arm-flags.h: This new file.
* config/arm/arm.h (TARGET_MODE_SPEC_FUNCTIONS): Define.
(EXTRA_SPEC_FUNCTIONS): Add TARGET_MODE_SPEC_FUNCTIONS to its value.
(TARGET_MODE_SPECS): Define.
(DRIVER_SELF_SPECS): Add TARGET_MODE_SPECS to its value.


*** gcc/testsuite/ChangeLog.arm ***

2016-11-18  Thomas Preud'homme  

Backport from mainline
2016-11-18  Thomas Preud'homme  

* gcc.target/arm/optional_thumb-1.c: New test.
* gcc.target/arm/optional_thumb-2.c: New test.
* gcc.target/arm/optional_thumb-3.c: New test.
--- Begin Message ---

On 11/11/16 14:35, Kyrill Tkachov wrote:


On 08/11/16 13:36, Thomas Preudhomme wrote:

Ping?

Best regards,

Thomas

On 25/10/16 18:07, Thomas Preudhomme wrote:

Hi,

Currently when a user compiles for a thumb-only target (such as Cortex-M
processors) without specifying the -mthumb option GCC throws the error "target
CPU does not support ARM mode". This is suboptimal from a usability point of
view: the -mthumb could be deduced from the -march or -mcpu option when there is
no ambiguity.

This patch implements this behavior by extending the DRIVER_SELF_SPECS to
automatically append -mthumb to the command line for thumb-only targets. It does
so by checking the last -march option if any is given or the last -mcpu option
otherwise. There is no ordering issue because conflicting -mcpu and -march is
already handled.

Note that the logic cannot be implemented in function arm_option_override
because we need to provide the modified command line to the GCC driver for
finding the right multilib path and the function arm_option_override is executed
too late for that effect.

ChangeLog entries are as follow:

*** gcc/ChangeLog ***

2016-10-18  Terry Guo  
Thomas Preud'homme 

PR target/64802
* common/config/arm/arm-common.c (arm_target_thumb_only): New function.
* config/arm/arm-opts.h: Include arm-flags.h.
(struct arm_arch_core_flag): Define.
(arm_arch_core_flags): Define.
* config/arm/arm-protos.h: Include arm-flags.h.
(FL_NONE, FL_ANY, FL_CO_PROC, FL_ARCH3M, FL_MODE26, FL_MODE32,
FL_ARCH4, FL_ARCH5, FL_THUMB, FL_LDSCHED, FL_STRONG, FL_ARCH5E,
FL_XSCALE, FL_ARCH6, FL_VFPV2, FL_WBUF, FL_ARCH6K, FL_THUMB2, FL_NOTM,
FL_THUMB_DIV, FL_VFPV3, FL_NEON, FL_ARCH7EM, FL_ARCH7, FL_ARM_DIV,
FL_ARCH8, FL_CRC32, FL_SMALLMUL, FL_NO_VOLATILE_CE, FL_IWMMXT,
FL_IWMMXT2, FL_ARCH6KZ, FL2_ARCH8_1, FL2_ARCH8_2, FL2_FP16INST,
FL_TUNE, FL_FOR_ARCH2, FL_FOR_ARCH3, FL_FOR_ARCH3M, FL_FOR_ARCH4,
FL_FOR_ARCH4T, FL_FOR_ARCH5, FL_FOR_ARCH5T, FL_FOR_ARCH5E,
FL_FOR_ARCH5TE, FL_FOR_ARCH5TEJ, FL_FOR_ARCH6, FL_FOR_ARCH6J,
FL_FOR_ARCH6K, FL_FOR_ARCH6Z, FL_FOR_ARCH6ZK, FL_FOR_ARCH6KZ,
FL_FOR_ARCH6T2, FL_FOR_ARCH6M, FL_FOR_ARCH7, FL_FOR_ARCH7A,
FL_FOR_ARCH7VE, FL_FOR_ARCH7R, FL_FOR_ARCH7M, FL_FOR_ARCH7EM,
FL_FOR_ARCH8A, FL2_FOR_ARCH8_1A, FL2_FOR_ARCH8_2A, FL_FOR

Re: [PATCH v2] Support ASan ODR indicators at compiler side.

2016-11-21 Thread Jakub Jelinek

On Mon, Nov 14, 2016 at 11:44:26AM +0300, Maxim Ostapenko wrote:
> this is the second attempt to support ASan odr indicators in GCC. I've fixed
> issues with several flags (e.g.TREE_ADDRESSABLE) and introduced new "asan
> odr indicator" attribute to distinguish indicators from other symbols.
> Looks better now?
> 
> Tested and ASan bootstrapped on x86_64-unknown-linux-gnu.
> 
> -Maxim

> config/
> 
>   * bootstrap-asan.mk: Replace LSAN_OPTIONS=detect_leaks=0 with
>   ASAN_OPTIONS=detect_leaks=0:use_odr_indicator=1.
> 
> gcc/
> 
>   * asan.c (asan_global_struct): Refactor.
>   (create_odr_indicator): New function.
>   (asan_needs_odr_indicator_p): Likewise.
>   (is_odr_indicator): Likewise.
>   (asan_add_global): Introduce odr_indicator_ptr. Pass it into global's
>   constructor.
>   (asan_protect_global): Do not protect odr indicators.
> 
> gcc/c-family/
> 
>   * c-attribs.c (asan odr indicator): New attribute.
>   (handle_asan_odr_indicator_attribute): New function.
> 
> gcc/testsuite/
> 
>   * c-c++-common/asan/no-redundant-odr-indicators-1.c: New test.
> 
> diff --git a/config/ChangeLog b/config/ChangeLog
> index 3b0092b..0c75185 100644
> --- a/config/ChangeLog
> +++ b/config/ChangeLog
> @@ -1,3 +1,8 @@
> +2016-11-14  Maxim Ostapenko  
> +
> + * bootstrap-asan.mk: Replace LSAN_OPTIONS=detect_leaks=0 with
> + ASAN_OPTIONS=detect_leaks=0:use_odr_indicator=1.
> +
>  2016-06-21  Trevor Saunders  
>  
>   * elf.m4: Remove interix support.
> diff --git a/config/bootstrap-asan.mk b/config/bootstrap-asan.mk
> index 70baaf9..e73d4c2 100644
> --- a/config/bootstrap-asan.mk
> +++ b/config/bootstrap-asan.mk
> @@ -1,7 +1,7 @@
>  # This option enables -fsanitize=address for stage2 and stage3.
>  
>  # Suppress LeakSanitizer in bootstrap.
> -export LSAN_OPTIONS="detect_leaks=0"
> +export ASAN_OPTIONS=detect_leaks=0:use_odr_indicator=1
>  
>  STAGE2_CFLAGS += -fsanitize=address
>  STAGE3_CFLAGS += -fsanitize=address
> diff --git a/gcc/ChangeLog b/gcc/ChangeLog
> index a76e3e8..64744b9 100644
> --- a/gcc/ChangeLog
> +++ b/gcc/ChangeLog
> @@ -1,3 +1,13 @@
> +2016-11-14  Maxim Ostapenko  
> +
> + * asan.c (asan_global_struct): Refactor.
> + (create_odr_indicator): New function.
> + (asan_needs_odr_indicator_p): Likewise.
> + (is_odr_indicator): Likewise.
> + (asan_add_global): Introduce odr_indicator_ptr. Pass it into global's
> + constructor.
> + (asan_protect_global): Do not protect odr indicators.
> +
>  2016-11-09  Kugan Vivekanandarajah  
>  
>   * ipa-cp.c (ipa_get_jf_pass_through_result): Handle unary expressions.
> diff --git a/gcc/asan.c b/gcc/asan.c
> index 6e93ea3..1191ebe 100644
> --- a/gcc/asan.c
> +++ b/gcc/asan.c
> @@ -1388,6 +1388,16 @@ asan_needs_local_alias (tree decl)
>return DECL_WEAK (decl) || !targetm.binds_local_p (decl);
>  }
>  
> +/* Return true if DECL, a global var, is an artificial ODR indicator symbol
> +   therefore doesn't need protection.  */
> +
> +static bool
> +is_odr_indicator (tree decl)
> +{
> +  return DECL_ARTIFICIAL (decl)
> +  && lookup_attribute ("asan odr indicator", DECL_ATTRIBUTES (decl));

Better use
  return (DECL_ARTIFICIAL (decl)
  && lookup_attribute ("asan odr indicator", DECL_ATTRIBUTES (decl)));
at least emacs users most likely need that.

> - "__name", "__module_name", "__has_dynamic_init", "__location", 
> "__odr_indicator"};
> -  tree fields[8], ret;
> -  int i;
> + "__name", "__module_name", "__has_dynamic_init", "__location",
> + "__odr_indicator"};

Please put space before };.

> +/* Create and return odr indicator symbol for DECL.
> +   TYPE is __asan_global struct type as returned by asan_global_struct.  */
> +
> +static tree
> +create_odr_indicator (tree decl, tree type)
> +{
> +  char sym_name[100], tmp_name[100];
> +  static int lasan_odr_ind_cnt = 0;
> +  tree uptr = TREE_TYPE (DECL_CHAIN (TYPE_FIELDS (type)));
> +
> +  snprintf (tmp_name, sizeof (tmp_name), "__odr_asan_%s_",
> + IDENTIFIER_POINTER (DECL_NAME (decl)));
> +  ASM_GENERATE_INTERNAL_LABEL (sym_name, tmp_name, ++lasan_odr_ind_cnt);

This is just weird.  DECL_NAME in theory could be NULL, or can be a symbol
much longer than 100 bytes, at which point you have strlen (tmp_name) == 99
and ASM_GENERATE_INTERNAL_LABEL will just misbehave.
I fail to see why you need tmp_name at all, I'd go just with
  char sym_name[40];
  ASM_GENERATE_INTERNAL_LABEL (sym_name, "LASANODR", ++lasan_odr_ind_cnt);
or so.

> +  char *asterisk = sym_name;
> +  while ((asterisk = strchr (asterisk, '*')))
> +*asterisk = '_';

Can't * be just at the beginning?  And other ASM_GENERATE_INTERNAL_LABEL +
build_decl with get_identifier spots in asan.c certainly don't strip any.

> @@ -2335,6 +2397,9 @@ asan_add_global (tree decl, tree type, 
> vec *v)
>assemble_alias (refdecl, DECL_ASSEMBLER_NAME (decl));
>  }
>  
> +  tree odr_indicator_ptr = asan_needs_odr_indicator_p (decl)
> +

[arm-embedded] [PATCH, GCC/ARM] Make arm_feature_set agree with type of FL_* macros

2016-11-21 Thread Thomas Preudhomme


Hi,

We have decided to backport this patch to fix a type inconsistency for 
arm_feature_set to our embedded-6-branch.


*** gcc/ChangeLog.arm ***

2016-11-18  Thomas Preud'homme  

Backport from mainline
2016-11-18  Thomas Preud'homme  

* config/arm/arm-protos.h (FL_NONE, FL_ANY, FL_CO_PROC, FL_ARCH3M,
FL_MODE26, FL_MODE32, FL_ARCH4, FL_ARCH5, FL_THUMB, FL_LDSCHED,
FL_STRONG, FL_ARCH5E, FL_XSCALE, FL_ARCH6, FL_VFPV2, FL_WBUF,
FL_ARCH6K, FL_THUMB2, FL_NOTM, FL_THUMB_DIV, FL_VFPV3, FL_NEON,
FL_ARCH7EM, FL_ARCH7, FL_ARM_DIV, FL_ARCH8, FL_CRC32, FL_SMALLMUL,
FL_NO_VOLATILE_CE, FL_IWMMXT, FL_IWMMXT2, FL_ARCH6KZ, FL2_ARCH8_1,
FL2_ARCH8_2, FL2_FP16INST): Reindent comment, add final dot when
missing and make value unsigned.
(arm_feature_set): Use unsigned entries instead of unsigned long.
--- Begin Message ---

Hi,

I've rebased the patch to make arm_feature_set agree with type of FL_* macros on 
top of trunk rather than on top of the optional -mthumb patch. That involved 
doing the changes to gcc/config/arm/arm-protos.h rather than 
gcc/config/arm/arm-flags.h. I also took advantage of the fact that each line is 
changed to change the indentation to tabs and add dots in comments missing one.


For reference, please find below the original patch description:

Currently arm_feature_set is defined in gcc/config/arm/arm-flags as an array of 
2 unsigned long. However, the flags stored in these two entries are (signed) 
int, being combinations of bits set via expression of the form 1 << bitno. This 
creates 3 issues:


1) undefined behavior when setting the msb (1 << 31)
2) undefined behavior when storing a flag with msb set (negative int) into one 
of the unsigned array entries (positive int)

3) waste of space since the top 32 bits of each entry is not used

This patch changes the definition of FL_* macro to be unsigned int by using the 
form 1U << bitno instead and changes the definition of arm_feature_set to be an 
array of 2 unsigned (int) entries.


*** gcc/ChangeLog ***

2016-10-15  Thomas Preud'homme  

* config/arm/arm-protos.h (FL_NONE, FL_ANY, FL_CO_PROC, FL_ARCH3M,
FL_MODE26, FL_MODE32, FL_ARCH4, FL_ARCH5, FL_THUMB, FL_LDSCHED,
FL_STRONG, FL_ARCH5E, FL_XSCALE, FL_ARCH6, FL_VFPV2, FL_WBUF,
FL_ARCH6K, FL_THUMB2, FL_NOTM, FL_THUMB_DIV, FL_VFPV3, FL_NEON,
FL_ARCH7EM, FL_ARCH7, FL_ARM_DIV, FL_ARCH8, FL_CRC32, FL_SMALLMUL,
FL_NO_VOLATILE_CE, FL_IWMMXT, FL_IWMMXT2, FL_ARCH6KZ, FL2_ARCH8_1,
FL2_ARCH8_2, FL2_FP16INST): Reindent comment, add final dot when
missing and make value unsigned.
(arm_feature_set): Use unsigned entries instead of unsigned long.


Bootstrapped on arm-linux-gnueabihf targeting Thumb-2 state.

Is this ok for trunk?

Best regards,

Thomas

On 14/11/16 18:56, Thomas Preudhomme wrote:

My apologize, I realized when trying to apply the patch that I wrote it on top
of the optional -mthumb patch instead of the reverse. I'll rebase it to not
screw up bisect.

Best regards,

Thomas

On 14/11/16 14:47, Kyrill Tkachov wrote:


On 14/11/16 14:07, Thomas Preudhomme wrote:

Hi,

Currently arm_feature_set is defined in gcc/config/arm/arm-flags as an array
of 2 unsigned long. However, the flags stored in these two entries are
(signed) int, being combinations of bits set via expression of the form 1 <<
bitno. This creates 3 issues:

1) undefined behavior when setting the msb (1 << 31)
2) undefined behavior when storing a flag with msb set (negative int) into one
of the unsigned array entries (positive int)
3) waste of space since the top 32 bits of each entry is not used

This patch changes the definition of FL_* macro to be unsigned int by using
the form 1U << bitno instead and changes the definition of arm_feature_set to
be an array of 2 unsigned (int) entries.

Bootstrapped on arm-linux-gnueabihf targeting Thumb-2 state.

Is this ok for trunk?



Ok.
Thanks,
Kyrill


Best regards,

Thomas


diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h
index 95bae5ef57ba4c433c0cce8e0c197959abdf887b..5cee7718554886982f535da2e9baa5015da609e4 100644
--- a/gcc/config/arm/arm-protos.h
+++ b/gcc/config/arm/arm-protos.h
@@ -351,50 +351,51 @@ extern bool arm_is_constant_pool_ref (rtx);
 /* Flags used to identify the presence of processor capabilities.  */
 
 /* Bit values used to identify processor capabilities.  */
-#define FL_NONE	  (0)	  /* No flags.  */
-#define FL_ANY	  (0x)/* All flags.  */
-#define FL_CO_PROC(1 << 0)/* Has external co-processor bus */
-#define FL_ARCH3M (1 << 1)/* Extended multiply */
-#define FL_MODE26 (1 << 2)/* 26-bit mode support */
-#define FL_MODE32 (1 << 3)/* 32-bit mode support */
-#define FL_ARCH4  (1 << 4)/* Architecture rel 4 */
-#define FL_ARCH5  (1 << 5)/* Architecture rel 5 */
-#define FL_THUMB  (1 << 6)/

[arm-embedded] [PATCH, GCC/ARM] Fix PR77933: stack corruption on ARM when using high registers and lr

2016-11-21 Thread Thomas Preudhomme


Hi,

We have decided to backport this patch to fix a stack corruption when using lr 
and high registers to our embedded-6-branch.


*** gcc/ChangeLog.arm ***

2016-11-08  Thomas Preud'homme  

Backport from mainline
2016-11-08  Thomas Preud'homme  

PR target/77933
* config/arm/arm.c (thumb1_expand_prologue): Distinguish between lr
being live in the function and lr needing to be saved.  Distinguish
between already saved pushable registers and registers to push.
Check for LR being an available pushable register.


*** gcc/testsuite/ChangeLog.arm ***

2016-11-08  Thomas Preud'homme  

Backport from mainline
2016-11-08  Thomas Preud'homme  

PR target/77933
* gcc.target/arm/pr77933-1.c: New test.
* gcc.target/arm/pr77933-2.c: Likewise.
--- Begin Message ---

Hi Kyrill,

I've committed the following updated patch where the test is restricted to Thumb 
execution mode and skipping it if not possible since -mtpcs-leaf-frame is only 
available in Thumb mode. I've considered the change obvious.


*** gcc/ChangeLog ***

2016-11-08  Thomas Preud'homme  

PR target/77933
* config/arm/arm.c (thumb1_expand_prologue): Distinguish between lr
being live in the function and lr needing to be saved.  Distinguish
between already saved pushable registers and registers to push.
Check for LR being an available pushable register.


*** gcc/testsuite/ChangeLog ***

2016-11-08  Thomas Preud'homme  

PR target/77933
* gcc.target/arm/pr77933-1.c: New test.
* gcc.target/arm/pr77933-2.c: Likewise.

Best regards,

Thomas

On 17/11/16 10:04, Kyrill Tkachov wrote:


On 09/11/16 16:41, Thomas Preudhomme wrote:

I've reworked the patch following comments from Wilco [1] (sorry could not
find it in my MUA for some reason).

[1] https://gcc.gnu.org/ml/gcc-patches/2016-11/msg00317.html


== Context ==

When saving registers, function thumb1_expand_prologue () aims at minimizing
the number of push instructions. One of the optimization it does is to push LR
alongside high register(s) (after having moved them to low register(s)) when
there is no low register to save. The way this is implemented is to add LR to
the pushable_regs mask if it is live just before pushing the registers in that
mask. The mask of live pushable registers which is used to detect whether LR
needs to be saved is then clear to ensure LR is only saved once.


== Problem ==

However beyond deciding what register to push pushable_regs is used to track
what pushable register can be used to move a high register before being
pushed, hence the name. That mask is cleared when all high registers have been
assigned a low register but the clearing assumes the high registers were
assigned to the registers with the biggest number in that mask. This is not
the case because LR is not considered when looking for a register in that
mask. Furthermore, LR might have been saved in the TARGET_BACKTRACE path above
yet the mask of live pushable registers is not cleared in that case.


== Solution ==

This patch changes the loop to iterate over register LR to r0 so as to both
fix the stack corruption reported in PR77933 and reuse lr to push some high
register when possible. This patch also introduce a new variable
lr_needs_saving to record whether LR (still) needs to be saved at a given
point in code and sets the variable accordingly throughout the code, thus
fixing the second issue. Finally, this patch create a new push_mask variable
to distinguish between the mask of registers to push and the mask of live
pushable registers.


== Note ==

Other bits could have been improved but have been left out to allow the patch
to be backported to stable branch:

(1) using argument registers that are not holding an argument
(2) using push_mask consistently instead of l_mask (in TARGET_BACKTRACE), mask
(low register push) and push_mask
(3) the !l_mask case improved in TARGET_BACKTRACE since offset == 0
(4) rename l_mask to a more appropriate name (live_pushable_regs_mask?)

ChangeLog entry are as follow:

*** gcc/ChangeLog ***

2016-11-08  Thomas Preud'homme  

PR target/77933
* config/arm/arm.c (thumb1_expand_prologue): Distinguish between lr
being live in the function and lr needing to be saved. Distinguish
between already saved pushable registers and registers to push.
Check for LR being an available pushable register.


*** gcc/testsuite/ChangeLog ***

2016-11-08  Thomas Preud'homme  

PR target/77933
* gcc.target/arm/pr77933-1.c: New test.
* gcc.target/arm/pr77933-2.c: Likewise.


Testing: no regression on arm-none-eabi GCC cross-compiler targeting Cortex-M0

Is this ok for trunk?



Ok.
Thanks,
Kyrill


Best regards,

Thomas

On 02/11/16 17:08, Thomas Preudhomme wrote:

Hi,

When saving registers, function thumb1_expand_prologue () aims at minimizing the
number of push instructions. One

[arm-embedded] [PATCH, GCC/ARM] Fix ICE when compiling empty FIQ interrupt handler in ARM mode

2016-11-21 Thread Thomas Preudhomme


Hi,

We have decided to backport this patch to fix a testsuite failure in newly added 
test empty_fiq_handler to our embedded-6-branch.


*** gcc/testsuite/ChangeLog.arm ***

2016-11-21  Thomas Preud'homme  

Backport to mainline
2016-11-21  Thomas Preud'homme  

* gcc.target/arm/empty_fiq_handler.c: Skip if -mthumb is passed in and
target is Thumb-only.
--- Begin Message ---

On 17/11/16 20:04, Thomas Preudhomme wrote:

Hi Christophe,

On 17/11/16 13:36, Christophe Lyon wrote:

On 16 November 2016 at 10:39, Kyrill Tkachov
 wrote:


On 09/11/16 16:19, Thomas Preudhomme wrote:


Hi,

This patch fixes the following ICE when building when compiling an empty
FIQ interrupt handler in ARM mode:

empty_fiq_handler.c:5:1: error: insn does not satisfy its constraints:
 }
 ^

(insn/f 13 12 14 (set (reg/f:SI 13 sp)
(plus:SI (reg/f:SI 11 fp)
(const_int 4 [0x4]))) irq.c:5 4 {*arm_addsi3}
 (expr_list:REG_CFA_ADJUST_CFA (set (reg/f:SI 13 sp)
(plus:SI (reg/f:SI 11 fp)
(const_int 4 [0x4])))
(nil)))

The ICE was provoked by missing an alternative to reflect that ARM mode
can do an add of general register into sp which is unpredictable in Thumb
mode add immediate.

ChangeLog entries are as follow:

*** gcc/ChangeLog ***

2016-11-04  Thomas Preud'homme  

* config/arm/arm.md (arm_addsi3): Add alternative for addition of
general register with general register or ARM constant into SP
register.


*** gcc/testsuite/ChangeLog ***

2016-11-04  Thomas Preud'homme  

* gcc.target/arm/empty_fiq_handler.c: New test.


Testing: bootstrapped on ARMv7-A ARM mode & testsuite shows no regression.

Is this ok for trunk?



I see that "r" does not include the stack pointer (STACK_REG is separate
from GENERAL_REGs) so we are indeed missing
that constraint.

Ok for trunk.
Thanks,
Kyrill


Best regards,

Thomas





Hi Thomas,

The new test fails when compiled with -mthumb -march=armv5t:
gcc.target/arm/empty_fiq_handler.c: In function 'fiq_handler':
gcc.target/arm/empty_fiq_handler.c:11:1: error: interrupt Service
Routines cannot be coded in Thumb mode


Right, interrupt handler can only be compiled in the mode where the CPU boots.
So for non Thumb-only targets it should be compiled with -marm. I'll push a
patch tomorrow.


I've committed the following patch as obvious:

Interrupt handlers on ARM can only be compiled in the execution mode where the 
processor boot. That is -mthumb for Thumb-only devices, -marm otherwise. This 
changes the empty_fiq_handler to skip the test when -mthumb is passed but the 
processor boot in ARM mode.


ChangeLog entry is as follows:

*** gcc/testsuite/ChangeLog ***

2016-11-17  Thomas Preud'homme  

* gcc.target/arm/empty_fiq_handler.c: Skip if -mthumb is passed in and
target is Thumb-only.

Best regards,

Thomas
diff --git a/gcc/testsuite/gcc.target/arm/empty_fiq_handler.c b/gcc/testsuite/gcc.target/arm/empty_fiq_handler.c
index bbcfd0e32f9d0cc60c8a013fd1bb584b21aaad16..8313f2199122be153a737946e817a5e3bee60372 100644
--- a/gcc/testsuite/gcc.target/arm/empty_fiq_handler.c
+++ b/gcc/testsuite/gcc.target/arm/empty_fiq_handler.c
@@ -1,4 +1,5 @@
 /* { dg-do compile } */
+/* { dg-skip-if "" { ! arm_cortex_m } { "-mthumb" } } */
 
 /* Below code used to trigger an ICE due to missing constraints for
sp = fp + cst pattern.  */
--- End Message ---

[arm-embedded][PATCH, GCC/ARM] Fix ICE when compiling empty FIQ interrupt handler in ARM mode

2016-11-21 Thread Thomas Preudhomme


Hi,

We have decided to backport this patch to fix an internal compiler error when 
compiling an empty FIX interrupt handler to our embedded-6-branch.


*** gcc/ChangeLog.arm ***

2016-11-17  Thomas Preud'homme  

Backport from mainline
2016-11-16  Thomas Preud'homme  

* config/arm/arm.md (arm_addsi3): Add alternative for addition of
general register with general register or ARM constant into SP
register.


*** gcc/testsuite/ChangeLog.arm ***

2016-11-17  Thomas Preud'homme  

Backport from mainline
2016-11-16  Thomas Preud'homme  

* gcc.target/arm/empty_fiq_handler.c: New test.
--- Begin Message ---

Hi,

This patch fixes the following ICE when building when compiling an empty FIQ 
interrupt handler in ARM mode:


empty_fiq_handler.c:5:1: error: insn does not satisfy its constraints:
 }
 ^

(insn/f 13 12 14 (set (reg/f:SI 13 sp)
(plus:SI (reg/f:SI 11 fp)
(const_int 4 [0x4]))) irq.c:5 4 {*arm_addsi3}
 (expr_list:REG_CFA_ADJUST_CFA (set (reg/f:SI 13 sp)
(plus:SI (reg/f:SI 11 fp)
(const_int 4 [0x4])))
(nil)))

The ICE was provoked by missing an alternative to reflect that ARM mode can do 
an add of general register into sp which is unpredictable in Thumb mode add 
immediate.


ChangeLog entries are as follow:

*** gcc/ChangeLog ***

2016-11-04  Thomas Preud'homme  

* config/arm/arm.md (arm_addsi3): Add alternative for addition of
general register with general register or ARM constant into SP
register.


*** gcc/testsuite/ChangeLog ***

2016-11-04  Thomas Preud'homme  

* gcc.target/arm/empty_fiq_handler.c: New test.


Testing: bootstrapped on ARMv7-A ARM mode & testsuite shows no regression.

Is this ok for trunk?

Best regards,

Thomas
diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index 8393f65bcf4c9c3e61b91e5adcd5f59ff7c6ec3f..70cd31f6cb176fe29efc1fbbf692bfc270ef5a1b 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -609,9 +609,9 @@
 ;;  (plus (reg rN) (reg sp)) into (reg rN).  In this case reload will
 ;; put the duplicated register first, and not try the commutative version.
 (define_insn_and_split "*arm_addsi3"
-  [(set (match_operand:SI  0 "s_register_operand" "=rk,l,l ,l ,r ,k ,r,r ,k ,r ,k,k,r ,k ,r")
-(plus:SI (match_operand:SI 1 "s_register_operand" "%0 ,l,0 ,l ,rk,k ,r,rk,k ,rk,k,r,rk,k ,rk")
- (match_operand:SI 2 "reg_or_int_operand" "rk ,l,Py,Pd,rI,rI,k,Pj,Pj,L ,L,L,PJ,PJ,?n")))]
+  [(set (match_operand:SI  0 "s_register_operand" "=rk,l,l ,l ,r ,k ,r,k ,r ,k ,r ,k,k,r ,k ,r")
+	(plus:SI (match_operand:SI 1 "s_register_operand" "%0 ,l,0 ,l ,rk,k ,r,r ,rk,k ,rk,k,r,rk,k ,rk")
+		 (match_operand:SI 2 "reg_or_int_operand" "rk ,l,Py,Pd,rI,rI,k,rI,Pj,Pj,L ,L,L,PJ,PJ,?n")))]
   "TARGET_32BIT"
   "@
add%?\\t%0, %0, %2
@@ -621,6 +621,7 @@
add%?\\t%0, %1, %2
add%?\\t%0, %1, %2
add%?\\t%0, %2, %1
+   add%?\\t%0, %1, %2
addw%?\\t%0, %1, %2
addw%?\\t%0, %1, %2
sub%?\\t%0, %1, #%n2
@@ -640,10 +641,10 @@
 		  operands[1], 0);
   DONE;
   "
-  [(set_attr "length" "2,4,4,4,4,4,4,4,4,4,4,4,4,4,16")
+  [(set_attr "length" "2,4,4,4,4,4,4,4,4,4,4,4,4,4,4,16")
(set_attr "predicable" "yes")
-   (set_attr "predicable_short_it" "yes,yes,yes,yes,no,no,no,no,no,no,no,no,no,no,no")
-   (set_attr "arch" "t2,t2,t2,t2,*,*,*,t2,t2,*,*,a,t2,t2,*")
+   (set_attr "predicable_short_it" "yes,yes,yes,yes,no,no,no,no,no,no,no,no,no,no,no,no")
+   (set_attr "arch" "t2,t2,t2,t2,*,*,*,a,t2,t2,*,*,a,t2,t2,*")
(set (attr "type") (if_then_else (match_operand 2 "const_int_operand" "")
 		  (const_string "alu_imm")
 		  (const_string "alu_sreg")))
diff --git a/gcc/testsuite/gcc.target/arm/empty_fiq_handler.c b/gcc/testsuite/gcc.target/arm/empty_fiq_handler.c
new file mode 100644
index ..bbcfd0e32f9d0cc60c8a013fd1bb584b21aaad16
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/empty_fiq_handler.c
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+
+/* Below code used to trigger an ICE due to missing constraints for
+   sp = fp + cst pattern.  */
+
+void fiq_handler (void) __attribute__((interrupt ("FIQ")));
+
+void
+fiq_handler (void)
+{
+}
--- End Message ---

[PATCH, PING] Support ASan ODR indicators at compiler side.

2016-11-21 Thread Maxim Ostapenko


Hi,

I would like to ping the following patch: 
https://gcc.gnu.org/ml/gcc-patches/2016-11/msg01300.html.


-Maxim

Re: [PATCH 2/2 v3] PR77822

2016-11-21 Thread Dominik Vogt

On Thu, Nov 17, 2016 at 04:54:17PM +0100, Dominik Vogt wrote:
> On Thu, Nov 17, 2016 at 04:53:03PM +0100, Dominik Vogt wrote:
> > The following two patches fix PR 77822 on s390x for gcc-7.  As the
> > macro doing the argument range checks can be used on other targets
> > as well, I've put it in system.h (couldn't think of a better
> > place; maybe rtl.h?).
> > 
> > Bootstrapped on s390x biarch, regression tested on s390x biarch
> > and s390, all on a zEC12 with -march=zEC12.
> > 
> > Please check the commit messages for details.
> 
> S390 backend patch.

New version of the patch attached.

Changes since the first version of the patchset:

v2: 
Add s390 test cases. 
Support .cxx tests in s390.exp. 
Put all arguments of SIZE_POS_IN_RANGE in parentheses. 
Rewrite SIZE_POS_IN_RANGE macro to handle wrapping SIZE + POS. 
 
v3: 
Rename macro argument from UPPER back to RANGE. 


Ciao

Dominik ^_^  ^_^

-- 

Dominik Vogt
IBM Germany
gcc/ChangeLog

PR target/77822
* config/s390/s390.md ("extzv")
("*extzv")
("*extzvdi_lshiftrt")
("*_ior_and_sr_ze")
("*extract1bitdi")
("*insv", "*insv_rnsbg_noshift")
("*insv_rnsbg_srl", "*insv_mem_reg")
("*insvdi_mem_reghigh", "*insvdi_reg_imm"): Use SIZE_POS_IN_RANGE to
validate the arguments of zero_extract and sign_extract.
gcc/testsuite/ChangeLog

PR target/77822
* gcc.target/s390/s390.exp: Support .cxx tests.
* gcc.target/s390/pr77822-2.c: New test.
* gcc.target/s390/pr77822-1.cxx: New test.
>From 7cdadeef8e47baf66d938f80950dd364cc9371e8 Mon Sep 17 00:00:00 2001
From: Dominik Vogt 
Date: Thu, 17 Nov 2016 14:49:48 +0100
Subject: [PATCH 2/2] PR target/77822: S390: Validate argument range of
 {zero,sign}_extract.

With some undefined code, combine generates patterns where the arguments to
*_extract are out of range, e.b. a negative bit position.  If the s390 backend
accepts these, they lead to not just undefined behaviour but invalid assembly
instructions (argument out of the allowed range).  So this patch makes sure
that the rtl expressions with out of range arguments are rejected.
---
 gcc/config/s390/s390.md |  20 +-
 gcc/testsuite/gcc.target/s390/pr77822-1.cxx |  21 ++
 gcc/testsuite/gcc.target/s390/pr77822-2.c   | 307 
 gcc/testsuite/gcc.target/s390/s390.exp  |   8 +-
 4 files changed, 349 insertions(+), 7 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/s390/pr77822-1.cxx
 create mode 100644 gcc/testsuite/gcc.target/s390/pr77822-2.c

diff --git a/gcc/config/s390/s390.md b/gcc/config/s390/s390.md
index a449b03..d0664a8 100644
--- a/gcc/config/s390/s390.md
+++ b/gcc/config/s390/s390.md
@@ -3741,6 +3741,8 @@
  (clobber (reg:CC CC_REGNUM))])]
   "TARGET_Z10"
 {
+  if (! SIZE_POS_IN_RANGE (INTVAL (operands[2]), INTVAL (operands[3]), 64))
+FAIL;
   /* Starting with zEC12 there is risbgn not clobbering CC.  */
   if (TARGET_ZEC12)
 {
@@ -3760,7 +3762,9 @@
 (match_operand 2 "const_int_operand" "")   ; size
 (match_operand 3 "const_int_operand" ""))) ; start
   ]
-  ""
+  "
+   && SIZE_POS_IN_RANGE (INTVAL (operands[2]), INTVAL (operands[3]),
+GET_MODE_BITSIZE (mode))"
   "\t%0,%1,64-%2,128+63,%3+%2" ; dst, src, start, end, 
shift
   [(set_attr "op_type" "RIE")
(set_attr "z10prop" "z10_super_E1")])
@@ -3773,6 +3777,7 @@
(lshiftrt:DI (match_operand:DI 3 "register_operand" "d")
 (match_operand:DI 4 "nonzero_shift_count_operand" "")))]
   "
+   && SIZE_POS_IN_RANGE (INTVAL (operands[1]), INTVAL (operands[2]), 64)
&& 64 - UINTVAL (operands[4]) >= UINTVAL (operands[1])"
   "\t%0,%3,%2,%2+%1-1,128-%2-%1-%4"
   [(set_attr "op_type" "RIE")
@@ -3791,6 +3796,7 @@
  (match_operand 5 "const_int_operand" "")) ; start
 4)))]
   "
+   && SIZE_POS_IN_RANGE (INTVAL (operands[4]), INTVAL (operands[5]), 64)
&& UINTVAL (operands[2]) == (~(0ULL) << UINTVAL (operands[4]))"
   "\t%0,%3,64-%4,63,%4+%5"
   [(set_attr "op_type" "RIE")
@@ -3804,7 +3810,8 @@
(const_int 1)  ; size
(match_operand 2 "const_int_operand" "")) ; start
   (const_int 0)))]
-  ""
+  "
+   && SIZE_POS_IN_RANGE (1, INTVAL (operands[2]), 64)"
   "\t%0,%1,64-1,128+63,%2+1" ; dst, src, start, end, shift
   [(set_attr "op_type" "RIE")
(set_attr "z10prop" "z10_super_E1")])
@@ -3919,6 +3926,8 @@
  (match_operand 2 "const_int_operand""I")) ; pos
(match_operand:GPR 3 "nonimmediate_operand" "d"))]
   "
+   && SIZE_POS_IN_RANGE (INTVAL (operands[1]), INTVAL (operands[2]),
+GET_MODE_BITSIZE (mode))
&& (INTVAL (operands[1]) + INTVAL (operands[2])) <= "
   "\t%0,%3,%2,%2+%1-1,-%2-%1"
   [(set_attr "op_type" "RIE")
@@ -4214,6 +4223,7 @@
  (match_operand:DI 3 "nonimmediate_operand" "d")))
(clobber (reg

Re: [PATCH 1/2 v3] PR77822

2016-11-21 Thread Dominik Vogt

On Fri, Nov 18, 2016 at 04:29:18PM +0100, Dominik Vogt wrote:
> On Fri, Nov 18, 2016 at 08:02:08AM -0600, Segher Boessenkool wrote:
> > On Fri, Nov 18, 2016 at 01:09:24PM +0100, Dominik Vogt wrote:
> > > IN_RANGE(POS...) makes sure that POS is a non-negative number
> > > smaller than UPPER, so (UPPER) - (POS) does not wrap.  Or is there
> > > some case that the new macro does not handle?
> > 
> > I think it works fine.
> > 
> > With a name like UPPER, you may want to have SIZE_POS_IN_RANGE work like
> > IN_RANGE, i.e. UPPER is inclusive then.  Dunno.
> 
> Yeah, maybe rather call it RANGE to avoid too much similarity.
> Some name that is so vague that one has to read the documentation.
> I'll post an updated patch later.

Third version of the patch attached.  The only difference is that
the macro argument name has been changed back to RANGE.

Ciao

Dominik ^_^  ^_^

-- 

Dominik Vogt
IBM Germany
gcc/ChangeLog

PR target/77822
* system.h (SIZE_POS_IN_RANGE): New.
>From 8e02352c70d478c74155d5d127560da31363dd8a Mon Sep 17 00:00:00 2001
From: Dominik Vogt 
Date: Thu, 17 Nov 2016 14:49:18 +0100
Subject: [PATCH 1/2] PR target/77822: Add helper macro SIZE_POS_IN_RANGE to
 system.h.

The macro can be used to validate the arguments of zero_extract and
sign_extract to fix this problem:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77822
---
 gcc/system.h | 9 +
 1 file changed, 9 insertions(+)

diff --git a/gcc/system.h b/gcc/system.h
index 8c6127c..6c1228d 100644
--- a/gcc/system.h
+++ b/gcc/system.h
@@ -316,6 +316,15 @@ extern int errno;
   ((unsigned HOST_WIDE_INT) (VALUE) - (unsigned HOST_WIDE_INT) (LOWER) \
<= (unsigned HOST_WIDE_INT) (UPPER) - (unsigned HOST_WIDE_INT) (LOWER))
 
+/* A convenience macro to determine whether SIZE lies inclusively
+   within [1, RANGE], POS lies inclusively within between
+   [0, RANGE - 1] and the sum lies inclusively within [1, RANGE].
+   RANGE must be >= 1, but SIZE and POS may be negative.  */
+#define SIZE_POS_IN_RANGE(SIZE, POS, RANGE) \
+  (IN_RANGE ((POS), 0, (unsigned HOST_WIDE_INT) (RANGE) - 1) \
+   && IN_RANGE ((SIZE), 1, (unsigned HOST_WIDE_INT) (RANGE) \
+  - (unsigned HOST_WIDE_INT)(POS)))
+
 /* Infrastructure for defining missing _MAX and _MIN macros.  Note that
macros defined with these cannot be used in #if.  */
 
-- 
2.3.0

Re: [PATCH, GCC/ARM] Fix ICE when compiling empty FIQ interrupt handler in ARM mode

2016-11-21 Thread Thomas Preudhomme


On 17/11/16 20:04, Thomas Preudhomme wrote:

Hi Christophe,

On 17/11/16 13:36, Christophe Lyon wrote:

On 16 November 2016 at 10:39, Kyrill Tkachov
 wrote:


On 09/11/16 16:19, Thomas Preudhomme wrote:


Hi,

This patch fixes the following ICE when building when compiling an empty
FIQ interrupt handler in ARM mode:

empty_fiq_handler.c:5:1: error: insn does not satisfy its constraints:
 }
 ^

(insn/f 13 12 14 (set (reg/f:SI 13 sp)
(plus:SI (reg/f:SI 11 fp)
(const_int 4 [0x4]))) irq.c:5 4 {*arm_addsi3}
 (expr_list:REG_CFA_ADJUST_CFA (set (reg/f:SI 13 sp)
(plus:SI (reg/f:SI 11 fp)
(const_int 4 [0x4])))
(nil)))

The ICE was provoked by missing an alternative to reflect that ARM mode
can do an add of general register into sp which is unpredictable in Thumb
mode add immediate.

ChangeLog entries are as follow:

*** gcc/ChangeLog ***

2016-11-04  Thomas Preud'homme  

* config/arm/arm.md (arm_addsi3): Add alternative for addition of
general register with general register or ARM constant into SP
register.


*** gcc/testsuite/ChangeLog ***

2016-11-04  Thomas Preud'homme  

* gcc.target/arm/empty_fiq_handler.c: New test.


Testing: bootstrapped on ARMv7-A ARM mode & testsuite shows no regression.

Is this ok for trunk?



I see that "r" does not include the stack pointer (STACK_REG is separate
from GENERAL_REGs) so we are indeed missing
that constraint.

Ok for trunk.
Thanks,
Kyrill


Best regards,

Thomas





Hi Thomas,

The new test fails when compiled with -mthumb -march=armv5t:
gcc.target/arm/empty_fiq_handler.c: In function 'fiq_handler':
gcc.target/arm/empty_fiq_handler.c:11:1: error: interrupt Service
Routines cannot be coded in Thumb mode


Right, interrupt handler can only be compiled in the mode where the CPU boots.
So for non Thumb-only targets it should be compiled with -marm. I'll push a
patch tomorrow.


I've committed the following patch as obvious:

Interrupt handlers on ARM can only be compiled in the execution mode where the 
processor boot. That is -mthumb for Thumb-only devices, -marm otherwise. This 
changes the empty_fiq_handler to skip the test when -mthumb is passed but the 
processor boot in ARM mode.


ChangeLog entry is as follows:

*** gcc/testsuite/ChangeLog ***

2016-11-17  Thomas Preud'homme  

* gcc.target/arm/empty_fiq_handler.c: Skip if -mthumb is passed in and
target is Thumb-only.

Best regards,

Thomas
diff --git a/gcc/testsuite/gcc.target/arm/empty_fiq_handler.c b/gcc/testsuite/gcc.target/arm/empty_fiq_handler.c
index bbcfd0e32f9d0cc60c8a013fd1bb584b21aaad16..8313f2199122be153a737946e817a5e3bee60372 100644
--- a/gcc/testsuite/gcc.target/arm/empty_fiq_handler.c
+++ b/gcc/testsuite/gcc.target/arm/empty_fiq_handler.c
@@ -1,4 +1,5 @@
 /* { dg-do compile } */
+/* { dg-skip-if "" { ! arm_cortex_m } { "-mthumb" } } */
 
 /* Below code used to trigger an ICE due to missing constraints for
sp = fp + cst pattern.  */

Re: PR61409: -Wmaybe-uninitialized false-positive with -O2

2016-11-21 Thread Andreas Schwab

On Nov 21 2016, Christophe Lyon  wrote:

> Since this commit (r242639), I've noticed regressions on arm targets:
>
>   - PASS now FAIL [PASS => FAIL]:
>
>   gcc.dg/uninit-pred-6_a.c warning (test for warnings, line 36)
>   gcc.dg/uninit-pred-6_b.c warning (test for warnings, line 42)
>   gcc.dg/uninit-pred-7_c.c (test for excess errors)
>   gcc.dg/uninit-pred-7_c.c warning (test for warnings, line 29)
>
>   - FAIL appears  [ => FAIL]:
>
>   gcc.dg/uninit-pred-7_c.c (internal compiler error)

Same failures on m68k.

Andreas.

-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."

Handle sibcalls with aggregate returns

2016-11-21 Thread Richard Sandiford

We treated this g as a sibling call to f:

  int f (int);
  int g (void) { return f (1); }

but not this one:

  struct s { int i; };
  struct s f (int);
  struct s g (void) { return f (1); }

We treated them both as sibcalls on x86 before the first patch for PR36326,
so I suppose this is a regression of sorts from 4.3.

The patch allows function returns to be local aggregate variables as well
as gimple registers.

Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?

Thanks,
Richard


gcc/
* tree-tailcall.c (process_assignment): Simplify the check for
a valid copy, allowing the source to be a local variable as
well as an SSA name.
(find_tail_calls): Allow copies between local variables to follow
the call.  Allow the result to be stored in any local variable,
even if it's an aggregate.
(eliminate_tail_call): Check whether the result is an SSA name
before updating its SSA_NAME_DEF_STMT.

gcc/testsuite/
* gcc.dg/tree-ssa/tailcall-7.c: New test.

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/tailcall-7.c 
b/gcc/testsuite/gcc.dg/tree-ssa/tailcall-7.c
new file mode 100644
index 000..eabf1a8
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/tailcall-7.c
@@ -0,0 +1,89 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-tailc-details" } */
+
+struct s { int x; };
+struct s f (int);
+struct s global;
+void callit (void (*) (void));
+
+/* Tail call.  */
+void
+g1 (void)
+{
+  f (1);
+}
+
+/* Not a tail call.  */
+void
+g2 (void)
+{
+  global = f (2);
+}
+
+/* Not a tail call.  */
+void
+g3 (struct s *ptr)
+{
+  *ptr = f (3);
+}
+
+/* Tail call.  */
+struct s
+g4 (struct s param)
+{
+  param = f (4);
+  return param;
+}
+
+/* Tail call.  */
+struct s
+g5 (void)
+{
+  struct s local = f (5);
+  return local;
+}
+
+/* Tail call.  */
+struct s
+g6 (void)
+{
+  return f (6);
+}
+
+/* Not a tail call.  */
+struct s
+g7 (void)
+{
+  struct s local = f (7);
+  global = local;
+  return local;
+}
+
+/* Not a tail call.  */
+struct s
+g8 (struct s *ptr)
+{
+  struct s local = f (8);
+  *ptr = local;
+  return local;
+}
+
+/* Not a tail call.  */
+int
+g9 (struct s param)
+{
+  void inner (void) { param = f (9); }
+  callit (inner);
+  return 9;
+}
+
+/* Tail call.  */
+int
+g10 (int param)
+{
+  void inner (void) { f (param); }
+  callit (inner);
+  return 10;
+}
+
+/* { dg-final { scan-tree-dump-times "Found tail call" 5 "tailc" } } */
diff --git a/gcc/tree-tailcall.c b/gcc/tree-tailcall.c
index 0436f0f..f97541d 100644
--- a/gcc/tree-tailcall.c
+++ b/gcc/tree-tailcall.c
@@ -269,7 +269,7 @@ process_assignment (gassign *stmt, gimple_stmt_iterator 
call, tree *m,
  conversions that can never produce extra code between the function
  call and the function return.  */
   if ((rhs_class == GIMPLE_SINGLE_RHS || gimple_assign_cast_p (stmt))
-  && (TREE_CODE (src_var) == SSA_NAME))
+  && src_var == *ass_var)
 {
   /* Reject a tailcall if the type conversion might need
 additional code.  */
@@ -287,9 +287,6 @@ process_assignment (gassign *stmt, gimple_stmt_iterator 
call, tree *m,
return false;
}
 
-  if (src_var != *ass_var)
-   return false;
-
   *ass_var = dest;
   return true;
 }
@@ -428,6 +425,13 @@ find_tail_calls (basic_block bb, struct tailcall **ret)
  break;
}
 
+  /* Allow simple copies between local variables, even if they're
+aggregates.  */
+  if (is_gimple_assign (stmt)
+ && auto_var_in_fn_p (gimple_assign_lhs (stmt), cfun->decl)
+ && auto_var_in_fn_p (gimple_assign_rhs1 (stmt), cfun->decl))
+   continue;
+
   /* If the statement references memory or volatile operands, fail.  */
   if (gimple_references_memory_p (stmt)
  || gimple_has_volatile_ops (stmt))
@@ -444,18 +448,20 @@ find_tail_calls (basic_block bb, struct tailcall **ret)
   return;
 }
 
-  /* If the LHS of our call is not just a simple register, we can't
- transform this into a tail or sibling call.  This situation happens,
- in (e.g.) "*p = foo()" where foo returns a struct.  In this case
- we won't have a temporary here, but we need to carry out the side
- effect anyway, so tailcall is impossible.
+  /* If the LHS of our call is not just a simple register or local
+ variable, we can't transform this into a tail or sibling call.
+ This situation happens, in (e.g.) "*p = foo()" where foo returns a
+ struct.  In this case we won't have a temporary here, but we need
+ to carry out the side effect anyway, so tailcall is impossible.
 
  ??? In some situations (when the struct is returned in memory via
  invisible argument) we could deal with this, e.g. by passing 'p'
  itself as that argument to foo, but it's too early to do this here,
  and expand_call() will not handle it anyway.  If it ever can, then
  we need to revisit this here, to allow that situation.  */
-  if (as

Re: [PATCH] Delete GCJ

2016-11-21 Thread Iain Sandoe


> On 20 Nov 2016, at 20:42, Matthias Klose  wrote:
> 
> On 10.10.2016 09:58, Iain Sandoe wrote:
>> 

>> The point here was to simplify the dependent configury so that it only needs 
>> to test something that the configuring user specifies (i.e. if they specify 
>> objc-gc, then they need also to specify the place that the gc lib can be 
>> found).
> 
> So here is the next proposal, I hope the added documentation in install.texi
> makes the usage clear.

thanks for working on this!

> 
> 
> 
> 2016-11-19  Matthias Klose  
> 
>   * Makefile.def: Remove reference to boehm-gc target module.
>   * configure.ac: Include pkg.m4, check for --with-target-bdw-gc
>   options and for the bdw-gc pkg-config module.
>   * configure: Regenerate.
>   * Makefile.in: Regenerate.


+AC_ARG_ENABLE(objc-gc,
+[AS_HELP_STRING([--enable-objc-gc],
+   [enable use of Boehm's garbage collector with the
+GNU Objective-C runtime])])
+AC_ARG_WITH([target-bdw-gc],
+[AS_HELP_STRING([--with-target-bdw-gc=PATHLIST],
+   [specify prefix directory for installed bdw-gc package.
+Equivalent to --with-bdw-gc-include=PATH/include
+plus --with-bdw-gc-lib=PATH/lib])])

missing “target” in the --with-bdw-gc-*

> 
> gcc/
> 
> 2016-11-19  Matthias Klose  
> 
>   * doc/install.texi: Document configure options --enable-objc-gc
>   and --with-target-bdw-gc.

As per Sandra’s comment,  should we understand the priority of options is

--with-target-bdw-gc-*

which overrides…

--with-target-bdw-gc=

which overrides automatic discovery using pkg_config?

Iain

Re: PR61409: -Wmaybe-uninitialized false-positive with -O2

2016-11-21 Thread Christophe Lyon

 Hi,


On 20 November 2016 at 17:36, Aldy Hernandez  wrote:
> On 11/16/2016 03:57 PM, Jeff Law wrote:
>>
>> On 11/02/2016 11:16 AM, Aldy Hernandez wrote:
>>>
>>> Hi Jeff.
>>>
>>> As discussed in the PR, here is a patch exploring your idea of ignoring
>>> unguarded uses if we can prove that the guards for such uses are
>>> invalidated by the uninitialized operand paths being executed.
>>>
>>> This is an updated patch from my suggestion in the PR.  It bootstraps
>>> with no regression on x86-64 Linux, and fixes the PR in question.
>>>
>>> As the "NOTE:" in the code states, we could be much smarter when
>>> invalidating predicates, but for now let's do straight negation which
>>> works for the simple case.  We could expand on this in the future.
>>>
>>> OK for trunk?
>>>
>>>
>>> curr
>>>
>>>
>>> commit 8375d7e28c1a798dd0cc0f487d7fa1068d9eb124
>>> Author: Aldy Hernandez 
>>> Date:   Thu Aug 25 10:44:29 2016 -0400
>>>
>>> PR middle-end/61409
>>> * tree-ssa-uninit.c (use_pred_not_overlap_with_undef_path_pred):
>>> Remove reference to missing NUM_PREDS in function comment.
>>> (can_one_predicate_be_invalidated_p): New.
>>> (can_chain_union_be_invalidated_p): New.
>>> (flatten_out_predicate_chains): New.
>>> (uninit_ops_invalidate_phi_use): New.
>>> (is_use_properly_guarded): Call uninit_ops_invalidate_phi_use.
>>
>> [ snip ]
>>
>>>
>>> +static bool
>>> +can_one_predicate_be_invalidated_p (pred_info predicate,
>>> +vec worklist)
>>> +{
>>> +  for (size_t i = 0; i < worklist.length (); ++i)
>>> +{
>>> +  pred_info *p = worklist[i];
>>> +
>>> +  /* NOTE: This is a very simple check, and only understands an
>>> + exact opposite.  So, [i == 0] is currently only invalidated
>>> + by [.NOT. i == 0] or [i != 0].  Ideally we should also
>>> + invalidate with say [i > 5] or [i == 8].  There is certainly
>>> + room for improvement here.  */
>>> +  if (pred_neg_p (predicate, *p))
>>
>> It's good enough for now.  I saw some other routines that might allow us
>> to handle more cases.  I'm OK with faulting those in if/when we see such
>> cases in real code.
>
>
> Ok.
>
>>
>>
>>> +
>>> +/* Return TRUE if executing the path to some uninitialized operands in
>>> +   a PHI will invalidate the use of the PHI result later on.
>>> +
>>> +   UNINIT_OPNDS is a bit vector specifying which PHI arguments have
>>> +   arguments which are considered uninitialized.
>>> +
>>> +   USE_PREDS is the pred_chain_union specifying the guard conditions
>>> +   for the use of the PHI result.
>>> +
>>> +   What we want to do is disprove each of the guards in the factors of
>>> +   the USE_PREDS.  So if we have:
>>> +
>>> +   # USE_PREDS guards of:
>>> +   #1. i > 5 && i < 100
>>> +   #2. j > 10 && j < 88
>>
>> Are USE_PREDS joined by an AND or IOR?  I guess based on their type it
>> must be IOR.   Thus to get to a use  #1 or #2 must be true.  So to prove
>> we can't reach a use, we have to prove that #1 and #2 are both not true.
>>  Right?
>
>
> IOR, so yes.
>
>>
>>
>>> +
>>> +static bool
>>> +uninit_ops_invalidate_phi_use (gphi *phi, unsigned uninit_opnds,
>>> +   pred_chain_union use_preds)
>>> +{
>>> +  /* Look for the control dependencies of all the uninitialized
>>> + operands and build predicates describing them.  */
>>> +  unsigned i;
>>> +  pred_chain_union uninit_preds[32];
>>> +  memset (uninit_preds, 0, sizeof (pred_chain_union) * 32);
>>> +  for (i = 0; i < MIN (32, gimple_phi_num_args (phi)); i++)
>>
>> Can you replace the magic "32" with a file scoped const or #define?  I
>> believe there's 2 existing uses of a magic "32" elsewhere in
>> tree-ssa-uninit.c as well.
>
>
> Done.
>
>>
>>> +
>>> +  /* Build the control dependency chain for `i'...  */
>>> +  if (compute_control_dep_chain (find_dom (e->src),
>>> + e->src,
>>> + dep_chains,
>>> + &num_chains,
>>> + &cur_chain,
>>> + &num_calls))
>>
>> Does this miss the control dependency carried by E itself.
>>
>> ie, if e->src ends in a conditional, shouldn't that conditional be
>> reflected in the control dependency chain as well?  I guess we'd have to
>> have the notion of computing the control dependency for an edge rather
>> than a block.  It doesn't look like compute_control_dep_chain is ready
>> for that.  I'm willing to put that into a "future work" bucket.
>
>
> Hmmm, probably not.  So yeah, let's put that in the future work bucket :).
>
>>
>> So I think just confirming my question about how USE_PREDS are joined at
>> the call to uninit_opts_invalidate_phi_use and fixing the magic 32 to be
>> a file scoped const or a #define and this is good to go on the trunk.
>
>
> I've retested with no regressions on x86-64 Linux.
>
> OK for trunk?
>
>

Since this commit (r242639), I've noticed regressions on arm targets:

  - PASS now FAIL [

Re: Fix PR78154

2016-11-21 Thread Richard Biener

On Fri, 18 Nov 2016, Prathamesh Kulkarni wrote:

> On 17 November 2016 at 15:24, Richard Biener  wrote:
> > On Thu, 17 Nov 2016, Prathamesh Kulkarni wrote:
> >
> >> On 17 November 2016 at 14:21, Richard Biener  wrote:
> >> > On Thu, 17 Nov 2016, Prathamesh Kulkarni wrote:
> >> >
> >> >> Hi Richard,
> >> >> Following your suggestion in PR78154, the patch checks if stmt
> >> >> contains call to memmove (and friends) in gimple_stmt_nonzero_warnv_p
> >> >> and returns true in that case.
> >> >>
> >> >> Bootstrapped+tested on x86_64-unknown-linux-gnu.
> >> >> Cross-testing on arm*-*-*, aarch64*-*-* in progress.
> >> >> Would it be OK to commit this patch in stage-3 ?
> >> >
> >> > As people noted we have returns_nonnull for this and that is already
> >> > checked.  So please make sure the builtins get this attribute instead.
> >> OK thanks, I will add the returns_nonnull attribute to the required
> >> string builtins.
> >> I noticed some of the string builtins don't have RET1 in builtins.def:
> >> strcat, strncpy, strncat have ATTR_NOTHROW_NONNULL_LEAF.
> >> Should they instead be having ATTR_RET1_NOTHROW_NONNULL_LEAF similar
> >> to entries for memmove, strcpy ?
> >
> > Yes, I think so.
> Hi,
> In the attached patch I added returns_nonnull attribute to
> ATTR_RET1_NOTHROW_NONNULL_LEAF,
> and changed few builtins like strcat, strncpy, strncat and
> corresponding _chk builtins to use ATTR_RET1_NOTHROW_NONNULL_LEAF.
> Does the patch look correct ?

Hmm, given you only change ATTR_RET1_NOTHROW_NONNULL_LEAF means that
the gimple_stmt_nonzero_warnv_p code is incomplete -- it should
infer returns_nonnull itself from RET1 (which is fnspec("1") basically)
and the nonnull attribute on the argument.  So

  unsigned rf = gimple_call_return_flags (stmt);
  if (rf & ERF_RETURNS_ARG)
   {
 tree arg = gimple_call_arg (stmt, rf & ERF_RETURN_ARG_MASK);
 if (range of arg is ! VARYING)
   use range of arg;
 else if (infer_nonnull_range_by_attribute (stmt, arg))
... nonnull ...

Richard.

> Thanks,
> Prathamesh
> >
> > Richard.
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)

Re: [PATCH PR78114]Refine gfortran.dg/vect/fast-math-mgrid-resid.f

2016-11-21 Thread Richard Biener

On Fri, Nov 18, 2016 at 5:57 PM, Bin.Cheng  wrote:
> On Fri, Nov 18, 2016 at 4:52 PM, Michael Matz  wrote:
>> Hi,
>>
>> On Thu, 17 Nov 2016, Bin.Cheng wrote:
>>
>>> B) Depending on ilp, I think below test strings fail for long time with 
>>> haswell:
>>> ! { dg-final { scan-tree-dump-times "Executing predictive commoning
>>> without unrolling" 1 "pcom" { target lp64 } } }
>>> ! { dg-final { scan-tree-dump-times "Executing predictive commoning
>>> without unrolling" 2 "pcom" { target ia32 } } }
>>> Because vectorizer choose vf==4 in this case, and there is no
>>> predictive commoning opportunities at all.
>>> Also the newly added test string fails in this case too because the
>>> prolog peeled iterates more than 1 times.
>>
>> Btw, this probably means that on haswell (or other archs with vf==4) mgrid
>> is slower than necessary.  On mgrid you really really want predictive
>> commoning to happen.  Vectorization isn't that interesting there.
> Interesting, I will check if there is difference between 2/4 vf.  we
> do have cases that smaller vf is better and should be chosen, though
> for different reasons.

At some time in the past we had predictive commoning done before
vectorization (GCC 4.3 at least).

Patch is ok meanwhile.

Richard.

> Thanks,
> bin
>>
>>
>> Ciao,
>> Michael.

Re: [fixincludes] Fix macOS 10.12 and (PR sanitizer/78267)

2016-11-21 Thread Rainer Orth

Hi Bruce,

> On Fri, Nov 18, 2016 at 9:42 AM, Mike Stump  wrote:
>> On Nov 18, 2016, at 2:45 AM, Rainer Orth  
>> wrote:
>>> So the current suggestion is to combine my fixincludes patch and Jack's
>>> patch to disable  use if !__BLOCKS__.
>>
>>> I guess this is ok for mainline now to restore bootstrap?
>>
>> I think we are down to everyone likes this, Ok.  The big question is,
>> does this need a back port?
>
> My thinking on that subject is that since include fixes do not directly affect
> the compiler and, instead, facilitate its use on various platforms, it
> is therefore
> reasonably safe to back port.  Especially if adequate guards (selection tests)
> are included in the fix to prevent it from taking action on the wrong files.
> But I also think it is really a "steering committee" decision.
>
> For me, I think it is safe enough.

Agreed: especially for the WIP 10.10 and 10.11 fixes I've found
instances where people had stumbled across incompatibilities in contexts
other than libsanitizer.  So the fixes would improve their user
experience regardless :-)

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University

Re: [fixincludes] Fix macOS 10.12 and (PR sanitizer/78267)

2016-11-21 Thread Rainer Orth

Hi Mike,

> On Nov 18, 2016, at 2:45 AM, Rainer Orth  
> wrote:
>> So the current suggestion is to combine my fixincludes patch and Jack's
>> patch to disable  use if !__BLOCKS__.
>
>> I guess this is ok for mainline now to restore bootstrap?
>
> I think we are down to everyone likes this, Ok.  The big question is, does
> this need a back port?

while they are not necessary to fix the libsanitizer bootstrap failure,
they fix genuine header problems, so I think they're desirable.  My plan
has been to look into the problems discovered in 10.10 and 10.11 headers
while developing this one, get them into mainline as time permits and
backport the whole bunch afterwards.

> If you fix includes virtual members or data members of C/C++ classes, just
> be careful disappearing content as that can change the layout of the
> structure or class.

Understood.  So far, the fixes have just removed attributes not
supported by GCC, or types using such extensions and function
declarations using them.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University

Re: [Aarch64][PATCH] Improve Logical And Immediate Expressions

2016-11-21 Thread James Greenhalgh

On Fri, Nov 18, 2016 at 07:41:58AM +, Michael Collison wrote:
> James,
> 
> I incorporated all your suggestions, and successfully bootstrapped and re-ran
> the testsuite.
> 
> Okay for trunk?
> 
> 2016-11-18  Michael Collison  
> 
>   * config/aarch64/aarch64-protos.h
>   (aarch64_and_split_imm1, aarch64_and_split_imm2)
>   (aarch64_and_bitmask_imm): New prototypes
>   * config/aarch64/aarch64.c (aarch64_and_split_imm1):
>   New overloaded function to create bit mask covering the
>   lowest to highest bits set.
>   (aarch64_and_split_imm2): New overloaded functions to create bit
>   mask of zeros between first and last bit set.
>   (aarch64_and_bitmask_imm): New function to determine if a integer
>   is a valid two instruction "and" operation.
>   * config/aarch64/aarch64.md:(and3): New define_insn and _split
>   allowing wider range of constants with "and" operations.
>   * (ior3, xor3): Use new LOGICAL2 iterator to prevent
>   "and" operator from matching restricted constant range used for
>   ior and xor operators.
>   * config/aarch64/constraints.md (UsO constraint): New SImode constraint
>   for constants in "and" operantions.
>   (UsP constraint): New DImode constraint for constants in "and" 
> operations.
>   * config/aarch64/iterators.md (lconst2): New mode iterator.
>   (LOGICAL2): New code iterator.
>   * config/aarch64/predicates.md (aarch64_logical_and_immediate): New
>   predicate
>   (aarch64_logical_and_operand): New predicate allowing extended constants
>   for "and" operations.
>   * testsuite/gcc.target/aarch64/and_const.c: New test to verify
>   additional constants are recognized and fewer instructions generated.
>   * testsuite/gcc.target/aarch64/and_const2.c: New test to verify
>   additional constants are recognized and fewer instructions generated.
> 

> diff --git a/gcc/config/aarch64/aarch64-protos.h 
> b/gcc/config/aarch64/aarch64-protos.h
> index 3cdd69b..7ef8cdf 100644
> --- a/gcc/config/aarch64/aarch64-protos.h
> +++ b/gcc/config/aarch64/aarch64-protos.h
> @@ -300,6 +300,9 @@ extern struct tune_params aarch64_tune_params;
>  HOST_WIDE_INT aarch64_initial_elimination_offset (unsigned, unsigned);
>  int aarch64_get_condition_code (rtx);
>  bool aarch64_bitmask_imm (HOST_WIDE_INT val, machine_mode);
> +unsigned HOST_WIDE_INT aarch64_and_split_imm1 (HOST_WIDE_INT val_in);
> +unsigned HOST_WIDE_INT aarch64_and_split_imm2 (HOST_WIDE_INT val_in);
> +bool aarch64_and_bitmask_imm (unsigned HOST_WIDE_INT val_in, machine_mode 
> mode);
>  int aarch64_branch_cost (bool, bool);
>  enum aarch64_symbol_type aarch64_classify_symbolic_expression (rtx);
>  bool aarch64_const_vec_all_same_int_p (rtx, HOST_WIDE_INT);
> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
> index 3e663eb..8e33c40 100644
> --- a/gcc/config/aarch64/aarch64.c
> +++ b/gcc/config/aarch64/aarch64.c
> @@ -3600,6 +3600,43 @@ aarch64_bitmask_imm (HOST_WIDE_INT val_in, 
> machine_mode mode)
>return val == mask * bitmask_imm_mul[__builtin_clz (bits) - 26];
>  }
>  
> +/* Create mask of ones, covering the lowest to highest bits set in VAL_IN.  
> */
> +
> +unsigned HOST_WIDE_INT
> +aarch64_and_split_imm1 (HOST_WIDE_INT val_in)
> +{
> +  int lowest_bit_set = ctz_hwi (val_in);
> +  int highest_bit_set = floor_log2 (val_in);
> +  gcc_assert (val_in != 0);

The comment above the function should make this precondition clear. Or you
could pick a well defined behaviour for when val_in == 0 (return 0?), if
that fits the rest of the algorithm.

Otherwise, this patch looks OK to me.

Thanks,
James

Re: [PATCH] Fix PR78413

2016-11-21 Thread Richard Biener

On Fri, Nov 18, 2016 at 5:27 PM, Bill Schmidt
 wrote:
> Hi,
>
> The if-conversion patch for PR77848 missed a case where an outer loop
> should not be versioned for vectorization; this was caught by an assert
> in tests recorded in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78413.
> This patch fixes the problem by ensuring that both inner and outer loop
> latches have a single predecessor before versioning an outer loop.
>
> Bootstrapped and tested on powerpc64le-unknown-linux-gnu with no
> regressions.  Is this ok for trunk?

Ok (with the suggested testcase adjustment).

Maybe as a followup we can factor out a common predicate usable by the
vectorizer and if-conversion.  I'll see if I can do that.

Thanks,
Richard.

> Thanks,
> Bill
>
>
> [gcc]
>
> 2016-11-18  Bill Schmidt  
>
> PR tree-optimization/78413
> * tree-if-conv.c (versionable_outer_loop_p): Require that both
> inner and outer loop latches have single predecessors.
>
> [gcc/testsuite]
>
> 2016-11-18  Bill Schmidt  
>
> PR tree-optimization/78413
> * gcc.dg/tree-ssa/pr78413.c: New test.
>
>
> Index: gcc/testsuite/gcc.dg/tree-ssa/pr78413.c
> ===
> --- gcc/testsuite/gcc.dg/tree-ssa/pr78413.c (revision 0)
> +++ gcc/testsuite/gcc.dg/tree-ssa/pr78413.c (working copy)
> @@ -0,0 +1,35 @@
> +/* PR78413.  These previously failed in tree if-conversion due to a loop
> +   latch with multiple predecessors that the code did not anticipate.  */
> +/* { dg-do compile } */
> +/* { dg-options "-O3 -ffast-math" } */
> +
> +extern long long int llrint(double x);
> +int a;
> +double b;
> +__attribute__((cold)) void decode_init() {
> +  int c, d = 0;
> +  for (; d < 12; d++) {
> +if (d)
> +  b = 0;
> +c = 0;
> +for (; c < 6; c++)
> +  a = b ? llrint(b) : 0;
> +  }
> +}
> +
> +struct S {
> +  _Bool bo;
> +};
> +int a, bb, c, d;
> +void fn1() {
> +  do
> +do
> +  do {
> +   struct S *e = (struct S *)1;
> +   do
> + bb = a / (e->bo ? 2 : 1);
> +   while (bb);
> +  } while (0);
> +while (d);
> +  while (c);
> +}
> Index: gcc/tree-if-conv.c
> ===
> --- gcc/tree-if-conv.c  (revision 242551)
> +++ gcc/tree-if-conv.c  (working copy)
> @@ -2575,6 +2575,8 @@ version_loop_for_if_conversion (struct loop *loop)
>  - The loop has a single exit.
>  - The loop header has a single successor, which is the inner
>loop header.
> +- Each of the inner and outer loop latches have a single
> +  predecessor.
>  - The loop exit block has a single predecessor, which is the
>inner loop's exit block.  */
>
> @@ -2586,7 +2588,9 @@ versionable_outer_loop_p (struct loop *loop)
>|| loop->inner->next
>|| !single_exit (loop)
>|| !single_succ_p (loop->header)
> -  || single_succ (loop->header) != loop->inner->header)
> +  || single_succ (loop->header) != loop->inner->header
> +  || !single_pred_p (loop->latch)
> +  || !single_pred_p (loop->inner->latch))
>  return false;
>
>basic_block outer_exit = single_pred (loop->latch);
>

Re: [fixincludes, v3] Don't define libstdc++-internal macros in Solaris 10+

2016-11-21 Thread Rainer Orth

Hi Jonathan,

>>Ok for mainline now and the backports after some soak time?
>
> Yes, the libstdc++ parts are OK, thanks.

I assume Bruce is ok with the change to the hpux11_fabsf fix given that it
was suggested by the HP-UX maintainer and fixes fixincludes make check ;-)

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University

Re: [wwwdocs][PATCH][ARM] Add -mpure-code to changes.html

2016-11-21 Thread Kyrill Tkachov



On 21/11/16 09:40, Andre Vieira (lists) wrote:

Hi,

I added the description of the new ARM -mpure-code option to changes.html.

Is this OK?

Cheers,
Andre


Ok.
Thanks,
Kyrill

[wwwdocs][PATCH][ARM] Add -mpure-code to changes.html

2016-11-21 Thread Andre Vieira (lists)

Hi,

I added the description of the new ARM -mpure-code option to changes.html.

Is this OK?

Cheers,
Andre
Index: htdocs/gcc-7/changes.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-7/changes.html,v
retrieving revision 1.24
diff -u -r1.24 changes.html
--- htdocs/gcc-7/changes.html	9 Nov 2016 14:28:59 -	1.24
+++ htdocs/gcc-7/changes.html	17 Nov 2016 09:31:22 -
@@ -330,6 +330,12 @@
-mcpu=cortex-m33 and -mtune=cortex-m33
options.
  
+ 
+   A new command line option -mpure-code has been added.
+   It does not allow constant data to be placed in code sections.
+   This option is only available when generating non-pic code for ARMv7-M
+   targets.
+ 

 
 AVR

Re: PR78153

2016-11-21 Thread Richard Biener

On Sun, 20 Nov 2016, Prathamesh Kulkarni wrote:

> Hi,
> As suggested by Martin in PR78153 strlen's return value cannot exceed
> PTRDIFF_MAX.
> So I set it's range to [0, PTRDIFF_MAX - 1] in extract_range_basic()
> in the attached patch.
> 
> However it regressed strlenopt-3.c:
> 
> Consider fn1() from strlenopt-3.c:
> 
> __attribute__((noinline, noclone)) size_t
> fn1 (char *p, char *q)
> {
>   size_t s = strlen (q);
>   strcpy (p, q);
>   return s - strlen (p);
> }
> 
> The optimized dump shows the following:
> 
> __attribute__((noclone, noinline))
> fn1 (char * p, char * q)
> {
>   size_t s;
>   size_t _7;
>   long unsigned int _9;
> 
>   :
>   s_4 = strlen (q_3(D));
>   _9 = s_4 + 1;
>   __builtin_memcpy (p_5(D), q_3(D), _9);
>   _7 = 0;
>   return _7;
> 
> }
> 
> which introduces the regression, because the test expects "return 0;" in 
> fn1().
> 
> The issue seems to be in vrp2:
> 
> Before the patch:
> Visiting statement:
> s_4 = strlen (q_3(D));
> Found new range for s_4: VARYING
> 
> Visiting statement:
> _1 = s_4;
> Found new range for _1: [s_4, s_4]
> marking stmt to be not simulated again
> 
> Visiting statement:
> _7 = s_4 - _1;
> Applying pattern match.pd:111, gimple-match.c:27997
> Match-and-simplified s_4 - _1 to 0
> Intersecting
>   [0, 0]
> and
>   [0, +INF]
> to
>   [0, 0]
> Found new range for _7: [0, 0]
> 
> __attribute__((noclone, noinline))
> fn1 (char * p, char * q)
> {
>   size_t s;
>   long unsigned int _1;
>   long unsigned int _9;
> 
>   :
>   s_4 = strlen (q_3(D));
>   _9 = s_4 + 1;
>   __builtin_memcpy (p_5(D), q_3(D), _9);
>   _1 = s_4;
>   return 0;
> 
> }
> 
> 
> After the patch:
> Visiting statement:
> s_4 = strlen (q_3(D));
> Intersecting
>   [0, 9223372036854775806]
> and
>   [0, 9223372036854775806]
> to
>   [0, 9223372036854775806]
> Found new range for s_4: [0, 9223372036854775806]
> marking stmt to be not simulated again
> 
> Visiting statement:
> _1 = s_4;
> Intersecting
>   [0, 9223372036854775806]  EQUIVALENCES: { s_4 } (1 elements)
> and
>   [0, 9223372036854775806]
> to
>   [0, 9223372036854775806]  EQUIVALENCES: { s_4 } (1 elements)
> Found new range for _1: [0, 9223372036854775806]
> marking stmt to be not simulated again
> 
> Visiting statement:
> _7 = s_4 - _1;
> Intersecting
>   ~[9223372036854775807, 9223372036854775809]
> and
>   ~[9223372036854775807, 9223372036854775809]
> to
>   ~[9223372036854775807, 9223372036854775809]
> Found new range for _7: ~[9223372036854775807, 9223372036854775809]
> marking stmt to be not simulated again
> 
> __attribute__((noclone, noinline))
> fn1 (char * p, char * q)
> {
>   size_t s;
>   long unsigned int _1;
>   size_t _7;
>   long unsigned int _9;
> 
>   :
>   s_4 = strlen (q_3(D));
>   _9 = s_4 + 1;
>   __builtin_memcpy (p_5(D), q_3(D), _9);
>   _1 = s_4;
>   _7 = s_4 - _1;
>   return _7;
> 
> }
> 
> Then forwprop4 turns
> _1 = s_4
> _7 = s_4 - _1
> into
> _7 = 0
> 
> and we end up with:
> _7 = 0
> return _7
> in optimized dump.
> 
> Running ccp again after forwprop4 trivially solves the issue, however
> I am not sure if we want to run ccp again ?
> 
> The issue is probably with extract_range_from_ssa_name():
> For _1 = s_4
> 
> Before patch:
> VR for s_4 is set to varying.
> So VR for _1 is set to [s_4, s_4] by extract_range_from_ssa_name.
> Since VR for _1 is [s_4, s_4] it implicitly implies that _1 is equal to s_4,
> and vrp is able to transform _7 = s_4 - _1 to _7 = 0 (by using
> match.pd pattern x - x -> 0).
> 
> After patch:
> VR for s_4 is set to [0, PTRDIFF_MAX - 1]
> And correspondingly VR for _1 is set to [0, PTRDIFF_MAX - 1]
> so IIUC, we then lose the information that _1 is equal to s_4,

We don't lose it, it's in its set of equivalencies.

> and vrp doesn't transform _7 = s_4 - _1 to _7 = 0.
> forwprop4 does that because it sees that s_4 and _1 are equivalent.
> Does this sound correct ?

Yes.  So the issue is really that vrp_visit_assignment_or_call calls
gimple_fold_stmt_to_constant_1 with vrp_valueize[_1] which when
we do not have a singleton VR_RANGE does not fall back to looking
at equivalences (there's not a good cheap way to do that currently because
VRP doesn't keep a proper copy lattice but simply IORs equivalences
from all equivalences).  In theory simply using the first set bit
might work.  Thus sth like

@@ -7057,6 +7030,12 @@ vrp_valueize (tree name)
  || is_gimple_min_invariant (vr->min))
  && vrp_operand_equal_p (vr->min, vr->max))
return vr->min;
+  else if (vr->equiv && ! bitmap_empty_p (vr->equiv))
+   {
+ unsigned num = bitmap_first_set_bit (vr->equiv);
+ if (num < SSA_NAME_VERSION (name))
+   return ssa_name (num);
+   }
 }
   return name;
 }

might work with the idea of simply doing canonicalization to one of
the equivalences.  But as we don't allow copies in the SSA def stmt
(via vrp_valueize_1) I'm not sure that's good enough canonicalization.

Richard.

Re: [PATCH][wwwdocs] Document new arm/aarch64 CPU support in a more compact way

2016-11-21 Thread James Greenhalgh

On Fri, Nov 18, 2016 at 10:27:10AM +, Kyrill Tkachov wrote:
> Hi all,
> 
> This patch brings the new CPU support announcements in line with the format
> used in the GCC 6 notes.  That is, rather than have a separate "The
>  is now supported via the..." entry for each new core just list
> them and give a use example with the -mcpu,-mtune options.
> 
> Ok to commit?

Looks good to me, for what it is worth.

Thanks,
James

> Index: htdocs/gcc-7/changes.html
> ===
> RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-7/changes.html,v
> retrieving revision 1.24
> diff -U 3 -r1.24 changes.html
> --- htdocs/gcc-7/changes.html 9 Nov 2016 14:28:59 -   1.24
> +++ htdocs/gcc-7/changes.html 16 Nov 2016 10:23:04 -
> @@ -287,14 +287,14 @@
> processing floating-point instructions.
>   
>   
> -   The ARM Cortex-A73 processor is now supported via the
> -   -mcpu=cortex-a73 and -mtune=cortex-a73
> -   options as well as the equivalent target attributes and pragmas.
> - 
> - 
> -   The Broadcom Vulcan processor is now supported via the
> -   -mcpu=vulcan and -mtune=vulcan options as
> -   well as the equivalent target attributes and pragmas.
> +   Support has been added for the following processors
> +   (GCC identifiers in parentheses): ARM Cortex-A73
> +   (cortex-a73) and Broadcom Vulcan (vulcan).
> +   The GCC identifiers can be used
> +   as arguments to the -mcpu or -mtune options,
> +   for example: -mcpu=cortex-a73 or
> +   -mtune=vulcan or as arguments to the equivalent target
> +   attributes and pragmas.
>   
> 
>  
> @@ -316,19 +316,14 @@
> armv8-m.main+dsp options.
>   
>   
> -   The ARM Cortex-A73 processor is now supported via the
> -   -mcpu=cortex-a73 and -mtune=cortex-a73
> -   options.
> - 
> - 
> -   The ARM Cortex-M23 processor is now supported via the
> -   -mcpu=cortex-m23 and -mtune=cortex-m23
> -   options.
> - 
> - 
> -   The ARM Cortex-M33 processor is now supported via the
> -   -mcpu=cortex-m33 and -mtune=cortex-m33
> -   options.
> +   Support has been added for the following processors
> +   (GCC identifiers in parentheses): ARM Cortex-A73
> +   (cortex-a73), ARM Cortex-M23 (cortex-m23) 
> and
> +   ARM Cortex-M33 (cortex-m33).
> +   The GCC identifiers can be used
> +   as arguments to the -mcpu or -mtune options,
> +   for example: -mcpu=cortex-a73 or
> +   -mtune=cortex-m33.
>   
> 
>

Re: [PATCH] shrink-wrap: Fix problem with DF checking (PR78400)

2016-11-21 Thread Richard Biener

On Sat, Nov 19, 2016 at 12:49 PM, Segher Boessenkool
 wrote:
> With my previous patch the compiler ICEs if you use --enable-checking=df.
> This patch fixes it, by calling df_update_entry_exit_and_calls instead of
> df_update_entry_block_defs and df_update_exit_block_uses.
>
> Bootstrapped and checked on powerpc64-linux (also with --enable-checking=df).
> Is this okay for trunk?  Thanks,

Ok.

Richard.

>
> Segher
>
>
> 2016-11-19  Segher Boessenkool  
>
> PR rtl-optimization/78400
> * shrink-wrap.c (try_shrink_wrapping_separate): Call
> df_update_entry_exit_and_calls instead of df_update_entry_block_defs
> and df_update_exit_block_uses.
>
> ---
>  gcc/shrink-wrap.c | 6 ++
>  1 file changed, 2 insertions(+), 4 deletions(-)
>
> diff --git a/gcc/shrink-wrap.c b/gcc/shrink-wrap.c
> index f838696..59feca1 100644
> --- a/gcc/shrink-wrap.c
> +++ b/gcc/shrink-wrap.c
> @@ -1798,8 +1798,7 @@ try_shrink_wrapping_separate (basic_block first_bb)
>   the register for that component is in the IN or GEN or KILL set for
>   that block.  */
>df_scan->local_flags |= DF_SCAN_EMPTY_ENTRY_EXIT;
> -  df_update_entry_block_defs ();
> -  df_update_exit_block_uses ();
> +  df_update_entry_exit_and_calls ();
>df_live_add_problem ();
>df_live_set_all_dirty ();
>df_analyze ();
> @@ -1867,8 +1866,7 @@ try_shrink_wrapping_separate (basic_block first_bb)
>
>/* All done.  */
>df_scan->local_flags &= ~DF_SCAN_EMPTY_ENTRY_EXIT;
> -  df_update_entry_block_defs ();
> -  df_update_exit_block_uses ();
> +  df_update_entry_exit_and_calls ();
>df_live_set_all_dirty ();
>df_analyze ();
>  }
> --
> 1.9.3
>

Re: [Patch v4 0/17] Add support for _Float16 to AArch64 and ARM

2016-11-21 Thread Kyrill Tkachov



On 18/11/16 18:19, James Greenhalgh wrote:

On Fri, Nov 11, 2016 at 03:37:17PM +, James Greenhalgh wrote:

Hi,

This patch set enables the _Float16 type specified in ISO/IEC TS 18661-3
for AArch64 and ARM. The patch set has been posted over the past two months,
with many of the target-independent changes approved. I'm reposting it in
entirity in the form I hope to commit it to trunk.

The patch set can be roughly split in three; first, hookization of
TARGET_FLT_EVAL_METHOD, and changes to the excess precision logic in the
compiler to handle the new values for FLT_EVAL_METHOD defined in
ISO/IEC TS-18661-3. Second, the AArch64 changes required to enable _Float16,
and finally the ARM changes required to enable _Float16.

The broad goals and an outline of each patch in the patch set were
described in https://gcc.gnu.org/ml/gcc-patches/2016-09/msg02383.html .
As compared to the original submission, the patch set has grown an ARM
port, and has had several rounds of technical review on the target
independent aspects.

This has resulted in many of the patches already being approved, a full
summary of the status of each ticket is immediately below.

Clearly the focus for review of this patch set now needs to be the AArch64
and ARM ports, I hope the appropriate maintainers will be able to do so in
time for the patch set to be accepted for GCC 7.

I've built and tested the full patch set on ARM (cross and native),
AArch64 (cross and native) and x86_64 (native) with no identified issues.

All the target independent changes are now approved, and all the ARM patches
have been approved (two are conditional on minor changes).

I'd like to apply the target independent and ARM changes to trunk, while I
wait for approval of the AArch64 changes. The changes for the two ports should
be independent. Would that be acceptable, or would you prefer me to wait
for review of the AArch64 changes?


That's fine with me (I'd like to start getting the new features in to trunk so
they can start getting the stage 3 testing).

Kyrill


I will then send a follow-up patch for doc/extend.texi detailing the
availability of _Float16 on ARM (I'm holding off on doing this until I know
which order the ARM and AArch64 parts will go in).

Thanks,
James


--
Target independent changes

10 patches, 9 previously approved, 1 New implementing testsuite
changes to enable _Float16 tests in more circumstances on ARM.
--

[Patch 1/17] Add a new target hook for describing excess precision intentions

   Approved: https://gcc.gnu.org/ml/gcc-patches/2016-11/msg00781.html

[Patch 2/17] Implement TARGET_C_EXCESS_PRECISION for i386

   Blanket approved by Jeff in:
 https://gcc.gnu.org/ml/gcc-patches/2016-09/msg02402.html

[Patch 3/17] Implement TARGET_C_EXCESS_PRECISION for s390

   Approved: https://gcc.gnu.org/ml/gcc-patches/2016-10/msg01554.html

[Patch 4/17] Implement TARGET_C_EXCESS_PRECISION for m68k

   Blanket approved by Jeff in:
 https://gcc.gnu.org/ml/gcc-patches/2016-09/msg02402.html
   And by Andreas: https://gcc.gnu.org/ml/gcc-patches/2016-09/msg02414.html

   There was a typo in the original patch, fixed in:
 https://gcc.gnu.org/ml/gcc-patches/2016-10/msg01173.html
   which I would apply as an "obvious" fix to the original patch.

[Patch 5/17] Add -fpermitted-flt-eval-methods=[c11|ts-18661-3]

   Approved: https://gcc.gnu.org/ml/gcc-patches/2016-09/msg02405.html

   Joseph had a comment in
   https://gcc.gnu.org/ml/gcc-patches/2016-10/msg00335.html that the tests
   should check FLT_EVAL_METHOD from  rather than
   __FLT_EVAL_METHOD__. Rather than implement that suggestion, I added tests
   to patch 6 which tested the  macro, and left the tests in this
   patch testing the internal macro.

[Patch 6/17] Migrate excess precision logic to use TARGET_EXCESS_PRECISION

   Approved (after removing a rebase bug):
   https://gcc.gnu.org/ml/gcc-patches/2016-11/msg00231.html

[Patch 7/17] Delete TARGET_FLT_EVAL_METHOD and poison it.

   Approved: https://gcc.gnu.org/ml/gcc-patches/2016-09/msg02401.html

[Patch 8/17] Make _Float16 available if HFmode is available

   Approved: https://gcc.gnu.org/ml/gcc-patches/2016-09/msg02403.html

[Patch libgcc 9/17] Update soft-fp from glibc

   Self approved under policy that we can update libraries which GCC mirrors
   without further approval.

[Patch testsuite patch 10/17] Add options for floatN when checking effective 
target for support

   NEW!


AArch64 changes

3 patches, none reviewed


[Patch AArch64 11/17] Add floatdihf2 and floatunsdihf2 patterns

   Not reviewed, last pinged (^6):
   https://gcc.gnu.org/ml/gcc-patches/2016-11/msg00584.html

[Patch libgcc AArch64 12/17] Enable hfmode soft-float conversions and 
truncations

   Not reviewed:
   https://gcc.gnu.org/ml/gcc-patches/2016-09/msg02395.html

[Patch AArch64 13/17] Enable _Float16 for AArch64

   Not reviewed:
   https://gcc.gnu.org/ml/gcc-patches/2016-10/msg01176.html

---

Re: [PATCH, GCC/ARM, ping] Optional -mthumb for Thumb only targets

2016-11-21 Thread Christophe Lyon

Hi Thomas,

On 18 November 2016 at 17:51, Thomas Preudhomme
 wrote:
> On 11/11/16 14:35, Kyrill Tkachov wrote:
>>
>>
>> On 08/11/16 13:36, Thomas Preudhomme wrote:
>>>
>>> Ping?
>>>
>>> Best regards,
>>>
>>> Thomas
>>>
>>> On 25/10/16 18:07, Thomas Preudhomme wrote:

 Hi,

 Currently when a user compiles for a thumb-only target (such as Cortex-M
 processors) without specifying the -mthumb option GCC throws the error
 "target
 CPU does not support ARM mode". This is suboptimal from a usability
 point of
 view: the -mthumb could be deduced from the -march or -mcpu option when
 there is
 no ambiguity.

 This patch implements this behavior by extending the DRIVER_SELF_SPECS
 to
 automatically append -mthumb to the command line for thumb-only targets.
 It does
 so by checking the last -march option if any is given or the last -mcpu
 option
 otherwise. There is no ordering issue because conflicting -mcpu and
 -march is
 already handled.

 Note that the logic cannot be implemented in function
 arm_option_override
 because we need to provide the modified command line to the GCC driver
 for
 finding the right multilib path and the function arm_option_override is
 executed
 too late for that effect.

 ChangeLog entries are as follow:

 *** gcc/ChangeLog ***

 2016-10-18  Terry Guo  
 Thomas Preud'homme 

 PR target/64802
 * common/config/arm/arm-common.c (arm_target_thumb_only): New
 function.
 * config/arm/arm-opts.h: Include arm-flags.h.
 (struct arm_arch_core_flag): Define.
 (arm_arch_core_flags): Define.
 * config/arm/arm-protos.h: Include arm-flags.h.
 (FL_NONE, FL_ANY, FL_CO_PROC, FL_ARCH3M, FL_MODE26, FL_MODE32,
 FL_ARCH4, FL_ARCH5, FL_THUMB, FL_LDSCHED, FL_STRONG, FL_ARCH5E,
 FL_XSCALE, FL_ARCH6, FL_VFPV2, FL_WBUF, FL_ARCH6K, FL_THUMB2,
 FL_NOTM,
 FL_THUMB_DIV, FL_VFPV3, FL_NEON, FL_ARCH7EM, FL_ARCH7,
 FL_ARM_DIV,
 FL_ARCH8, FL_CRC32, FL_SMALLMUL, FL_NO_VOLATILE_CE, FL_IWMMXT,
 FL_IWMMXT2, FL_ARCH6KZ, FL2_ARCH8_1, FL2_ARCH8_2, FL2_FP16INST,
 FL_TUNE, FL_FOR_ARCH2, FL_FOR_ARCH3, FL_FOR_ARCH3M,
 FL_FOR_ARCH4,
 FL_FOR_ARCH4T, FL_FOR_ARCH5, FL_FOR_ARCH5T, FL_FOR_ARCH5E,
 FL_FOR_ARCH5TE, FL_FOR_ARCH5TEJ, FL_FOR_ARCH6, FL_FOR_ARCH6J,
 FL_FOR_ARCH6K, FL_FOR_ARCH6Z, FL_FOR_ARCH6ZK, FL_FOR_ARCH6KZ,
 FL_FOR_ARCH6T2, FL_FOR_ARCH6M, FL_FOR_ARCH7, FL_FOR_ARCH7A,
 FL_FOR_ARCH7VE, FL_FOR_ARCH7R, FL_FOR_ARCH7M, FL_FOR_ARCH7EM,
 FL_FOR_ARCH8A, FL2_FOR_ARCH8_1A, FL2_FOR_ARCH8_2A,
 FL_FOR_ARCH8M_BASE,
 FL_FOR_ARCH8M_MAIN, arm_feature_set, ARM_FSET_MAKE,
 ARM_FSET_MAKE_CPU1, ARM_FSET_MAKE_CPU2, ARM_FSET_CPU1,
 ARM_FSET_CPU2,
 ARM_FSET_EMPTY, ARM_FSET_ANY, ARM_FSET_HAS_CPU1,
 ARM_FSET_HAS_CPU2,
 ARM_FSET_HAS_CPU, ARM_FSET_ADD_CPU1, ARM_FSET_ADD_CPU2,
 ARM_FSET_DEL_CPU1, ARM_FSET_DEL_CPU2, ARM_FSET_UNION,
 ARM_FSET_INTER,
 ARM_FSET_XOR, ARM_FSET_EXCLUDE, ARM_FSET_IS_EMPTY,
 ARM_FSET_CPU_SUBSET): Move to ...
 * config/arm/arm-flags.h: This new file.
 * config/arm/arm.h (TARGET_MODE_SPEC_FUNCTIONS): Define.
 (EXTRA_SPEC_FUNCTIONS): Add TARGET_MODE_SPEC_FUNCTIONS to its
 value.
 (TARGET_MODE_SPECS): Define.
 (DRIVER_SELF_SPECS): Add TARGET_MODE_SPECS to its value.

 *** gcc/testsuite/ChangeLog ***

 2016-10-11  Thomas Preud'homme 

 PR target/64802
 * gcc.target/arm/optional_thumb-1.c: New test.
 * gcc.target/arm/optional_thumb-2.c: New test.
 * gcc.target/arm/optional_thumb-3.c: New test.

 No regression when running the testsuite for -mcpu=cortex-m0 -mthumb,
 -mcpu=cortex-m0 -marm and -mcpu=cortex-a8 -marm

 Is this ok for trunk?

>>
>> This looks like a useful usability improvement.
>> This is ok after a bootstrap on an arm-none-linux-gnueabihf target.
>>
>> Sorry for the delay,
>> Kyrill
>
>
> I've rebased the patch on top of the arm_feature_set type consistency fix
> [1] and committed it. The committed patch is in attachment for reference.
>
> [1] https://gcc.gnu.org/ml/gcc-patches/2016-11/msg01680.html
>

Since this commit (242597), I've noticed that:
- the 2 new tests optional_thumb-1.c and optional_thumb-2.c fail
if GCC was configured --with-mode=arm. The error message is:
cc1: error: target CPU does not support ARM mode

- on armeb --with-mode=arm, gcc.dg/vect/pr64252.c fails at execution

See: 
http://people.linaro.org/~christophe.lyon/cross-validation/gcc/trunk/242597/report-build-info.html

Christophe

> Best regards,
>
> Thomas

Re: [PATCH 0/6][ARM] Implement support for ACLE Coprocessor Intrinsics

2016-11-21 Thread Christophe Lyon

Hi,


On 17 November 2016 at 11:45, Kyrill Tkachov
 wrote:
>
> On 17/11/16 10:31, Andre Vieira (lists) wrote:
>>
>> Hi Kyrill,
>>
>> On 17/11/16 10:11, Kyrill Tkachov wrote:
>>>
>>> Hi Andre,
>>>
>>> On 09/11/16 10:00, Andre Vieira (lists) wrote:

 Tested the series by bootstrapping arm-none-linux-gnuabihf and found no
 regressions, also did a normal build for arm-none-eabi and ran the
 acle.exp tests for a Cortex-M3.
>>>
>>> Can you please also do a full testsuite run on arm-none-linux-gnueabihf.
>>> Patches have to be tested by the whole testsuite.
>>
>> That's what I have done and meant to say with "Tested the series by
>> bootstrapping arm-none-linux-gnuabihf and found no regressions". I
>> compared gcc/g++/libstdc++ tests on a bootstrap with and without the
>> patches.
>
>
> Ah ok, great.
>
>>
>> I'm happy to rerun the tests after a rebase when the patches get approved.
>
FWIW, I ran a validation with the 6 patches applied, and saw no regression.
Given the large number of new tests, I didn't check the full details.

If you want to check that each configuration has the PASSes you expect,
you can have a look at:
http://people.linaro.org/~christophe.lyon/cross-validation/gcc-test-patches/242581-acle/report-build-info.html

Thanks,

Christophe


>
> Thanks,
> Kyrill
>
>>
>> Cheers,
>> Andre
>
>

99 matches

Mail list logo