Re: [PATCH] Fix parloops ICE (PR tree-optimization/81578)

2017-07-27 Thread Richard Biener
On July 27, 2017 9:10:48 PM GMT+02:00, Jakub Jelinek  wrote:
>Hi!
>
>Not all vectorizable reductions are valid OpenMP standard reductions
>(we could create user defined reductions from that, but that would be
>quite a lot of work).
>
>This patch bails out for unsupported reductions.
>
>Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

OK.

Thanks,
Richard.

>2017-07-27  Jakub Jelinek  
>
>   PR tree-optimization/81578
>   * tree-parloops.c (build_new_reduction): Bail out if
>   reduction_code isn't one of the standard OpenMP reductions.
>   Move the details printing after that decision.
>
>   * gcc.dg/pr81578.c: New test.
>
>--- gcc/tree-parloops.c.jj 2017-07-19 14:01:24.0 +0200
>+++ gcc/tree-parloops.c2017-07-27 14:27:13.966749227 +0200
>@@ -2475,23 +2475,39 @@ build_new_reduction (reduction_info_tabl
> 
>   gcc_assert (reduc_stmt);
> 
>-  if (dump_file && (dump_flags & TDF_DETAILS))
>-{
>-  fprintf (dump_file,
>- "Detected reduction. reduction stmt is:\n");
>-  print_gimple_stmt (dump_file, reduc_stmt, 0);
>-  fprintf (dump_file, "\n");
>-}
>-
>   if (gimple_code (reduc_stmt) == GIMPLE_PHI)
> {
>   tree op1 = PHI_ARG_DEF (reduc_stmt, 0);
>   gimple *def1 = SSA_NAME_DEF_STMT (op1);
>   reduction_code = gimple_assign_rhs_code (def1);
> }
>-
>   else
> reduction_code = gimple_assign_rhs_code (reduc_stmt);
>+  /* Check for OpenMP supported reduction.  */
>+  switch (reduction_code)
>+{
>+case PLUS_EXPR:
>+case MULT_EXPR:
>+case MAX_EXPR:
>+case MIN_EXPR:
>+case BIT_IOR_EXPR:
>+case BIT_XOR_EXPR:
>+case BIT_AND_EXPR:
>+case TRUTH_OR_EXPR:
>+case TRUTH_XOR_EXPR:
>+case TRUTH_AND_EXPR:
>+  break;
>+default:
>+  return;
>+}
>+
>+  if (dump_file && (dump_flags & TDF_DETAILS))
>+{
>+  fprintf (dump_file,
>+ "Detected reduction. reduction stmt is:\n");
>+  print_gimple_stmt (dump_file, reduc_stmt, 0);
>+  fprintf (dump_file, "\n");
>+}
> 
>   new_reduction = XCNEW (struct reduction_info);
> 
>--- gcc/testsuite/gcc.dg/pr81578.c.jj  2017-07-27 14:48:16.426441581
>+0200
>+++ gcc/testsuite/gcc.dg/pr81578.c 2017-07-27 14:48:01.0 +0200
>@@ -0,0 +1,12 @@
>+/* PR tree-optimization/81578 */
>+/* { dg-do compile { target pthread } } */
>+/* { dg-options "-O2 -ftree-parallelize-loops=2" } */
>+
>+int
>+foo (int *x)
>+{
>+  int i, r = 1;
>+  for (i = 0; i != 1024; i++)
>+r *= x[i] < 0;
>+  return r;
>+}
>
>   Jakub



Re: [PATCH 00/17] RFC: New source-location representation; Language Server Protocol

2017-07-27 Thread Alexandre Oliva
On Jul 26, 2017, Jim Wilson  wrote:

> On 07/24/2017 01:04 PM, David Malcolm wrote:
>> * The LSP implementation is a just a proof-of-concept, to further
>> motivate capturing the extra data.  Turning it into a "proper" LSP
>> server implementation would be a *lot* more work, and I'm unlikely to
>> actually do that (but maybe someone on the list wants to take this on?)

> Apparently Alexandre Oliva has ideas on how to implement LSP by using
> gdb.  You two may want to compare notes.

*nod*

I thought GDB would be a better place for the server proper, because it
already knows how to deal with multiple translation units, how to
navigate the symbol tables in the presence of multiple contexts, how to
perform context-aware name completion and whatnot.

Current debug info is not enough to implement everything that's expected
of a language server, so we'd probably need a compile mode not entirely
unlike syntax-check (i.e., no actual code generation), but dumping
symbolic information about definitions (an extended subset of existing
debug info, since there's no object code to refer to, but there are
e.g. symbolic templates that don't appear at all nowadays), and either a
detailed summary of where tokens appear in sources, or something a bit
like preprocessed output, that contains even declarations brought from
header files, but perhaps without resolving preprocessor conditionals,
expanding macros or otherwise mangling actual sources.

I'm afraid this is as far as I got in the "design" that Richard Stallman
asked of me.  Lucky I knew what it was about because David had
introduced me to LSP back in March ;-)

-- 
Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer


[PATCH][PR target/81535] Fix tests on Power

2017-07-27 Thread Yury Gribov

Hi all,

This patch fixes issues reported in 
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81535


I removed call to g in pr79439.c because gcc was duplicating the basic 
block with call depending on compiler flags (so scan-assembler-times 
pattern wasn't reliable anymore).  I also added alias to prevent 
inlining introduced by recent PR56727 patch.


I added Power-specific pattern in pr56727-2.c testcase and disabled 
testing on Power in pr56727-1.c.


Tested on powerpc64-unknown-linux-gnu.  Ok for trunk?

-Y
2017-07-28  Yury Gribov  

PR target/81535
* gcc.dg/pr56727-1.c: Do not check output on Power.
* gcc.dg/pr56727-2.c: Fix pattern for Power.
* gcc.target/powerpc/pr79439.c: Prevent inlining.

diff -rupN gcc/gcc/testsuite/gcc.dg/pr56727-1.c 
gcc-81535/gcc/testsuite/gcc.dg/pr56727-1.c
--- gcc/gcc/testsuite/gcc.dg/pr56727-1.c2017-07-28 02:39:54.770046466 
+
+++ gcc-81535/gcc/testsuite/gcc.dg/pr56727-1.c  2017-07-28 04:25:04.805648587 
+
@@ -1,6 +1,6 @@
 /* { dg-do compile { target fpic } } */
 /* { dg-options "-O2 -fPIC" } */
-/* { dg-final { scan-assembler-not "@(PLT|plt)" { target i?86-*-* x86_64-*-* 
powerpc*-*-* } } } */
+/* { dg-final { scan-assembler-not "@(PLT|plt)" { target i?86-*-* x86_64-*-* } 
} } */
 
 #define define_func(type) \
   void f_ ## type (type b) { f_ ## type (0); } \
diff -rupN gcc/gcc/testsuite/gcc.dg/pr56727-2.c 
gcc-81535/gcc/testsuite/gcc.dg/pr56727-2.c
--- gcc/gcc/testsuite/gcc.dg/pr56727-2.c2017-07-28 02:39:54.770046466 
+
+++ gcc-81535/gcc/testsuite/gcc.dg/pr56727-2.c  2017-07-28 04:21:19.195215187 
+
@@ -1,10 +1,10 @@
 /* { dg-do compile { target fpic } } */
 /* { dg-options "-O2 -fPIC" } */
-/* { dg-final { scan-assembler "@(PLT|plt)" { target i?86-*-* x86_64-*-* 
powerpc*-*-linux* } } } */
 
 __attribute__((noinline, noclone))
 void f (short b)
 {
+  __builtin_setjmp (0);  /* Prevent tailcall */
   f (0);
 }
 
@@ -14,3 +14,5 @@ void h ()
 {
   g (0);
 }
+/* { dg-final { scan-assembler "@(PLT|plt)" { target i?86-*-* x86_64-*-* } } } 
*/
+/* { dg-final { scan-assembler "bl f\n\[ \t\]*nop" { target powerpc*-*-linux* 
} } } */
diff -rupN gcc/gcc/testsuite/gcc.target/powerpc/pr79439.c 
gcc-81535/gcc/testsuite/gcc.target/powerpc/pr79439.c
--- gcc/gcc/testsuite/gcc.target/powerpc/pr79439.c  2017-07-28 
02:39:55.750048426 +
+++ gcc-81535/gcc/testsuite/gcc.target/powerpc/pr79439.c2017-07-28 
04:13:47.834177237 +
@@ -8,22 +8,17 @@
 
 int f (void);
 
-void
-g (void)
-{
-}
-
 int
 rec (int a)
 {
   int ret = 0;
   if (a > 10 && f ())
 ret += rec (a - 1);
-  g ();
   return a + ret;
 }
 
+void rec_alias (short) __attribute__ ((alias ("rec")));
+
 /* { dg-final { scan-assembler-times {\mbl f\M}   1 } } */
-/* { dg-final { scan-assembler-times {\mbl g\M}   1 } } */
 /* { dg-final { scan-assembler-times {\mbl rec\M} 1 } } */
-/* { dg-final { scan-assembler-times {\mnop\M}3 } } */
+/* { dg-final { scan-assembler-times {\mnop\M}2 } } */


[PATCHv2][PING][PR 59521] Respect __builtin_expect in switch statements

2017-07-27 Thread Yury Gribov

Hi all,

This is a ping for 
https://gcc.gnu.org/ml/gcc-patches/2017-07/msg01275.html . I've fixed 
attachment type, hopefully it's easier to read now.


This patch adds support for __builtin_expect in switch statements at
tree level (RTL part would be reviewed/commited separately).  It's an
update of https://gcc.gnu.org/ml/gcc-patches/2017-07/msg01016.html ,
rebased and retested.

Ok for trunk?

-Y
2017-07-21  Yury Gribov  
Martin Liska  

PR middle-end/59521

gcc/
* predict.c (set_even_probabilities): Handle case of a single
likely edge.
(combine_predictions_for_bb): Ditto.
(tree_predict_by_opcode): Handle switch statements.
* stmt.c (balance_case_nodes): Select pivot value based on
probabilities.

gcc/testsuite/
* gcc.dg/predict-15.c: New test.

diff -rupN gcc/gcc/predict.c gcc-59521/gcc/predict.c
--- gcc/gcc/predict.c   2017-07-18 22:21:16.0 +0200
+++ gcc-59521/gcc/predict.c 2017-07-19 08:26:57.0 +0200
@@ -815,9 +815,12 @@ unlikely_executed_bb_p (basic_block bb)
 
 static void
 set_even_probabilities (basic_block bb,
-   hash_set *unlikely_edges = NULL)
+   hash_set *unlikely_edges = NULL,
+   hash_set *likely_edges = NULL)
 {
   unsigned nedges = 0, unlikely_count = 0;
+  unsigned likely_count
+= likely_edges ? likely_edges->elements () : 0;
   edge e = NULL;
   edge_iterator ei;
   profile_probability all = profile_probability::always ();
@@ -827,7 +830,7 @@ set_even_probabilities (basic_block bb,
   all -= e->probability;
 else if (!unlikely_executed_edge_p (e))
   {
-nedges ++;
+nedges++;
 if (unlikely_edges != NULL && unlikely_edges->contains (e))
  {
all -= profile_probability::very_unlikely ();
@@ -844,18 +847,44 @@ set_even_probabilities (basic_block bb,
 
   unsigned c = nedges - unlikely_count;
 
-  FOR_EACH_EDGE (e, ei, bb->succs)
-if (e->probability.initialized_p ())
-  ;
-else if (!unlikely_executed_edge_p (e))
-  {
-   if (unlikely_edges != NULL && unlikely_edges->contains (e))
- e->probability = profile_probability::very_unlikely ();
-   else
- e->probability = all.apply_scale (1, c).guessed ();
-  }
-else
-  e->probability = profile_probability::never ();
+  /* If we have one likely edge, then use its probability and
+ distribute remaining probabilities as even.  */
+  if (likely_count == 1)
+{
+  edge_prediction *prediction = *likely_edges->begin ();
+  int p = prediction->ep_probability;
+  profile_probability likely_prob
+   = all.apply_scale (p, REG_BR_PROB_BASE).guessed ();
+  profile_probability remainder = all - likely_prob;
+
+  FOR_EACH_EDGE (e, ei, bb->succs)
+   if (e->probability.initialized_p ())
+ ;
+   else if (!unlikely_executed_edge_p (e))
+ {
+   if (prediction->ep_edge == e)
+ e->probability = likely_prob;
+   else
+ e->probability = remainder.apply_scale (1, nedges - 1);
+ }
+   else
+ e->probability = profile_probability::never ();
+}
+  else
+{
+  FOR_EACH_EDGE (e, ei, bb->succs)
+  if (e->probability.initialized_p ())
+   ;
+  else if (!unlikely_executed_edge_p (e))
+   {
+ if (unlikely_edges != NULL && unlikely_edges->contains (e))
+   e->probability = profile_probability::very_unlikely ();
+ else
+   e->probability = all.apply_scale (1, c).guessed ();
+   }
+  else
+   e->probability = profile_probability::never ();
+}
 }
 
 /* Add REG_BR_PROB note to JUMP with PROB.  */
@@ -1151,6 +1180,7 @@ combine_predictions_for_bb (basic_block 
   if (nedges != 2)
 {
   hash_set unlikely_edges (4);
+  hash_set likely_edges (4);
 
   /* Identify all edges that have a probability close to very unlikely.
 Doing the approach for very unlikely doesn't worth for doing as
@@ -1158,16 +1188,31 @@ combine_predictions_for_bb (basic_block 
   edge_prediction **preds = bb_predictions->get (bb);
   if (preds)
for (pred = *preds; pred; pred = pred->ep_next)
- if (pred->ep_probability <= PROB_VERY_UNLIKELY)
-   unlikely_edges.add (pred->ep_edge);
+ {
+   if (pred->ep_probability <= PROB_VERY_UNLIKELY)
+ unlikely_edges.add (pred->ep_edge);
+   if (pred->ep_probability >= PROB_VERY_LIKELY
+   || pred->ep_predictor == PRED_BUILTIN_EXPECT)
+ likely_edges.add (pred);
+ }
 
   if (!dry_run)
-   set_even_probabilities (bb, &unlikely_edges);
+   set_even_probabilities (bb, &unlikely_edges, &likely_edges);
   clear_bb_predictions (bb);
   if (dump_file)
{
  fprintf (dump_file, "Predictions for bb %i\n", bb->index);
- if (unlikely_edges.elements () == 0)
+ if (likely_edges.

Re: [PATCHv4][PING][PR 57371] Remove useless floating point casts in comparisons

2017-07-27 Thread Yuri Gribov
On Tue, Jul 25, 2017 at 9:32 PM, Jeff Law  wrote:
> On 07/25/2017 08:10 AM, Richard Biener wrote:
>> On Mon, Jul 17, 2017 at 9:29 AM, Yuri Gribov  wrote:
>>> Hi all,
>>>
>>> This is an updated version of patch in
>>> https://gcc.gnu.org/ml/gcc-patches/2017-07/msg00409.html . It prevents
>>> optimization in presense of sNaNs (and qNaNs when comparison operator
>>> is > >= < <=) to preserve FP exceptions.
>>>
>>> Note that I had to use -fsignaling-nans in pr57371-5.c test because by
>>> default this option is off and some existing patterns in match.pd
>>> happily optimize NaN comparisons, even with sNaNs (!).
>>>
>>> Bootstrapped and regtested on x64. Ok for trunk?
>>
>> + {
>> +   tree itype = TREE_TYPE (@0);
>> +   gcc_assert (INTEGRAL_TYPE_P (itype));
>>
>> no need to spell out this assert.
> Right.  I think Yuri added this in response to a comment from me.
> However, I think the subsequent discussion made it clear that we don't
> actually need to check that @0 is an integral type.

I was initially scared to rely on particular semantics of
verify_gimple_assign_unary so left the assert in place, especially
given that it puzzled others as well.  I'll remove it.

-Y


Re: [PATCH] Improve extraction of changed file in contrib/mklog

2017-07-27 Thread Yuri Gribov
On Wed, Jul 26, 2017 at 6:11 PM, Jeff Law  wrote:
> On 07/09/2017 01:03 PM, Yuri Gribov wrote:
>> Hi,
>>
>> Currently mklog will fail to analyze lines like this in patches:
>> diff -rupN gcc/gcc/testsuite/lib/profopt.exp
>> gcc-compare-checks/gcc/testsuite/lib/profopt.exp
>> (it fails with "Error: failed to parse diff for ... and ...").
>>
>> This patch fixes it. Ok for trunk?
>>
>> -Y
>>
>>
>> mklog-filename-fix-1.patch
>>
>>
>> 2017-07-09  Yury Gribov  
>>
>> contrib/
>> * mklog: Fix extraction of changed file name.
> One could argue that given general directions python would be a better
> match than perl, but I don't think it's reasonable to require a rewrite
> to move forward.

Jeff,

Is this to make script more accessible for hacking by others?
Otherwise from technical standpoint there probly isn't much
difference.

I'm fine with rewriting it in Python once I get some time later this month.

> OK.
>
> jeff


Re: [PATCH] Fix PR middle-end/81564: ICE in group_case_labels_stmt()

2017-07-27 Thread Peter Bergner
On 7/27/17 12:21 PM, Steven Bosscher wrote:
> On Wed, Jul 26, 2017 at 9:35 PM, Peter Bergner wrote:
>> The test case for PR81564 exposes an issue where the case labels for a
>> switch statement point to blocks that have already been removed by an
>> earlier call to cleanup_tree_cfg().  In that case, the code in
>> group_case_labels_stmt() that does:
> 
> How can a basic block be removed (apparently as unreachable) if there
> are still case labels leading to it?
> 
> Apparently there is enough information to make CASE_LABEL be set to
> NULL. Why is the case label not just removed (or redirected to the
> default, or ...)?

My bad above.  The block is actually deleted during the process of grouping
the labels.  The switch statement we have entering group_case_labels_stmt() is:

  switch (...)
  {
case 3:
case 7:
  __builtin_unreachable();
  }

We first handle case "3" and we notice that it leads to an "unreachable" block,
so we delete the edge to that block and then the block itself:

  /* Discard cases that have an unreachable destination block.  */
  if (EDGE_COUNT (base_bb->succs) == 0
  && gimple_seq_unreachable_p (bb_seq (base_bb)))
{
  edge base_edge = find_edge (gimple_bb (stmt), base_bb);
  if (base_edge != NULL)
remove_edge_and_dominated_blocks (base_edge);
  i = next_index;
  continue;
}

Next time through the loop, we handle case "7" which pointed to the same
block as case "3", but since we just deleted the block it points to, that
is why 'base_bb = label_to_block (CASE_LABEL (base_case));' now returns a
NULL basic block.  So the incongruous issue you state is just a temporary
artifact of the process of cleaning up the unreachable blocks.  The NULL
basic block is just a sign that we deleted that case's block in an earlier
loop iteration, so we're correct in just removing it like the patch does.

Sorry for the poor initial description!

Peter



Re: [PATCH,AIX] Enable Go for AIX

2017-07-27 Thread Ian Lance Taylor
On Wed, Jul 26, 2017 at 5:48 AM, REIX, Tony  wrote:
> Description:
>  * This patch enables Go on AIX.
>
> Tests:
>  * Fedora25/x86_64 + GCC trunk : Configure/Build: SUCCESS
>- build made by means of gmake.
>
> ChangeLog:
>  * configure.ac, configure: Enable Go for AIX
>  * contrib/config-list.mk: Enable Go for AIX

This patch is fine with me.  Leaving for David as AIX maintainer.

Thanks.

Ian


Re: [PATCH,AIX] Manage .go_export section for AIX

2017-07-27 Thread Ian Lance Taylor
On Wed, Jul 26, 2017 at 3:09 AM, REIX, Tony  wrote:
> Description:
>  * This patch manages the .go_export section as an EXCLUDE section on AIX.
>
> Tests:
>  * Fedora25/x86_64 + GCC trunk : Configure/Build: SUCCESS
>- build made by means of gmake.
>
> ChangeLog:
>  * go-backend.c (go_write_export_data): Use EXCLUDE section for AIX.

Thanks.  Testing _AIX here is clearly wrong, as we need to test for a
target property, not a host property.  I committed this patch as
appended.

Ian


2017-07-27Tony Reix  

* go-backend.c (go_write_export_data): Use EXCLUDE section for
AIX.
Index: go-backend.c
===
--- go-backend.c(revision 250406)
+++ go-backend.c(working copy)
@@ -45,6 +45,10 @@ along with GCC; see the file COPYING3.
 #define GO_EXPORT_SECTION_NAME ".go_export"
 #endif
 
+#ifndef TARGET_AIX
+#define TARGET_AIX 0
+#endif
+
 /* This file holds all the cases where the Go frontend needs
information from gcc's backend.  */
 
@@ -101,7 +105,9 @@ go_write_export_data (const char *bytes,
   if (sec == NULL)
 {
   gcc_assert (targetm_common.have_named_sections);
-  sec = get_section (GO_EXPORT_SECTION_NAME, SECTION_DEBUG, NULL);
+  sec = get_section (GO_EXPORT_SECTION_NAME,
+TARGET_AIX ? SECTION_EXCLUDE : SECTION_DEBUG,
+NULL);
 }
 
   switch_to_section (sec);


[PATCH] fix the handling of string precision in pretty printer (PR 81586)

2017-07-27 Thread Martin Sebor

The pretty printer treats precision in %s directives as a request
to print exactly as many characters from the string argument when
what precision normally (in C) means is the maximum number of
characters to read from the string.  It doesn't mean to read
past the terminating NUL.

The attached patch fixes that.  Tested on x86_64-linux.

Martin
PR c++/81586 - valgrind error in output_buffer_append_r with -Wall

gcc/ChangeLog:

	PR c++/81586
	* pretty-print.c (pp_format): Correct the handling of %s precision.

diff --git a/gcc/pretty-print.c b/gcc/pretty-print.c
index 570dec7..a79191b 100644
--- a/gcc/pretty-print.c
+++ b/gcc/pretty-print.c
@@ -667,7 +667,17 @@ pp_format (pretty_printer *pp, text_info *text)
 	  }
 
 	s = va_arg (*text->args_ptr, const char *);
-	pp_append_text (pp, s, s + n);
+
+	/* Negative precision is treated as if it were omitted.  */
+	if (n < 0)
+	  n = INT_MAX;
+
+	/* Append the lesser of precision and strlen (s) characters.  */
+	size_t len = strlen (s);
+	if ((unsigned) n < len)
+	  len = n;
+
+	pp_append_text (pp, s, s + len);
 	  }
 	  break;
 


[PATCH], PR target/81593, Optimize PowerPC vector sets coming from a vector extracts

2017-07-27 Thread Michael Meissner
This patches optimizes the PowerPC vector set operation for 64-bit doubles and
longs where the elements in the vector set may have been extracted from another
vector (PR target/81593):

Here an an example:

vector double
test_vpasted (vector double high, vector double low)
{
  vector double res;
  res[1] = high[1];
  res[0] = low[0];
  return res;
}

Previously it would generate:

xxpermdi 12,34,34,2
vspltisw 2,0
xxlor 0,35,35
xxpermdi 34,34,12,0
xxpermdi 34,0,34,1

and with these patches, it now generates:

xxpermdi 34,35,34,1

I have tested it on a little endian power8 system and a big endian power7
system with the usual bootstrap and make checks with no regressions.  Can I
check this into the trunk?

I also built Spec 2006 with the compiler, and saw no changes in the code
generated.  This isn't surprising because it isn't something that auto
vectorization might generate by default.

[gcc]
2017-07-27  Michael Meissner  

PR target/81593
* config/rs6000/rs6000-protos.h (rs6000_emit_xxpermdi): New
declaration.
* config/rs6000/rs6000.c (rs6000_emit_xxpermdi): New function to
emit XXPERMDI accessing either double word in either vector
register inputs.
* config/rs6000/vsx.md (vsx_concat_, VSX_D iterator):
Rewrite VEC_CONCAT insn to call rs6000_emit_xxpermdi.  Simplify
the constraints with the removal of the -mupper-regs-* switches.
(vsx_concat__1): New combiner insns to optimize CONCATs
where either register might have come from VEC_SELECT.
(vsx_concat__2): Likewise.
(vsx_concat__3): Likewise.
(vsx_set_, VSX_D iterator): Rewrite insn to generate a
VEC_CONCAT rather than use an UNSPEC to specify the option.

[gcc/testsuite]
2017-07-27  Michael Meissner  

PR target/81593
* gcc.target/powerpc/vsx-extract-6.c: New test.
* gcc.target/powerpc/vsx-extract-7.c: Likewise.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797
Index: gcc/config/rs6000/rs6000-protos.h
===
--- gcc/config/rs6000/rs6000-protos.h   
(svn+ssh://meiss...@gcc.gnu.org/svn/gcc/trunk/gcc/config/rs6000/rs6000-protos.h)
(revision 250577)
+++ gcc/config/rs6000/rs6000-protos.h   (.../gcc/config/rs6000/rs6000-protos.h) 
(working copy)
@@ -233,6 +233,7 @@ extern void rs6000_asm_output_dwarf_pcre
   const char *label);
 extern void rs6000_asm_output_dwarf_datarel (FILE *file, int size,
 const char *label);
+extern const char *rs6000_emit_xxpermdi (rtx[], rtx, rtx);
 
 /* Declare functions in rs6000-c.c */
 
Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  
(svn+ssh://meiss...@gcc.gnu.org/svn/gcc/trunk/gcc/config/rs6000/rs6000.c)   
(revision 250577)
+++ gcc/config/rs6000/rs6000.c  (.../gcc/config/rs6000/rs6000.c)
(working copy)
@@ -39167,6 +39167,38 @@ rs6000_optab_supported_p (int op, machin
   return true;
 }
 }
+
+
+/* Emit a XXPERMDI instruction that can extract from either double word of the
+   two arguments.  ELEMENT1 and ELEMENT2 are either NULL or they are 0/1 giving
+   which double word to be used for the operand.  */
+
+const char *
+rs6000_emit_xxpermdi (rtx operands[], rtx element1, rtx element2)
+{
+  int op1_dword = (!element1) ? 0 : INTVAL (element1);
+  int op2_dword = (!element2) ? 0 : INTVAL (element2);
+
+  gcc_assert (IN_RANGE (op1_dword | op2_dword, 0, 1));
+
+  if (BYTES_BIG_ENDIAN)
+{
+  operands[3] = GEN_INT (2*op1_dword + op2_dword);
+  return "xxpermdi %x0,%x1,%x2,%3";
+}
+  else
+{
+  if (element1)
+   op1_dword = 1 - op1_dword;
+
+  if (element2)
+   op2_dword = 1 - op2_dword;
+
+  operands[3] = GEN_INT (op1_dword + 2*op2_dword);
+  return "xxpermdi %x0,%x2,%x1,%3";
+}
+}
+
 
 struct gcc_target targetm = TARGET_INITIALIZER;
 
Index: gcc/config/rs6000/vsx.md
===
--- gcc/config/rs6000/vsx.md
(svn+ssh://meiss...@gcc.gnu.org/svn/gcc/trunk/gcc/config/rs6000/vsx.md) 
(revision 250577)
+++ gcc/config/rs6000/vsx.md(.../gcc/config/rs6000/vsx.md)  (working copy)
@@ -2366,19 +2366,17 @@ (define_insn "*vsx_float_fix_v2df2"
 
 ;; Build a V2DF/V2DI vector from two scalars
 (define_insn "vsx_concat_"
-  [(set (match_operand:VSX_D 0 "gpc_reg_operand" "=,we")
+  [(set (match_operand:VSX_D 0 "vsx_register_operand" "=wa,we")
(vec_concat:VSX_D
-(match_operand: 1 "gpc_reg_operand" ",b")
-(match_operand: 2 "gpc_reg_operand" ",b")))]
+(match_operand: 1 "gpc_reg_operand" "wa,b")
+(match

Re: [PATCH] C++: fix ordering of missing std #include suggestion (PR c++/81514)

2017-07-27 Thread Martin Sebor

I'm not sure why Solaris' decl of std::sprintf doesn't hit the
reject path above.

I was able to reproduce the behavior seen on Solaris on my Fedora
box by using this:

  namespace std
  {
extern int sprintf (char *dst, const char *format, ...);
  }


This is how C library symbols were intended to be declared
in the C++  headers (plus extern "C" that GCC doesn't
implement).  Very few systems went to the trouble to make
those changes.  Solaris was one of them.

Martin


Re: tls-dialect gnu2

2017-07-27 Thread H.J. Lu
On Thu, Jul 27, 2017 at 3:07 PM, Nathan Sidwell  wrote:
> Jan,
> I notice the x86 default -mtls-dialect is GNU not GNU2.  The latter was
> added in 2006.  Is there a reason not to default to it now?  (perhaps
> because of the ldso dependency?)

You may have noticed.  There is zero test for -mtls-dialect=gnu2 on x86
in GCC and I only added one -mtls-dialect-gnu2 in glibc last year when
I fixed GNU2 TLS bug in glibc:

https://sourceware.org/bugzilla/show_bug.cgi?id=20309

So effectively, there is very little test for -mtls-dialect-gnu2 on x86.


-- 
H.J.


[Patch] Testsuite fixes for failures caused by patch for PR 80925 - loop peeling and alignment

2017-07-27 Thread Steve Ellcey
I was looking at the latest aarch64 failures and noticed PR 80925.  There
seems to be a consensus to change the tests to reflect the current loop
peeling behaviour so I have created a patch to do that.  There are three
issues with this patch that might need fixing before it can be checked in.

One, I fixed this for aarch64 but not power8.  I don't have a power system
and I am not sure how to specify it in check_effective_target_vect_peel_align.
Maybe someone on the power side can update and test the patch to address
that.

Two, I tried to include a change to gcc.dg/vect/vect-93.c which is one of 
the tests that started failing and I could never get it to pass cleanly,
no matter how many times I tweaked the dg-final statements, so I gave up
and left that test out.

Three, I was a little concerned about the test g++.dg/vect/slp-pr56812.cc,
it looks different than the others and does not mention peeling, but it
was affected by the same checkins as the others.

Any comments from the power and/or vectorizer folks?

Steve Ellcey
sell...@cavium.com


2017-07-27  Steve Ellcey  

PR tree-optimization/80925
* gcc.dg/vect/no-section-anchors-vect-69.c: Add vect_peel_align target
and xfail.
* g++.dg/vect/slp-pr56812.cc: Add vect_peel_align target.
* gcc.dg/vect/section-anchors-vect-69.c: Ditto.
* gcc.dg/vect/vect-28.c: Ditto.
* gcc.dg/vect/vect-33-big-array.c: Ditto.
* gcc.dg/vect/vect-70.c: Ditto.
* gcc.dg/vect/vect-87.c: Ditto.
* gcc.dg/vect/vect-91.c: Ditto.
* lib/target-supports.exp (check_effective_target_vect_peel_align):
New.diff --git a/gcc/testsuite/g++.dg/vect/slp-pr56812.cc b/gcc/testsuite/g++.dg/vect/slp-pr56812.cc
index 80bdcdd..3040341 100644
--- a/gcc/testsuite/g++.dg/vect/slp-pr56812.cc
+++ b/gcc/testsuite/g++.dg/vect/slp-pr56812.cc
@@ -17,4 +17,4 @@ void mydata::Set (float x)
 data[i] = x;
 }
 
-/* { dg-final { scan-tree-dump-times "basic block vectorized" 1 "slp1" } } */
+/* { dg-final { scan-tree-dump-times "basic block vectorized" 1 "slp1" { target { vect_peel_align } } } } */
diff --git a/gcc/testsuite/gcc.dg/vect/no-section-anchors-vect-69.c b/gcc/testsuite/gcc.dg/vect/no-section-anchors-vect-69.c
index fe968de..63f3bc4 100644
--- a/gcc/testsuite/gcc.dg/vect/no-section-anchors-vect-69.c
+++ b/gcc/testsuite/gcc.dg/vect/no-section-anchors-vect-69.c
@@ -114,7 +114,7 @@ int main (void)
 } 
 
 /* { dg-final { scan-tree-dump-times "vectorized 4 loops" 1 "vect" } } */
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 2 "vect" { xfail { {! vector_alignment_reachable} || { vect_sizes_32B_16B} } } } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" { target { vect_peel_align } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 2 "vect" { xfail { vect_peel_align || { {! vector_alignment_reachable} || { vect_sizes_32B_16B} } } } } } */
 /* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 1 "vect" { target { {! vector_alignment_reachable} && {! vect_hw_misalign} } } } } */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { target { {! vector_alignment_reachable} && {! vect_hw_misalign} } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { target { vect_peel_align || { {! vector_alignment_reachable} && {! vect_hw_misalign} } } } } } */
diff --git a/gcc/testsuite/gcc.dg/vect/section-anchors-vect-69.c b/gcc/testsuite/gcc.dg/vect/section-anchors-vect-69.c
index 8c88e5f..873af82 100644
--- a/gcc/testsuite/gcc.dg/vect/section-anchors-vect-69.c
+++ b/gcc/testsuite/gcc.dg/vect/section-anchors-vect-69.c
@@ -112,8 +112,8 @@ int main (void)
 } 
 
 /* { dg-final { scan-tree-dump-times "vectorized 4 loops" 1 "vect" { target vect_int } } } */
-/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect" } } */
+/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 0 "vect"  { target vect_peel_align } } } */
 /* Alignment forced using versioning until the pass that increases alignment
   is extended to handle structs.  */
-/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 2 "vect" { target {vect_int && vector_alignment_reachable } } } } */
+/* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 2 "vect" { target { { vect_int && vect_peel_align } && vector_alignment_reachable } } } } */
 /* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 4 "vect" { target {vect_int && {! vector_alignment_reachable} } } } } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-28.c b/gcc/testsuite/gcc.dg/vect/vect-28.c
index b28fbd9..0e5aaa5 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-28.c
+++ b/gcc/testsuite/gcc.dg/vect/v

tls-dialect gnu2

2017-07-27 Thread Nathan Sidwell

Jan,
I notice the x86 default -mtls-dialect is GNU not GNU2.  The latter was 
added in 2006.  Is there a reason not to default to it now?  (perhaps 
because of the ldso dependency?)


nathan
--
Nathan Sidwell


RFA: Backport fix for PR80769

2017-07-27 Thread Richard Sandiford
This is a minimal-ish backport of the fix for PR80769.  The trunk version
also replaced open-coded instances of get_next_strinfo with calls to the
new function.  It also added asserts in various other places to try to
ensure that related strinfos were consistently delayed or not delayed.

Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK for gcc-7-branch?
(And OK for gcc-6-branch if the same patch passes testing there?)

Richard


2017-07-27  Richard Sandiford  

gcc/
PR tree-optimization/80769
* tree-ssa-strlen.c (strinfo): Document that "stmt" is also used
for malloc and calloc.  Document the new invariant that all related
strinfos have delayed lengths or none do.
(get_next_strinfo): New function.
(verify_related_strinfos): Move earlier in file.
(set_endptr_and_length): New function, split out from...
(get_string_length): ...here.  Also set the lengths of related
strinfos.

gcc/testsuite/
PR tree-optimization/80769
* gcc.dg/strlenopt-31.c: New test.
* gcc.dg/strlenopt-31g.c: Likewise.

Index: gcc/tree-ssa-strlen.c
===
--- gcc/tree-ssa-strlen.c   2017-05-10 18:04:24.514775477 +0100
+++ gcc/tree-ssa-strlen.c   2017-07-27 18:21:20.308966958 +0100
@@ -61,7 +61,13 @@ struct strinfo
   tree length;
   /* Any of the corresponding pointers for querying alias oracle.  */
   tree ptr;
-  /* Statement for delayed length computation.  */
+  /* This is used for two things:
+
+ - To record the statement that should be used for delayed length
+   computations.  We maintain the invariant that all related strinfos
+   have delayed lengths or none do.
+
+ - To record the malloc or calloc call that produced this result.  */
   gimple *stmt;
   /* Pointer to '\0' if known, if NULL, it can be computed as
  ptr + length.  */
@@ -156,6 +162,19 @@ get_strinfo (int idx)
   return (*stridx_to_strinfo)[idx];
 }
 
+/* Get the next strinfo in the chain after SI, or null if none.  */
+
+static inline strinfo *
+get_next_strinfo (strinfo *si)
+{
+  if (si->next == 0)
+return NULL;
+  strinfo *nextsi = get_strinfo (si->next);
+  if (nextsi == NULL || nextsi->first != si->first || nextsi->prev != si->idx)
+return NULL;
+  return nextsi;
+}
+
 /* Helper function for get_stridx.  */
 
 static int
@@ -438,6 +457,45 @@ set_strinfo (int idx, strinfo *si)
   (*stridx_to_strinfo)[idx] = si;
 }
 
+/* Return the first strinfo in the related strinfo chain
+   if all strinfos in between belong to the chain, otherwise NULL.  */
+
+static strinfo *
+verify_related_strinfos (strinfo *origsi)
+{
+  strinfo *si = origsi, *psi;
+
+  if (origsi->first == 0)
+return NULL;
+  for (; si->prev; si = psi)
+{
+  if (si->first != origsi->first)
+   return NULL;
+  psi = get_strinfo (si->prev);
+  if (psi == NULL)
+   return NULL;
+  if (psi->next != si->idx)
+   return NULL;
+}
+  if (si->idx != si->first)
+return NULL;
+  return si;
+}
+
+/* Set SI's endptr to ENDPTR and compute its length based on SI->ptr.
+   Use LOC for folding.  */
+
+static void
+set_endptr_and_length (location_t loc, strinfo *si, tree endptr)
+{
+  si->endptr = endptr;
+  si->stmt = NULL;
+  tree start_as_size = fold_convert_loc (loc, size_type_node, si->ptr);
+  tree end_as_size = fold_convert_loc (loc, size_type_node, endptr);
+  si->length = fold_build2_loc (loc, MINUS_EXPR, size_type_node,
+   end_as_size, start_as_size);
+}
+
 /* Return string length, or NULL if it can't be computed.  */
 
 static tree
@@ -533,12 +591,12 @@ get_string_length (strinfo *si)
case BUILT_IN_STPCPY_CHK_CHKP:
  gcc_assert (lhs != NULL_TREE);
  loc = gimple_location (stmt);
- si->endptr = lhs;
- si->stmt = NULL;
- lhs = fold_convert_loc (loc, size_type_node, lhs);
- si->length = fold_convert_loc (loc, size_type_node, si->ptr);
- si->length = fold_build2_loc (loc, MINUS_EXPR, size_type_node,
-   lhs, si->length);
+ set_endptr_and_length (loc, si, lhs);
+ for (strinfo *chainsi = verify_related_strinfos (si);
+  chainsi != NULL;
+  chainsi = get_next_strinfo (chainsi))
+   if (chainsi->length == NULL)
+ set_endptr_and_length (loc, chainsi, lhs);
  break;
case BUILT_IN_MALLOC:
  break;
@@ -607,32 +665,6 @@ unshare_strinfo (strinfo *si)
   return nsi;
 }
 
-/* Return first strinfo in the related strinfo chain
-   if all strinfos in between belong to the chain, otherwise
-   NULL.  */
-
-static strinfo *
-verify_related_strinfos (strinfo *origsi)
-{
-  strinfo *si = origsi, *psi;
-
-  if (origsi->first == 0)
-return NULL;
-  for (; si->prev; si = psi)
-{
-  if (si->first != origsi->first)
-   return NULL;
-  psi = get_strinfo (si->prev);
-  i

Re: [PATCH, rs6000] Add support for the vec_xl_be builtin

2017-07-27 Thread Segher Boessenkool
Hi Carl,

On Thu, Jul 27, 2017 at 01:48:41PM -0700, Carl Love wrote:
> +  pat = GEN_FCN (icode) (target, addr);
> +  if (! pat)
> +return 0;

No space after "!".

> +  /*  Reverse element order of elements if in LE mode */

Single space after "/*"; sentences end with dot space space.

> +  /* LX_BE  We initialized them to always load in big endian order.  */

XL_BE.

> +default:
> +   break;
> +  /* Fall through.  */
> +}

"break" is indented one space too many I think?

> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/builtins-4-runnable.c
> @@ -0,0 +1,321 @@
> +/* { dg-do run { target { powerpc64*-*-* } } } */

powerpc64*-*-* is pretty much never correct (you can use
powerpc64-linux-gcc -m32 as well as powerpc-linux-gcc -m64).  If you
need to restrict a testcase to 64-bit, use an lp64 test.  But this
test works everywhere I think?  So just powerpc*-*-*, or you can
leave out even that, this is in gcc.target/powerpc.

Looks fine otherwise.  Please fix those trivialities and then
commit, thanks!


Segher


[PATCH] C++: fix ordering of missing std #include suggestion (PR c++/81514)

2017-07-27 Thread David Malcolm
PR c++/81514 reports a problem where
  g++.dg/lookup/missing-std-include-2.C
fails on Solaris, offering the suggestion:

  error: 'string' is not a member of 'std'
  note: suggested alternative: 'sprintf'

instead of the expected:

  error: 'string' is not a member of 'std'
  note: 'std::string' is defined in header ''; did you forget to 
'#include '?

This is after a:
  #include 

suggest_alternative_in_explicit_scope currently works in two phases:

(a) it attempts to look for misspellings within the explicitly-given
namespace and suggests the best it finds

(b) failing that, it then looks for well-known "std::"
names and suggests a missing header

This now seems the wrong way round to me; if the user has
typed "std::string", a missing #include  seems more helpful
as a suggestion than attempting to look for misspellings.

This patch reverses the ordering of (a) and (b) above, so that
missing header hints for well-known std:: names are offered first,
only then falling back to misspelling hints.

The problem doesn't show up on my x86_64-pc-linux-gnu box, as
the pertinent part of the #include  appears to be
equivalent to:

  extern int sprintf (char *dst, const char *format, ...);
  namespace std
  {
using ::sprintf;
  }

The "std::sprintf" thus examined within consider_binding_level
is the same tree node as ::sprintf, and is rejected by:

  /* Skip anticipated decls of builtin functions.  */
  if (TREE_CODE (d) == FUNCTION_DECL
  && DECL_BUILT_IN (d)
  && DECL_ANTICIPATED (d))
continue;

and so the name "sprintf" is never considered as a spell-correction
for std::"string".

Hence we're not issuing spelling corrections for aliases
within a namespace for builtins from the global namespace;
these are pre-created by cxx_builtin_function, which has:

4397  /* All builtins that don't begin with an '_' should additionally
4398 go in the 'std' namespace.  */
4399  if (name[0] != '_')
4400{
4401  tree decl2 = copy_node(decl);
4402  push_namespace (std_identifier);
4403  builtin_function_1 (decl2, std_node, false);
4404  pop_namespace ();
4405}

I'm not sure why Solaris' decl of std::sprintf doesn't hit the
reject path above.

I was able to reproduce the behavior seen on Solaris on my Fedora
box by using this:

  namespace std
  {
extern int sprintf (char *dst, const char *format, ...);
  }

which isn't rejected by the "Skip anticipated decls of builtin
functions" test above, and hence sprintf is erroneously offered
 as a suggestion.

The patch reworks the test case to work in the above way,
to trigger the problem on Linux, and then fixes it by
changing the order that the suggestions are tried in
name-lookup.c.  It introduces an "empty.h" since the testcase
is also to verify that we suggest a good location for new #include
directives relative to pre-existing #include directives.

Successfully bootstrapped®rtested on x86_64-pc-linux-gnu.

OK for trunk?

gcc/cp/ChangeLog:
PR c++/81514
* name-lookup.c (maybe_suggest_missing_header): Convert return
type from void to bool; return true iff a suggestion was offered.
(suggest_alternative_in_explicit_scope): Move call to
maybe_suggest_missing_header to before use of best_match, and
return true if the former offers a suggestion.

gcc/testsuite/ChangeLog:
PR c++/81514
* g++.dg/lookup/empty.h: New file.
* g++.dg/lookup/missing-std-include-2.C: Replace include of
stdio.h with empty.h and a declaration of a "std::sprintf" not based
on a built-in.
---
 gcc/cp/name-lookup.c   | 39 +++---
 gcc/testsuite/g++.dg/lookup/empty.h|  1 +
 .../g++.dg/lookup/missing-std-include-2.C  | 11 --
 3 files changed, 29 insertions(+), 22 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/lookup/empty.h

diff --git a/gcc/cp/name-lookup.c b/gcc/cp/name-lookup.c
index cd7428a..49c4dea 100644
--- a/gcc/cp/name-lookup.c
+++ b/gcc/cp/name-lookup.c
@@ -4838,34 +4838,34 @@ get_std_name_hint (const char *name)
   return NULL;
 }
 
-/* Subroutine of suggest_alternative_in_explicit_scope, for use when we have no
-   suggestions to offer.
-   If SCOPE is the "std" namespace, then suggest pertinent header
-   files for NAME.  */
+/* If SCOPE is the "std" namespace, then suggest pertinent header
+   files for NAME at LOCATION.
+   Return true iff a suggestion was offered.  */
 
-static void
+static bool
 maybe_suggest_missing_header (location_t location, tree name, tree scope)
 {
   if (scope == NULL_TREE)
-return;
+return false;
   if (TREE_CODE (scope) != NAMESPACE_DECL)
-return;
+return false;
   /* We only offer suggestions for the "std" namespace.  */
   if (scope != std_node)
-return;
+return false;
   gcc_assert (TREE_CODE (name) == IDENTIFIER_NODE);
 
   const char *name_str = IDENTIFIER_POINTER (name);
   const c

[PATCH, rs6000] Add support for the vec_xl_be builtin

2017-07-27 Thread Carl Love
GCC Maintainers:

The following patch add support for the vec_xl_be builtins.  The builtin
always loads data in BE order.

The patch has been tested on powerpc64le-unknown-linux-gnu (Power 8 LE)
and on powerpc64-unknown-linux-gnu (Power 7 BE).


Please let me know if the following patch is acceptable.  Thanks.

Carl Love

--

gcc/ChangeLog:

2017-07-27  Carl Love  

* config/rs6000/rs6000-c: Add support for built-in functions
vector signed char vec_xl_be (signed long long, signed char *);
vector unsigned char vec_xl_be (signed long long, unsigned char *);
vector signed int vec_xl_be (signed long long, signed int *);
vector unsigned int vec_xl_be (signed long long, unsigned int *);
vector signed long long vec_xl_be (signed long long, signed long long 
*);
vector unsigned long long vec_xl_be (signed long long, unsigned long 
long *);
vector signed short vec_xl_be (signed long long, signed short *);
vector unsigned short vec_xl_be (signed long long, unsigned short *);
vector double vec_xl_be (signed long long, double *);
vector float vec_xl_be (signed long long, float *);
* config/rs6000/altivec.h (vec_xl_be): Add #define.
* config/rs6000/rs6000-builtin.def (XL_BE_V16QI, XL_BE_V8HI, XL_BE_V4SI,
XL_BE_V2DI, XL_BE_V4SF, XL_BE_V2DF, XL_BE): Add definitions for the 
builtins.
* config/rs6000/rs6000.c (altivec_expand_xl_be_builtin): Add function.
(altivec_expand_builtin): Add switch statement to call 
altivec_expand_xl_be
for each builtin.
(altivec_init_builtins): Add def_builtin for _builtin_vsx_le_be_v8hi,
__builtin_vsx_le_be_v4si, __builtin_vsx_le_be_v2di, 
__builtin_vsx_le_be_v4sf,
__builtin_vsx_le_be_v2df, __builtin_vsx_le_be_v16qi.
* doc/extend.texi: Update the built-in documentation file for the
new built-in functions.

gcc/testsuite/ChangeLog:

2017-07-27  Carl Love  

* gcc.target/powerpc/builtins-4-runnable.c: Add test cases for the
new builtins.
---
 gcc/config/rs6000/altivec.h|   1 +
 gcc/config/rs6000/rs6000-builtin.def   |   9 +
 gcc/config/rs6000/rs6000-c.c   |  20 ++
 gcc/config/rs6000/rs6000.c | 111 +++
 gcc/doc/extend.texi|  13 +
 .../gcc.target/powerpc/builtins-4-runnable.c   | 321 +
 6 files changed, 475 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/builtins-4-runnable.c

diff --git a/gcc/config/rs6000/altivec.h b/gcc/config/rs6000/altivec.h
index 4d34a97..c8e508c 100644
--- a/gcc/config/rs6000/altivec.h
+++ b/gcc/config/rs6000/altivec.h
@@ -355,6 +355,7 @@
 #define vec_vsx_ld __builtin_vec_vsx_ld
 #define vec_vsx_st __builtin_vec_vsx_st
 #define vec_xl __builtin_vec_vsx_ld
+#define vec_xl_be __builtin_vec_xl_be
 #define vec_xst __builtin_vec_vsx_st
 
 /* Note, xxsldi and xxpermdi were added as __builtin_vsx_ functions
diff --git a/gcc/config/rs6000/rs6000-builtin.def 
b/gcc/config/rs6000/rs6000-builtin.def
index a043e70..850164a 100644
--- a/gcc/config/rs6000/rs6000-builtin.def
+++ b/gcc/config/rs6000/rs6000-builtin.def
@@ -1735,6 +1735,14 @@ BU_VSX_X (LXVW4X_V4SF, "lxvw4x_v4sf",MEM)
 BU_VSX_X (LXVW4X_V4SI,"lxvw4x_v4si",   MEM)
 BU_VSX_X (LXVW4X_V8HI,"lxvw4x_v8hi",   MEM)
 BU_VSX_X (LXVW4X_V16QI,  "lxvw4x_v16qi",   MEM)
+
+BU_VSX_X (XL_BE_V16QI, "xl_be_v16qi", MEM)
+BU_VSX_X (XL_BE_V8HI, "xl_be_v8hi", MEM)
+BU_VSX_X (XL_BE_V4SI, "xl_be_v4si", MEM)
+BU_VSX_X (XL_BE_V2DI, "xl_be_v2di", MEM)
+BU_VSX_X (XL_BE_V4SF, "xl_be_v4sf", MEM)
+BU_VSX_X (XL_BE_V2DF, "xl_be_v2df", MEM)
+
 BU_VSX_X (STXSDX,"stxsdx", MEM)
 BU_VSX_X (STXVD2X_V1TI,  "stxvd2x_v1ti",   MEM)
 BU_VSX_X (STXVD2X_V2DF,  "stxvd2x_v2df",   MEM)
@@ -1835,6 +1843,7 @@ BU_VSX_OVERLOAD_1 (VUNSIGNEDO,  "vunsignedo")
 BU_VSX_OVERLOAD_X (LD,  "ld")
 BU_VSX_OVERLOAD_X (ST,  "st")
 BU_VSX_OVERLOAD_X (XL,  "xl")
+BU_VSX_OVERLOAD_X (XL_BE,"xl_be")
 BU_VSX_OVERLOAD_X (XST, "xst")
 
 /* 2 argument CMPB instructions added in ISA 2.05. */
diff --git a/gcc/config/rs6000/rs6000-c.c b/gcc/config/rs6000/rs6000-c.c
index 1359099..7ffb3fd 100644
--- a/gcc/config/rs6000/rs6000-c.c
+++ b/gcc/config/rs6000/rs6000-c.c
@@ -3077,6 +3077,26 @@ const struct altivec_builtin_types 
altivec_overloaded_builtins[] = {
 ~RS6000_BTI_unsigned_V16QI, 0 },
   { VSX_BUILTIN_VEC_XL, VSX_BUILTIN_LD_ELEMREV_V16QI,
 RS6000_BTI_unsigned_V16QI, RS6000_BTI_INTSI, ~RS6000_BTI_UINTQI, 0 },
+  { VSX_BUILTIN_VEC_XL_BE, VSX_BUILTIN_XL_BE_V16QI,
+RS6000_BTI_unsigned_V16QI, RS6000_BTI_INTSI, ~RS6000_BTI_UINTQI, 0 },
+  { VSX_BUILTIN_VEC_XL_BE, VSX_BUILTIN_XL_BE_V16QI,
+RS6000_BTI_V16QI, RS6000_BTI_INTSI, ~RS6000_BTI_INTQI,

Re: [PATCH 2/2] Remove reload_in_progress and other cleanups.

2017-07-27 Thread Peter Bergner
On 7/27/17 2:29 PM, Segher Boessenkool wrote:
> Hi!
> 
> On Thu, Jul 27, 2017 at 10:44:43AM -0500, Peter Bergner wrote:
>> This patch removes reload specific code from the rs6000 port made possible
>> by the elimination of the usage of the -mno-lra option.
> 
>>  * config/rs6000/predicates.md (volatile_mem_operand) Remove code
>>  related to reload_in_progress.
> 
> Missing colon here.
> 
>>  * config/rs6000/rs6000-protos.h (rs6000_secondary_memory_needed_rtx)
>>  Delete prototype.
> 
> And here.

Oops, thanks for catching those.  Fixed.


> Okay with those trivial changelog fixes.  Nice cleanup :-)

Committed as revision 250638, thanks!

Peter




Re: [PATCH 1/2] Eliminate -mno-lra from the rs6000 port.

2017-07-27 Thread Peter Bergner
On 7/27/17 11:47 AM, Segher Boessenkool wrote:
> On Thu, Jul 27, 2017 at 10:43:44AM -0500, Peter Bergner wrote:
>> This patch makes the -mlra option a nop while disallowing -mno-lra.
>> It also removes the target bit mask and its usage.
>> Finally, this patch updates the testsuite by removing all usage of
>> the -mlra and -mno-lra options.
> 
> So you left the -mno-lra testcases (but without -mno-lra) because it
> gives some more test coverage this way.  Okay.
> 
> Looks fine, please commit.  Thanks!

Committed as revision 250637, thanks!

Peter



Re: [PATCH 04/17] C frontend: capture BLT information

2017-07-27 Thread Martin Sebor

On 07/24/2017 02:05 PM, David Malcolm wrote:

This patch extends the C frontend so that it optionally builds a BLT tree,
by using an auto_blt_node class within the recursive descent through the
parser, so that its ctor/dtors build the blt_node hierarchy; this is
rapidly (I hope) rejected in the no -fblt case, so that by default, no
blt nodes are created and it's (I hope) close to a no-op.


...

@@ -4627,6 +4629,8 @@ start_decl (struct c_declarator *declarator, struct 
c_declspecs *declspecs,
 deprecated_state);
   if (!decl || decl == error_mark_node)
 return NULL_TREE;
+  if (declarator->bltnode)
+declarator->bltnode->set_tree (decl);


FWIW, if this checking were to become wide-spred, an alternative
to introducing conditionals like this is to set declarator->bltnode
to an object of a dummy type that doesn't do anything if -fblt isn't
specified, and to an object of the real "workhorse" type that does
the work with -fblt.  The object can either be polymorphic if the
cost of the virtual call isn't a concern, or it can be a static
wrapper around a polymorphic interface, where the (inline member
functions of the) static wrapper do the checking under the hood
so the users don't have to worry about it.  (A more C-like
alternative is to simply hide the check in an inline function.)



+/* A RAII-style class for optionally building a concrete parse tree (or
+   at least, something close to it) as the recursive descent parser runs.
+
+   Close to a no-op if -fblt is not selected.  */
+
+class auto_blt_node
+{
+public:
+  auto_blt_node (c_parser* parser, enum blt_kind kind);
+  ~auto_blt_node ();


Unless the class is safely copyable and assignable (I can't tell
from the dtor definition) I would suggest to define its copy ctor
and assignment operator private to avoid introducing subtle bugs
by inadvertently making copies.

Martin


Re: [PATCH] [PowerPC/RTEMS] Add 64-bit support using ELFv2 ABI

2017-07-27 Thread Segher Boessenkool
On Thu, Jul 27, 2017 at 07:28:30AM +0200, Sebastian Huber wrote:
> >This deletes eabi.h and I don't see you add all its definitions to
> >rtems.h directly (NAME__MAIN etc.)  Is this on purpose?
> 
> Yes, I always wondered why GCC added the __eabi() call to main() out of 
> thin air. In general, there is no main() function in RTEMS. Instead, you 
> can statically configure initialization threads. We call __eabi() in the 
> low-level startup code, e.g.
> 
> https://git.rtems.org/rtems/tree/c/src/lib/libbsp/powerpc/qoriq/start/start.S#n144

Heh, I always thought the EABI must require it, but it seems to be a GCC
invention.

Patch looks fine to me then.  You can approve it yourself of course :-)


Segher


Re: [PATCH 00/17] RFC: New source-location representation; Language Server Protocol

2017-07-27 Thread Martin Sebor

I think this is great.  My overall question is: is the new BLT
representation available in the middle-end?  If not, do you
have plans to make it available?  (I think it might be especially
useful there, to either accurately highlight the source of
a problem when it's far removed from the problem site, or to
track the flow of data from its source to the sink.)


=
(b) Examples of usage
=

Patches 6-10 in the kit update various diagnostics to use
the improved location information where available:

* C and C++: highlighting the pertinent parameter of a function
  decl when there's a mismatched type in a call


Very nice!



* C and C++: highlighting the return type in the function defn
  when compaining about mismatch in the body (e.g. highlighting
  the "void" when we see a "return EXPR;" within a void function).

* C++: add a fix-it hint to -Wsuggest-override

I have plenty of ideas for other uses of this infrastructure
(but which aren't implemented yet), e.g.:

* C++: highlight the "const" token (or suggest a fix-it hint)
  when you have a missing "const" on the *definition* of a member
  function that was declared as "const" (I make this mistake
  all the time).

* C++: add a fix-it hint to -Wsuggest-final-methods


To answer your call for other ideas below: There are other
suggestions that GCC could offer to help improve code, including

 *  to add the const qualifier to function pointer and reference
arguments (including member functions)

 *  to add restrict where appropriate (especially to const pointer
and reference arguments)

 *  to delete or default the default copy ctor or copy assignment
operator if/when it would benefit

 *  to help highlight how to optimize data layout (with a new
hypothetical feature that offered suggestions to reorder
struct or class members for space efficiency or data
locality)



* highlight bogus attributes


This would be very nice.  The diagnostics in this area are weak
to put it mildly, and the highlighting is completely bogus.  It
would be great to also be able to highlight attribute operands.



* add fix-it hints suggesting missing attributes

...etc, plus those "cousins of a compiler" ideas mentioned above.

Any other ideas?


This may be outside the scope of your work but when a declaration
is found to conflict in some way with one seen earlier on in a file,
it would be great to be able to point to the actual source of the
conflict rather than to the immediately preceding declaration as
a whole.  As in:

  int __attribute__ ((noinline)) foo (int);

  int foo (int);

  int __attribute ((always_inline)) foo (int);

  x.c:5:35: warning: declaration of ‘int foo(int)’ with attribute 
‘always_inline’ follows declaration with attribute ‘noinline’ [-Wattributes]

   int __attribute ((always_inline)) foo (int);
 ^~~

Rather than printing a note like this:

  x.c:3:5: note: previous declaration of ‘int foo(int)’ was here
   int foo (int);
   ^~~

print this:

  x.c:1:5: note: previous declaration of ‘int foo(int)’ was here
  int __attribute__ ((noinline)) foo (int);
 ^~~

(preferably with the attribute underlined).

I'm sure there are many others.

Martin


[PATCH] PR debug/81570: dwarf2cfi.c: Update cfa.offset in create_pseudo_cfg

2017-07-27 Thread H.J. Lu
execute_dwarf2_frame is called for each funtion.  But create_cie_data
is called only once to initialize cie_cfi_row for all functions.  Since
INCOMING_FRAME_SP_OFFSET may be different for each function, we can't
use the same INCOMING_FRAME_SP_OFFSET in cie_cfi_row for all functions.
This patch sets cie_cfi_row->cfa.offset to INCOMING_FRAME_SP_OFFSET in
create_pseudo_cfg which is called for each function.

Tested on x86-64.  OK for trunk?

Thanks.


H.J.
PR debug/81570
* dwarf2cfi.c (create_pseudo_cfg): Set cie_cfi_row->cfa.offset
to INCOMING_FRAME_SP_OFFSET.
---
 gcc/dwarf2cfi.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/gcc/dwarf2cfi.c b/gcc/dwarf2cfi.c
index a5f9832fc4a..c40f31d2f20 100644
--- a/gcc/dwarf2cfi.c
+++ b/gcc/dwarf2cfi.c
@@ -2831,6 +2831,9 @@ create_pseudo_cfg (void)
   memset (&ti, 0, sizeof (ti));
   ti.head = get_insns ();
   ti.beg_row = cie_cfi_row;
+  /* Set cfa.offset to INCOMING_FRAME_SP_OFFSET here since it may be
+ different for each function.  */
+  cie_cfi_row->cfa.offset = INCOMING_FRAME_SP_OFFSET;
   ti.cfa_store = cie_cfi_row->cfa;
   ti.cfa_temp.reg = INVALID_REGNUM;
   trace_info.quick_push (ti);
-- 
2.13.3



Re: [PATCH 2/2] Remove reload_in_progress and other cleanups.

2017-07-27 Thread Segher Boessenkool
Hi!

On Thu, Jul 27, 2017 at 10:44:43AM -0500, Peter Bergner wrote:
> This patch removes reload specific code from the rs6000 port made possible
> by the elimination of the usage of the -mno-lra option.

>   * config/rs6000/predicates.md (volatile_mem_operand) Remove code
>   related to reload_in_progress.

Missing colon here.

>   * config/rs6000/rs6000-protos.h (rs6000_secondary_memory_needed_rtx)
>   Delete prototype.

And here.

Okay with those trivial changelog fixes.  Nice cleanup :-)


Segher


[committed] Fix C omp for verification (PR c/45784)

2017-07-27 Thread Jakub Jelinek
Hi!

Apparently for C sizeof on VLA the FE tends to emit something that
is folded into a COMPOUND_EXPR with the VLA decl on lhs and the actual
condition on rhs.  In the bar routine in the testcase I'm actually
testing a case where there are multiple such COMPOUND_EXPRs.

This patch accepts those and moves those to the non-decl operand of
the comparison, which is really the only spot where it could be actually
used anyway.

Bootstrapped/regtested on x86_64-linux and i686-linux, committed to trunk,
queued for backporting to release branches.

2017-07-27  Jakub Jelinek  

PR c/45784
* c-omp.c (c_finish_omp_for): If the condition is wrapped in
rhs of COMPOUND_EXPR(s), skip them and readd their lhs into
new COMPOUND_EXPRs around the rhs of the comparison.

* testsuite/libgomp.c/pr45784.c: New test.
* testsuite/libgomp.c++/pr45784.C: New test.

--- gcc/c-family/c-omp.c.jj 2017-01-01 12:45:46.0 +0100
+++ gcc/c-family/c-omp.c2017-07-27 18:33:40.764274882 +0200
@@ -531,6 +531,12 @@ c_finish_omp_for (location_t locus, enum
{
  bool cond_ok = false;
 
+ /* E.g. C sizeof (vla) could add COMPOUND_EXPRs with
+evaluation of the vla VAR_DECL.  We need to readd
+them to the non-decl operand.  See PR45784.  */
+ while (TREE_CODE (cond) == COMPOUND_EXPR)
+   cond = TREE_OPERAND (cond, 1);
+
  if (EXPR_HAS_LOCATION (cond))
elocus = EXPR_LOCATION (cond);
 
@@ -605,6 +611,21 @@ c_finish_omp_for (location_t locus, enum
  else if (code != CILK_SIMD && code != CILK_FOR)
cond_ok = false;
}
+
+ if (cond_ok && TREE_VEC_ELT (condv, i) != cond)
+   {
+ tree ce = NULL_TREE, *pce = &ce;
+ tree type = TREE_TYPE (TREE_OPERAND (cond, 1));
+ for (tree c = TREE_VEC_ELT (condv, i); c != cond;
+  c = TREE_OPERAND (c, 1))
+   {
+ *pce = build2 (COMPOUND_EXPR, type, TREE_OPERAND (c, 0),
+TREE_OPERAND (cond, 1));
+ pce = &TREE_OPERAND (*pce, 1);
+   }
+ TREE_OPERAND (cond, 1) = ce;
+ TREE_VEC_ELT (condv, i) = cond;
+   }
}
 
  if (!cond_ok)
--- libgomp/testsuite/libgomp.c/pr45784.c.jj2017-07-27 18:50:53.230801321 
+0200
+++ libgomp/testsuite/libgomp.c/pr45784.c   2017-07-27 18:51:08.436617805 
+0200
@@ -0,0 +1,41 @@
+/* PR c/45784 */
+/* { dg-do run } */
+
+void
+foo (int n)
+{
+  char *p, vla[2 * n];
+  int i;
+  #pragma omp parallel for
+  for (p = vla; p < vla + (sizeof (vla) / sizeof (vla[0])); p++)
+*p = ' ';
+  #pragma omp parallel for
+  for (i = 0; i < 2 * n; i++)
+if (vla[i] != ' ')
+  __builtin_abort ();
+}
+
+void
+bar (int n)
+{
+  char *p, vla1[n], vla2[n * 2], vla3[n * 3], vla4[n * 4];
+  int i;
+  __builtin_memset (vla4, ' ', n * 4);
+  #pragma omp parallel for
+  for (p = vla4 + sizeof (vla1); p < vla4 + sizeof (vla3) - sizeof (vla2) + 
sizeof (vla1); p += sizeof (vla4) / sizeof (vla4))
+p[0] = '!';
+  #pragma omp parallel for
+  for (i = 0; i < n * 4; i++)
+if (vla4[i] != ((i >= n && i < 2 * n) ? '!' : ' '))
+  __builtin_abort ();
+}
+
+int
+main ()
+{
+  volatile int n;
+  n = 128;
+  foo (n);
+  bar (n);
+  return 0;
+}
--- libgomp/testsuite/libgomp.c++/pr45784.C.jj  2017-07-27 18:51:38.545254431 
+0200
+++ libgomp/testsuite/libgomp.c++/pr45784.C 2017-07-27 18:51:32.404328545 
+0200
@@ -0,0 +1,5 @@
+// PR c/45784
+// { dg-do run }
+
+#include "../libgomp.c/pr45784.c"
+

Jakub


Re: [PATCH] New C++ warning option '-Wduplicated-access-specifiers'

2017-07-27 Thread Richard Sandiford
Martin Sebor  writes:
> On 07/23/2017 02:42 PM, Volker Reichelt wrote:
>> On 21 Jul, Martin Sebor wrote:
>>> On 07/20/2017 10:35 AM, Volker Reichelt wrote:
 Hi,

 the following patch introduces a new C++ warning option
 -Wduplicated-access-specifiers that warns about redundant
 access-specifiers in classes, e.g.

   class B
   {
 public:
   B();

 private:
   void foo();
 private:
   int i;
   };
>>>
>>> I'm very fond of warnings that help find real bugs, or even that
>>> provide an opportunity to review code for potential problems or
>>> inefficiencies and suggest a possibility to improve it in some
>>> way (make it clearer, or easier for humans or compilers to find
>>> real bugs in, or faster, etc.), even if the code isn't strictly
>>> incorrect.
>>>
>>> In this case I'm having trouble coming up with an example where
>>> the warning would have this effect.  What do you have in mind?
>>> (Duplicate access specifiers tend to crop up in large classes
>>> and/or as a result of preprocessor conditionals.)
>>
>> This warning fights the "tend to crop up" effect that clutters the
>> code. After some time these redundant access specifiers creep in
>> and make your code harder to read. If I see something like
>>
>>   ...
>> void foo() const;
>>
>>   private:
>> void bar();
>>   ...
>>
>> on the screen I tend to think that 'foo' has a different access
>> level than bar. If that's not the case because the access-specifier
>> is redundant, then that's just confusing and distracting.
>> The warning helps to maintain readability of your code.
>>
>> The benefit might not be big, but the warning comes at relatively
>> low cost. It passes a location around through the class stack and
>> checks less than a handful of tokens.
>>
>> My personal motivation to implement this warning was the fact that
>> I changed a big Qt application suite from Qt's legacy SIGNAL-SLOT
>> mechanism to the new function pointer syntax of Qt5. In the old
>> version you had to mark slots in the following fashion:
>>
>>   public slots:
>> void foo();
>> void bar();
>>
>> But now you can use any function making the 'slots' macro obsolete.
>> Therefore I ended up with hundreds of redundant access-specifiers
>> which this warning helped to clean up. Doing this sort of thing in the
>> compiler with a proper parser is way safer than to write some script
>> to achieve this.
>
> Okay, thanks for clarifying that.  I think what's distracting to
> one could be helpful to another.  For example, it's not uncommon
> for classes with many members to use redundant access specifiers
> to group blocks of related declarations.  Or, in a class with many
> lines of comments (e.g., Doxygen), repeating the access specifier
> every so often could be seen as helpful because otherwise there
> would be no way to tell what its access is without scrolling up
> or down.  It's debatable what approach to dealing with this is
> best.  Java, for instance, requires every non-public member to
> be declared with its own access specifier.  Some other languages
> (I think D) do as well.  An argument could be made that that's
> a far clearer style than using the specifier only when changing
> access.  It seems to me that the most suitable approach will be
> different from one project to another, if not from one person to
> another.  A diagnostic that recommends a particular style (or that
> helps with a specific kind of a project like the one you did for
> Qt) might be a good candidate for a plugin, but enshrining any
> one style (or a solution to a project-specific problem) in GCC
> as a general-purpose warning doesn't seem appropriate or in line
> with the definition of warnings in GCC:
>
>constructions that are not inherently erroneous but that are
>risky or suggest there may have been an error

I think there are some circumstances in which the warning would count,
especially if you're working to a coding convention that requires all
public members followed by all protected members followed by all private
members.  Having a duplicated specifier in that context might then
indicate that you've got a cut-&-paste error.

I think both that scenario and the ones Volker gave are enough
justification for the warning to be useful, but not enough for
including it in Wall or Wextra (which isn't being proposed).

> PS There are other redundancies that some might say unnecessarily
> clutter code.  For instance, declaring a symbol static in
> an unnamed namespace, or explicitly declaring a member function
> inline that's also defined within the body of a class, or
> explicitly declaring a function virtual that overrides one
> declared in a base class.  None of these is diagnosed, and I'd
> say for good reason: they are all benign and do not suggest any
> sort of a coding mistake or present an opportunity for improvement.
> In fact, warning for some of them (especially the virtual function

[PATCH] Fix parloops ICE (PR tree-optimization/81578)

2017-07-27 Thread Jakub Jelinek
Hi!

Not all vectorizable reductions are valid OpenMP standard reductions
(we could create user defined reductions from that, but that would be
quite a lot of work).

This patch bails out for unsupported reductions.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2017-07-27  Jakub Jelinek  

PR tree-optimization/81578
* tree-parloops.c (build_new_reduction): Bail out if
reduction_code isn't one of the standard OpenMP reductions.
Move the details printing after that decision.

* gcc.dg/pr81578.c: New test.

--- gcc/tree-parloops.c.jj  2017-07-19 14:01:24.0 +0200
+++ gcc/tree-parloops.c 2017-07-27 14:27:13.966749227 +0200
@@ -2475,23 +2475,39 @@ build_new_reduction (reduction_info_tabl
 
   gcc_assert (reduc_stmt);
 
-  if (dump_file && (dump_flags & TDF_DETAILS))
-{
-  fprintf (dump_file,
-  "Detected reduction. reduction stmt is:\n");
-  print_gimple_stmt (dump_file, reduc_stmt, 0);
-  fprintf (dump_file, "\n");
-}
-
   if (gimple_code (reduc_stmt) == GIMPLE_PHI)
 {
   tree op1 = PHI_ARG_DEF (reduc_stmt, 0);
   gimple *def1 = SSA_NAME_DEF_STMT (op1);
   reduction_code = gimple_assign_rhs_code (def1);
 }
-
   else
 reduction_code = gimple_assign_rhs_code (reduc_stmt);
+  /* Check for OpenMP supported reduction.  */
+  switch (reduction_code)
+{
+case PLUS_EXPR:
+case MULT_EXPR:
+case MAX_EXPR:
+case MIN_EXPR:
+case BIT_IOR_EXPR:
+case BIT_XOR_EXPR:
+case BIT_AND_EXPR:
+case TRUTH_OR_EXPR:
+case TRUTH_XOR_EXPR:
+case TRUTH_AND_EXPR:
+  break;
+default:
+  return;
+}
+
+  if (dump_file && (dump_flags & TDF_DETAILS))
+{
+  fprintf (dump_file,
+  "Detected reduction. reduction stmt is:\n");
+  print_gimple_stmt (dump_file, reduc_stmt, 0);
+  fprintf (dump_file, "\n");
+}
 
   new_reduction = XCNEW (struct reduction_info);
 
--- gcc/testsuite/gcc.dg/pr81578.c.jj   2017-07-27 14:48:16.426441581 +0200
+++ gcc/testsuite/gcc.dg/pr81578.c  2017-07-27 14:48:01.0 +0200
@@ -0,0 +1,12 @@
+/* PR tree-optimization/81578 */
+/* { dg-do compile { target pthread } } */
+/* { dg-options "-O2 -ftree-parallelize-loops=2" } */
+
+int
+foo (int *x)
+{
+  int i, r = 1;
+  for (i = 0; i != 1024; i++)
+r *= x[i] < 0;
+  return r;
+}

Jakub


Re: [Patch AArch64 2/2] Fix memory sizes to load/store patterns

2017-07-27 Thread James Greenhalgh
On Mon, Jul 03, 2017 at 11:46:58AM +0100, James Greenhalgh wrote:
> On Wed, Jun 21, 2017 at 11:50:08AM +0100, James Greenhalgh wrote:
> > *ping*
> 
> Ping*2

Ping*3

Thanks,
James

> 
> Thanks,
> James
> 
> > On Mon, Jun 12, 2017 at 02:54:00PM +0100, James Greenhalgh wrote:
> > > 
> > > Hi,
> > > 
> > > There seems to be a partial misconception in the AArch64 backend that
> > > load1/load2 referred to the number of registers to load, rather than the
> > > number of words to load. This patch fixes that using the new "number of
> > > byte" types added in the previous patch.
> > > 
> > > That means using the load_16 and store_16 types that were defined in the
> > > previous patch for the first time in the AArch64 backend. To ensure
> > > continuity for scheduling models, I've just split this out from load_8.
> > > Please update your models if this is very wrong!
> > > 
> > > Bootstrapped on aarch64-none-linux-gnu with no issue.
> > > 
> > > OK?
> > > 
> > > Thanks,
> > > James
> > > 
> > > ---
> > > 2017-06-12  James Greenhalgh  
> > > 
> > >   * config/aarch64/aarch64.md (movdi_aarch64): Set load/store
> > >   types correctly.
> > >   (movti_aarch64): Likewise.
> > >   (movdf_aarch64): Likewise.
> > >   (movtf_aarch64): Likewise.
> > >   (load_pairdi): Likewise.
> > >   (store_pairdi): Likewise.
> > >   (load_pairdf): Likewise.
> > >   (store_pairdf): Likewise.
> > >   (loadwb_pair_): Likewise.
> > >   (storewb_pair_): Likewise.
> > >   (ldr_got_small_): Likewise.
> > >   (ldr_got_small_28k_): Likewise.
> > >   (ldr_got_tiny): Likewise.
> > >   * config/aarch64/iterators.md (ldst_sz): New.
> > >   (ldpstp_sz): Likewise.
> > >   * config/aarch64/thunderx.md (thunderx_storepair): Split store_8
> > >   to store_16.
> > >   (thunderx_load): Split load_8 to load_16.
> > >   * config/aarch64/thunderx2t99.md (thunderx2t99_loadpair): Split
> > >   load_8 to load_16.
> > >   (thunderx2t99_storepair_basic): Split store_8 to store_16.
> > >   * config/arm/xgene1.md (xgene1_load_pair): Split load_8 to load_16.
> > >   (xgene1_store_pair): Split store_8 to store_16.
> > > 
> > 
> > > diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
> > > index 11295a6..a1385e3 100644
> > > --- a/gcc/config/aarch64/aarch64.md
> > > +++ b/gcc/config/aarch64/aarch64.md
> > > @@ -981,7 +981,7 @@
> > > DONE;
> > >  }"
> > >[(set_attr "type" "mov_reg,mov_reg,mov_reg,mov_imm,mov_imm,\
> > > - load_4,load_4,store_4,store_4,\
> > > + load_8,load_8,store_8,store_8,\
> > >   adr,adr,f_mcr,f_mrc,fmov,neon_move")
> > > (set_attr "fp" "*,*,*,*,*,*,yes,*,yes,*,*,yes,yes,yes,*")
> > > (set_attr "simd" "*,*,*,*,*,*,*,*,*,*,*,*,*,*,yes")]
> > > @@ -1026,7 +1026,8 @@
> > > ldr\\t%q0, %1
> > > str\\t%q1, %0"
> > >[(set_attr "type" "multiple,f_mcr,f_mrc,neon_logic_q, \
> > > -  load_8,store_8,store_8,f_loadd,f_stored")
> > > +  load_16,store_16,store_16,\
> > > + load_16,store_16")
> > > (set_attr "length" "8,8,8,4,4,4,4,4,4")
> > > (set_attr "simd" "*,*,*,yes,*,*,*,*,*")
> > > (set_attr "fp" "*,*,*,*,*,*,*,yes,yes")]
> > > @@ -1121,7 +1122,7 @@
> > > str\\t%x1, %0
> > > mov\\t%x0, %x1"
> > >[(set_attr "type" "neon_move,f_mcr,f_mrc,fmov,fconstd,\
> > > - f_loadd,f_stored,load_4,store_4,mov_reg")
> > > + f_loadd,f_stored,load_8,store_8,mov_reg")
> > > (set_attr "simd" "yes,*,*,*,*,*,*,*,*,*")]
> > >  )
> > >  
> > > @@ -1145,7 +1146,7 @@
> > > stp\\t%1, %H1, %0
> > > stp\\txzr, xzr, %0"
> > >[(set_attr "type" "logic_reg,multiple,f_mcr,f_mrc,neon_move_q,f_mcr,\
> > > - f_loadd,f_stored,load_8,store_8,store_8")
> > > + f_loadd,f_stored,load_16,store_16,store_16")
> > > (set_attr "length" "4,8,8,8,4,4,4,4,4,4,4")
> > > (set_attr "simd" "yes,*,*,*,yes,*,*,*,*,*,*")]
> > >  )
> > > @@ -1209,7 +1210,7 @@
> > >"@
> > > ldp\\t%x0, %x2, %1
> > > ldp\\t%d0, %d2, %1"
> > > -  [(set_attr "type" "load_8,neon_load1_2reg")
> > > +  [(set_attr "type" "load_16,neon_load1_2reg")
> > > (set_attr "fp" "*,yes")]
> > >  )
> > >  
> > > @@ -1244,7 +1245,7 @@
> > >"@
> > > stp\\t%x1, %x3, %0
> > > stp\\t%d1, %d3, %0"
> > > -  [(set_attr "type" "store_8,neon_store1_2reg")
> > > +  [(set_attr "type" "store_16,neon_store1_2reg")
> > > (set_attr "fp" "*,yes")]
> > >  )
> > >  
> > > @@ -1278,7 +1279,7 @@
> > >"@
> > > ldp\\t%d0, %d2, %1
> > > ldp\\t%x0, %x2, %1"
> > > -  [(set_attr "type" "neon_load1_2reg,load_8")
> > > +  [(set_attr "type" "neon_load1_2reg,load_16")
> > > (set_attr "fp" "yes,*")]
> > >  )
> > >  
> > > @@ -1312,7 +1313,7 @@
> > >"@
> > > stp\\t%d1, %d3, %0
> > > stp\\t%x1, %x3, %0"
> > > -  [(set_attr "type" "neon_store1_2reg,store_8")
> > > +  [(set_attr "type" "neon_store1_2reg,store_16")
> > >  

Re: [Mechanical Patch ARM/AArch64 1/2] Rename load/store scheduling types to encode data size

2017-07-27 Thread James Greenhalgh
On Wed, Jun 21, 2017 at 11:49:47AM +0100, James Greenhalgh wrote:
> On Mon, Jun 12, 2017 at 03:28:52PM +0100, Kyrill Tkachov wrote:

*ping ^2*

Thanks,
James


> > 
> > On 12/06/17 14:53, James Greenhalgh wrote:
> > >Hi,
> > >
> > >In the AArch64 backend and scheduling models there is some confusion as to
> > >what the load1/load2 etc. scheduling types refer to. This leads to us using
> > >load1/load2 in two contexts - for a variety of 32-bit, 64-bit and 128-bit
> > >loads in AArch32 and 128-bit loads in AArch64. That leads to an undesirable
> > >confusion in scheduling.
> > >
> > >Fixing it is easy, but mechanical and boring. Essentially,
> > >
> > >   s/load1/load_4/
> > >   s/load2/load_8/
> > >   s/load3/load_12/
> > >   s/load4/load_16/
> > >   s/store1/store_4/
> > >   s/store2/store_8/
> > >   s/store3/store_12/
> > >   s/store4/store_16/
> > 
> > So the number now is the number of bytes being loaded?
> > 
> > >Across all sorts of pipeline models, and the two backends.
> > >
> > >I have intentionally not modified any of the patterns which now look 
> > >obviously
> > >incorrect. I'll be doing a second pass over the AArch64 back-end in patch
> > >2/2 which will fix these bugs. The AArch32 back-end looked to me to get 
> > >this
> > >correct.
> > >
> > >Bootstrapped on AArch64 and ARM without issue - there's no functional
> > >change here.
> > >
> > >OK?
> > 
> > Ok from an arm perspective.
> 
> *Ping* for the AArch64 maintainers.




Re: [Patch AArch64] Stop generating BSL for simple integer code

2017-07-27 Thread James Greenhalgh

On Mon, Jun 12, 2017 at 02:44:40PM +0100, James Greenhalgh wrote:
> [Sorry for the re-send. I spotted that the attributes were not right for the
>  new pattern I was adding. The change between this and the first version was:
>
>   +  [(set_attr "type" "neon_bsl,neon_bsl,neon_bsl,multiple")
>   +   (set_attr "length" "4,4,4,12")]
> ]
>
> ---
>
> Hi,
>
> In this testcase, all argument registers and the return register
> will be general purpose registers:
>
>   long long
>   foo (long long a, long long b, long long c)
>   {
> return ((a ^ b) & c) ^ b;
>   }
>
> However, due to the implementation of aarch64_simd_bsl_internal
> we'll match that pattern and emit a BSL, necessitating moving all those
> arguments and results to the Advanced SIMD registers:
>
>   fmovd2, x0
>   fmovd0, x2
>   fmovd1, x1
>   bsl v0.8b, v2.8b, v1.8b
>   fmovx0, d0
>
> To fix this, we turn aarch64_simd_bsldi_internal in to an insn_and_split that
> knows to split back to integer operations if the register allocation
> falls that way.
>
> We could have used an unspec, but then we lose some of the nice
> simplifications that can be made from explicitly spelling out the semantics
> of BSL.

Off list, Richard and I found considerable issues with this patch. From
idioms failing to match in the testcase, to subtle register allocation bugs,
to potential for suboptimal code generation.

That's not good!

This spin of the patch corrects those issues by adding a second split
pattern, allocating a temporary register if we're permitted to, or
properly using tied output operands if we're not, and by generally playing
things a bit safer around the possibility of register overlaps.

The testcase is expanded to an execute test, hopefully giving a little
more assurance that we're doing the right thing - now testing the BSL idiom
generation in both general-purpose and floating-point registers and comparing
the results. Hopefully we're now a bit more robust!

Bootstrapped and tested on aarch64-none-linux-gnu and cross-tested on
aarch64-none-elf with no issues.

OK for trunk?

Thanks,
James

---
gcc/

2017-07-27  James Greenhalgh  

* config/aarch64/aarch64-simd.md
(aarch64_simd_bsl_internal): Remove DImode.
(*aarch64_simd_bsl_alt): Likewise.
(aarch64_simd_bsldi_internal): New.
(aarch64_simd_bsldi_alt): Likewise.

gcc/testsuite/

2017-07-27  James Greenhalgh  

* gcc.target/aarch64/bsl-idiom.c: New.
* gcc.target/aarch64/copysign-bsl.c: New.

diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md
index 011fcec0..a186eae 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -2280,13 +2280,13 @@
 ;; in *aarch64_simd_bsl_alt.
 
 (define_insn "aarch64_simd_bsl_internal"
-  [(set (match_operand:VSDQ_I_DI 0 "register_operand" "=w,w,w")
-	(xor:VSDQ_I_DI
-	   (and:VSDQ_I_DI
-	 (xor:VSDQ_I_DI
+  [(set (match_operand:VDQ_I 0 "register_operand" "=w,w,w")
+	(xor:VDQ_I
+	   (and:VDQ_I
+	 (xor:VDQ_I
 	   (match_operand: 3 "register_operand" "w,0,w")
-	   (match_operand:VSDQ_I_DI 2 "register_operand" "w,w,0"))
-	 (match_operand:VSDQ_I_DI 1 "register_operand" "0,w,w"))
+	   (match_operand:VDQ_I 2 "register_operand" "w,w,0"))
+	 (match_operand:VDQ_I 1 "register_operand" "0,w,w"))
 	  (match_dup: 3)
 	))]
   "TARGET_SIMD"
@@ -2304,14 +2304,14 @@
 ;; permutations of commutative operations, we have to have a separate pattern.
 
 (define_insn "*aarch64_simd_bsl_alt"
-  [(set (match_operand:VSDQ_I_DI 0 "register_operand" "=w,w,w")
-	(xor:VSDQ_I_DI
-	   (and:VSDQ_I_DI
-	 (xor:VSDQ_I_DI
-	   (match_operand:VSDQ_I_DI 3 "register_operand" "w,w,0")
-	   (match_operand:VSDQ_I_DI 2 "register_operand" "w,0,w"))
-	  (match_operand:VSDQ_I_DI 1 "register_operand" "0,w,w"))
-	  (match_dup:VSDQ_I_DI 2)))]
+  [(set (match_operand:VDQ_I 0 "register_operand" "=w,w,w")
+	(xor:VDQ_I
+	   (and:VDQ_I
+	 (xor:VDQ_I
+	   (match_operand:VDQ_I 3 "register_operand" "w,w,0")
+	   (match_operand:VDQ_I 2 "register_operand" "w,0,w"))
+	  (match_operand:VDQ_I 1 "register_operand" "0,w,w"))
+	  (match_dup:VDQ_I 2)))]
   "TARGET_SIMD"
   "@
   bsl\\t%0., %3., %2.
@@ -2320,6 +2320,100 @@
   [(set_attr "type" "neon_bsl")]
 )
 
+;; DImode is special, we want to avoid computing operations which are
+;; more naturally computed in general purpose registers in the vector
+;; registers.  If we do that, we need to move all three operands from general
+;; purpose registers to vector registers, then back again.  However, we
+;; don't want to make this pattern an UNSPEC as we'd lose scope for
+;; optimizations based on the component operations of a BSL.
+;;
+;; That means we need a splitter back to the individual operations, if they
+;; would be better calculated on the integer side.
+
+(define_insn_and_split "aarch64_simd_bsldi_internal"
+  [(set (match_operand:DI 0 "register_operand" "=w,w,w,&r

Re: [PATCH] Fix PR middle-end/81564: ICE in group_case_labels_stmt()

2017-07-27 Thread Steven Bosscher
On Wed, Jul 26, 2017 at 9:35 PM, Peter Bergner wrote:
> The test case for PR81564 exposes an issue where the case labels for a
> switch statement point to blocks that have already been removed by an
> earlier call to cleanup_tree_cfg().  In that case, the code in
> group_case_labels_stmt() that does:

How can a basic block be removed (apparently as unreachable) if there
are still case labels leading to it?

Apparently there is enough information to make CASE_LABEL be set to
NULL. Why is the case label not just removed (or redirected to the
default, or ...)?

The patch feels like it's papering over another issue.
group_case_labels is an optional thing to do, basically just a
simplification. The compiler should run even if you never group the
case labels...

Ciao!
Steven


Re: [PATCH] [RISCV] Add RTEMS support

2017-07-27 Thread Palmer Dabbelt
It appears to work for me: I can generate a simple no-op executable and link it
with multilib.  I don't know anything about RTEMS, so I'm just going to trust
it'll actually work :).  We're not going to have bandwidth to test this, but if
you're interested there's some support for running the GCC test suite in our
super-repo

  https://github.com/riscv/riscv-gnu-toolchain

I committed the patch as

  https://gcc.gnu.org/viewcvs/gcc?view=revision&revision=250632

Thanks a bunch!

On Thu, 27 Jul 2017 07:12:42 PDT (-0700), kito.ch...@gmail.com wrote:
> Hi Sebastian:
> LGTM, I've test riscv32-rtems-gcc is buildable.
>
> Thanks for you patch :)
>
> Hi Palmer:
> Could you help to commit this patch ?
>
> Thanks.
>
> On Thu, Jul 27, 2017 at 7:05 PM, Sebastian Huber
>  wrote:
>> gcc/
>> * config.gcc (riscv*-*-elf*): Add (riscv*-*-rtems*).
>> * config/riscv/rtems.h: New file.
>> ---
>>  gcc/config.gcc   |  7 ++-
>>  gcc/config/riscv/rtems.h | 31 +++
>>  2 files changed, 37 insertions(+), 1 deletion(-)
>>  create mode 100644 gcc/config/riscv/rtems.h
>>
>> diff --git a/gcc/config.gcc b/gcc/config.gcc
>> index aab7f65c1df..f28164646c3 100644
>> --- a/gcc/config.gcc
>> +++ b/gcc/config.gcc
>> @@ -2040,7 +2040,7 @@ riscv*-*-linux*)
>> # automatically detect that GAS supports it, yet we require it.
>> gcc_cv_initfini_array=yes
>> ;;
>> -riscv*-*-elf*)
>> +riscv*-*-elf* | riscv*-*-rtems*)
>> tm_file="elfos.h newlib-stdint.h ${tm_file} riscv/elf.h"
>> case "x${enable_multilib}" in
>> xno) ;;
>> @@ -2053,6 +2053,11 @@ riscv*-*-elf*)
>> # Force .init_array support.  The configure script cannot always
>> # automatically detect that GAS supports it, yet we require it.
>> gcc_cv_initfini_array=yes
>> +   case ${target} in
>> +   riscv*-*-rtems*)
>> + tm_file="${tm_file} rtems.h riscv/rtems.h"
>> + ;;
>> +   esac
>> ;;
>>  mips*-*-netbsd*)   # NetBSD/mips, either endian.
>> target_cpu_default="MASK_ABICALLS"
>> diff --git a/gcc/config/riscv/rtems.h b/gcc/config/riscv/rtems.h
>> new file mode 100644
>> index 000..221e2f69815
>> --- /dev/null
>> +++ b/gcc/config/riscv/rtems.h
>> @@ -0,0 +1,31 @@
>> +/* Definitions for RISC-V RTEMS systems with ELF format.
>> +   Copyright (C) 2017 Free Software Foundation, Inc.
>> +
>> +   This file is part of GCC.
>> +
>> +   GCC is free software; you can redistribute it and/or modify it
>> +   under the terms of the GNU General Public License as published
>> +   by the Free Software Foundation; either version 3, or (at your
>> +   option) any later version.
>> +
>> +   GCC is distributed in the hope that it will be useful, but WITHOUT
>> +   ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
>> +   or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
>> +   License for more details.
>> +
>> +   Under Section 7 of GPL version 3, you are granted additional
>> +   permissions described in the GCC Runtime Library Exception, version
>> +   3.1, as published by the Free Software Foundation.
>> +
>> +   You should have received a copy of the GNU General Public License and
>> +   a copy of the GCC Runtime Library Exception along with this program;
>> +   see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
>> +   .  */
>> +
>> +#undef TARGET_OS_CPP_BUILTINS
>> +#define TARGET_OS_CPP_BUILTINS()   \
>> +do {   \
>> +   builtin_define ("__rtems__");   \
>> +   builtin_define ("__USE_INIT_FINI__");   \
>> +   builtin_assert ("system=rtems");\
>> +} while (0)
>> --
>> 2.12.3
>>


Re: [PATCH] Improve alloca alignment

2017-07-27 Thread Jeff Law
On 07/26/2017 05:29 PM, Wilco Dijkstra wrote:
> Jeff Law wrote:
> 
>> +  if (required_align > MAX_SUPPORTED_STACK_ALIGNMENT)
>> +{
>> +  extra = (required_align - MAX_SUPPORTED_STACK_ALIGNMENT)
>> + / BITS_PER_UNIT;
>> +  size = plus_constant (Pmode, size, extra);
>> +  size = force_operand (size, NULL_RTX);
>>   
>> -  if (extra && size_align > BITS_PER_UNIT)
>> -size_align = BITS_PER_UNIT;
>> +  if (flag_stack_usage_info && pstack_usage_size)
>> + *pstack_usage_size += extra;
>> +}
> 
>> So it's unclear to me why you increase the size iff REQUIRED_ALIGN is
>> greater than MAX_SUPPORTED_STACK_ALIGNMENT.
>>
>> ISTM the safe assumption we can make in this code is that the stack is
>> aligned to STACK_BOUNDARY.  If so, isn't the right test
>>
>> (required_align > STACK_BOUNDARY)?
>>
>> And once we decide to make an adjustment, isn't the right adjustment to
>> make (required_align - STACK_BOUNDARY) / BITS_PER_UNIT?
>>
>> I feel like I must be missing something really important here.
> 
> Yes I think you're right that if STACK_BOUNDARY is the guaranteed stack
> alignment, it should check against STACK_BOUNDARY indeed.
Seems that way to me.

> 
> But then the check size_align % MAX_SUPPORTED_STACK_ALIGNMENT != 0
> seems wrong too given that round_push uses a different alignment to align to. 
I had started to dig into the history of this code, but just didn't have
the time to do so fully before needing to leave for the day.  To some
degree I was hoping you knew the rationale behind the test against
MAX_SUPPORTED_STACK_ALIGNMENT and I wouldn't have to do a ton of digging :-)

Jeff


Re: [PATCH 1/2] Eliminate -mno-lra from the rs6000 port.

2017-07-27 Thread Segher Boessenkool
Hi!

On Thu, Jul 27, 2017 at 10:43:44AM -0500, Peter Bergner wrote:
> This patch makes the -mlra option a nop while disallowing -mno-lra.
> It also removes the target bit mask and its usage.
> Finally, this patch updates the testsuite by removing all usage of
> the -mlra and -mno-lra options.

So you left the -mno-lra testcases (but without -mno-lra) because it
gives some more test coverage this way.  Okay.

Looks fine, please commit.  Thanks!


Segher


Re: [PATCH][AArch64] Fix missing optimization for CMP+AND

2017-07-27 Thread James Greenhalgh
On Wed, Mar 29, 2017 at 11:17:20AM +0100, Sudi Das wrote:
> 
> Hi all
> 
> During combine GCC tries to merge CMP (with zero) and AND into a TST.
> However, in cases where an ANDS operand is not compatible, this was being
> missed. Adding a define_split where this operand was moved to a register
> seems to help out. 
> 
> For example for a test :
> 
> int
> f (unsigned char *p)
> {
>   return p[0] == 50 || p[0] == 52;
> }
> 
> int
> g (unsigned char *p)
> {
>   return (p[0] >> 4 & 0xfd) == 0;
> }
> 
> we are now generating
> 
> f:
>   ldrbw0, [x0]
>   mov w1, 253
>   sub w0, w0, #50
>   tst w0, w1
>   csetw0, eq
>   ret
>   .size   f, .-f
>   .align  2
>   .p2align 3,,7
>   .global g
>   .type   g, %function
> g:
>   ldrbw1, [x0]
>   mov w0, 13
>   tst w0, w1, lsr 4
>   csetw0, eq
>   ret
> 
> instead of
> 
> f:
>   ldrbw0, [x0]
>   sub w0, w0, #50
>   and w0, w0, -3
>   and w0, w0, 255
>   cmp w0, 0
>   csetw0, eq
>   ret
>   .size   f, .-f
>   .align  2
>   .p2align 3,,7
>   .global g
>   .type   g, %function
> g:
>   ldrbw0, [x0]
>   lsr w0, w0, 4
>   and w0, w0, -3
>   cmp w0, 0
>   csetw0, eq
>   ret
> 
> Added this new test and checked for regressions on bootstrapped
> aarch64-none-linux-gnu Ok for stage 1?

Sorry to have let this slip for so long.

I've committed this on your behalf as revision 250631.

Thanks,
James

> 
> Thanks 
> Sudi
> 
> 2017-03-17 Kyrylo Tkachov  
>  Sudakshina Das  
> 
>   * config/aarch64/aarch64.md (define_split for and3nr_compare0): 
> Move 
>   non aarch64_logical_operand to a register.
>   (define_split for and_3nr_compare0): Move non 
> register 
>   immediate operand to a register.
> 
>   * config/aarch64/predicates.md (aarch64_mov_imm_operand): New.
> 
> 2017-03-17  Kyrylo Tkachov  
>   Sudakshina Das  
> 
>   * gcc.target/aarch64/tst_imm_split_1.c: New Test.



Re: [PATCH][GCC][AArch64] optimize float immediate moves (2 /4) - HF/DF/SF mode.

2017-07-27 Thread James Greenhalgh
On Mon, Jun 26, 2017 at 11:50:51AM +0100, Tamar Christina wrote:
> Hi all,
> 
> Here's the re-spun patch.
> Aside from the grouping of the split patterns it now also uses h register for 
> the fmov for HF when available,
> otherwise it forces a literal load.
> 
> Regression tested on aarch64-none-linux-gnu and no regressions.
> 
> OK for trunk?

OK.

Thanks,
James

> 
> Thanks,
> Tamar
> 
> 
> gcc/
> 2017-06-26  Tamar Christina  
> Richard Sandiford 
> 
> * config/aarch64/aarch64.md (mov): Generalize.
> (*movhf_aarch64, *movsf_aarch64, *movdf_aarch64):
> Add integer and movi cases.
> (movi-split-hf-df-sf split, fp16): New.
> (enabled): Added TARGET_FP_F16INST.
> * config/aarch64/iterators.md (GPF_HF): New.



Re: [PATCH][2/2] Fix PR81502

2017-07-27 Thread Andrew Pinski
On Thu, Jul 27, 2017 at 6:50 AM, Richard Biener  wrote:
>
> I am testing the following additional pattern for match.pd to fix
> PR81502 resulting in the desired optimization to
>
> bar:
> .LFB526:
> .cfi_startproc
> movl%edi, %eax
> ret
>
> the pattern optimizes a BIT_FIELD_REF on a BIT_INSERT_EXPR by
> either extracting from the destination or the inserted value.

Note this optimization pattern was on my list to implement for
bit-field optimizations after lowering.

Thanks,
Andrew Pinski

>
> Bootstrap and regtest running on x86_64-unknown-linux-gnu.
>
> Richard.
>
> 2017-07-27  Richard Biener  
>
> PR tree-optimization/81502
> * match.pd: Add pattern combining BIT_INSERT_EXPR with
> BIT_FIELD_REF.
>
> * gcc.target/i386/pr81502.c: New testcase.
>
> Index: gcc/match.pd
> ===
> *** gcc/match.pd(revision 250620)
> --- gcc/match.pd(working copy)
> *** DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> *** 4178,4180 
> --- 4178,4195 
>  { CONSTRUCTOR_ELT (ctor, idx / k)->value; })
> (BIT_FIELD_REF { CONSTRUCTOR_ELT (ctor, idx / k)->value; }
>@1 { bitsize_int ((idx % k) * width); })
> +
> + /* Simplify a bit extraction from a bit insertion for the cases with
> +the inserted element fully covering the extraction or the insertion
> +not touching the extraction.  */
> + (simplify
> +  (BIT_FIELD_REF (bit_insert @0 @1 @ipos) @rsize @rpos)
> +  (switch
> +   (if (wi::leu_p (@ipos, @rpos)
> +&& wi::leu_p (wi::add (@rpos, @rsize),
> +  wi::add (@ipos, TYPE_PRECISION (TREE_TYPE (@1)
> +(BIT_FIELD_REF @1 @rsize { wide_int_to_tree (bitsizetype,
> + wi::sub (@rpos, @ipos)); }))
> +   (if (wi::geu_p (@ipos, wi::add (@rpos, @rsize))
> +|| wi::geu_p (@rpos, wi::add (@ipos, TYPE_PRECISION (TREE_TYPE 
> (@1)
> +(BIT_FIELD_REF @0 @rsize @rpos
> Index: gcc/testsuite/gcc.target/i386/pr81502.c
> ===
> *** gcc/testsuite/gcc.target/i386/pr81502.c (nonexistent)
> --- gcc/testsuite/gcc.target/i386/pr81502.c (working copy)
> ***
> *** 0 
> --- 1,34 
> + /* { dg-do compile { target lp64 } } */
> + /* { dg-options "-O2 -msse2" } */
> +
> + #include 
> +
> + #define SIZE (sizeof (void *))
> +
> + static int foo(unsigned char (*foo)[SIZE])
> + {
> +   __m128i acc = _mm_set_epi32(0, 0, 0, 0);
> +   size_t i = 0;
> +   for(; i + sizeof(__m128i) <= SIZE; i += sizeof(__m128i)) {
> +   __m128i word;
> +   __builtin_memcpy(&word, foo + i, sizeof(__m128i));
> +   acc = _mm_add_epi32(word, acc);
> +   }
> +   if (i != SIZE) {
> +   __m128i word = _mm_set_epi32(0, 0, 0, 0);
> +   __builtin_memcpy(&word, foo + i, SIZE - i); // (1)
> +   acc = _mm_add_epi32(word, acc);
> +   }
> +   int res;
> +   __builtin_memcpy(&res, &acc, sizeof(res));
> +   return res;
> + }
> +
> + int bar(void *ptr)
> + {
> +   unsigned char buf[SIZE];
> +   __builtin_memcpy(buf, &ptr, SIZE);
> +   return foo((unsigned char(*)[SIZE])buf);
> + }
> +
> + /* { dg-final { scan-assembler-times "mov" 1 } } */


RE:[PATCH,AIX] Changes for linking gotools on AIX.

2017-07-27 Thread REIX, Tony
Hi Ian, David,

On AIX, that is more complicated...

We have to use -static-libgo when building the libgo tests. Because AIX does 
not work like Linux does and because the Go libgo tests are done by duplicating 
several .go files of libgo packages that already appear in the libgo.a 
(libgo.so) library.
On AIX, without -static-libgo, when building/running the libgo tests, we have 
Go variables defined somewhere and used elsewhere, BUT with different memory 
addresses... leading to bad issues...

However, when building/linking real Go application code outside of Gcc Go 
compiler, and thus with NO 2-times compiled libgo internal code, we have 
another issue and we need to load libgo.a at first, otherwise we have other 
issues.

In short, this patch is the first step of a global fix we have found for AIX 
for covering the 2 cases: build/run libgo internal tests, and build real 
NO-libgo internal customer code. And it works fine.
I'll provide the second step later.


Cordialement,

Tony Reix

Bull - ATOS
IBM Coop Architect & Technical Leader
Office : +33 (0) 4 76 29 72 67
1 rue de Provence - 38432 Échirolles - France
www.atos.net


De : Ian Lance Taylor [i...@golang.org]
Envoyé : mercredi 26 juillet 2017 20:06
À : David Edelsohn
Cc : REIX, Tony; gcc-patches@gcc.gnu.org
Objet : Re: [PATCH,AIX] Changes for linking gotools on AIX.

On Wed, Jul 26, 2017 at 9:58 AM, David Edelsohn  wrote:
> On Wed, Jul 26, 2017 at 12:41 PM, REIX, Tony  wrote:
>> Description:
>>  * This patch adds linker options for gotools for AIX.
>>
>> Tests:
>>  * Fedora25/x86_64 + GCC trunk : Configure/Build: SUCCESS
>>- build remade by means of gmake.
>>- some test redone in libgo (gmake check)
>>  * AIX + GCC 7.1.0 :
>>- build remade by means of gmake.
>>- some test redone in libgo (gmake check)
>>
>> ChangeLog:
>>  * Makefile.am (AM_LDFLAGS & GOLINK): Changes for linking on AIX.
>>  * Makefile.in: Rebuild.
>
> If this is trying to fix AIX search paths, a better solution would
> seem to be the equivalent of -static-libstdc++ -static-libgcc.  The Go
> tools should be linked statically and not depend on Go shared
> libraries.

On GNU/Linux I used to use -static-libgo, but I changed it because of
https://gcc.gnu.org/PR64738.  Of course on AIX we can do as you
prefer.

Ian


Re: [patch 2/2,avr] PR49847: Add hook to place read-only lookup-tables in named address-space

2017-07-27 Thread Denis Chertykov
2017-07-27 16:50 GMT+04:00 Georg-Johann Lay :
> On 27.07.2017 14:29, Georg-Johann Lay wrote:
>>
>> For some targets, the best place to put read-only lookup tables as
>> generated by -ftree-switch-conversion is not the generic address space
>> but some target specific address space.
>>
>> This is the case for AVR, where for most devices .rodata must be
>> located in RAM.
>>
>> Part #1 adds a new, optional target hook that queries the back-end
>> about the desired address space.  There is currently only one user:
>> tree-switch-conversion.c::build_one_array() which re-builds value_type
>> and array_type if the address space returned by the backend is not
>> the generic one.
>>
>> Part #2 is the AVR part that implements the new hook and adds some
>> sugar around it.
>
>
> This is the AVR part.
>
> It implements the new hook which returns a convenient flash address
> space for devices where .rodata is located in RAM:  The 16-bit __flash
> for devices with <= 64 KiB flash and 24-bit __memx for > 64 KiB flash.
>
> It adds a new option -madd-space-for-lookup= which allows to pick a
> specific address space.
>
> Some new insns and combine-split suport best code generation by the
> knowledge that the 24-bit addresses will never point to RAM so that
> the expensive decision-at-runtime whether LD or LPM has to be used
> can be avoided.
>
> Passed without new regressions on atmega128.
>
> Ok for trunk provided the gcc part 1/2 is approved?

It's cool. Thank you.
Please apply.


>
> Johann
>
>
> Implement TARGET_ADDR_SPACE_FOR_ARTIFICIAL_RODATA.
>
> PR target/49857
> * config/avr/avr-opts.h: New file.
> * config/avr/avr.opt: Include it.
> (-maddr-space-for-lookup=): New option and...
> (avr_opt_addr_space_for_lookup): ...associated Var.
> (avr_aspace_for_lookup): New option enums used by above.
> * config/avr/avr-protos.h (avr_out_load_flashx): New proto.
> * config/avr/avr.c (avr_out_load_flashx): New function.
> * avr_adjust_insn_length [ADJUST_LEN_LOAD_FLASHX]: Handle it.
> * avr_rtx_costs_1 [ZERO_EXTEND, SIGN_EXTEND]: Handle
> shift-and-extend-by-1 of HI -> PSI.
> [ASHIFT,PSImode]: Describe cost of extend-and-shift-by-1.
> (TARGET_ADDR_SPACE_FOR_ARTIFICIAL_RODATA): Define to...
> (avr_addr_space_for_artificial_rodata): ...this new static function.
> * config/avr/avr.md (unspec): Add UNSPEC_LOAD_FLASHX.
> (adjust_len): Add load_flashx.
> (*ashiftpsi.1_sign_extend.hi, *ashiftpsi.1_zero_extend.hi)
> (*extendpsi.ashift.1.uqi, *load-flashx): New insns.
> (*split_xload-cswtch): New insn-and-split.
> * doc/invoke.texi (AVR Options) <-maddr-space-for-lookup=>:
> Document.
>


[PATCH 2/2] Remove reload_in_progress and other cleanups.

2017-07-27 Thread Peter Bergner
This patch removes reload specific code from the rs6000 port made possible
by the elimination of the usage of the -mno-lra option.

This passed bootstrap and regtesting with no regressions, ok for trunk?

Peter

* config/rs6000/predicates.md (volatile_mem_operand) Remove code
related to reload_in_progress.
(splat_input_operand): Likewise.
* config/rs6000/rs6000-protos.h (rs6000_secondary_memory_needed_rtx)
Delete prototype.
* config/rs6000/rs6000.c (machine_function): Remove sdmode_stack_slot
field.
(TARGET_EXPAND_TO_RTL_HOOK): Delete.
(TARGET_INSTANTIATE_DECLS): Likewise.
(legitimate_indexed_address_p): Delete reload_in_progress code.
(rs6000_debug_legitimate_address_p): Likewise.
(rs6000_eliminate_indexed_memrefs): Likewise.
(rs6000_emit_le_vsx_store): Likewise.
(rs6000_emit_move_si_sf_subreg): Likewise.
(rs6000_emit_move): Likewise.
(register_to_reg_type): Likewise.
(rs6000_pre_atomic_barrier): Likewise.
(rs6000_machopic_legitimize_pic_address): Likewise.
(rs6000_allocate_stack_temp): Likewise.
(rs6000_address_for_fpconvert): Likewise.
(rs6000_address_for_altivec): Likewise.
(rs6000_secondary_memory_needed_rtx): Delete function.
(rs6000_check_sdmode): Likewise.
(rs6000_alloc_sdmode_stack_slot): Likewise.
(rs6000_instantiate_decls): Likewise.
* config/rs6000/rs6000.h (SECONDARY_MEMORY_NEEDED_RTX): Delete.
* config/rs6000/rs6000.md (splitter for *movsi_got_internal):
Delete reload_in_progress.
(*vec_reload_and_plus_): Likewise.
* config/rs6000/vsx.md (vsx_mul_v2di): Likewise.
(vsx_div_v2di): Likewise.
(vsx_udiv_v2di): Likewise.

Index: gcc/config/rs6000/predicates.md
===
--- gcc/config/rs6000/predicates.md (revision 250587)
+++ gcc/config/rs6000/predicates.md (working copy)
@@ -783,10 +783,8 @@ (define_predicate "volatile_mem_operand"
   (and (and (match_code "mem")
(match_test "MEM_VOLATILE_P (op)"))
(if_then_else (match_test "reload_completed")
- (match_operand 0 "memory_operand")
- (if_then_else (match_test "reload_in_progress")
-  (match_test "strict_memory_address_p (mode, XEXP (op, 0))")
-  (match_test "memory_address_p (mode, XEXP (op, 0))")
+(match_operand 0 "memory_operand")
+(match_test "memory_address_p (mode, XEXP (op, 0))"
 
 ;; Return 1 if the operand is an offsettable memory operand.
 (define_predicate "offsettable_mem_operand"
@@ -1142,7 +1140,7 @@ (define_predicate "splat_input_operand"
   if (! volatile_ok && MEM_VOLATILE_P (op))
return 0;
 
-  if (reload_in_progress || lra_in_progress || reload_completed)
+  if (lra_in_progress || reload_completed)
return indexed_or_indirect_address (addr, vmode);
   else
return memory_address_addr_space_p (vmode, addr, MEM_ADDR_SPACE (op));
Index: gcc/config/rs6000/rs6000-protos.h
===
--- gcc/config/rs6000/rs6000-protos.h   (revision 250587)
+++ gcc/config/rs6000/rs6000-protos.h   (working copy)
@@ -154,7 +154,6 @@ extern void rs6000_split_multireg_move (
 extern void rs6000_emit_le_vsx_move (rtx, rtx, machine_mode);
 extern bool valid_sf_si_move (rtx, rtx, machine_mode);
 extern void rs6000_emit_move (rtx, rtx, machine_mode);
-extern rtx rs6000_secondary_memory_needed_rtx (machine_mode);
 extern machine_mode rs6000_secondary_memory_needed_mode (machine_mode);
 extern rtx (*rs6000_legitimize_reload_address_ptr) (rtx, machine_mode,
int, int, int, int *);
Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  (revision 250587)
+++ gcc/config/rs6000/rs6000.c  (working copy)
@@ -143,10 +143,6 @@ typedef struct GTY(()) machine_function
   /* Offset from virtual_stack_vars_rtx to the start of the ABI_V4
  varargs save area.  */
   HOST_WIDE_INT varargs_save_offset;
-  /* Temporary stack slot to use for SDmode copies.  This slot is
- 64-bits wide and is allocated early enough so that the offset
- does not overflow the 16-bit load/store offset field.  */
-  rtx sdmode_stack_slot;
   /* Alternative internal arg pointer for -fsplit-stack.  */
   rtx split_stack_arg_pointer;
   bool split_stack_argp_used;
@@ -1872,12 +1868,6 @@ static const struct attribute_spec rs600
 #undef TARGET_BUILTIN_RECIPROCAL
 #define TARGET_BUILTIN_RECIPROCAL rs6000_builtin_reciprocal
 
-#undef TARGET_EXPAND_TO_RTL_HOOK
-#define TARGET_EXPAND_TO_RTL_HOOK rs6000_alloc_sdmode_stack_slot
-
-#undef TARGET_INSTANTIATE_DECLS
-#define TARGET_INSTANTIATE_DECLS rs6000_instantiate_decls
-
 #undef TARGET_SECONDARY_RELOAD
 #define TARGET_SECONDARY_RELOAD rs600

[PATCH 1/2] Eliminate -mno-lra from the rs6000 port.

2017-07-27 Thread Peter Bergner
This patch makes the -mlra option a nop while disallowing -mno-lra.
It also removes the target bit mask and its usage.
Finally, this patch updates the testsuite by removing all usage of
the -mlra and -mno-lra options.

This passed bootstrap and regtesting with no regressions, ok for trunk?

Peter


gcc/

* config/rs6000/rs6000.opt (mlra): Replace with stub.
* config/rs6000/rs6000-cpus.def (POWERPC_MASKS): Delete OPTION_MASK_LRA.
* config/rs6000/rs6000.c (TARGET_LRA_P): Delete.
(rs6000_debug_reg_global): Delete print of LRA status.
(rs6000_option_override_internal): Delete dead LRA related code.
(rs6000_lra_p): Delete function.
* doc/invoke.texi (RS/6000 and PowerPC Options): Delete -mlra.

gcc/testsuite/

* g++.dg/pr69667.C: Remove option -mlra.
* gcc.target/powerpc/dform-1.c: Likewise.
* gcc.target/powerpc/dform-2.c: Likewise.
* gcc.target/powerpc/dform-3.c: Likewise.
* gcc.target/powerpc/p8vector-int128-1.c: Likewise.
* gcc.target/powerpc/p9-vparity.c: Likewise.
* gcc.target/powerpc/pr63491.c: Likewise.
* gcc.target/powerpc/pr67808.c: Likewise.
* gcc.target/powerpc/pr68805.c: Likewise.
* gcc.target/powerpc/pr69461.c: Likewise.
* gcc.target/powerpc/pr71680.c: Likewise.
* gcc.target/powerpc/pr77289.c: Likewise.
* gcc.target/powerpc/pr78458.c: Likewise.
* gcc.target/powerpc/pr78543.c: Likewise.
* g++.dg/pr71294.C: Remove option -mno-lra.
* gcc.target/powerpc/pr71656-1.c: Likewise.
* gcc.target/powerpc/pr71656-2.c: Likewise.
* gcc.target/powerpc/pr71698.c: Likewise.

Index: gcc/config/rs6000/rs6000.opt
===
--- gcc/config/rs6000/rs6000.opt(revision 250587)
+++ gcc/config/rs6000/rs6000.opt(working copy)
@@ -430,9 +430,9 @@ mlong-double-
 Target RejectNegative Joined UInteger Var(rs6000_long_double_type_size) Save
 -mlong-double-  Specify size of long double (64 or 128 bits).
 
+; This option existed in the past, but now is always on.
 mlra
-Target Report Mask(LRA) Var(rs6000_isa_flags)
-Enable Local Register Allocation.
+Target RejectNegative Undocumented Ignore
 
 msched-costly-dep=
 Target RejectNegative Joined Var(rs6000_sched_costly_dep_str)
Index: gcc/config/rs6000/rs6000-cpus.def
===
--- gcc/config/rs6000/rs6000-cpus.def   (revision 250587)
+++ gcc/config/rs6000/rs6000-cpus.def   (working copy)
@@ -126,7 +126,6 @@
 | OPTION_MASK_FPRND\
 | OPTION_MASK_HTM  \
 | OPTION_MASK_ISEL \
-| OPTION_MASK_LRA  \
 | OPTION_MASK_MFCRF\
 | OPTION_MASK_MFPGPR   \
 | OPTION_MASK_MODULO   \
Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  (revision 250587)
+++ gcc/config/rs6000/rs6000.c  (working copy)
@@ -1887,9 +1887,6 @@ static const struct attribute_spec rs600
 #undef TARGET_MODE_DEPENDENT_ADDRESS_P
 #define TARGET_MODE_DEPENDENT_ADDRESS_P rs6000_mode_dependent_address_p
 
-#undef TARGET_LRA_P
-#define TARGET_LRA_P rs6000_lra_p
-
 #undef TARGET_COMPUTE_PRESSURE_CLASSES
 #define TARGET_COMPUTE_PRESSURE_CLASSES rs6000_compute_pressure_classes
 
@@ -2793,8 +2790,6 @@ rs6000_debug_reg_global (void)
   if (TARGET_LINK_STACK)
 fprintf (stderr, DEBUG_FMT_S, "link_stack", "true");
 
-  fprintf (stderr, DEBUG_FMT_S, "lra", TARGET_LRA ? "true" : "false");
-
   if (TARGET_P8_FUSION)
 {
   char options[80];
@@ -4562,35 +4557,10 @@ rs6000_option_override_internal (bool gl
}
 }
 
-  /* Enable LRA by default.  */
-  if ((rs6000_isa_flags_explicit & OPTION_MASK_LRA) == 0)
-rs6000_isa_flags |= OPTION_MASK_LRA;
-
-  /* There have been bugs with -mvsx-timode that don't show up with -mlra,
- but do show up with -mno-lra.  Given -mlra will become the default once
- PR 69847 is fixed, turn off the options with problems by default if
- -mno-lra was used, and warn if the user explicitly asked for the option.
-
- Enable -mpower9-dform-vector by default if LRA and other power9 options.
- Enable -mvsx-timode by default if LRA and VSX.  */
-  if (!TARGET_LRA)
-{
-  if (TARGET_VSX_TIMODE)
-   {
- if ((rs6000_isa_flags_explicit & OPTION_MASK_VSX_TIMODE) != 0)
-   warning (0, "-mvsx-timode might need -mlra");
-
- else
-   rs6000_isa_flags &= ~OPTION_MASK_VSX_TIMODE;
-   }
-}
-
-  else
-{
-  if (TARGET_VSX && !TARGET_VSX_TIMODE
- && (rs6000_isa_flags_explicit & 

Re: [PATCH][GCC][AArch64] optimize float immediate moves (1 /4) - infrastructure.

2017-07-27 Thread James Greenhalgh
On Wed, Jul 26, 2017 at 05:00:05PM +0100, Tamar Christina wrote:
> Hi James,
> 
> I have updated the patch and have responded to your question blow.
> 
> Ok for trunk?

Ok, with a few small changes.

> > >  static bool
> > > @@ -5857,12 +5955,6 @@ aarch64_preferred_reload_class (rtx x,
> > reg_class_t regclass)
> > >return NO_REGS;
> > >  }
> > >
> > > -  /* If it's an integer immediate that MOVI can't handle, then
> > > - FP_REGS is not an option, so we return NO_REGS instead.  */
> > > -  if (CONST_INT_P (x) && reg_class_subset_p (regclass, FP_REGS)
> > > -  && !aarch64_simd_imm_scalar_p (x, GET_MODE (x)))
> > > -return NO_REGS;
> > > -
> > 
> > I don't understand this relaxation could you explain what this achieves and
> > why it is safe in this patch?
> 
> Because this should be left up to the pattern to decide what should happen
> and not reload.  Leaving the check here also means you'll do a reasonably
> expensive check twice for each constant
> you can emit a move for.
> 
> Removing extra restriction on the constant classes leaves it up to
> aarch64_legitimate_constant_p to decide if if the constant can be emitted as
> a move or should be forced to memory.
> aarch64_legitimate_constant_p also calls aarch64_cannot_force_const_mem.
> 
> The documentation for TARGET_PREFERRED_RELOAD_CLASS also states: 
> 
> "One case where TARGET_PREFERRED_RELOAD_CLASS must not return rclass is if x
> is a legitimate constant which cannot be loaded into some register class. By
> returning NO_REGS you can force x into a memory location. For example, rs6000
> can load immediate values into general-purpose registers, but does not have
> an instruction for loading an immediate value into a floating-point register,
> so TARGET_PREFERRED_RELOAD_CLASS returns NO_REGS when x is a floating-point
> constant. If the constant can't be loaded into any kind of register, code
> generation will be better if TARGET_LEGITIMATE_CONSTANT_P makes the constant
> illegitimate instead of using TARGET_PREFERRED_RELOAD_CLASS. "
> 
> So it seems that not only did the original constraint not add anything, we
> may also generate better code now.

Thanks for the explanation.

> diff --git a/gcc/config/aarch64/aarch64-protos.h 
> b/gcc/config/aarch64/aarch64-protos.h
> index 
> e397ff4afa73cfbc7e192fd5686b1beff9bbbadf..fd20576d23cfdc48761f65e41762b2a5a71f3e61
>  100644
> --- a/gcc/config/aarch64/aarch64-protos.h
> +++ b/gcc/config/aarch64/aarch64-protos.h
> @@ -326,6 +326,8 @@ bool aarch64_emit_approx_sqrt (rtx, rtx, bool);
>  void aarch64_expand_call (rtx, rtx, bool);
>  bool aarch64_expand_movmem (rtx *);
>  bool aarch64_float_const_zero_rtx_p (rtx);
> +bool aarch64_float_const_rtx_p (rtx);
> +bool aarch64_can_const_movi_rtx_p (rtx x, machine_mode mode);

This list should be alphabetical, first by type, then by name. So I'd
expect this to fit in just above aarch64_const_vec_all_same_int_p .

>  bool aarch64_function_arg_regno_p (unsigned);
>  bool aarch64_fusion_enabled_p (enum aarch64_fusion_pairs);
>  bool aarch64_gen_movmemqi (rtx *);
> @@ -353,7 +355,6 @@ bool aarch64_regno_ok_for_base_p (int, bool);
>  bool aarch64_regno_ok_for_index_p (int, bool);
>  bool aarch64_simd_check_vect_par_cnst_half (rtx op, machine_mode mode,
>   bool high);
> -bool aarch64_simd_imm_scalar_p (rtx x, machine_mode mode);
>  bool aarch64_simd_imm_zero_p (rtx, machine_mode);
>  bool aarch64_simd_scalar_immediate_valid_for_move (rtx, machine_mode);
>  bool aarch64_simd_shift_imm_p (rtx, machine_mode, bool);
> @@ -488,4 +489,6 @@ std::string aarch64_get_extension_string_for_isa_flags 
> (unsigned long,
>  
>  rtl_opt_pass *make_pass_fma_steering (gcc::context *ctxt);
>  
> +bool aarch64_reinterpret_float_as_int (rtx value, unsigned HOST_WIDE_INT 
> *fail);

This isn't defined in common/config/aarch64-common.c so shouldn't be in
the section for functions which will be defined there. It should be in
the list above between aarch64_regno_ok_for_index_p and
aarch64_simd_check_vect_par_cnst_half.

> +
>  #endif /* GCC_AARCH64_PROTOS_H */
> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
> index 
> d753666ef67fc524260c41f36743df3649a0a98a..b1ddd77823e50e63439e497f695f3fad9bd9efc9
>  100644
> --- a/gcc/config/aarch64/aarch64.c
> +++ b/gcc/config/aarch64/aarch64.c
> @@ -147,6 +147,8 @@ static bool aarch64_builtin_support_vector_misalignment 
> (machine_mode mode,
>const_tree type,
>int misalignment,
>bool is_packed);
> +static machine_mode
> +aarch64_simd_container_mode (machine_mode mode, unsigned width);
>  
>  /* Major revision number of the ARM Architecture implemented by the target.  
> */
>  unsigned aarch64_architecture_version;
> @@ -4723,6 +4725,62 @@ aarch64_legitimize_address_displacement (rtx *disp, 
> rtx *off, ma

[PATCH 0/2] Force usage of LRA for all rs6000 port compiles.

2017-07-27 Thread Peter Bergner
In GCC 7, the rs6000 port made the switch to using LRA over reload by
default.  We kept the ability of using reload in case there was an LRA
issue.  We have squashed all known rs6000 specific LRA bugs and now is
the time to remove the ability to use reload from GCC 8/rs6000.

The first patch replaces the -mlra option with a dummy stub and disallows
using the -mno-lra option.  It also removes the target bit mask and its usage.
Finally, it updates the testsuite by removing all usage of the -mlra and
-mno-lra options.

The second patch removes the now dead reload_in_progress usage and any
code dependent on that.

Peter



RE:[PATCH,AIX] Don't leak a file descriptor if an archive is malformed.

2017-07-27 Thread REIX, Tony
Better with the patch file...
Sorry. The Resend did not add the joint file I added with first message (in 
HTML format, refused).
Hope it's OK now.
Tony
Index: libbacktrace/ChangeLog
===
--- libbacktrace/ChangeLog	(revision 250609)
+++ libbacktrace/ChangeLog	(working copy)
@@ -1,3 +1,7 @@
+2017-07-27  Tony Reix  
+
+	* xcoff.c: Don't leak a file descriptor if an archive is malformed.
+
 2017-07-26  Tony Reix  
 
 	* configure.ac: Check for XCOFF32/XCOFF64.  Check for loadquery.
Index: libbacktrace/xcoff.c
===
--- libbacktrace/xcoff.c	(revision 250609)
+++ libbacktrace/xcoff.c	(working copy)
@@ -1288,7 +1288,7 @@ xcoff_armem_add (struct backtrace_state *state, in
 
   if (!backtrace_get_view (state, descriptor, 0, sizeof (b_ar_fl_hdr),
 			   error_callback, data, &view))
-return 0;
+goto fail;
 
   memcpy (&fl_hdr, view.data, sizeof (b_ar_fl_hdr));
 
@@ -1295,13 +1295,13 @@ xcoff_armem_add (struct backtrace_state *state, in
   backtrace_release_view (state, &view, error_callback, data);
 
   if (memcmp (fl_hdr.fl_magic, AIAMAGBIG, 8) != 0)
-return 0;
+goto fail;
 
   memlen = strlen (member);
 
   /* Read offset of first archive member.  */
   if (!xcoff_parse_decimal (fl_hdr.fl_fstmoff, sizeof fl_hdr.fl_fstmoff, &off))
-return 0;
+goto fail;
   while (off != 0)
 {
   /* Map archive member header and member name.  */
@@ -1309,7 +1309,7 @@ xcoff_armem_add (struct backtrace_state *state, in
   if (!backtrace_get_view (state, descriptor, off,
 			   sizeof (b_ar_hdr) + memlen,
 			   error_callback, data, &view))
-	return 0;
+	break;
 
   ar_hdr = (const b_ar_hdr *) view.data;
 
@@ -1345,6 +1345,7 @@ xcoff_armem_add (struct backtrace_state *state, in
   backtrace_release_view (state, &view, error_callback, data);
 }
 
+ fail:
   /* No matching member found.  */
   backtrace_close (descriptor, error_callback, data);
   return 0;


[PATCH,AIX] Don't leak a file descriptor if an archive is malformed.

2017-07-27 Thread REIX, Tony
(Damned ! Brut text format required !!)

Description:
 * This patch fixes a possible leak of a file descriptor if an archive is 
malformed.

Tests:
 * Fedora25/x86_64 + GCC trunk : SUCCESS
   - gmake in libbacktrace directory
 * AIX :
   - gmake in libbacktrace directory

ChangeLog:
 * xcoff.c: Don't leak a file descriptor if an archive is malformed.

Cordialement,

Tony Reix


Bull - ATOS
IBM Coop Architect & Technical Leader

Office : +33 (0) 4 76 29 72 67
1 rue de Provence - 38432 Échirolles - France
www.atos.net


Re: c-family PATCH to improve -Wsign-compare (PR c/81417)

2017-07-27 Thread David Malcolm
On Thu, 2017-07-27 at 16:42 +0200, Marek Polacek wrote:
> On Tue, Jul 25, 2017 at 11:03:12AM -0400, David Malcolm wrote:
> > Thanks for updating the patch.
> > 
> > There's still an issue with ordering in the updated patch.
> > 
> > There are three orderings:
> > 
> >   (a) the order of the expressions in the source code (LHS CMP RHS)
> > 
> >   (b) the order of kinds of signedness in the messages (currently
> > hardcoded as "signed and unsigned", which doesn't respect (a))
> > 
> >   (c) the order of the the types that are reported (currently done
> > as
> > orig_op0 vs orig_op1, which if I'm reading the code is LHS vs RHS).
> > 
> > So, as written (a) and (c) have the same order, but (b)'s order is
> > hardcoded, and so there could be a mismatch.
> > 
> > All of the examples in the testcase are of the form
> >   signed LHS with unsigned RHS.
> > 
> > What happens if the LHS is unsigned, and the RHS is signed?  e.g.
> > 
> >   int
> >   fn5 (unsigned int a, signed int b)
> >   {
> > return a < b;
> >   }
> > 
> > Presumably this case still merits a warning, but as written, the
> > warning would read:
> > 
> >   warning "comparison between signed and unsigned integer
> > expressions:   'unsigned int' and 'signed int'
> > 
> > This seems rather awkward to me; in a less trivial example, I can
> > imagine the user having to take a moment to figure out which side
> > of the expression has which signedness.
> > 
> > I think that any time we're reporting on the types of two sides of
> > an expression like this, we should follow the ordering in the
> > user's code, i.e. (a) above.   The patch has (c) doing this, but
> > the text (b) is problematic.
> > 
> > I can see two ways of fixing this:
> > 
> > (i) rework the text of the message so that it changes based on
> > which side has which signedness, e.g.:
> > 
> >   "comparison between signed and unsigned integer expressions"
> > vs
> >   "comparison between unsigned and signed integer expressions"
> > 
> > or,
> > 
> > (ii) change the text of the message to not have an ordering.  Clang
> > has "comparison of integers of different signs" - though I think
> > this should say "signedness", not "signs"; surely an instance of an
> > int has a sign (e.g. "-3" is negative), but a integer *type* has a
> > signedness (e.g. "unsigned short").  So I'd change it to say:
> > "comparison of integer expressions of different signedness"
> > 
> > Please also add a testcase that covers this case (e.g. fn5 above).
> 
> I went with (ii).  I've also added the new test:
> 
> Bootstrapped/regtested on x86_64-linux and ppc64le-linux, ok for
> trunk?

[...]

Thanks for updating it.  LGTM.

Dave


c-family PATCH to improve and simplify -Wmultistatement-macros (PR c/81448, c/81306)

2017-07-27 Thread Marek Polacek
To recap, -Wmultistatement-macros warns for code like

#define FOO y++; z++

  if (i)
 FOO;

if FOO expands to multiple statements not wrapped in {}.  It tracks the
location of the guard (if), the location of the body of the conditional
(y++) and the location of the following statement (z++).  This warning
only warns if BODY and NEXT come from the same macro expansion, and the
guard doesn't come from that expansion.  But it can be fooled with code
like

#define IF if(1)

#define BAR \
void bar (void) \
{   \
  IF\
baz (); \
  return;   \
}

where BODY and NEXT come from the same expansion (BAR), the guard doesn't, yet
we shouldn't warn.  To stave off the bogus warnings, my fix is to keep
unwinding the IF macro, and if any expansion is the same as the expansion of
BODY and LOC, don't warn.  So basically avoid the warning in this scenario:

   BAR
|
  +--+
  |  |
  IFbaz (); return;
  |
  if

While working on the fix, I noticed I can simplify the code a lot, the
MACRO_MAP_LOCATIONS walk is not necessary -- which should also fix the
ugly PR81306 - yay!

Bootstrapped/regtested on x86_64-linux and ppc64le-linux, ok for trunk?

2017-07-27  Marek Polacek  

PR c/81448
PR c/81306
* c-warn.c (warn_for_multistatement_macros): Prevent bogus
warnings.  Avoid walking MACRO_MAP_LOCATIONS.   
 

* c-c++-common/Wmultistatement-macros-13.c: New test.

diff --git gcc/c-family/c-warn.c gcc/c-family/c-warn.c
index a8b38c1b98d..0bfb24994d9 100644
--- gcc/c-family/c-warn.c
+++ gcc/c-family/c-warn.c
@@ -2456,34 +2456,44 @@ warn_for_multistatement_macros (location_t body_loc, 
location_t next_loc,
   || body_loc_exp == next_loc_exp)
 return;
 
-  /* Find the macro map for the macro expansion BODY_LOC.  */
-  const line_map *map = linemap_lookup (line_table, body_loc);
-  const line_map_macro *macro_map = linemap_check_macro (map);
-
-  /* Now see if the following token is coming from the same macro
- expansion.  If it is, it's a problem, because it should've been
- parsed at this point.  We only look at odd-numbered indexes
- within the MACRO_MAP_LOCATIONS array, i.e. the spelling locations
- of the tokens.  */
-  bool found_guard = false;
-  bool found_next = false;
-  for (unsigned int i = 1;
-   i < 2 * MACRO_MAP_NUM_MACRO_TOKENS (macro_map);
-   i += 2)
-{
-  if (MACRO_MAP_LOCATIONS (macro_map)[i] == next_loc_exp)
-   found_next = true;
-  if (MACRO_MAP_LOCATIONS (macro_map)[i] == guard_loc_exp)
-   found_guard = true;
-}
+  /* Find the macro maps for the macro expansions.  */
+  const line_map *body_map = linemap_lookup (line_table, body_loc);
+  const line_map *next_map = linemap_lookup (line_table, next_loc);
+  const line_map *guard_map = linemap_lookup (line_table, guard_loc);
+
+  /* Now see if the following token (after the body) is coming from the
+ same macro expansion.  If it is, it might be a problem.  */
+  if (body_map != next_map)
+return;
 
   /* The conditional itself must not come from the same expansion, because
  we don't want to warn about
  #define IF if (x) x++; y++
  and similar.  */
-  if (!found_next || found_guard)
+  if (guard_map == body_map)
 return;
 
+  /* Handle the case where NEXT and BODY come from the same expansion while
+ GUARD doesn't, yet we shouldn't warn.  E.g.
+
+   #define GUARD if (...)
+   #define GUARD2 GUARD
+
+ and in the definition of another macro:
+
+   GUARD2
+   foo ();
+   return 1;
+   */
+  while (linemap_macro_expansion_map_p (guard_map))
+{
+  const line_map_macro *mm = linemap_check_macro (guard_map);
+  guard_loc_exp = MACRO_MAP_EXPANSION_POINT_LOCATION (mm);
+  guard_map = linemap_lookup (line_table, guard_loc_exp);
+  if (guard_map == body_map)
+   return;
+}
+
   if (warning_at (body_loc, OPT_Wmultistatement_macros,
  "macro expands to multiple statements"))
 inform (guard_loc, "some parts of macro expansion are not guarded by "
diff --git gcc/testsuite/c-c++-common/Wmultistatement-macros-13.c 
gcc/testsuite/c-c++-common/Wmultistatement-macros-13.c
index e69de29bb2d..9f42e268d9f 100644
--- gcc/testsuite/c-c++-common/Wmultistatement-macros-13.c
+++ gcc/testsuite/c-c++-common/Wmultistatement-macros-13.c
@@ -0,0 +1,104 @@
+/* PR c/81448 */
+/* { dg-do compile } */
+/* { dg-options "-Wmultistatement-macros" } */
+
+extern int i;
+
+#define BAD4 i++; i++ /* { dg-warning "macro expands to multiple statements" } 
*/
+#define BAD5 i++; i++ /* { dg-warning "macro expands to multiple statements" } 
*/
+#define BAD6 i++; i++ /* { dg-warning "macro expands to multip

Re: c-family PATCH to improve -Wsign-compare (PR c/81417)

2017-07-27 Thread Marek Polacek
On Tue, Jul 25, 2017 at 11:03:12AM -0400, David Malcolm wrote:
> Thanks for updating the patch.
> 
> There's still an issue with ordering in the updated patch.
> 
> There are three orderings:
> 
>   (a) the order of the expressions in the source code (LHS CMP RHS)
> 
>   (b) the order of kinds of signedness in the messages (currently
> hardcoded as "signed and unsigned", which doesn't respect (a))
> 
>   (c) the order of the the types that are reported (currently done as
> orig_op0 vs orig_op1, which if I'm reading the code is LHS vs RHS).
> 
> So, as written (a) and (c) have the same order, but (b)'s order is
> hardcoded, and so there could be a mismatch.
> 
> All of the examples in the testcase are of the form
>   signed LHS with unsigned RHS.
> 
> What happens if the LHS is unsigned, and the RHS is signed?  e.g.
> 
>   int
>   fn5 (unsigned int a, signed int b)
>   {
> return a < b;
>   }
> 
> Presumably this case still merits a warning, but as written, the
> warning would read:
> 
>   warning "comparison between signed and unsigned integer expressions:   
> 'unsigned int' and 'signed int'
> 
> This seems rather awkward to me; in a less trivial example, I can imagine the 
> user having to take a moment to figure out which side of the expression has 
> which signedness.
> 
> I think that any time we're reporting on the types of two sides of an 
> expression like this, we should follow the ordering in the user's code, i.e. 
> (a) above.   The patch has (c) doing this, but the text (b) is problematic.
> 
> I can see two ways of fixing this:
> 
> (i) rework the text of the message so that it changes based on which side has 
> which signedness, e.g.:
> 
>   "comparison between signed and unsigned integer expressions"
> vs
>   "comparison between unsigned and signed integer expressions"
> 
> or,
> 
> (ii) change the text of the message to not have an ordering.  Clang has 
> "comparison of integers of different signs" - though I think this should say 
> "signedness", not "signs"; surely an instance of an int has a sign (e.g. "-3" 
> is negative), but a integer *type* has a signedness (e.g. "unsigned short").  
> So I'd change it to say:
> "comparison of integer expressions of different signedness"
> 
> Please also add a testcase that covers this case (e.g. fn5 above).

I went with (ii).  I've also added the new test:

Bootstrapped/regtested on x86_64-linux and ppc64le-linux, ok for trunk?

2017-07-27  Marek Polacek  

PR c/81417
* c-warn.c (warn_for_sign_compare): Tweak the warning message.  Print
the types.

* c-c++-common/Wsign-compare-1.c: New test.
* g++.dg/warn/Wsign-compare-2.C: Update dg-warning.
* g++.dg/warn/Wsign-compare-4.C: Likewise.
* g++.dg/warn/Wsign-compare-6.C: Likewise.
* g++.dg/warn/compare1.C: Likewise.
* gcc.dg/compare1.c: Likewise.
* gcc.dg/compare2.c: Likewise.
* gcc.dg/compare4.c: Likewise.
* gcc.dg/compare5.c: Likewise.
* gcc.dg/pr35430.c: Likewise.
* gcc.dg/pr60087.c: Likewise.

diff --git gcc/c-family/c-warn.c gcc/c-family/c-warn.c
index a8b38c1b98d..505070e5586 100644
--- gcc/c-family/c-warn.c
+++ gcc/c-family/c-warn.c
@@ -1891,9 +1891,10 @@ warn_for_sign_compare (location_t location,
   c_common_signed_type (base_type)))
/* OK */;
   else
-   warning_at (location,
-   OPT_Wsign_compare,
-   "comparison between signed and unsigned integer 
expressions");
+   warning_at (location, OPT_Wsign_compare,
+   "comparison of integer expressions of different "
+   "signedness: %qT and %qT", TREE_TYPE (orig_op0),
+   TREE_TYPE (orig_op1));
 }
 
   /* Warn if two unsigned values are being compared in a size larger
diff --git gcc/testsuite/c-c++-common/Wsign-compare-1.c 
gcc/testsuite/c-c++-common/Wsign-compare-1.c
index e69de29bb2d..b9b17a99280 100644
--- gcc/testsuite/c-c++-common/Wsign-compare-1.c
+++ gcc/testsuite/c-c++-common/Wsign-compare-1.c
@@ -0,0 +1,33 @@
+/* PR c/81417 */
+/* { dg-do compile } */
+/* { dg-options "-Wsign-compare" } */
+
+int
+fn1 (signed int a, unsigned int b)
+{
+  return a < b; /* { dg-warning "comparison of integer expressions of 
different signedness: 'int' and 'unsigned int'" } */
+}
+
+int
+fn2 (signed int a, unsigned int b)
+{
+  return b < a; /* { dg-warning "comparison of integer expressions of 
different signedness: 'unsigned int' and 'int'" } */
+}
+
+int
+fn3 (signed long int a, unsigned long int b)
+{
+  return b < a; /* { dg-warning "comparison of integer expressions of 
different signedness: 'long unsigned int' and 'long int'" } */
+}
+
+int
+fn4 (signed short int a, unsigned int b)
+{
+  return b < a; /* { dg-warning "comparison of integer expressions of 
different signedness: 'unsigned int' and 'short int'" } */
+}
+
+int
+fn5 (unsigned int a, signed int b)
+{
+  return a < b; /* { dg-warning "compa

[PATCH] Dump BB number when dumping a BB with label.

2017-07-27 Thread Martin Liška
Hi.

Following simple patch adds support for dumping of BBs when it's a BB
that contains a label. That makes it easier for debugging as one can
find destination for an edge in dump file.

Sample, before:

foo (int a)
{
  int D.1821;
  int _1;
  int _4;
  int _5;

   [0.00%] [count: INV]:
  switch (a_2(D))  [INV] [count: INV], case 0:  [INV] [count: 
INV], case 1:  [INV] [count: INV]>

 [0.00%] [count: INV]:
  a_3 = a_2(D) + 2;

 [0.00%] [count: INV]:
  _4 = 2;
  goto  (); [INV] [count: INV]

 [0.00%] [count: INV]:
  _5 = 123;

  # _1 = PHI <_4(4), _5(5)>
 [0.00%] [count: INV]:
  return _1;

}

After:

foo (int a)
{
  int D.1821;
  int _1;
  int _4;
  int _5;

   [0.00%] [count: INV]:
  switch (a_2(D))  [INV] [count: INV], case 0:  [INV] [count: 
INV], case 1:  [INV] [count: INV]>

 () [0.00%] [count: INV]:
  a_3 = a_2(D) + 2;

 () [0.00%] [count: INV]:
  _4 = 2;
  goto  (); [INV] [count: INV]

 () [0.00%] [count: INV]:
  _5 = 123;

  # _1 = PHI <_4(4), _5(5)>
 () [0.00%] [count: INV]:
  return _1;

}

Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.

Thoughts?
Martin

gcc/testsuite/ChangeLog:

2017-07-27  Martin Liska  

* gcc.dg/builtin-unreachable-6.c: Update scanned pattern.
* gcc.dg/tree-ssa/attr-hotcold-2.c: Likewise.
* gcc.dg/tree-ssa/ssa-ccp-18.c: Likewise.

gcc/ChangeLog:

2017-07-27  Martin Liska  

* gimple-pretty-print.c (dump_gimple_label): Dump BB number.
---
 gcc/gimple-pretty-print.c  | 6 +-
 gcc/testsuite/gcc.dg/builtin-unreachable-6.c   | 2 +-
 gcc/testsuite/gcc.dg/tree-ssa/attr-hotcold-2.c | 4 ++--
 gcc/testsuite/gcc.dg/tree-ssa/ssa-ccp-18.c | 3 +--
 4 files changed, 9 insertions(+), 6 deletions(-)


diff --git a/gcc/gimple-pretty-print.c b/gcc/gimple-pretty-print.c
index c8eb9c4a7bf..6b272286714 100644
--- a/gcc/gimple-pretty-print.c
+++ b/gcc/gimple-pretty-print.c
@@ -1122,7 +1122,11 @@ dump_gimple_label (pretty_printer *buffer, glabel *gs, int spc,
   dump_generic_node (buffer, label, spc, flags, false);
   basic_block bb = gimple_bb (gs);
   if (bb && !(flags & TDF_GIMPLE))
-	pp_scalar (buffer, " %s", dump_profile (bb->frequency, bb->count));
+	{
+	  if (gimple_bb (gs))
+	pp_scalar (buffer, " ()", gimple_bb (gs)->index);
+	  pp_scalar (buffer, " %s", dump_profile (bb->frequency, bb->count));
+	}
   pp_colon (buffer);
 }
   if (flags & TDF_GIMPLE)
diff --git a/gcc/testsuite/gcc.dg/builtin-unreachable-6.c b/gcc/testsuite/gcc.dg/builtin-unreachable-6.c
index d2596e95c3f..040917f29b0 100644
--- a/gcc/testsuite/gcc.dg/builtin-unreachable-6.c
+++ b/gcc/testsuite/gcc.dg/builtin-unreachable-6.c
@@ -16,5 +16,5 @@ lab2:
   goto *x;
 }
 
-/* { dg-final { scan-tree-dump-times "lab \\\[\[0-9.\]+%\\\]" 1 "fab1" } } */
+/* { dg-final { scan-tree-dump-times "lab \\\(\\\) \\\[\[0-9.\]+%\\\]" 1 "fab1" } } */
 /* { dg-final { scan-tree-dump-times "__builtin_unreachable" 1 "fab1" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/attr-hotcold-2.c b/gcc/testsuite/gcc.dg/tree-ssa/attr-hotcold-2.c
index 184dd10ddae..67eb9163684 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/attr-hotcold-2.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/attr-hotcold-2.c
@@ -20,9 +20,9 @@ void f(int x, int y)
 
 /* { dg-final { scan-tree-dump-times "hot label heuristics" 1 "profile_estimate" } } */
 /* { dg-final { scan-tree-dump-times "cold label heuristics" 1 "profile_estimate" } } */
-/* { dg-final { scan-tree-dump "A \\\[0\\\..*\\\]" "profile_estimate" } } */
+/* { dg-final { scan-tree-dump "A \\\(\\\) \\\[0\\\..*\\\]" "profile_estimate" } } */
 
 /* Note: we're attempting to match some number > 6000, i.e. > 60%.
The exact number ought to be tweekable without having to juggle
the testcase around too much.  */
-/* { dg-final { scan-tree-dump "B \\\[\[6-9\]\[0-9\]\\\..*\\\]" "profile_estimate" } } */
+/* { dg-final { scan-tree-dump "B \\\(\\\) \\\[\[6-9\]\[0-9\]\\\..*\\\]" "profile_estimate" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-ccp-18.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-ccp-18.c
index 2ab12626088..78d53520395 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/ssa-ccp-18.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-ccp-18.c
@@ -15,5 +15,4 @@ void func2(int* val)
   d: d(val);
 }
 
-/* { dg-final { scan-tree-dump-not "a \\\(" "ccp1" } } */
-/* { dg-final { scan-tree-dump-not "b \\\(" "ccp1" } } */
+/* { dg-final { scan-tree-dump-not "goto" "ccp1" } } */



[PATCH] Add -nolibc option

2017-07-27 Thread Tristan Gingold

Hello,

this patch adds a new option -nolibc to supress -lc in the link command.
This refines -nostdlib/-nostartfiles/nodefaultlibs, so that it is 
possible to link with libgcc but without libc.


Our main use case is for embedded targets when we use the GNAT compiler 
without an installed libc.  Of course, in that case the gnat library has 
to provide its own memcpy/memset/memmove/memcmp if needed.


No regressions on x86_64-linux-gnu.
Ok to commit ?

Tristan.

2017-07-27  gingold  

* common.opt (nolibc): New option.
* doc/invoke.texi (Link Options): Document it.
* gcc.c (LINK_GCC_C_SEQUENCE_SPEC): Consider nolibc.
* config/arm/unknown-elf.h (LINK_GCC_C_SEQUENCE_SPEC): Likewise.

Index: gcc/common.opt
===
--- gcc/common.opt  (revision 250563)
+++ gcc/common.opt  (working copy)
@@ -2956,6 +2956,10 @@
 nostdlib
 Driver

+nolibc
+Driver
+Do not link with libc
+
 o
 Common Driver Joined Separate Var(asm_file_name) 
MissingArgError(missing filename after %qs)

 -o Place output into .
Index: gcc/config/arm/unknown-elf.h
===
--- gcc/config/arm/unknown-elf.h(revision 250563)
+++ gcc/config/arm/unknown-elf.h(working copy)
@@ -91,6 +91,6 @@
 /* The libgcc udivmod functions may throw exceptions.  If newlib is
configured to support long longs in I/O, then printf will depend on
udivmoddi4, which will depend on the exception unwind routines,
-   which will depend on abort, which is defined in libc.  */
+   which will depend on abort, which is defined in libc.  */
 #undef LINK_GCC_C_SEQUENCE_SPEC
-#define LINK_GCC_C_SEQUENCE_SPEC "--start-group %G %L --end-group"
+#define LINK_GCC_C_SEQUENCE_SPEC "--start-group %G %{!nolibc:%L} 
--end-group"

Index: gcc/doc/invoke.texi
===
--- gcc/doc/invoke.texi (revision 250563)
+++ gcc/doc/invoke.texi (working copy)
@@ -495,8 +495,8 @@
 @item Linker Options
 @xref{Link Options,,Options for Linking}.
 @gccoptlist{@var{object-file-name}  -fuse-ld=@var{linker} 
-l@var{library} @gol

--nostartfiles  -nodefaultlibs  -nostdlib  -pie  -pthread  -rdynamic @gol
--s  -static  -static-libgcc  -static-libstdc++ @gol
+-nostartfiles  -nodefaultlibs  -nostdlib  -nolibc  -pie  -pthread @gol
+-rdynamic  -s  -static  -static-libgcc  -static-libstdc++ @gol
 -static-libasan  -static-libtsan  -static-liblsan  -static-libubsan @gol
 -static-libmpx  -static-libmpxwrappers @gol
 -shared  -shared-libgcc  -symbolic @gol
@@ -11760,6 +11760,12 @@
 constructors are called; @pxref{Collect2,,@code{collect2}, gccint,
 GNU Compiler Collection (GCC) Internals}.)

+@item -nolibc
+@opindex nolibc
+Do not use the standard C library when linking, but still link with
+start files and @file{libgcc.a}.  This is useful mainly on bare-board
+targets in the case there is no C library available.
+
 @item -pie
 @opindex pie
 Produce a position independent executable on targets that support it.
Index: gcc/gcc.c
===
--- gcc/gcc.c   (revision 250563)
+++ gcc/gcc.c   (working copy)
@@ -863,7 +863,7 @@
-lgcc and -lc order specially, yet not require them to override all
of LINK_COMMAND_SPEC.  */
 #ifndef LINK_GCC_C_SEQUENCE_SPEC
-#define LINK_GCC_C_SEQUENCE_SPEC "%G %L %G"
+#define LINK_GCC_C_SEQUENCE_SPEC "%G %{!nolibc:%L %G}"
 #endif

 #ifndef LINK_SSP_SPEC


Re: [Patch (preapproved)] Guard Copy Header pass on flag_tree_loop_vectorize

2017-07-27 Thread James Greenhalgh
On Thu, Jul 27, 2017 at 02:26:03PM +0200, Richard Biener wrote:
> On Thu, Jul 27, 2017 at 2:08 PM, Jakub Jelinek  wrote:
> > On Thu, Jul 27, 2017 at 01:54:21PM +0200, Richard Biener wrote:
> >> --- gcc/common.opt  (revision 250619)
> >> +++ gcc/common.opt  (working copy)
> >>  ftree-vectorize
> >> -Common Report Var(flag_tree_vectorize) Optimization
> >> +Common Report Optimization
> >>  Enable vectorization on trees.
> >>
> >>  ftree-vectorizer-verbose=
> >>
> >> which shows a few other uses of flag_tree_vectorize:
> >>
> >> int
> >> omp_max_vf (void)
> >> {
> >>   if (!optimize
> >>   || optimize_debug
> >>   || !flag_tree_loop_optimize
> >>   || (!flag_tree_loop_vectorize
> >>   && (global_options_set.x_flag_tree_loop_vectorize
> >>   || global_options_set.x_flag_tree_vectorize)))
> >> return 1;
> >>
> >> not sure what that was supposed to test.  Jakub?  Similar
> >> use in expand_omp_simd.
> >
> > The intent is/was to check if the vectorizer pass will be invoked,
> > otherwise it makes no sense to generate the arrays.
> > So, for -O0/-Og or -fno-tree-loop-optimize, we know that the pass
> > isn't even in the pipeline.
> > And otherwise the intent was that we try to optimize, unless
> > user asked explicitly -fno-tree-loop-vectorize or -fno-tree-vectorize
> > not to optimize.  Because the vect pass is enabled if:
> > flag_tree_loop_vectorize || fun->has_force_vectorize_loops
> > but returning non-zero from omp_max_vf and the other omp spot means
> > there will be cfun->has_force_vectorize_loops.
> 
> I see.  So it would be good to try if adding EnabledBy(ftree-vectorize) to
> ftree-loop-vectorize/ftree-slp-vectorize would add those to global_options_set
> iff -ftree-vectorize is enabled (the opts.c hunk setting the flags is then
> redundant as well I guess).

This looks like it works.

I'll prepare the patch and put it through a full bootstrap cycle.

Thanks,
James



Re: [PATCH] [RISCV] Add RTEMS support

2017-07-27 Thread Kito Cheng
Hi Sebastian:
LGTM, I've test riscv32-rtems-gcc is buildable.

Thanks for you patch :)

Hi Palmer:
Could you help to commit this patch ?

Thanks.

On Thu, Jul 27, 2017 at 7:05 PM, Sebastian Huber
 wrote:
> gcc/
> * config.gcc (riscv*-*-elf*): Add (riscv*-*-rtems*).
> * config/riscv/rtems.h: New file.
> ---
>  gcc/config.gcc   |  7 ++-
>  gcc/config/riscv/rtems.h | 31 +++
>  2 files changed, 37 insertions(+), 1 deletion(-)
>  create mode 100644 gcc/config/riscv/rtems.h
>
> diff --git a/gcc/config.gcc b/gcc/config.gcc
> index aab7f65c1df..f28164646c3 100644
> --- a/gcc/config.gcc
> +++ b/gcc/config.gcc
> @@ -2040,7 +2040,7 @@ riscv*-*-linux*)
> # automatically detect that GAS supports it, yet we require it.
> gcc_cv_initfini_array=yes
> ;;
> -riscv*-*-elf*)
> +riscv*-*-elf* | riscv*-*-rtems*)
> tm_file="elfos.h newlib-stdint.h ${tm_file} riscv/elf.h"
> case "x${enable_multilib}" in
> xno) ;;
> @@ -2053,6 +2053,11 @@ riscv*-*-elf*)
> # Force .init_array support.  The configure script cannot always
> # automatically detect that GAS supports it, yet we require it.
> gcc_cv_initfini_array=yes
> +   case ${target} in
> +   riscv*-*-rtems*)
> + tm_file="${tm_file} rtems.h riscv/rtems.h"
> + ;;
> +   esac
> ;;
>  mips*-*-netbsd*)   # NetBSD/mips, either endian.
> target_cpu_default="MASK_ABICALLS"
> diff --git a/gcc/config/riscv/rtems.h b/gcc/config/riscv/rtems.h
> new file mode 100644
> index 000..221e2f69815
> --- /dev/null
> +++ b/gcc/config/riscv/rtems.h
> @@ -0,0 +1,31 @@
> +/* Definitions for RISC-V RTEMS systems with ELF format.
> +   Copyright (C) 2017 Free Software Foundation, Inc.
> +
> +   This file is part of GCC.
> +
> +   GCC is free software; you can redistribute it and/or modify it
> +   under the terms of the GNU General Public License as published
> +   by the Free Software Foundation; either version 3, or (at your
> +   option) any later version.
> +
> +   GCC is distributed in the hope that it will be useful, but WITHOUT
> +   ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
> +   or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
> +   License for more details.
> +
> +   Under Section 7 of GPL version 3, you are granted additional
> +   permissions described in the GCC Runtime Library Exception, version
> +   3.1, as published by the Free Software Foundation.
> +
> +   You should have received a copy of the GNU General Public License and
> +   a copy of the GCC Runtime Library Exception along with this program;
> +   see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
> +   .  */
> +
> +#undef TARGET_OS_CPP_BUILTINS
> +#define TARGET_OS_CPP_BUILTINS()   \
> +do {   \
> +   builtin_define ("__rtems__");   \
> +   builtin_define ("__USE_INIT_FINI__");   \
> +   builtin_assert ("system=rtems");\
> +} while (0)
> --
> 2.12.3
>


Re: [PATCH] Fix PR middle-end/81564: ICE in group_case_labels_stmt()

2017-07-27 Thread Peter Bergner
On 7/27/17 2:48 AM, Richard Biener wrote:
> On Wed, Jul 26, 2017 at 9:35 PM, Peter Bergner  wrote:
>> The fix here is to just treat case labels that point to blocks that have
>> already been deleted similarly to case labels that point to the default
>> case statement, by removing them.
>>
>> This passed bootstrap and regtesting on powerpc64le-linux with no 
>> regressions.
>> Ok for trunk?
> 
> Ok.

Thanks, committed as revision 250628.

Peter



[PATCH][2/2] Fix PR81502

2017-07-27 Thread Richard Biener

I am testing the following additional pattern for match.pd to fix
PR81502 resulting in the desired optimization to

bar:
.LFB526:
.cfi_startproc
movl%edi, %eax
ret

the pattern optimizes a BIT_FIELD_REF on a BIT_INSERT_EXPR by
either extracting from the destination or the inserted value.

Bootstrap and regtest running on x86_64-unknown-linux-gnu.

Richard.

2017-07-27  Richard Biener  

PR tree-optimization/81502
* match.pd: Add pattern combining BIT_INSERT_EXPR with
BIT_FIELD_REF.

* gcc.target/i386/pr81502.c: New testcase.

Index: gcc/match.pd
===
*** gcc/match.pd(revision 250620)
--- gcc/match.pd(working copy)
*** DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
*** 4178,4180 
--- 4178,4195 
 { CONSTRUCTOR_ELT (ctor, idx / k)->value; })
(BIT_FIELD_REF { CONSTRUCTOR_ELT (ctor, idx / k)->value; }
   @1 { bitsize_int ((idx % k) * width); })
+ 
+ /* Simplify a bit extraction from a bit insertion for the cases with
+the inserted element fully covering the extraction or the insertion
+not touching the extraction.  */
+ (simplify
+  (BIT_FIELD_REF (bit_insert @0 @1 @ipos) @rsize @rpos)
+  (switch
+   (if (wi::leu_p (@ipos, @rpos)
+&& wi::leu_p (wi::add (@rpos, @rsize),
+  wi::add (@ipos, TYPE_PRECISION (TREE_TYPE (@1)
+(BIT_FIELD_REF @1 @rsize { wide_int_to_tree (bitsizetype,
+ wi::sub (@rpos, @ipos)); }))
+   (if (wi::geu_p (@ipos, wi::add (@rpos, @rsize))
+|| wi::geu_p (@rpos, wi::add (@ipos, TYPE_PRECISION (TREE_TYPE (@1)
+(BIT_FIELD_REF @0 @rsize @rpos
Index: gcc/testsuite/gcc.target/i386/pr81502.c
===
*** gcc/testsuite/gcc.target/i386/pr81502.c (nonexistent)
--- gcc/testsuite/gcc.target/i386/pr81502.c (working copy)
***
*** 0 
--- 1,34 
+ /* { dg-do compile { target lp64 } } */
+ /* { dg-options "-O2 -msse2" } */
+ 
+ #include 
+ 
+ #define SIZE (sizeof (void *))
+ 
+ static int foo(unsigned char (*foo)[SIZE])
+ {
+   __m128i acc = _mm_set_epi32(0, 0, 0, 0);
+   size_t i = 0;
+   for(; i + sizeof(__m128i) <= SIZE; i += sizeof(__m128i)) {
+   __m128i word;
+   __builtin_memcpy(&word, foo + i, sizeof(__m128i));
+   acc = _mm_add_epi32(word, acc);
+   }
+   if (i != SIZE) {
+   __m128i word = _mm_set_epi32(0, 0, 0, 0);
+   __builtin_memcpy(&word, foo + i, SIZE - i); // (1)
+   acc = _mm_add_epi32(word, acc);
+   }
+   int res;
+   __builtin_memcpy(&res, &acc, sizeof(res));
+   return res;
+ }
+ 
+ int bar(void *ptr)
+ {
+   unsigned char buf[SIZE];
+   __builtin_memcpy(buf, &ptr, SIZE);
+   return foo((unsigned char(*)[SIZE])buf);
+ }
+ 
+ /* { dg-final { scan-assembler-times "mov" 1 } } */


[PATCH] Fix PR81571

2017-07-27 Thread Richard Biener

The following fixes PR81571.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

Richard.

2017-07-27  Richard Biener  

PR tree-optimization/81571
* tree-vect-slp.c (vect_build_slp_tree): Properly verify reduction
PHIs.

* gcc.dg/torture/pr81571.c: New testcase.

Index: gcc/tree-vect-slp.c
===
--- gcc/tree-vect-slp.c (revision 250607)
+++ gcc/tree-vect-slp.c (working copy)
@@ -947,11 +948,27 @@ vect_build_slp_tree (vec_info *vinfo,
  the recursion.  */
   if (gimple_code (stmt) == GIMPLE_PHI)
 {
+  vect_def_type def_type = STMT_VINFO_DEF_TYPE (vinfo_for_stmt (stmt));
   /* Induction from different IVs is not supported.  */
-  if (STMT_VINFO_DEF_TYPE (vinfo_for_stmt (stmt)) == vect_induction_def)
-   FOR_EACH_VEC_ELT (stmts, i, stmt)
- if (stmt != stmts[0])
-   return NULL;
+  if (def_type == vect_induction_def)
+   {
+ FOR_EACH_VEC_ELT (stmts, i, stmt)
+   if (stmt != stmts[0])
+ return NULL;
+   }
+  else
+   {
+ /* Else def types have to match.  */
+ FOR_EACH_VEC_ELT (stmts, i, stmt)
+   {
+ /* But for reduction chains only check on the first stmt.  */
+ if (GROUP_FIRST_ELEMENT (vinfo_for_stmt (stmt))
+ && GROUP_FIRST_ELEMENT (vinfo_for_stmt (stmt)) != stmt)
+   continue;
+ if (STMT_VINFO_DEF_TYPE (vinfo_for_stmt (stmt)) != def_type)
+   return NULL;
+   }
+   }
   node = vect_create_new_slp_node (stmts);
   return node;
 }
Index: gcc/testsuite/gcc.dg/torture/pr81571.c
===
--- gcc/testsuite/gcc.dg/torture/pr81571.c  (nonexistent)
+++ gcc/testsuite/gcc.dg/torture/pr81571.c  (working copy)
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+
+int a, b, c, d;
+short fn1(int p1, int p2) { return p1; }
+
+int fn2(int p1) {}
+
+int main()
+{
+  for (; c; c++)
+a |= fn1(1, a) | fn2(b |= d);
+  return 0;
+}


[PATCH] Fix PR81573

2017-07-27 Thread Richard Biener

The following fixes PR81573.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

Richard.

2017-07-27  Richard Biener  

PR tree-optimization/81573
PR tree-optimization/81494
* tree-vect-loop.c (vect_create_epilog_for_reduction): Handle
multi defuse cycle case.

* gcc.dg/torture/pr81573.c: New testcase.

Index: gcc/tree-vect-loop.c
===
--- gcc/tree-vect-loop.c(revision 250607)
+++ gcc/tree-vect-loop.c(working copy)
@@ -4787,20 +4800,17 @@ vect_create_epilog_for_reduction (vec 1)
+{
+  gcc_assert (new_phis.length () == 1);
+  tree first_vect = PHI_RESULT (new_phis[0]);
+  gassign *new_vec_stmt = NULL;
+  vec_dest = vect_create_destination_var (scalar_dest, vectype);
+  gimple *next_phi = new_phis[0];
+  for (int k = 1; k < ncopies; ++k)
+   {
+ next_phi = STMT_VINFO_RELATED_STMT (vinfo_for_stmt (next_phi));
+ tree second_vect = PHI_RESULT (next_phi);
+  tree tem = make_ssa_name (vec_dest, new_vec_stmt);
+  new_vec_stmt = gimple_build_assign (tem, code,
+ first_vect, second_vect);
+  gsi_insert_before (&exit_gsi, new_vec_stmt, GSI_SAME_STMT);
+ first_vect = tem;
+   }
+  new_phi_result = first_vect;
+  new_phis.truncate (0);
+  new_phis.safe_push (new_vec_stmt);
+}
   else
 new_phi_result = PHI_RESULT (new_phis[0]);
 
Index: gcc/testsuite/gcc.dg/torture/pr81573.c
===
--- gcc/testsuite/gcc.dg/torture/pr81573.c  (nonexistent)
+++ gcc/testsuite/gcc.dg/torture/pr81573.c  (working copy)
@@ -0,0 +1,16 @@
+/* { dg-do run } */
+
+int a = 1, *c = &a, d;
+char b;
+
+int main ()
+{
+  for (; b > -27; b--)
+{
+  *c ^= b;
+  *c ^= 1;
+}
+  while (a > 1)
+;
+  return 0; 
+}


Re: [PATCH 1/2] Introduce testsuite support to run Python tests

2017-07-27 Thread David Malcolm
On Thu, 2017-07-27 at 10:49 +0200, Pierre-Marie de Rodat wrote:
> On 07/26/2017 06:48 PM, David Malcolm wrote:
> > IIRC RHEL 6 has Python 2.6 as its /usr/bin/python (but Python 2.7
> > is
> > available as a "software collection" add-on).
> > 
> > I don't know if gcc as a project would want to support 2.6+ or
> > simply
> > 2.7 for Python 2.
> 
> I don’t know neither: let’s wait for further feedback, then. If
> needed 
> I’ll turn all the .format into % operations.

Note that str.format was introduced in Python 2.6, so even if we do
support that old version, str.format is still good; no need to rewrite
those ops.

(sorry, I used to maintain Python for RHEL, so I'm perhaps over
-sensitive to this kind of thing).

Dave


Re: [patch 0/2] PR49847: Add hook to place read-only lookup-tables in named address-space

2017-07-27 Thread Georg-Johann Lay

On 27.07.2017 14:34, Richard Biener wrote:

On Thu, Jul 27, 2017 at 2:29 PM, Georg-Johann Lay  wrote:

For some targets, the best place to put read-only lookup tables as
generated by -ftree-switch-conversion is not the generic address space
but some target specific address space.

This is the case for AVR, where for most devices .rodata must be
located in RAM.

Part #1 adds a new, optional target hook that queries the back-end
about the desired address space.  There is currently only one user:
tree-switch-conversion.c::build_one_array() which re-builds value_type
and array_type if the address space returned by the backend is not
the generic one.

Part #2 is the AVR part that implements the new hook and adds some
sugar around it.


Given that switch-conversion just creates a constant initializer doesn't AVR
benefit from handling those uniformly (at RTL expansion?).  Not sure but
I think it goes through the regular constant pool handling.

Richard.


avr doesn't use constant pools because they are not profitable for
several reasons.

Moreover, it's not sufficient to just patch the pool's section, you'd
also have to patch *all* accesses to also use the exact same address
space so that they use the correct instruction (LPM instead of LD).

Loop optimization, for example, may move addr-space pointers out of
loops and actually does this for some CSWTCH tables.  I didn't look
into pool handling, but don't expect it allows to consistently
patch all accesses in the aftermath.

*If* that's possible, then it should also be possible to patch vtables
and all of their accesses, aka.

https://gcc.gnu.org/PR43745

e.g. in a target-specific tree pass?

With vtables it's basically the same problem, you'd have to patch *all*
accesses to match the vtable's address-space.  c++ need not to expose
address-spaces to the application, handling address-spaces internally
would be sufficient.

Johann


Re: [PATCH v2] [SPARC] Add -mfsmuld option

2017-07-27 Thread Eric Botcazou
> Thanks for your quick review. I am really glad that we can now use the
> upcoming GCC 7.2 release.

You"re welcome.  I just realized that FSMULD would pop up out of nowhere in 
the log displayed by -mdebug=options so I have installed the attached fixlet.


2017-07-27  Eric Botcazou  

* config/sparc/sparc.c (sparc_option_override): Set MASK_FSMULD flag
earlier and only if MASK_FPU is set.  Adjust formatting.


-- 
Eric BotcazouIndex: config/sparc/sparc.c
===
--- config/sparc/sparc.c	(revision 250609)
+++ config/sparc/sparc.c	(working copy)
@@ -1449,8 +1449,7 @@ sparc_option_override (void)
   MASK_V9|MASK_POPC|MASK_VIS4|MASK_FMAF|MASK_CBCOND|MASK_SUBXC },
 /* UltraSPARC M8 */
 { "m8",		MASK_ISA,
-  MASK_V9|MASK_POPC|MASK_VIS4|MASK_FMAF|MASK_CBCOND|MASK_SUBXC
-  |MASK_VIS4B }
+  MASK_V9|MASK_POPC|MASK_VIS4|MASK_FMAF|MASK_CBCOND|MASK_SUBXC|MASK_VIS4B }
   };
   const struct cpu_table *cpu;
   unsigned int i;
@@ -1489,6 +1488,11 @@ sparc_option_override (void)
 	}
 }
 
+  /* Enable the FsMULd instruction by default if not explicitly specified by
+ the user.  It may be later disabled by the CPU (explicitly or not).  */
+  if (TARGET_FPU && !(target_flags_explicit & MASK_FSMULD))
+target_flags |= MASK_FSMULD;
+
   if (TARGET_DEBUG_OPTIONS)
 {
   dump_target_flags("Initial target_flags", target_flags);
@@ -1513,12 +1517,6 @@ sparc_option_override (void)
   target_flags |= MASK_LONG_DOUBLE_128;
 }
 
-  /* Enable the FsMULd instruction by default if not explicitly configured by
- the user.  It may be later disabled by the CPU target flags or if
- !TARGET_FPU.  */
-  if (!(target_flags_explicit & MASK_FSMULD))
-target_flags |= MASK_FSMULD;
-
   /* Code model selection.  */
   sparc_cmodel = SPARC_DEFAULT_CMODEL;
 
@@ -1540,7 +1538,7 @@ sparc_option_override (void)
 	sparc_cmodel = cmodel->value;
 	}
   else
-	error ("-mcmodel= is not supported on 32 bit systems");
+	error ("-mcmodel= is not supported on 32-bit systems");
 }
 
   /* Check that -fcall-saved-REG wasn't specified for out registers.  */
@@ -1551,7 +1549,7 @@ sparc_option_override (void)
 call_used_regs [i] = 1;
   }
 
-  /* Set the default CPU.  */
+  /* Set the default CPU if no -mcpu option was specified.  */
   if (!global_options_set.x_sparc_cpu_and_features)
 {
   for (def = &cpu_default[0]; def->cpu != -1; ++def)
@@ -1561,6 +1559,7 @@ sparc_option_override (void)
   sparc_cpu_and_features = def->processor;
 }
 
+  /* Set the default CPU if no -mtune option was specified.  */
   if (!global_options_set.x_sparc_cpu)
 sparc_cpu = sparc_cpu_and_features;
 
@@ -1569,8 +1568,6 @@ sparc_option_override (void)
   if (TARGET_DEBUG_OPTIONS)
 {
   fprintf (stderr, "sparc_cpu_and_features: %s\n", cpu->name);
-  fprintf (stderr, "sparc_cpu: %s\n",
-	   cpu_table[(int) sparc_cpu].name);
   dump_target_flags ("cpu->disable", cpu->disable);
   dump_target_flags ("cpu->enable", cpu->enable);
 }
@@ -1613,7 +1610,7 @@ sparc_option_override (void)
 
   /* Don't allow -mvis, -mvis2, -mvis3, -mvis4, -mvis4b, -mfmaf and -mfsmuld if
  FPU is disabled.  */
-  if (! TARGET_FPU)
+  if (!TARGET_FPU)
 target_flags &= ~(MASK_VIS | MASK_VIS2 | MASK_VIS3 | MASK_VIS4
 		  | MASK_VIS4B | MASK_FMAF | MASK_FSMULD);
 
@@ -1626,18 +1623,18 @@ sparc_option_override (void)
 }
 
   /* -mvis also implies -mv8plus on 32-bit.  */
-  if (TARGET_VIS && ! TARGET_ARCH64)
+  if (TARGET_VIS && !TARGET_ARCH64)
 target_flags |= MASK_V8PLUS;
 
-  /* Use the deprecated v8 insns for sparc64 in 32 bit mode.  */
+  /* Use the deprecated v8 insns for sparc64 in 32-bit mode.  */
   if (TARGET_V9 && TARGET_ARCH32)
 target_flags |= MASK_DEPRECATED_V8_INSNS;
 
-  /* V8PLUS requires V9, makes no sense in 64 bit mode.  */
-  if (! TARGET_V9 || TARGET_ARCH64)
+  /* V8PLUS requires V9 and makes no sense in 64-bit mode.  */
+  if (!TARGET_V9 || TARGET_ARCH64)
 target_flags &= ~MASK_V8PLUS;
 
-  /* Don't use stack biasing in 32 bit mode.  */
+  /* Don't use stack biasing in 32-bit mode.  */
   if (TARGET_ARCH32)
 target_flags &= ~MASK_STACK_BIAS;
 
@@ -4975,7 +4972,7 @@ enum sparc_mode_class {
??? Note that, despite the settings, non-double-aligned parameter
registers can hold double-word quantities in 32-bit mode.  */
 
-/* This points to either the 32 bit or the 64 bit version.  */
+/* This points to either the 32-bit or the 64-bit version.  */
 const int *hard_regno_mode_classes;
 
 static const int hard_32bit_mode_classes[] = {
@@ -7309,7 +7306,7 @@ sparc_function_arg_advance (cumulative_args_t cum_
 }
 
 /* Handle the FUNCTION_ARG_PADDING macro.
-   For the 64 bit ABI structs are always stored left shifted in their
+   For the 64-bit ABI structs are always stored left shifted in their
argument slot.  */
 
 enum direction
@@ -8428,7 +8425,7 @@ o

Re: [PATCH v2][RFC] Canonize names of attributes.

2017-07-27 Thread Martin Liška
PING^1

On 07/13/2017 03:48 PM, Martin Liška wrote:
> On 07/11/2017 05:52 PM, Jason Merrill wrote:
>> On Tue, Jul 11, 2017 at 9:37 AM, Martin Liška  wrote:
>>> On 07/03/2017 11:00 PM, Jason Merrill wrote:
 On Mon, Jul 3, 2017 at 5:52 AM, Martin Liška  wrote:
> On 06/30/2017 09:34 PM, Jason Merrill wrote:
>>
>> On Fri, Jun 30, 2017 at 5:23 AM, Martin Liška  wrote:
>>>
>>> This is v2 of the patch, where just names of attributes are
>>> canonicalized.
>>> Patch can bootstrap on ppc64le-redhat-linux and survives regression
>>> tests.
>>
>>
>> What is the purpose of the new "strict" parameter to cmp_attribs* ?  I
>> don't see any discussion of it.
>
>
> It's needed for arguments of attribute names, like:
>
> /usr/include/stdio.h:391:62: internal compiler error: in cmp_attribs, at
> tree.h:5523
>__THROWNL __attribute__ ((__format__ (__printf__, 3, 4)));
>

 Mm.  Although we don't want to automatically canonicalize all
 identifier arguments to attributes in the parser, we could still do it
 for specific attributes, e.g. in handle_format_attribute or
 handle_mode_attribute.
>>>
>>> Yep, that was done in my previous version of the patch
>>> (https://gcc.gnu.org/ml/gcc-patches/2017-06/msg00996.html).
>>> Where only attribute that was preserved unchanged was 'cleanup':
>>>
>>> diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
>>> index 8f638785e0e..08b4db5e5bd 100644
>>> --- a/gcc/cp/parser.c
>>> +++ b/gcc/cp/parser.c
>>> @@ -24765,7 +24765,8 @@ cp_parser_gnu_attribute_list (cp_parser* parser)
>>>   tree tv;
>>>   if (arguments != NULL_TREE
>>>   && ((tv = TREE_VALUE (arguments)) != NULL_TREE)
>>> - && TREE_CODE (tv) == IDENTIFIER_NODE)
>>> + && TREE_CODE (tv) == IDENTIFIER_NODE
>>> + && !id_equal (TREE_PURPOSE (attribute), "cleanup"))
>>> TREE_VALUE (arguments) = canonize_attr_name (tv);
>>>   release_tree_vector (vec);
>>> }
>>>
>>> Does it work for you to do it so?
>>
>> This is canonicalizing arguments by default; I want the default to be
>> not canonicalizing arguments.  I think we only want to canonicalize
>> arguments for format and mode, and we can do that in their handle_*
>> functions.
> 
> Yep, done that in v3. I decided to move couple of functions to attribs.h and
> verified that it will not cause binary size increase of cc1 and cc1plus.
> 
> Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.
> 
> Ready to be installed?
> Martin
> 
>>
>> Jason
>>
> 



Re: [PATCH] Make __FUNCTION__ a mergeable string and do not generate symbol entry.

2017-07-27 Thread Martin Liška
PING^1

On 07/14/2017 10:35 AM, Martin Liška wrote:
> On 05/01/2017 09:13 PM, Jason Merrill wrote:
>> On Wed, Apr 26, 2017 at 6:58 AM, Martin Liška  wrote:
>>> On 04/25/2017 01:58 PM, Jakub Jelinek wrote:
 On Tue, Apr 25, 2017 at 01:48:05PM +0200, Martin Liška wrote:
> Hello.
>
> This is patch that was originally installed by Jason and later reverted 
> due to PR70422.
> In the later PR Richi suggested a fix for that and Segher verified that 
> it helped him
> to survive regression tests. That's reason why I'm resending that.
>
> Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.
>
> Ready to be installed?
> Martin

> >From a34ce0ef37ae00609c9f3ff98a9cb0b7db6a8bd0 Mon Sep 17 00:00:00 2001
> From: marxin 
> Date: Thu, 20 Apr 2017 14:56:30 +0200
> Subject: [PATCH] Make __FUNCTION__ a mergeable string and do not generate
>  symbol entry.
>
> gcc/cp/ChangeLog:
>
> 2017-04-20  Jason Merrill  
>  Martin Liska  
>  Segher Boessenkool  
>
>  PR c++/64266
>  PR c++/70353
>  PR bootstrap/70422
>  Core issue 1962
>  * decl.c (cp_fname_init): Decay the initializer to pointer.
>  (cp_make_fname_decl): Set DECL_DECLARED_CONSTEXPR_P,
>  * pt.c (tsubst_expr) [DECL_EXPR]: Set DECL_VALUE_EXPR,
>  DECL_INITIALIZED_BY_CONSTANT_EXPRESSION_P and
>  DECL_IGNORED_P.  Don't call cp_finish_decl.

 If we don't emit those into the debug info, will the debugger be
 able to handle __FUNCTION__ etc. properly?
>>>
>>> No, debugger with the patch can't handled these. Similar to how clang
>>> behaves currently. Maybe it can be conditionally enabled with -g3, or -g?
>>>
 Admittedly, right now we emit it into debug info only if those decls
 are actually used, say on:
 const char * foo () { return __FUNCTION__; }
 const char * bar () { return ""; }
 we'd emit foo::__FUNCTION__, but not bar::__FUNCTION__, so the debugger
 has to have some handling of it anyway.  But while in functions
 that don't refer to __FUNCTION__ it is always the debugger that needs
 to synthetize those and thus they will be always pointer-equal,
 if there are some uses of it and for other uses the debugger would
 synthetize it, there is the possibility that the debugger synthetized
 string will not be the same object as actually used in the function.
>>>
>>> You're right, currently one has to use a special function to be able to
>>> print it in debugger. I believe we've already discussed that, according
>>> to spec, the strings don't have to point to a same string.
>>>
>>> Suggestions what we should do with the patch?
>>
>> We need to emit debug information for these variables.  From Jim's
>> description in 70422 it seems that the problem is that the reference
>> to the string from the debug information is breaking
>> function_mergeable_rodata_prefix, which relies on
>> current_function_decl.  It seems to me that its callers should pass
>> along their decl parameter so that f_m_r_p can use the decl's
>> DECL_CONTEXT rather than rely on current_function_decl being set
>> properly> 
>> Jason
>>
> 
> Ok, after some time I returned back to it. I followed your advises and
> changed the function function_mergeable_rodata_prefix. Apart from a small
> rebase was needed.
> 
> May I ask Jim to test the patch?
> Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.
> 
> Martin
> 



Re: [PATCH] Fix indirect call optimization done by autoFDO.

2017-07-27 Thread Martin Liška
On 07/26/2017 07:22 PM, Jeff Law wrote:
> So is the comment wrong?  Or is my interpretation wrong?
> 
> jeff

Yes, comment needs adjustment, done that in r250622.

Thanks for review,
Martin


Re: [PATCH] Initialize counters in autoFDO to zero, not to uninitialized.

2017-07-27 Thread Martin Liška
On 07/26/2017 07:43 PM, Jeff Law wrote:
> On 07/11/2017 04:35 AM, Martin Liška wrote:
>> Hello.
>>
>> This fixes majority of autoFDO test-cases.
>>
>> Patch can boostrap and survives regression tests.
>>
>> Ready for trunk?
>> Thanks,
>> Martin
>>
>> gcc/ChangeLog:
>>
>> 2017-07-11  Martin Liska  
>>
>> * auto-profile.c (afdo_annotate_cfg): Assign zero counts to
>> BBs and edges seen by autoFDO.
> I went back and forth on this a couple times.  I could argue that if we
> don't have data for the edge from auto-fdo, then the proper value is
> uninitialized().  But I think the response to that argument is that if
> the edge didn't show up in the afdo run, then it's counters should be
> zero'd as they weren't triggered during the afdo run.
> 
> I think a comment immediately prior to the initialization seems wise. OK
> with that change.

Yes, I've just added comment and installed the patch as r250621.

Thanks for the review.
Martin

> 
> jeff
> 



Re: [patch 2/2,avr] PR49847: Add hook to place read-only lookup-tables in named address-space

2017-07-27 Thread Georg-Johann Lay

On 27.07.2017 14:29, Georg-Johann Lay wrote:

For some targets, the best place to put read-only lookup tables as
generated by -ftree-switch-conversion is not the generic address space
but some target specific address space.

This is the case for AVR, where for most devices .rodata must be
located in RAM.

Part #1 adds a new, optional target hook that queries the back-end
about the desired address space.  There is currently only one user:
tree-switch-conversion.c::build_one_array() which re-builds value_type
and array_type if the address space returned by the backend is not
the generic one.

Part #2 is the AVR part that implements the new hook and adds some
sugar around it.


This is the AVR part.

It implements the new hook which returns a convenient flash address
space for devices where .rodata is located in RAM:  The 16-bit __flash
for devices with <= 64 KiB flash and 24-bit __memx for > 64 KiB flash.

It adds a new option -madd-space-for-lookup= which allows to pick a
specific address space.

Some new insns and combine-split suport best code generation by the
knowledge that the 24-bit addresses will never point to RAM so that
the expensive decision-at-runtime whether LD or LPM has to be used
can be avoided.

Passed without new regressions on atmega128.

Ok for trunk provided the gcc part 1/2 is approved?

Johann


Implement TARGET_ADDR_SPACE_FOR_ARTIFICIAL_RODATA.

PR target/49857
* config/avr/avr-opts.h: New file.
* config/avr/avr.opt: Include it.
(-maddr-space-for-lookup=): New option and...
(avr_opt_addr_space_for_lookup): ...associated Var.
(avr_aspace_for_lookup): New option enums used by above.
* config/avr/avr-protos.h (avr_out_load_flashx): New proto.
* config/avr/avr.c (avr_out_load_flashx): New function.
* avr_adjust_insn_length [ADJUST_LEN_LOAD_FLASHX]: Handle it.
* avr_rtx_costs_1 [ZERO_EXTEND, SIGN_EXTEND]: Handle
shift-and-extend-by-1 of HI -> PSI.
[ASHIFT,PSImode]: Describe cost of extend-and-shift-by-1.
(TARGET_ADDR_SPACE_FOR_ARTIFICIAL_RODATA): Define to...
(avr_addr_space_for_artificial_rodata): ...this new static function.
* config/avr/avr.md (unspec): Add UNSPEC_LOAD_FLASHX.
(adjust_len): Add load_flashx.
(*ashiftpsi.1_sign_extend.hi, *ashiftpsi.1_zero_extend.hi)
(*extendpsi.ashift.1.uqi, *load-flashx): New insns.
(*split_xload-cswtch): New insn-and-split.
* doc/invoke.texi (AVR Options) <-maddr-space-for-lookup=>: Document.

Index: config/avr/avr-opts.h
===
--- config/avr/avr-opts.h	(nonexistent)
+++ config/avr/avr-opts.h	(working copy)
@@ -0,0 +1,40 @@
+/* Definitions for option handling for AVR.
+   Copyright (C) 2017 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU General Public License for more details.
+
+Under Section 7 of GPL version 3, you are granted additional
+permissions described in the GCC Runtime Library Exception, version
+3.1, as published by the Free Software Foundation.
+
+You should have received a copy of the GNU General Public License and
+a copy of the GCC Runtime Library Exception along with this program;
+see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+.  */
+
+#ifndef AVR_OPTS_H
+#define AVR_OPTS_H
+
+enum avr_opt_addr_space
+  {
+AVR_OPT_ADDR_SPACE_flash,
+AVR_OPT_ADDR_SPACE_flash1,
+AVR_OPT_ADDR_SPACE_flash2,
+AVR_OPT_ADDR_SPACE_flash3,
+AVR_OPT_ADDR_SPACE_flash4,
+AVR_OPT_ADDR_SPACE_flash5,
+AVR_OPT_ADDR_SPACE_memx,
+AVR_OPT_ADDR_SPACE_generic
+  };
+
+#endif /* AVR_OPTS_H */
Index: config/avr/avr-protos.h
===
--- config/avr/avr-protos.h	(revision 250302)
+++ config/avr/avr-protos.h	(working copy)
@@ -94,6 +94,7 @@ extern const char* avr_out_plus (rtx, rt
 extern const char* avr_out_round (rtx_insn *, rtx*, int* =NULL);
 extern const char* avr_out_addto_sp (rtx*, int*);
 extern const char* avr_out_xload (rtx_insn *, rtx*, int*);
+extern const char* avr_out_load_flashx (rtx_insn*, rtx*, int*);
 extern const char* avr_out_movmem (rtx_insn *, rtx*, int*);
 extern const char* avr_out_insert_bits (rtx*, int*);
 extern bool avr_popcount_each_byte (rtx, int, int);
Index: config/avr/avr.c
===
--- config/avr/avr.c	(revision 250302)
+++ config/avr/avr.c	(working copy)
@@ -3914,6 +3914,31 @@ avr_out_xload (rtx_i

Re: [7/7] Pool alignment information for common bases

2017-07-27 Thread Richard Sandiford
Richard Biener  writes:
> On Tue, Jul 4, 2017 at 2:01 PM, Richard Sandiford
>  wrote:
>> Richard Biener  writes:
>>> On Mon, Jul 3, 2017 at 9:49 AM, Richard Sandiford
>>>  wrote:
 @@ -2070,8 +2143,7 @@ vect_find_same_alignment_drs (struct dat
if (dra == drb)
  return;

 -  if (!operand_equal_p (DR_BASE_OBJECT (dra), DR_BASE_OBJECT (drb),
 -   OEP_ADDRESS_OF)
 +  if (!operand_equal_p (DR_BASE_ADDRESS (dra), DR_BASE_ADDRESS (drb), 0)
>>>
>>> Why this change?  It's semantically weaker after your change.
>>
>> It's because the DR_BASE_OBJECT comes from the access_fn analysis
>> while the DR_BASE_ADDRESS comes from the innermost_loop_behavior.
>> I hadn't realised when adding the original code how different the
>> two were, and since all the other parts are based on the
>> innermost_loop_behavior, I think this check should be too.
>> E.g. it doesn't really make sense to compare DR_INITs based
>> on DR_BASE_OBJECT.
>
> Ah ok, makes sense now.
>
>> I guess it should have been a separate patch though.
>
> No need.
>
|| !operand_equal_p (DR_OFFSET (dra), DR_OFFSET (drb), 0)
|| !operand_equal_p (DR_STEP (dra), DR_STEP (drb), 0))
  return;
 @@ -2129,6 +2201,7 @@ vect_analyze_data_refs_alignment (loop_v
vec datarefs = vinfo->datarefs;
struct data_reference *dr;

 +  vect_record_base_alignments (vinfo);
FOR_EACH_VEC_ELT (datarefs, i, dr)
  {
stmt_vec_info stmt_info = vinfo_for_stmt (DR_STMT (dr));
 @@ -3327,7 +3400,8 @@ vect_analyze_data_refs (vec_info *vinfo,
 {
   struct data_reference *newdr
 = create_data_ref (NULL, loop_containing_stmt (stmt),
 - DR_REF (dr), stmt, maybe_scatter ? false : true);
 +  DR_REF (dr), stmt, !maybe_scatter,
 +  DR_IS_CONDITIONAL_IN_STMT (dr));
   gcc_assert (newdr != NULL && DR_REF (newdr));
   if (DR_BASE_ADDRESS (newdr)
   && DR_OFFSET (newdr)
 Index: gcc/tree-vect-slp.c
 ===
 --- gcc/tree-vect-slp.c 2017-07-03 08:20:56.404763323 +0100
 +++ gcc/tree-vect-slp.c 2017-07-03 08:42:51.149380545 +0100
 @@ -2358,6 +2358,7 @@ new_bb_vec_info (gimple_stmt_iterator re
gimple_stmt_iterator gsi;

res = (bb_vec_info) xcalloc (1, sizeof (struct _bb_vec_info));
 +  new (&res->base_alignments) vec_base_alignments ();
>>>
>>> Ick.  I'd rather make this proper C++ and do
>>>
>>>res = new _bb_vec_info;
>>>
>>> and add a constructor to the vec_info base initializing the hashtable.
>>> The above smells fishy.
>>
>> I knew I pushing my luck with that one.  I just didn't want to have to
>> convert the current xcalloc of loop_vec_info into a long list of explicit
>> zero initializers.  (OK, we have a lot of explicit zero assignments already,
>> but certainly some fields rely on the calloc zeroing.)
>>
>>> Looks like vec<> are happy with .create () being called on a zeroed struct.
>>>
>>> Alternatively add .create () to hashtable/map.
>>
>> The difference is that vec<> is explicitly meant to be POD, whereas
>> hash_map isn't (and I don't think we want it to be).
>
> Ah, indeed.  So your patch makes *vec_info no longer POD (no problem
> but then use new/delete and constructors).
>
>> Ah well.  I'll do a separate pre-patch to C++-ify the structures.

Here's an update that applies on top of the vec_info c++-ification patch
I just posted [ https://gcc.gnu.org/ml/gcc-patches/2017-07/msg01807.html ]

Tested as before.  OK to install?

Thanks,
Richard


2017-07-27  Richard Sandiford  

gcc/
PR tree-optimization/81136
* tree-vectorizer.h: Include tree-hash-traits.h.
(vec_base_alignments): New typedef.
(vec_info): Add a base_alignments field.
(vect_record_base_alignments): Declare.
* tree-data-ref.h (data_reference): Add an is_conditional_in_stmt
field.
(DR_IS_CONDITIONAL_IN_STMT): New macro.
(create_data_ref): Add an is_conditional_in_stmt argument.
* tree-data-ref.c (create_data_ref): Likewise.  Use it to initialize
the is_conditional_in_stmt field.
(data_ref_loc): Add an is_conditional_in_stmt field.
(get_references_in_stmt): Set the is_conditional_in_stmt field.
(find_data_references_in_stmt): Update call to create_data_ref.
(graphite_find_data_references_in_stmt): Likewise.
* tree-ssa-loop-prefetch.c (determine_loop_nest_reuse): Likewise.
* tree-vect-data-refs.c (vect_analyze_data_refs): Likewise.
(vect_record_base_alignment): New function.
(vect_record_base_alignments): Likewise.
(vect_compute_data_ref_alignment): Adjust base_addr and aligned_to
for nested statements even if we fail to compute a misalignment.
   

Re: [patch 1/2] PR49847: Add hook to place read-only lookup-tables in named address-space

2017-07-27 Thread Georg-Johann Lay

On 27.07.2017 14:29, Georg-Johann Lay wrote:

For some targets, the best place to put read-only lookup tables as
generated by -ftree-switch-conversion is not the generic address space
but some target specific address space.

This is the case for AVR, where for most devices .rodata must be
located in RAM.

Part #1 adds a new, optional target hook that queries the back-end
about the desired address space.  There is currently only one user:
tree-switch-conversion.c::build_one_array() which re-builds value_type
and array_type if the address space returned by the backend is not
the generic one.

Part #2 is the AVR part that implements the new hook and adds some
sugar around it.


This is the gcc part which adds the new hook 
TARGET_ADDR_SPACE_FOR_ARTIFICIAL_RODATA resp. 
targetm.addr_space.for_artificial_rodata().


The default implementation returns ADDR_SPACE_GENERIC which represents
a no-op.  Only if !ADDR_SPACE_GENERIC_P, the array and value type are
re-built.

The accesses must be in such a way that any access to the newly created
array matches the address space.  This is the reason for why the
back-end cannot do it on its own:  Just putting the stuff in a specific
section does *not* do the trick.

Bootstrapped ok on x86_64.

Johann

PR 49857
* doc/tm.texi.in (Named Address Spaces)
[TARGET_ADDR_SPACE_FOR_ARTIFICIAL_RODATA]: Add hook anchor.
* doc/tm.texi: Regenerate.
* target.def (addr_space) [for_artificial_rodata]: New optional hook.
* targhooks.c (default_addr_space_for_artificial_rodata): New function.
* targhooks.h (default_addr_space_for_artificial_rodata): New proto.
* tree-switch-conversion.c (target.h): Include it.
(build_one_array): Set address space of value_type according to
targetm.addr_space.for_artificial_rodata() and rebuild array_type
if needed.
Index: target.def
===
--- target.def	(revision 250302)
+++ target.def	(working copy)
@@ -3285,6 +3285,25 @@ The default implementation does nothing.
  void, (addr_space_t as, location_t loc),
  default_addr_space_diagnose_usage)
 
+/* Function to get the address space of some compiler-generated
+   read-only data.  Used for optimization purposes only.  */
+DEFHOOK
+(for_artificial_rodata,
+ "Define this hook to return an address space to be used for @var{type},\n\
+usually an artificial lookup-table that would reside in @code{.rodata}.\n\
+It is always safe not to implement this hook or to return\n\
+@code{ADDR_SPACE_GENERIC}.\n\
+\n\
+The hook can be used to put compiler-generated, artificial data in\n\
+static stzorage into a specific address space when this is better suited\n\
+than the generic address space.\n\
+The compiler will also generate all accesses to the respective data\n\
+so that all associated accesses will also use the desired address space.\n\
+An example for such data are the @code{CSWTCH} lookup tables as generated\n\
+by @option{-ftree-switch-conversion}.",
+ addr_space_t, (tree type),
+ default_addr_space_for_artificial_rodata)
+
 HOOK_VECTOR_END (addr_space)
 
 #undef HOOK_PREFIX
Index: targhooks.c
===
--- targhooks.c	(revision 250302)
+++ targhooks.c	(working copy)
@@ -1397,6 +1397,14 @@ default_addr_space_convert (rtx op ATTRI
   gcc_unreachable ();
 }
 
+/* The default hook for TARGET_ADDR_SPACE_FOR_ARTIFICIAL_RODATA.  */
+
+addr_space_t
+default_addr_space_for_artificial_rodata (tree)
+{
+  return ADDR_SPACE_GENERIC;
+}
+
 bool
 default_hard_regno_scratch_ok (unsigned int regno ATTRIBUTE_UNUSED)
 {
Index: targhooks.h
===
--- targhooks.h	(revision 250302)
+++ targhooks.h	(working copy)
@@ -184,6 +184,7 @@ extern bool default_addr_space_zero_addr
 extern int default_addr_space_debug (addr_space_t);
 extern void default_addr_space_diagnose_usage (addr_space_t, location_t);
 extern rtx default_addr_space_convert (rtx, tree, tree);
+extern addr_space_t default_addr_space_for_artificial_rodata (tree);
 extern unsigned int default_case_values_threshold (void);
 extern bool default_have_conditional_execution (void);
 
Index: tree-switch-conversion.c
===
--- tree-switch-conversion.c	(revision 250302)
+++ tree-switch-conversion.c	(working copy)
@@ -46,6 +46,7 @@ Software Foundation, 51 Franklin Street,
 #include "gimplify-me.h"
 #include "tree-cfg.h"
 #include "cfgloop.h"
+#include "target.h"
 
 /* ??? For lang_hooks.types.type_for_mode, but is there a word_mode
type in the GIMPLE type system that is language-independent?  */
@@ -1136,6 +1137,16 @@ build_one_array (gswitch *swtch, int num
   default_type = TREE_TYPE (info->default_values[num]);
   value_type = array_value_type (swtch, default_type, num, info);
   array_type = build_array_type (value_type, arr_index_type);
+

C++-ify vec_info structures

2017-07-27 Thread Richard Sandiford
[ Needed to unblock https://gcc.gnu.org/ml/gcc-patches/2017-07/msg00198.html ]

This patch uses new, delete, constructors and desctructors to manage
vec_info.  This includes making ~vec_info free all the data shared
by bb_vec_info and loop_vec_info, whereas previously the code was
duplicated in destroy_bb_vec_info and destroy_loop_vec_info.  This
in turn meant changing the order of:

  FOR_EACH_VEC_ELT (slp_instances, i, instance)
vect_free_slp_instance (instance);

and:

  gimple_set_uid (stmt, -1);

in destroy_bb_vec_info/~_bb_vec_info, so that now vect_free_slp_instance
could see a uid of -1 as well as 0.  The patch updates vinfo_for_stmt
so that it returns NULL for a uid of -1.

Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?

Richard


2017-07-27  Richard Sandiford  

gcc/
* tree-vectorizer.h (vec_info): Add a constructor and destructor.
Add an explicit name for the enum.  Use auto_vec for slp_instances
and grouped_stores.
(_loop_vec_info): Add a constructor and destructor.  Use auto_vec
for all vectors.
(_bb_vec_info): Add a constructor and destructor.
(vinfo_for_stmt): Return NULL for uids of -1 as well.
(destroy_loop_vec_info): Delete.
(vect_destroy_datarefs): Likewise.
* tree-vectorizer.c (vect_destroy_datarefs): Delete.
(vec_info::vec_info): New function.
(vec_info::~vec_info): Likewise.
(vectorize_loops): Use delete instead of destroy_loop_vec_info.
* tree-parloops.c (gather_scalar_reductions): Use delete instead of
destroy_loop_vec_info.
* tree-vect-loop.c (new_loop_vec_info): Replace with...
(_loop_vec_info::_loop_vec_info): ...this.
(destroy_loop_vec_info): Replace with...
(_loop_vec_info::~_loop_vec_info): ...this.  Unconditionally delete
the stmt_vec_infos.  Leave handling of vec_info information to its
destructor.  Remove explicit vector releases.
(vect_analyze_loop_form): Use new instead of new_loop_vec_info.
(vect_analyze_loop): Use delete instead of destroy_loop_vec_info.
* tree-vect-slp.c (new_bb_vec_info): Replace with...
(_bb_vec_info::_bb_vec_info): ...this.  Don't reserve space in
BB_VINFO_GROUPED_STORES or BB_VINFO_SLP_INSTANCES.
(destroy_bb_vec_info): Replace with...
(_bb_vec_info::~_bb_vec_info): ...this.  Leave handling of vec_info
information to its destructor.
(vect_slp_analyze_bb_1): Use new and delete instead of
new_bb_vec_info and destroy_bb_vec_info.
(vect_slp_bb): Replace 2 calls to destroy_bb_vec_info with a
single delete.

Index: gcc/tree-vectorizer.h
===
--- gcc/tree-vectorizer.h   2017-07-27 13:25:33.934530189 +0100
+++ gcc/tree-vectorizer.h   2017-07-27 13:25:42.989783339 +0100
@@ -155,20 +155,27 @@ typedef std::pair vec_object
 
 /* Vectorizer state common between loop and basic-block vectorization.  */
 struct vec_info {
-  enum { bb, loop } kind;
+  enum vec_kind { bb, loop };
+
+  vec_info (vec_kind, void *);
+  ~vec_info ();
+
+  /* The type of vectorization.  */
+  vec_kind kind;
 
   /* All SLP instances.  */
-  vec slp_instances;
+  auto_vec slp_instances;
 
-  /* All data references.  */
+  /* All data references.  Freed by free_data_refs, so not an auto_vec.  */
   vec datarefs;
 
-  /* All data dependences.  */
+  /* All data dependences.  Freed by free_dependence_relations, so not
+ an auto_vec.  */
   vec ddrs;
 
   /* All interleaving chains of stores, represented by the first
  stmt in the chain.  */
-  vec grouped_stores;
+  auto_vec grouped_stores;
 
   /* Cost data used by the target cost model.  */
   void *target_cost_data;
@@ -198,6 +205,8 @@ is_a_helper <_bb_vec_info *>::test (vec_
 /* Info on vectorized loops.   */
 /*-*/
 typedef struct _loop_vec_info : public vec_info {
+  _loop_vec_info (struct loop *);
+  ~_loop_vec_info ();
 
   /* The loop to which this info struct refers to.  */
   struct loop *loop;
@@ -239,32 +248,32 @@ typedef struct _loop_vec_info : public v
   int ptr_mask;
 
   /* The loop nest in which the data dependences are computed.  */
-  vec loop_nest;
+  auto_vec loop_nest;
 
   /* Data Dependence Relations defining address ranges that are candidates
  for a run-time aliasing check.  */
-  vec may_alias_ddrs;
+  auto_vec may_alias_ddrs;
 
   /* Data Dependence Relations defining address ranges together with segment
  lengths from which the run-time aliasing check is built.  */
-  vec comp_alias_ddrs;
+  auto_vec comp_alias_ddrs;
 
   /* Check that the addresses of each pair of objects is unequal.  */
-  vec check_unequal_addrs;
+  auto_vec check_unequal_addrs;
 
   /* Statements in the loop that have data references that are candidates for a
  runtime (l

Re: [PATCH] Fix segfault in gcov.c (PR gcov-profile/81561).

2017-07-27 Thread Martin Liška
On 07/27/2017 01:48 PM, Richard Biener wrote:
> On Thu, Jul 27, 2017 at 12:12 PM, Martin Liška  wrote:
>> Hello.
>>
>> As reported in mentioned PR, we segfault in gcov tool when one uses -a. It's 
>> caused by fact
>> that vectors blocks and block_lists have indices kept in sync and as one 
>> removes an element
>> from blocks via:
>>blocked.erase (it);
>>
>> Then calling recursively the same function breaks the synchronization. The 
>> patch was originally
>> written by Joshua (adding him to CC). If I'm correct calling:
>>
>> -unblock (u, blocked, block_lists);
>>
>> does not make sense as we've already removed 'u'. Plus one needs to put 
>> content of block_lists[index]
>> to a separate vector in order to not to break iteration.
>>
>> Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.
>> And fixed the problem reported in opensuse bugzilla (mentioned in the GCC 
>> bugzilla PR).
>>
>> Ready to be installed?
> 
> Looks good to me but please wait for Joshua to confirm.

Yes.

> 
> Did you manage to extract a testcase?

Unfortunately not, I've tried to isolate the affected function and call it with 
various arguments.
Problem is that the affected function does not have a loop and thus an inlined 
copy called from
a loop causes that.

Martin

> 
> Thanks,
> Richard.
> 
>> Martin
>>
>>
>> gcc/ChangeLog:
>>
>> 2017-07-26  Martin Liska  
>>
>> PR gcov-profile/81561
>> * gcov.c (unblock): Make unblocking safe as we need to preserve
>> index correspondence of blocks and block_lists.
>> ---
>>  gcc/gcov.c | 10 +++---
>>  1 file changed, 7 insertions(+), 3 deletions(-)
>>
>>
>>



Re: [patch 0/2] PR49847: Add hook to place read-only lookup-tables in named address-space

2017-07-27 Thread Richard Biener
On Thu, Jul 27, 2017 at 2:29 PM, Georg-Johann Lay  wrote:
> For some targets, the best place to put read-only lookup tables as
> generated by -ftree-switch-conversion is not the generic address space
> but some target specific address space.
>
> This is the case for AVR, where for most devices .rodata must be
> located in RAM.
>
> Part #1 adds a new, optional target hook that queries the back-end
> about the desired address space.  There is currently only one user:
> tree-switch-conversion.c::build_one_array() which re-builds value_type
> and array_type if the address space returned by the backend is not
> the generic one.
>
> Part #2 is the AVR part that implements the new hook and adds some
> sugar around it.

Given that switch-conversion just creates a constant initializer doesn't AVR
benefit from handling those uniformly (at RTL expansion?).  Not sure but
I think it goes through the regular constant pool handling.

Richard.

>


[patch 0/2] PR49847: Add hook to place read-only lookup-tables in named address-space

2017-07-27 Thread Georg-Johann Lay

For some targets, the best place to put read-only lookup tables as
generated by -ftree-switch-conversion is not the generic address space
but some target specific address space.

This is the case for AVR, where for most devices .rodata must be
located in RAM.

Part #1 adds a new, optional target hook that queries the back-end
about the desired address space.  There is currently only one user:
tree-switch-conversion.c::build_one_array() which re-builds value_type
and array_type if the address space returned by the backend is not
the generic one.

Part #2 is the AVR part that implements the new hook and adds some
sugar around it.



Re: [Patch (preapproved)] Guard Copy Header pass on flag_tree_loop_vectorize

2017-07-27 Thread Richard Biener
On Thu, Jul 27, 2017 at 2:08 PM, Jakub Jelinek  wrote:
> On Thu, Jul 27, 2017 at 01:54:21PM +0200, Richard Biener wrote:
>> --- gcc/common.opt  (revision 250619)
>> +++ gcc/common.opt  (working copy)
>>  ftree-vectorize
>> -Common Report Var(flag_tree_vectorize) Optimization
>> +Common Report Optimization
>>  Enable vectorization on trees.
>>
>>  ftree-vectorizer-verbose=
>>
>> which shows a few other uses of flag_tree_vectorize:
>>
>> int
>> omp_max_vf (void)
>> {
>>   if (!optimize
>>   || optimize_debug
>>   || !flag_tree_loop_optimize
>>   || (!flag_tree_loop_vectorize
>>   && (global_options_set.x_flag_tree_loop_vectorize
>>   || global_options_set.x_flag_tree_vectorize)))
>> return 1;
>>
>> not sure what that was supposed to test.  Jakub?  Similar
>> use in expand_omp_simd.
>
> The intent is/was to check if the vectorizer pass will be invoked,
> otherwise it makes no sense to generate the arrays.
> So, for -O0/-Og or -fno-tree-loop-optimize, we know that the pass
> isn't even in the pipeline.
> And otherwise the intent was that we try to optimize, unless
> user asked explicitly -fno-tree-loop-vectorize or -fno-tree-vectorize
> not to optimize.  Because the vect pass is enabled if:
> flag_tree_loop_vectorize || fun->has_force_vectorize_loops
> but returning non-zero from omp_max_vf and the other omp spot means
> there will be cfun->has_force_vectorize_loops.

I see.  So it would be good to try if adding EnabledBy(ftree-vectorize) to
ftree-loop-vectorize/ftree-slp-vectorize would add those to global_options_set
iff -ftree-vectorize is enabled (the opts.c hunk setting the flags is then
redundant as well I guess).

Richard.

> Jakub


Re: Handle data dependence relations with different bases

2017-07-27 Thread Richard Sandiford
Richard Sandiford  writes:
> Eric Botcazou  writes:
>> [Sorry for missing the previous messages]
>>
>>> Thanks.  Just been retesting, and I think I must have forgotten
>>> to include Ada last time.  It turns out that the patch causes a dg-scan
>>> regression in gnat.dg/vect17.adb, because we now think that if the
>>> array RECORD_TYPEs *do* alias in:
>>> 
>>>procedure Add (X, Y : aliased Sarray; R : aliased out Sarray) is
>>>begin
>>>   for I in Sarray'Range loop
>>>  R(I) := X(I) + Y(I);
>>>   end loop;
>>>end;
>>> 
>>> then the dependence distance must be zero.  Eric, does that hold true
>>> for Ada?  I.e. if X and R (or Y and R) alias, must it be the case that
>>> X(I) can only alias R(I) and not for example R(I-1) or R(I+1)?
>>
>> Yes, I'd think so (even without the artificial RECORD_TYPE around the 
>> arrays).
>
> Good!
>
>>> 2017-06-07  Richard Sandiford  
>>> 
>>> gcc/testsuite/
>>> * gnat.dg/vect17.ads (Sarray): Increase range to 1 .. 5.
>>> * gnat.dg/vect17.adb (Add): Create a dependence distance of 1
>>> when X = R or Y = R.
>>
>> I think that you need to modify vect15 and vect16 the same way.
>
> Ah, yeah.  And doing that shows that I'd not handled safelen for
> DDR_COULD_BE_INDEPENDENT_P.  I've fixed that locally.
>
> How does this look?  Tested on x86_64-linux-gnu both without the
> vectoriser changes and with the fixed vectoriser patch.

Here's a version of the patch that handles safelen.  I split the
handling out into a new function (vect_analyze_possibly_independent_ddr)
since it was getting too big to do inline.

Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?

Thanks,
Richard


2017-07-27  Richard Sandiford  

gcc/
* tree-data-ref.h (subscript): Add access_fn field.
(data_dependence_relation): Add could_be_independent_p.
(SUB_ACCESS_FN, DDR_COULD_BE_INDEPENDENT_P): New macros.
(same_access_functions): Move to tree-data-ref.c.
* tree-data-ref.c (ref_contains_union_access_p): New function.
(access_fn_component_p): Likewise.
(access_fn_components_comparable_p): Likewise.
(dr_analyze_indices): Add a reference to access_fn_component_p.
(dump_data_dependence_relation): Use SUB_ACCESS_FN instead of
DR_ACCESS_FN.
(constant_access_functions): Likewise.
(add_other_self_distances): Likewise.
(same_access_functions): Likewise.  (Moved from tree-data-ref.h.)
(initialize_data_dependence_relation): Use XCNEW and remove
explicit zeroing of DDR_REVERSED_P.  Look for a subsequence
of access functions that have the same type.  Allow the
subsequence to end with different bases in some circumstances.
Record the chosen access functions in SUB_ACCESS_FN.
(build_classic_dist_vector_1): Replace ddr_a and ddr_b with
a_index and b_index.  Use SUB_ACCESS_FN instead of DR_ACCESS_FN.
(subscript_dependence_tester_1): Likewise dra and drb.
(build_classic_dist_vector): Update calls accordingly.
(subscript_dependence_tester): Likewise.
* tree-ssa-loop-prefetch.c (determine_loop_nest_reuse): Check
DDR_COULD_BE_INDEPENDENT_P.
* tree-vectorizer.h (LOOP_REQUIRES_VERSIONING_FOR_ALIAS): Test
comp_alias_ddrs instead of may_alias_ddrs.
* tree-vect-data-refs.c (vect_analyze_possibly_independent_ddr):
New function.
(vect_analyze_data_ref_dependence): Use it if
DDR_COULD_BE_INDEPENDENT_P, but fall back to using the recorded
distance vectors if that fails.
(dependence_distance_ge_vf): New function.
(vect_prune_runtime_alias_test_list): Use it.  Don't clear
LOOP_VINFO_MAY_ALIAS_DDRS.

gcc/testsuite/
* gcc.dg/vect/vect-alias-check-3.c: New test.
* gcc.dg/vect/vect-alias-check-4.c: Likewise.
* gcc.dg/vect/vect-alias-check-5.c: Likewise.

Index: gcc/tree-data-ref.h
===
--- gcc/tree-data-ref.h 2017-07-27 13:10:29.620045506 +0100
+++ gcc/tree-data-ref.h 2017-07-27 13:10:33.023912613 +0100
@@ -260,6 +260,9 @@ struct conflict_function
 
 struct subscript
 {
+  /* The access functions of the two references.  */
+  tree access_fn[2];
+
   /* A description of the iterations for which the elements are
  accessed twice.  */
   conflict_function *conflicting_iterations_in_a;
@@ -278,6 +281,7 @@ struct subscript
 
 typedef struct subscript *subscript_p;
 
+#define SUB_ACCESS_FN(SUB, I) (SUB)->access_fn[I]
 #define SUB_CONFLICTS_IN_A(SUB) (SUB)->conflicting_iterations_in_a
 #define SUB_CONFLICTS_IN_B(SUB) (SUB)->conflicting_iterations_in_b
 #define SUB_LAST_CONFLICT(SUB) (SUB)->last_conflict
@@ -333,6 +337,33 @@ struct data_dependence_relation
   /* Set to true when the dependence relation is on the same data
  access.  */
   bool self_reference_p;
+
+  /* True if the dependence described is conservatively correct rather
+

Re: [Patch (preapproved)] Guard Copy Header pass on flag_tree_loop_vectorize

2017-07-27 Thread Jakub Jelinek
On Thu, Jul 27, 2017 at 01:54:21PM +0200, Richard Biener wrote:
> --- gcc/common.opt  (revision 250619)
> +++ gcc/common.opt  (working copy)
>  ftree-vectorize
> -Common Report Var(flag_tree_vectorize) Optimization
> +Common Report Optimization
>  Enable vectorization on trees.
> 
>  ftree-vectorizer-verbose=
> 
> which shows a few other uses of flag_tree_vectorize:
> 
> int
> omp_max_vf (void)
> {
>   if (!optimize
>   || optimize_debug
>   || !flag_tree_loop_optimize
>   || (!flag_tree_loop_vectorize
>   && (global_options_set.x_flag_tree_loop_vectorize
>   || global_options_set.x_flag_tree_vectorize)))
> return 1;
> 
> not sure what that was supposed to test.  Jakub?  Similar
> use in expand_omp_simd.

The intent is/was to check if the vectorizer pass will be invoked,
otherwise it makes no sense to generate the arrays.
So, for -O0/-Og or -fno-tree-loop-optimize, we know that the pass
isn't even in the pipeline.
And otherwise the intent was that we try to optimize, unless
user asked explicitly -fno-tree-loop-vectorize or -fno-tree-vectorize
not to optimize.  Because the vect pass is enabled if:
flag_tree_loop_vectorize || fun->has_force_vectorize_loops
but returning non-zero from omp_max_vf and the other omp spot means
there will be cfun->has_force_vectorize_loops.

Jakub


[PATCH][1/n] Fix PR81502

2017-07-27 Thread Richard Biener

This fixes a part of PR81502 where we fail to re-write a variable
into SSA because we don't know how to re-write a bit insertion into
a BIT_INSERT_EXPR.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk.

Richard.

2017-07-27  Richard Biener  

PR tree-optimization/81502
* tree-ssa.c (non_rewritable_lvalue_p): Handle BIT_INSERT_EXPR
with incompatible but same sized type.
(execute_update_addresses_taken): Likewise.

* gcc.target/i386/vect-insert-1.c: New testcase.

Index: gcc/tree-ssa.c
===
--- gcc/tree-ssa.c  (revision 250563)
+++ gcc/tree-ssa.c  (working copy)
@@ -1513,8 +1513,8 @@ non_rewritable_lvalue_p (tree lhs)
   if (DECL_P (decl)
  && VECTOR_TYPE_P (TREE_TYPE (decl))
  && TYPE_MODE (TREE_TYPE (decl)) != BLKmode
- && types_compatible_p (TREE_TYPE (lhs),
-TREE_TYPE (TREE_TYPE (decl)))
+ && operand_equal_p (TYPE_SIZE_UNIT (TREE_TYPE (lhs)),
+ TYPE_SIZE_UNIT (TREE_TYPE (TREE_TYPE (decl))), 0)
  && tree_fits_uhwi_p (TREE_OPERAND (lhs, 1))
  && tree_int_cst_lt (TREE_OPERAND (lhs, 1),
  TYPE_SIZE_UNIT (TREE_TYPE (decl)))
@@ -1529,8 +1529,9 @@ non_rewritable_lvalue_p (tree lhs)
   && DECL_P (TREE_OPERAND (lhs, 0))
   && VECTOR_TYPE_P (TREE_TYPE (TREE_OPERAND (lhs, 0)))
   && TYPE_MODE (TREE_TYPE (TREE_OPERAND (lhs, 0))) != BLKmode
-  && types_compatible_p (TREE_TYPE (lhs),
-TREE_TYPE (TREE_TYPE (TREE_OPERAND (lhs, 0
+  && operand_equal_p (TYPE_SIZE_UNIT (TREE_TYPE (lhs)),
+ TYPE_SIZE_UNIT
+   (TREE_TYPE (TREE_TYPE (TREE_OPERAND (lhs, 0, 0)
   && (tree_to_uhwi (TREE_OPERAND (lhs, 2))
  % tree_to_uhwi (TYPE_SIZE (TREE_TYPE (lhs == 0)
 return false;
@@ -1812,14 +1813,26 @@ execute_update_addresses_taken (void)
 DECL_UID (TREE_OPERAND (lhs, 0)))
&& VECTOR_TYPE_P (TREE_TYPE (TREE_OPERAND (lhs, 0)))
&& TYPE_MODE (TREE_TYPE (TREE_OPERAND (lhs, 0))) != BLKmode
-   && types_compatible_p (TREE_TYPE (lhs),
-  TREE_TYPE (TREE_TYPE
-  (TREE_OPERAND (lhs, 0
+   && operand_equal_p (TYPE_SIZE_UNIT (TREE_TYPE (lhs)),
+   TYPE_SIZE_UNIT (TREE_TYPE
+ (TREE_TYPE (TREE_OPERAND (lhs, 0,
+   0)
&& (tree_to_uhwi (TREE_OPERAND (lhs, 2))
% tree_to_uhwi (TYPE_SIZE (TREE_TYPE (lhs))) == 0))
  {
tree var = TREE_OPERAND (lhs, 0);
tree val = gimple_assign_rhs1 (stmt);
+   if (! types_compatible_p (TREE_TYPE (TREE_TYPE (var)),
+ TREE_TYPE (val)))
+ {
+   tree tem = make_ssa_name (TREE_TYPE (TREE_TYPE (var)));
+   gimple *pun
+ = gimple_build_assign (tem,
+build1 (VIEW_CONVERT_EXPR,
+TREE_TYPE (tem), val));
+   gsi_insert_before (&gsi, pun, GSI_SAME_STMT);
+   val = tem;
+ }
tree bitpos = TREE_OPERAND (lhs, 2);
gimple_assign_set_lhs (stmt, var);
gimple_assign_set_rhs_with_ops
@@ -1839,8 +1852,9 @@ execute_update_addresses_taken (void)
&& bitmap_bit_p (suitable_for_renaming, DECL_UID (sym))
&& VECTOR_TYPE_P (TREE_TYPE (sym))
&& TYPE_MODE (TREE_TYPE (sym)) != BLKmode
-   && types_compatible_p (TREE_TYPE (lhs),
-  TREE_TYPE (TREE_TYPE (sym)))
+   && operand_equal_p (TYPE_SIZE_UNIT (TREE_TYPE (lhs)),
+   TYPE_SIZE_UNIT
+ (TREE_TYPE (TREE_TYPE (sym))), 0)
&& tree_fits_uhwi_p (TREE_OPERAND (lhs, 1))
&& tree_int_cst_lt (TREE_OPERAND (lhs, 1),
TYPE_SIZE_UNIT (TREE_TYPE (sym)))
@@ -1848,6 +1862,17 @@ execute_update_addresses_taken (void)
% tree_to_uhwi (TYPE_SIZE_UNIT (TREE_TYPE (lhs == 0)
  {
tree val = gimple_assign_rhs1 (stmt);
+   if (! types_compatible_p (TREE_TYPE (val),
+ TREE_TYPE (TREE_TYPE (sym
+ {
+   tree tem = make_ssa_name 

Re: [PATCH] Switch vec_init and vec_extract optabs to 2 mode optab to allow extraction of vector from vector or initialization of vector from smaller vectors (PR target/80846)

2017-07-27 Thread Andreas Krebbel
On 07/25/2017 11:14 AM, Jakub Jelinek wrote:

S/390 parts are ok.

-Andreas-



Re: [Patch (preapproved)] Guard Copy Header pass on flag_tree_loop_vectorize

2017-07-27 Thread Richard Biener
On Thu, Jul 27, 2017 at 1:43 PM, James Greenhalgh
 wrote:
>
> Hi,
>
> While answering a user question on the equivalence of
> -ftree-loop-vectorize + -ftree-slp-vectorize and -ftree-vectorize I
> spotted one case which broke the equivalence. pass_ch::process_loop_p
> was guarded on flag_tree_vectorize, meaning you would get it for
> -ftree-vectorize, but not for -ftree-loop-vectorize/-ftree-slp-vectorize.
>
> This patch fixes that, getting rid of the only use of flag_tree_vectorize
> in the code base.
>
> This was preapproved on IRC:
>
>binche01: Should the first check in
> gcc/tree-ssa-loop-ch.c :: pass_ch_vect::process_loop_p  really be on
> !flag_tree_vectorize ? That seems to go against the documentation that
> -ftree-vectorize is equivalent to -ftree-loop-vectorize
> -ftree-slp-vectorize
>never noticed the condition.  any trouble caused?
>None that I know of, I was trying to answer a user
> question of whether the flags were really equivalent, and spotted
> that while grepping to confirm it
>don't know if header copy can enables slp with
> -fno-tree-loop-vectorize.  richi may have the answer.  maybe you
> can change it to flag_tree_loop_* see if there is breakage.
>jgreenhalgh: we should remove flag_tree_vectorize
>jgreenhalgh: patch pre-approved and change the CH flag check
> to flag_tree_loop_vectorize
>
> Committed as r250619 after a successful bootstrap and test run on
> aarch64-none-linux-gnu.
>
> I'm not sure what was meant by "remove flag_tree_vectorize" - the command line
> option seems a bit too popular to deprecate it, and the options framework
> doesn't like the idea of one option as an Alias of two others. So I've
> left it in place pending further instructions.

I thought of


Index: gcc/common.opt
===
--- gcc/common.opt  (revision 250619)
+++ gcc/common.opt  (working copy)
 ftree-vectorize
-Common Report Var(flag_tree_vectorize) Optimization
+Common Report Optimization
 Enable vectorization on trees.

 ftree-vectorizer-verbose=

which shows a few other uses of flag_tree_vectorize:

int
omp_max_vf (void)
{
  if (!optimize
  || optimize_debug
  || !flag_tree_loop_optimize
  || (!flag_tree_loop_vectorize
  && (global_options_set.x_flag_tree_loop_vectorize
  || global_options_set.x_flag_tree_vectorize)))
return 1;

not sure what that was supposed to test.  Jakub?  Similar
use in expand_omp_simd.

I suppose it wants to check whether _any_ function has
loop vectorization enabled (for some reason)?  And this is
an approximation only (doesn't work for one function
with attribute(optimize("tree-loop-vectorize")).

Thanks,
Richard.

> Thanks,
> James
>
> ---
> 2017-07-27  James Greenhalgh  
>
> * tree-ssa-loop-ch.c (pass_ch::process_loop_p): Guard on
> flag_tree_loop_vectorize rather than flag_tree_vectorize.
>


Re: [PATCH] Fix segfault in gcov.c (PR gcov-profile/81561).

2017-07-27 Thread Richard Biener
On Thu, Jul 27, 2017 at 12:12 PM, Martin Liška  wrote:
> Hello.
>
> As reported in mentioned PR, we segfault in gcov tool when one uses -a. It's 
> caused by fact
> that vectors blocks and block_lists have indices kept in sync and as one 
> removes an element
> from blocks via:
>blocked.erase (it);
>
> Then calling recursively the same function breaks the synchronization. The 
> patch was originally
> written by Joshua (adding him to CC). If I'm correct calling:
>
> -unblock (u, blocked, block_lists);
>
> does not make sense as we've already removed 'u'. Plus one needs to put 
> content of block_lists[index]
> to a separate vector in order to not to break iteration.
>
> Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.
> And fixed the problem reported in opensuse bugzilla (mentioned in the GCC 
> bugzilla PR).
>
> Ready to be installed?

Looks good to me but please wait for Joshua to confirm.

Did you manage to extract a testcase?

Thanks,
Richard.

> Martin
>
>
> gcc/ChangeLog:
>
> 2017-07-26  Martin Liska  
>
> PR gcov-profile/81561
> * gcov.c (unblock): Make unblocking safe as we need to preserve
> index correspondence of blocks and block_lists.
> ---
>  gcc/gcov.c | 10 +++---
>  1 file changed, 7 insertions(+), 3 deletions(-)
>
>
>


Re: [PATCH] Bound partial-inlining-entry-probability param (PR ipa/80663).

2017-07-27 Thread Jan Hubicka
> On 05/26/2017 10:54 AM, Richard Biener wrote:
> > On Thu, May 25, 2017 at 12:00 PM, Martin Liška  wrote:
> >> Hello.
> >>
> >> Having value of parameter partial-inlining-entry-probability bigger than 
> >> 100 does not
> >> make sense and can be just used to artificially trigger partial inlining.
> >>
> >> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
> >>
> >> Ready to be installed?
> > 
> > Ok.
> 
> Hello.
> 
> In order to fix PR81576, may I install the same patch to active branches?

OK,
Honza
> 
> Thanks,
> Martin
> 
> > 
> >> Martin


[Patch (preapproved)] Guard Copy Header pass on flag_tree_loop_vectorize

2017-07-27 Thread James Greenhalgh

Hi,

While answering a user question on the equivalence of
-ftree-loop-vectorize + -ftree-slp-vectorize and -ftree-vectorize I
spotted one case which broke the equivalence. pass_ch::process_loop_p
was guarded on flag_tree_vectorize, meaning you would get it for
-ftree-vectorize, but not for -ftree-loop-vectorize/-ftree-slp-vectorize.

This patch fixes that, getting rid of the only use of flag_tree_vectorize
in the code base.

This was preapproved on IRC:

   binche01: Should the first check in
gcc/tree-ssa-loop-ch.c :: pass_ch_vect::process_loop_p  really be on
!flag_tree_vectorize ? That seems to go against the documentation that
-ftree-vectorize is equivalent to -ftree-loop-vectorize
-ftree-slp-vectorize
   never noticed the condition.  any trouble caused?
   None that I know of, I was trying to answer a user
question of whether the flags were really equivalent, and spotted
that while grepping to confirm it
   don't know if header copy can enables slp with
-fno-tree-loop-vectorize.  richi may have the answer.  maybe you
can change it to flag_tree_loop_* see if there is breakage.
   jgreenhalgh: we should remove flag_tree_vectorize
   jgreenhalgh: patch pre-approved and change the CH flag check
to flag_tree_loop_vectorize

Committed as r250619 after a successful bootstrap and test run on
aarch64-none-linux-gnu.

I'm not sure what was meant by "remove flag_tree_vectorize" - the command line
option seems a bit too popular to deprecate it, and the options framework
doesn't like the idea of one option as an Alias of two others. So I've
left it in place pending further instructions.

Thanks,
James

---
2017-07-27  James Greenhalgh  

* tree-ssa-loop-ch.c (pass_ch::process_loop_p): Guard on
flag_tree_loop_vectorize rather than flag_tree_vectorize.

diff --git a/gcc/tree-ssa-loop-ch.c b/gcc/tree-ssa-loop-ch.c
index 86be34a..14cc6d8d 100644
--- a/gcc/tree-ssa-loop-ch.c
+++ b/gcc/tree-ssa-loop-ch.c
@@ -436,7 +436,7 @@ pass_ch::process_loop_p (struct loop *loop)
 bool
 pass_ch_vect::process_loop_p (struct loop *loop)
 {
-  if (!flag_tree_vectorize && !loop->force_vectorize)
+  if (!flag_tree_loop_vectorize && !loop->force_vectorize)
 return false;
 
   if (loop->dont_vectorize)


Re: [PATCH] Switch vec_init and vec_extract optabs to 2 mode optab to allow extraction of vector from vector or initialization of vector from smaller vectors (PR target/80846)

2017-07-27 Thread Segher Boessenkool
On Tue, Jul 25, 2017 at 11:14:32AM +0200, Jakub Jelinek wrote:
> The following patch adjusts the vec_init and vec_extract optabs, so that
> they don't have in the expander names just the vector mode, but also another
> mode, for vec_extract the mode of the result and for vec_init the mode of
> the elts of the vector passed as second operand.

> Ok for trunk?

I failed to say this explicitly yet: okay for rs6000.


Segher


[PATCH] [RISCV] Add RTEMS support

2017-07-27 Thread Sebastian Huber
gcc/
* config.gcc (riscv*-*-elf*): Add (riscv*-*-rtems*).
* config/riscv/rtems.h: New file.
---
 gcc/config.gcc   |  7 ++-
 gcc/config/riscv/rtems.h | 31 +++
 2 files changed, 37 insertions(+), 1 deletion(-)
 create mode 100644 gcc/config/riscv/rtems.h

diff --git a/gcc/config.gcc b/gcc/config.gcc
index aab7f65c1df..f28164646c3 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -2040,7 +2040,7 @@ riscv*-*-linux*)
# automatically detect that GAS supports it, yet we require it.
gcc_cv_initfini_array=yes
;;
-riscv*-*-elf*)
+riscv*-*-elf* | riscv*-*-rtems*)
tm_file="elfos.h newlib-stdint.h ${tm_file} riscv/elf.h"
case "x${enable_multilib}" in
xno) ;;
@@ -2053,6 +2053,11 @@ riscv*-*-elf*)
# Force .init_array support.  The configure script cannot always
# automatically detect that GAS supports it, yet we require it.
gcc_cv_initfini_array=yes
+   case ${target} in
+   riscv*-*-rtems*)
+ tm_file="${tm_file} rtems.h riscv/rtems.h"
+ ;;
+   esac
;;
 mips*-*-netbsd*)   # NetBSD/mips, either endian.
target_cpu_default="MASK_ABICALLS"
diff --git a/gcc/config/riscv/rtems.h b/gcc/config/riscv/rtems.h
new file mode 100644
index 000..221e2f69815
--- /dev/null
+++ b/gcc/config/riscv/rtems.h
@@ -0,0 +1,31 @@
+/* Definitions for RISC-V RTEMS systems with ELF format.
+   Copyright (C) 2017 Free Software Foundation, Inc.
+
+   This file is part of GCC.
+
+   GCC is free software; you can redistribute it and/or modify it
+   under the terms of the GNU General Public License as published
+   by the Free Software Foundation; either version 3, or (at your
+   option) any later version.
+
+   GCC is distributed in the hope that it will be useful, but WITHOUT
+   ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
+   or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
+   License for more details.
+
+   Under Section 7 of GPL version 3, you are granted additional
+   permissions described in the GCC Runtime Library Exception, version
+   3.1, as published by the Free Software Foundation.
+
+   You should have received a copy of the GNU General Public License and
+   a copy of the GCC Runtime Library Exception along with this program;
+   see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+   .  */
+
+#undef TARGET_OS_CPP_BUILTINS
+#define TARGET_OS_CPP_BUILTINS()   \
+do {   \
+   builtin_define ("__rtems__");   \
+   builtin_define ("__USE_INIT_FINI__");   \
+   builtin_assert ("system=rtems");\
+} while (0)
-- 
2.12.3



[Committed] S/390: Fix PR81534

2017-07-27 Thread Andreas Krebbel
The HI/QI atomic_fetch_" expander accepted symbolic
references and emitted CAS patterns whose insn predicates rejected them.

Fixed by allowing symbolic references there as well.  Reload will get
rid of them due to the constraint letter.

Regression tested on s390x.

Committed to mainline and GCC 7 branch.

gcc/ChangeLog:

2017-07-27  Andreas Krebbel  

PR target/81534
* config/s390/s390.md ("*atomic_compare_and_swap_1")
("*atomic_compare_and_swapdi_2", "*atomic_compare_and_swapsi_3"):
Change s_operand to memory_operand.

gcc/testsuite/ChangeLog:

2017-07-27  Andreas Krebbel  

PR target/81534
* gcc.target/s390/pr81534.c: New test.
---
 gcc/config/s390/s390.md |  6 +++---
 gcc/testsuite/gcc.target/s390/pr81534.c | 17 +
 2 files changed, 20 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/s390/pr81534.c

diff --git a/gcc/config/s390/s390.md b/gcc/config/s390/s390.md
index 0eef9b1..d1ac0b8 100644
--- a/gcc/config/s390/s390.md
+++ b/gcc/config/s390/s390.md
@@ -10266,7 +10266,7 @@
 ; cdsg, csg
 (define_insn "*atomic_compare_and_swap_1"
   [(set (match_operand:TDI 0 "register_operand" "=r")
-   (match_operand:TDI 1 "s_operand" "+S"))
+   (match_operand:TDI 1 "memory_operand" "+S"))
(set (match_dup 1)
(unspec_volatile:TDI
  [(match_dup 1)
@@ -10284,7 +10284,7 @@
 ; cds, cdsy
 (define_insn "*atomic_compare_and_swapdi_2"
   [(set (match_operand:DI 0 "register_operand" "=r,r")
-   (match_operand:DI 1 "s_operand" "+Q,S"))
+   (match_operand:DI 1 "memory_operand" "+Q,S"))
(set (match_dup 1)
(unspec_volatile:DI
  [(match_dup 1)
@@ -10305,7 +10305,7 @@
 ; cs, csy
 (define_insn "*atomic_compare_and_swapsi_3"
   [(set (match_operand:SI 0 "register_operand" "=r,r")
-   (match_operand:SI 1 "s_operand" "+Q,S"))
+   (match_operand:SI 1 "memory_operand" "+Q,S"))
(set (match_dup 1)
(unspec_volatile:SI
  [(match_dup 1)
diff --git a/gcc/testsuite/gcc.target/s390/pr81534.c 
b/gcc/testsuite/gcc.target/s390/pr81534.c
new file mode 100644
index 000..0e1bd99
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/pr81534.c
@@ -0,0 +1,17 @@
+/* PR81534 This testcase used to fail because the HI/QI
+   "atomic_fetch_" expander accepted symbolic references
+   and emitted CAS patterns whose insn definition rejected them.  */
+
+/* { dg-do compile } */
+/* { dg-options "-O2 -march=zEC12" } */
+
+struct {
+  short b;
+  long c;
+} a = {};
+
+void
+d ()
+{
+  __atomic_fetch_add(&a.b, 0, 5);
+}
-- 
2.9.1



Re: [PATCH] Bound partial-inlining-entry-probability param (PR ipa/80663).

2017-07-27 Thread Martin Liška
On 05/26/2017 10:54 AM, Richard Biener wrote:
> On Thu, May 25, 2017 at 12:00 PM, Martin Liška  wrote:
>> Hello.
>>
>> Having value of parameter partial-inlining-entry-probability bigger than 100 
>> does not
>> make sense and can be just used to artificially trigger partial inlining.
>>
>> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
>>
>> Ready to be installed?
> 
> Ok.

Hello.

In order to fix PR81576, may I install the same patch to active branches?

Thanks,
Martin

> 
>> Martin



[PATCH] Fix segfault in gcov.c (PR gcov-profile/81561).

2017-07-27 Thread Martin Liška
Hello.

As reported in mentioned PR, we segfault in gcov tool when one uses -a. It's 
caused by fact
that vectors blocks and block_lists have indices kept in sync and as one 
removes an element
from blocks via:
   blocked.erase (it);

Then calling recursively the same function breaks the synchronization. The 
patch was originally
written by Joshua (adding him to CC). If I'm correct calling:

-unblock (u, blocked, block_lists);

does not make sense as we've already removed 'u'. Plus one needs to put content 
of block_lists[index]
to a separate vector in order to not to break iteration.

Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.
And fixed the problem reported in opensuse bugzilla (mentioned in the GCC 
bugzilla PR).

Ready to be installed?
Martin


gcc/ChangeLog:

2017-07-26  Martin Liska  

PR gcov-profile/81561
* gcov.c (unblock): Make unblocking safe as we need to preserve
index correspondence of blocks and block_lists.
---
 gcc/gcov.c | 10 +++---
 1 file changed, 7 insertions(+), 3 deletions(-)



>From a31295c91c57fd3338e47eba1f513fcb1c37d8d2 Mon Sep 17 00:00:00 2001
From: marxin 
Date: Wed, 26 Jul 2017 14:21:52 +0200
Subject: [PATCH] Fix segfault in gcov.c (PR gcov-profile/81561).

gcc/ChangeLog:

2017-07-26  Martin Liska  

	PR gcov-profile/81561
	* gcov.c (unblock): Make unblocking safe as we need to preserve
	index correspondence of blocks and block_lists.
---
 gcc/gcov.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/gcc/gcov.c b/gcc/gcov.c
index e324cadad82..c56bac20278 100644
--- a/gcc/gcov.c
+++ b/gcc/gcov.c
@@ -539,13 +539,13 @@ unblock (const block_t *u, block_vector_t &blocked,
   unsigned index = it - blocked.begin ();
   blocked.erase (it);
 
-  for (block_vector_t::iterator it2 = block_lists[index].begin ();
-   it2 != block_lists[index].end (); it2++)
-unblock (*it2, blocked, block_lists);
-  for (unsigned j = 0; j < block_lists[index].size (); j++)
-unblock (u, blocked, block_lists);
+  block_vector_t to_unblock (block_lists[index]);
 
   block_lists.erase (block_lists.begin () + index);
+
+  for (block_vector_t::iterator it = to_unblock.begin ();
+   it != to_unblock.end (); it++)
+unblock (*it, blocked, block_lists);
 }
 
 /* Find circuit going to block V, PATH is provisional seen cycle.
-- 
2.13.3



Re: [PATCH 1/2] Introduce testsuite support to run Python tests

2017-07-27 Thread Pierre-Marie de Rodat

On 07/27/2017 10:50 AM, Matthias Klose wrote:

you are unconditionally hard coding python as the interpreter, which on most
distributions points to 2.7.  Please check python3 as well and make that the
preferred interpreter if available. python 2.7 is now EOL'd for 2020.


Understood, thank you. I’ll do this for the next patchset I’ll submit.

--
Pierre-Marie de Rodat


Re: [PATCH 2/2] Introduce Python testcases to check DWARF output

2017-07-27 Thread Pierre-Marie de Rodat

On 07/27/2017 10:36 AM, Richard Biener wrote:

Given that gdb can decode dwarf and we rely on gdb for guality and
gdb has python scripting can we somehow walk its dwarf tree from
within a python script?  That is, not need the dwarf decoding or
objdump requirement?


I’m quite familiar with GDB’s Python scripting API and unfortunately, 
no, it does not provide any access to raw debugging information: 
. All we have 
is access to ~source-level entities such as variables, functions and 
types (and “objfiles” themselves, but we can’t do anything interesting 
with them), so there is no way other way than testing dynamic behavior, 
i.e. checking that variables are properly read/decoded, etc. which is 
what we already do in guality tests.



On IRC I suggested to use pre-existing python DWARF decoders
which we might be able to import into the tree.  We'd still need them
to handle non-ELF object formats or somehow extract DWARF from
other containers to an ELF file (objcopy to the rescue...).

That said, not needing to write a DWARF / object file decoder
would be nice.


Yes. On IRC, I mentionned pyelftools 
(https://github.com/eliben/pyelftools/), which knows about ELF and 
DWARF, and that, I think, we could plug on some PE/XCOFF/… extractor to 
parse embedded DWARF. In any case, I feel it would not be simpler than 
what I sent. Of course I’m still open to suggestions. :-)



I see your testcases have associated .py files.  There are a few
existing "simple" dwarf testcases that would benefit from being
able to embed matching into the testcase source file itself?  Thus
have TCL autogenerate a .py file for the testing from, say

/* { dg-final { scan-dwarf { "Matcher('DW_TAG_member', 'i',
   attrs={'DW_AT_type': Capture('s0_i_type')})" } } } */

do you think that's feasible or doesn't it make much sense because
it would essentially match anywhere?  Or we'd end up with a
gazillion of scan-dwarf variants?


I think this is a good idea! If it is technically possible to have such 
multi-line statements in comments, I think this would be easy. I’ll 
prepare the engine for the next patchset version and I’ll try to find 
existing tests that could be re-written this way. As long as the pattern 
isn’t too generic, I think it would makes sense: for instance if the 
input source has only one structure field called “i”, then the above 
pattern will make it possible to match its type precisely.



I think a separate .py for checking is required anyway for the more
complex cases.

I think so as well, for instance for the tests I sent so far.

--
Pierre-Marie de Rodat


Re: [PATCH] Move static chain and non-local goto init after NOTE_INSN_FUNCTION_BEG (PR sanitize/81186).

2017-07-27 Thread Martin Liška
On 07/25/2017 02:49 PM, Jakub Jelinek wrote:
> On Tue, Jul 18, 2017 at 10:38:50AM +0200, Martin Liška wrote:
>> 2017-06-27  Martin Liska  
>>
>> PR sanitize/81186
> 
> 8 spaces instead of tab?
> 
>>  * function.c (expand_function_start): Set parm_birth_insn after
>>  static chain is initialized.
> 
> I don't like this description, after all, parm_birth_insn was set
> after static chain initialization before too (just not right after it
> in some cases).  The important change is that you've moved parm_birth_insn
> before the nonlocal_goto_save_area setup code, so IMHO the ChangeLog entry
> should say that.

Both notes fixed and the patch has been also installed to GCC 7 branch.
Thanks,
Martin

> 
> As for the patch itself, there are many spots which insert some code
> before or after parm_birth_insn or spots tied to the NOTE_INSN_FUNCTION_BEG
> note, but I'd hope nothing inserted there can actually call functions that
> perform non-local gotos, so I think the patch is fine.  And for debug info
> experience which is also related to NOTE_INSN_FUNCTION_BEG, I think the nl
> goto save area is nothing that can be seen in the debugger unless you know
> where it is, so the only change might be if you put a breakpoint on the end
> of prologue (i.e. NOTE_INSN_FUNCTION_BEG) and call from inferios some
> function that performs a non-local goto.  I think there are no barriers
> on that initialization anyway, so scheduler can move it around.
> 
> Thus, ok for trunk/7.2 with the above suggested ChangeLog change.
> 
>   Jakub
> 



Re: [PATCH 0/2] Python testcases to check DWARF output

2017-07-27 Thread Pierre-Marie de Rodat

Thank you for your feedback.

On 07/27/2017 09:52 AM, Richard Biener wrote:

I'm fine with the direction if a reviewer wants to go in that
direction.  I wish python didn't have a built-in speed penalty,
that's the only downside I don't like about it.  Aside from that,
even switching all of the testsuite to be python based isn't a
terrible idea.


But is it worse than TCL?


Good point. Actually for Python there are ways to make it faster. If we 
can somehow manage to have a limited set of Python interpreter instances 
(instead of one per test), we could use pypy, which is very good I heard 
to make long running instances fast.


As to switch all of the testsuite to Python, I don’t have an educated 
opinion on this. I just want to say that, here I’m using Python to 
pattern match DIEs, but if needed we could perfectly use it to do other 
complex tasks. This is why I kept the DWARF-specific stuff 
(gcc-dwarf.exp and the dwarfutils Python package, from second commit) 
separate from just Python interpreter handling (gcc-python.exp, from 
first commit).


Note that having a Python only testsuite would make it easier to have 
only one Python instance for all the testsuite run, so it would, in 
theory, make it easier to get a fast execution.


… now, I’m not familiar with DejaGNU but I have the feeling that it does 
a lot with respect to the handling of a great variety of 
targets/remote/etc. combinations. Re-writing it (and making sure it 
works!) sounds like a huuuge task. I’ll let experts in this area 
comment. :-)


--
Pierre-Marie de Rodat


Re: [PATCH 2/2] Introduce Python testcases to check DWARF output

2017-07-27 Thread Pierre-Marie de Rodat

On 07/26/2017 07:09 PM, David Malcolm wrote:

+If `single_cu` is True, make sure there is exactly one
compilation unit and


"is True" -> "is true"


Fixed.


+:param bool or_error: When True, if `single` is True and no
attribute


"True" -> "true" in two places


Fixed.


+:param None|(DIE) -> bool predicate: If provided, function
that filters
+out DIEs when it returns False.


You did not suggested, but I replaced “False” with “false” to be 
consistent. ;-)



+:param bool single: If True, look for a single DIE and raise
a


"True" -> "true", I suppose


Fixed.


+If left to None, the match succeded. Otherwise, must be set



"succeded" -> "succeeded"


Fixed.


+This is valid iff the match succeded.


here again.


Likewise.


+In Python 2, just return the input. In Python 3, decode the
input as ASCII.
+"""
+return (str_or_byte if sys.version_info.major < 3 else
+str_or_byte.decode('ascii'))


Aha!  Python 2 and Python 3.


Presumably this all runs with LANG=C so that there's no danger of any
non-ASCII bytes?  (bytes.decode('ascii' will raise a UnicodeDecodeError
if any byte >=128).


I’m not sure about the interaction with the locale. What I thought was: 
I’ve never seen non-ASCII strings in DWARF, nor in objdump’s output. I 
know it’s theorically possible: if that happens in the future (like some 
language allows non-ASCII identifier and yield non-ASCII names in 
DWARF), we’ll only have this function to fix.



There's a fair amount of non-trivial parsing going on here.
I wonder if it would be helpful to add a "unittest" suite for the
parsing?
(e.g. to have some precanned fragments of objdump output as strings,
and to verify that they're parsed as expected).

Note that I'm not a reviewer for the testsuite, so this is just a
suggestion.


That’s a good idea. Actually I think it will be very easy to write such 
tests *and* to assess Python code coverage for them. I’ll do this if 
this proposal is in good way to be accepted.



Hope this is constructive


It totally was: thank you very much!

--
Pierre-Marie de Rodat


Re: [PATCH 1/2] Introduce testsuite support to run Python tests

2017-07-27 Thread Matthias Klose
you are unconditionally hard coding python as the interpreter, which on most
distributions points to 2.7.  Please check python3 as well and make that the
preferred interpreter if available. python 2.7 is now EOL'd for 2020.

Matthias

On 26.07.2017 18:00, Pierre-Marie de Rodat wrote:
> gcc/testsuite/
> 
>   * lib/gcc-python.exp: New test library.
>   * python/testutils.py: New Python helper.
> ---
>  gcc/testsuite/lib/gcc-python.exp  | 95 
> +++
>  gcc/testsuite/python/testutils.py | 45 +++
>  2 files changed, 140 insertions(+)
>  create mode 100644 gcc/testsuite/lib/gcc-python.exp
>  create mode 100644 gcc/testsuite/python/testutils.py
> 
> diff --git a/gcc/testsuite/lib/gcc-python.exp 
> b/gcc/testsuite/lib/gcc-python.exp
> new file mode 100644
> index 000..30cf74a87ac
> --- /dev/null
> +++ b/gcc/testsuite/lib/gcc-python.exp
> @@ -0,0 +1,95 @@
> +# Copyright (C) 2017 Free Software Foundation, Inc.
> +
> +# This program is free software; you can redistribute it and/or modify
> +# it under the terms of the GNU General Public License as published by
> +# the Free Software Foundation; either version 3 of the License, or
> +# (at your option) any later version.
> +# 
> +# This program is distributed in the hope that it will be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +# GNU General Public License for more details.
> +# 
> +# You should have received a copy of the GNU General Public License
> +# along with GCC; see the file COPYING3.  If not see
> +# .
> +
> +# Helpers to run a Python interpreter
> +
> +load_lib "remote.exp"
> +
> +# Return whether a working Python interpreter is available.
> +
> +proc check-python-available { args } {
> +set result [local_exec "python -c print(\"Hello\")" "" "" 300]
> +
> +set status [lindex $result 0]
> +set output [string trim [lindex $result 1]]
> +
> +if { $status != 0 || $output != "Hello" } {
> + return 0
> +} else {
> + return 1
> +}
> +}
> +
> +# Run the SCRIPT_PY Python script. Add one PASSing (FAILing) test per output
> +# line that starts with "PASS: " ("FAIL: "). Also fail for any other output
> +# line and for non-zero exit status code.
> +#
> +# The Python script can access Python modules and packages in the
> +# $srcdir/python directory.
> +
> +proc python-test { script_py } {
> +global srcdir
> +
> +set testname testname-for-summary
> +
> +# This assumes that we are three frames down from dg-test, and that
> +# it still stores the filename of the testcase in a local variable 
> "name".
> +# A cleaner solution would require a new DejaGnu release.
> +upvar 2 prog src_file
> +
> +set asm_file "[file rootname [file tail $src_file]].o"
> +set script_py_path "[file dirname $src_file]/$script_py"
> +
> +set old_pythonpath [getenv "PYTHONPATH"]
> +set support_dir "$srcdir/python"
> +if { $old_pythonpath == "" } {
> +setenv "PYTHONPATH" $support_dir
> +} else {
> +setenv "PYTHONPATH" "$support_dir:$PYTHONPATH"
> +}
> +
> +set commandline "python $script_py_path $asm_file"
> +set timeout 300
> +
> +verbose -log "Executing: $commandline (timeout = $timeout)" 2
> +set result [local_exec $commandline "" "" $timeout]
> +
> +set status [lindex $result 0]
> +set output [lindex $result 1]
> +
> +if { $status != 0 } {
> + fail [concat "$testname: $script_py stopped with non-zero status" \
> +  " code ($status)"]
> +}
> +
> +foreach line [split $output "\n"] {
> +if { $line == "" } {
> +continue
> +}
> +if { [regexp "^PASS: (.*)" $line dummy message] } {
> +pass "$testname/$script_py: $message"
> +continue
> +}
> +if { [regexp "^FAIL: (.*)" $line dummy message] } {
> +fail "$testname/$script_py: $message"
> +continue
> +}
> +
> +fail "$testname/$script_py: spurious output: $line"
> +}
> +
> +setenv "PYTHONPATH" $old_pythonpath
> +}
> diff --git a/gcc/testsuite/python/testutils.py 
> b/gcc/testsuite/python/testutils.py
> new file mode 100644
> index 000..503105ad9d0
> --- /dev/null
> +++ b/gcc/testsuite/python/testutils.py
> @@ -0,0 +1,45 @@
> +# Copyright (C) 2017 Free Software Foundation, Inc.
> +
> +# This program is free software; you can redistribute it and/or modify
> +# it under the terms of the GNU General Public License as published by
> +# the Free Software Foundation; either version 3 of the License, or
> +# (at your option) any later version.
> +#
> +# This program is distributed in the hope that it will be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +# GNU General Public License for more

Re: [PATCH] PR libstdc++/53984 handle exceptions in basic_istream::sentry

2017-07-27 Thread Bin.Cheng
On Wed, Jul 26, 2017 at 11:06 PM, Jonathan Wakely  wrote:
> On 26/07/17 20:14 +0200, Paolo Carlini wrote:
>>
>> Hi again,
>>
>> On 26/07/2017 16:27, Paolo Carlini wrote:
>>>
>>> Hi,
>>>
>>> On 26/07/2017 16:21, Andreas Schwab wrote:

 ERROR: 27_io/basic_fstream/53984.cc: unknown dg option:
 dg-require-file-io 18 {} for " dg-require-file-io 18 "" "
>>>
>>> Should be already fixed, a trivial typo.
>>
>> ... but now the new test simply fails for me. If I don't spot something
>> else trivial over the next few hours I guess better waiting for Jon to look
>> into that.
>
>
> Sorry about that, I must have only checked for FAILs and missed the
> ERRORs.
>
> It should have been an ifstream not fstream, otherwise the filebuf
> can't even open the file. Fixed like so, committed to trunk.
Hi, I have seen below failure on aarch64/arm linux/elf:
spawn [open ...]^M
/tmp/.../src/gcc/libstdc++-v3/testsuite/27_io/basic_fstream/53984.cc:29:
void test01(): Assertion 'in.bad()' failed.
FAIL: 27_io/basic_fstream/53984.cc execution test
extra_tool_flags are:
  -include bits/stdc++.h

Thanks,
bin
>
>


Re: [PATCH 1/2] Introduce testsuite support to run Python tests

2017-07-27 Thread Pierre-Marie de Rodat

On 07/26/2017 06:48 PM, David Malcolm wrote:

IIRC RHEL 6 has Python 2.6 as its /usr/bin/python (but Python 2.7 is
available as a "software collection" add-on).

I don't know if gcc as a project would want to support 2.6+ or simply
2.7 for Python 2.


I don’t know neither: let’s wait for further feedback, then. If needed 
I’ll turn all the .format into % operations.


--
Pierre-Marie de Rodat


Ping! [Patch, fortran] PR34640 - ICE when assigning item of a derived-component to a pointer

2017-07-27 Thread Paul Richard Thomas
Hi Jerry,

I apologise for the long delay in replying to you. I was on vacation
in a location that excluded all but the most meagre emails :-) In
addition, my workstation has been in the repair shop for the last
couple of weeks and is now the subject of a warranty claim.

The span field is added in the middle of the descriptor because the
caf token field makes the descriptor variable length. This is
reflected in the change in libgfortran.h.

It has crossed my mind that I should add some more fields, by
expanding the dtype field, which would then allow us to bump up the
maximum number of dimensions for example.

I have also fixed the ASSOCIATE problem that was reported in the
thread on clf that Thomas started.

However, I would appreciate somebody reviewing what I already posted.
Damian has had problems applying the patch. I suggest that the -l
option be applied, since my editor fixes whitespace problems on the
fly.

Best regards

Paul


On 11 July 2017 at 15:48, Jerry DeLisle  wrote:
> On 07/11/2017 07:23 AM, Paul Richard Thomas wrote:
>> Well, a bit earlier than anticipated, here is the final version that
>> puts right all the wrinkles that I know about.
>>
>> Bootstraps and regtests - OK for trunk?
>>
>> Paul
>
> Somewhere in the threads on this, there was mentioned ABI breakage/change.
>
> Does it really do this? If the significant change is in the descriptor and you
> just added the span on the end of the structure, I am not convinced this is an
> issue. (I have not studied the patch at all, I would rather not bump library
> version)
>
> Jerry



-- 
"If you can't explain it simply, you don't understand it well enough"
- Albert Einstein


Re: [PATCH 00/19] cleanup of memory stats prototypes

2017-07-27 Thread Richard Biener
On Thu, Jul 27, 2017 at 10:30 AM,   wrote:
> From: Trevor Saunders 
>
> The preC++ way of passing information about the call site of a function was to
> use a macro that passed __file__, __LINE__, and __FUNCTION__ to a function 
> with
> the same name with _stat appended to it.  The way this is now done with C++ is
> to have arguments where the default value is __LINE__, __FILE__, and
> __FUNCTION__ in the caller.  This has the significant advantage that if you
> look for "^function (" you find the correct function, where in the C way of
> doing things you need to realize its a macro and check the definition of the
> macro to see what to look for next.  So this removes a layer of indirection,
> and makes things somewhat more consistant in using the C++ way of doing 
> things.
>
> patches independently bootstrapped and regtested on ppc64le-linux-gnu.  I
> successfully ran make all-gcc with --enable-gather-detailed-mem-stats, but
> couldn't complete a bootstrap before the series was applied, because the
> ddrs_table in tree-loop-distribution.c causes memory statistics gathering to 
> crash before the series as well as after it.  ok?

Thanks!  This was on my list of things todo...

The series is ok.

Did you catch all of MEM_STAT_INFO/ALONE_MEM_STAT_INFO so we can remove the
non-C++ way from statistics.h?

Richard.

> thanks
>
> Trev
>
> p.s. the issue with ddrs_table is that it ends up trying to add to the memory
> stats hash table from a global constructor when that hash table hasn't yet 
> been
> constructed.
>
>
> Trevor Saunders (19):
>   use c++ instead of make_node_stat
>   use c++ instead of _stat for copy_node_stat
>   use cxx instead of make_tree_binfo_stat
>   use c++ for make_int_cst_stat
>   use c++ instead of buildN_stat{,_loc}
>   use c++ instead of {make,grow}_tree_vec_stat
>   replace gimple_alloc_stat with c++
>   use c++ instead of build_decl_stat
>   use c++ instead of build_vl_exp_stat
>   use c++ for tree_cons_stat
>   remove unused build_var_debug_value prototype
>   use C++ for {make,build}_vector_stat
>   use c++ for build_tree_list{,_vec}_stat
>   replace rtx_alloc_stat with c++
>   replace shallow_copy_rtx_stat with c++
>   simplify the bitmap alloc_stat functions with c++
>   use c++ for bitmap_initialize
>   use c++ for gimple_build_debug_bind{,_source}
>   use c++ for fold_buildN_loc
>
>  gcc/bitmap.c  |   8 ++--
>  gcc/bitmap.h  |  17 +++-
>  gcc/cp/lex.c  |   4 +-
>  gcc/emit-rtl.c|   2 +-
>  gcc/fold-const.c  |  14 +++
>  gcc/fold-const.h  |  24 +---
>  gcc/fortran/resolve.c |   2 +-
>  gcc/gengenrtl.c   |   2 +-
>  gcc/gimple.c  |   8 ++--
>  gcc/gimple.h  |  11 ++
>  gcc/rtl.c |   4 +-
>  gcc/rtl.h |   6 +--
>  gcc/tree.c|  62 ++---
>  gcc/tree.h| 106 
> ++
>  14 files changed, 109 insertions(+), 161 deletions(-)
>
> --
> 2.11.0
>


Re: [PATCH 2/2] Introduce Python testcases to check DWARF output

2017-07-27 Thread Richard Biener
On Wed, Jul 26, 2017 at 6:00 PM, Pierre-Marie de Rodat
 wrote:
> For now, this supports only platforms that have an objdump available for
> the corresponding target. There are several things that would be nico to
> have in the future:
>
>   * add support for more DWARF dumping tools, such as otool on Darwin;
>
>   * have a DWARF location expression decoder, to be able to parse and
> pattern match expressions that objdump does not decode itself;
>
>   * complete the set of decoders for DIE attributes.

Just some random thoughts.

Given that gdb can decode dwarf and we rely on gdb for guality and
gdb has python scripting can we somehow walk its dwarf tree from
within a python script?  That is, not need the dwarf decoding or
objdump requirement?

On IRC I suggested to use pre-existing python DWARF decoders
which we might be able to import into the tree.  We'd still need them
to handle non-ELF object formats or somehow extract DWARF from
other containers to an ELF file (objcopy to the rescue...).

That said, not needing to write a DWARF / object file decoder
would be nice.

I see your testcases have associated .py files.  There are a few
existing "simple" dwarf testcases that would benefit from being
able to embed matching into the testcase source file itself?  Thus
have TCL autogenerate a .py file for the testing from, say

/* { dg-final { scan-dwarf { "Matcher('DW_TAG_member', 'i',
  attrs={'DW_AT_type': Capture('s0_i_type')})" } } } */

do you think that's feasible or doesn't it make much sense because
it would essentially match anywhere?  Or we'd end up with a
gazillion of scan-dwarf variants?

I think a separate .py for checking is required anyway for the more
complex cases.

> gcc/testsuite/
>
> * lib/gcc-dwarf.exp: New helper files.
> * python/dwarfutils/__init__.py,
> python/dwarfutils/data.py,
> python/dwarfutils/helpers.py,
> python/dwarfutils/objdump.py: New Python helpers.
> * gcc.dg/debug/dwarf2-py/dwarf2-py.exp,
> gnat.dg/dwarf/dwarf.exp: New test drivers.
> * gcc.dg/debug/dwarf2-py/sso.c,
> gcc.dg/debug/dwarf2-py/sso.py,
> gcc.dg/debug/dwarf2-py/var2.c,
> gcc.dg/debug/dwarf2-py/var2.py,
> gnat.dg/dwarf/debug9.adb,
> gnat.dg/dwarf/debug9.py,
> gnat.dg/dwarf/debug11.adb,
> gnat.dg/dwarf/debug11.py,
> gnat.dg/dwarf/debug12.adb,
> gnat.dg/dwarf/debug12.ads,
> gnat.dg/dwarf/debug12.py: New tests.
> ---
>  gcc/testsuite/gcc.dg/debug/dwarf2-py/dwarf2-py.exp |  52 ++
>  gcc/testsuite/gcc.dg/debug/dwarf2-py/sso.c |  19 +
>  gcc/testsuite/gcc.dg/debug/dwarf2-py/sso.py|  52 ++
>  gcc/testsuite/gcc.dg/debug/dwarf2-py/var2.c|  13 +
>  gcc/testsuite/gcc.dg/debug/dwarf2-py/var2.py   |  11 +
>  gcc/testsuite/gnat.dg/dg.exp   |   1 +
>  gcc/testsuite/gnat.dg/dwarf/debug11.adb|  19 +
>  gcc/testsuite/gnat.dg/dwarf/debug11.py |  51 ++
>  gcc/testsuite/gnat.dg/dwarf/debug12.adb|  10 +
>  gcc/testsuite/gnat.dg/dwarf/debug12.ads|   8 +
>  gcc/testsuite/gnat.dg/dwarf/debug12.py |   9 +
>  gcc/testsuite/gnat.dg/dwarf/debug9.adb |  45 ++
>  gcc/testsuite/gnat.dg/dwarf/debug9.py  |  22 +
>  gcc/testsuite/gnat.dg/dwarf/dwarf.exp  |  39 ++
>  gcc/testsuite/lib/gcc-dwarf.exp|  41 ++
>  gcc/testsuite/python/dwarfutils/__init__.py|  70 +++
>  gcc/testsuite/python/dwarfutils/data.py| 597 
> +
>  gcc/testsuite/python/dwarfutils/helpers.py |  11 +
>  gcc/testsuite/python/dwarfutils/objdump.py | 338 
>  19 files changed, 1408 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.dg/debug/dwarf2-py/dwarf2-py.exp
>  create mode 100644 gcc/testsuite/gcc.dg/debug/dwarf2-py/sso.c
>  create mode 100644 gcc/testsuite/gcc.dg/debug/dwarf2-py/sso.py
>  create mode 100644 gcc/testsuite/gcc.dg/debug/dwarf2-py/var2.c
>  create mode 100644 gcc/testsuite/gcc.dg/debug/dwarf2-py/var2.py
>  create mode 100644 gcc/testsuite/gnat.dg/dwarf/debug11.adb
>  create mode 100644 gcc/testsuite/gnat.dg/dwarf/debug11.py
>  create mode 100644 gcc/testsuite/gnat.dg/dwarf/debug12.adb
>  create mode 100644 gcc/testsuite/gnat.dg/dwarf/debug12.ads
>  create mode 100644 gcc/testsuite/gnat.dg/dwarf/debug12.py
>  create mode 100644 gcc/testsuite/gnat.dg/dwarf/debug9.adb
>  create mode 100644 gcc/testsuite/gnat.dg/dwarf/debug9.py
>  create mode 100644 gcc/testsuite/gnat.dg/dwarf/dwarf.exp
>  create mode 100644 gcc/testsuite/lib/gcc-dwarf.exp
>  create mode 100644 gcc/testsuite/python/dwarfutils/__init__.py
>  create mode 100644 gcc/testsuite/python/dwarfutils/data.py
>  create mode 100644 gcc/testsuite/python/dwarfutils/helpers.py
>  create mode 100644 gcc/testsuite/python/dwarfutils/objdump.py
>
> diff --git a/gcc/testsuite/gcc.dg/debug/dwarf2-py/dwarf2-py.exp 
> b/gcc/testsu

Re: Patch ping

2017-07-27 Thread Jakub Jelinek
On Thu, Jul 27, 2017 at 09:19:34AM +0200, Richard Biener wrote:
> > I'm building both addresses and subtracting them to get the offset.
> > I guess the other option is to compute just the address of the base
> > (i.e. base_addr), and add offset (if non-NULL) plus bitpos / BITS_PER_UNIT
> > plus offset from the MEM_REF (if any).  In that case it would probably
> > handle any handled_component_p and bitfields too.
> 
> Yes.  Can you try sth along this route?  Should be a matter of
> adding offset and bitpos / BITS_PER_UNIT (thus rounded down) plus
> any MEM_REF offset on the base.

Here it is, bootstrapped/regtested on x86_64-linux and i686-linux, ok for
trunk?

2017-07-27  Jakub Jelinek  

PR sanitizer/80998
* sanopt.c (pass_sanopt::execute): Handle IFN_UBSAN_PTR.
* tree-ssa-alias.c (call_may_clobber_ref_p_1): Likewise.
* flag-types.h (enum sanitize_code): Add SANITIZER_POINTER_OVERFLOW.
Or it into SANITIZER_UNDEFINED.
* ubsan.c: Include gimple-fold.h and varasm.h.
(ubsan_expand_ptr_ifn): New function.
(instrument_pointer_overflow): New function.
(maybe_instrument_pointer_overflow): New function.
(instrument_object_size): Formatting fix.
(pass_ubsan::execute): Call instrument_pointer_overflow
and maybe_instrument_pointer_overflow.
* internal-fn.c (expand_UBSAN_PTR): New function.
* ubsan.h (ubsan_expand_ptr_ifn): Declare.
* sanitizer.def (__ubsan_handle_pointer_overflow,
__ubsan_handle_pointer_overflow_abort): New builtins.
* tree-ssa-tail-merge.c (merge_stmts_p): Handle IFN_UBSAN_PTR.
* internal-fn.def (UBSAN_PTR): New internal function.
* opts.c (sanitizer_opts): Add pointer-overflow.
* lto-streamer-in.c (input_function): Handle IFN_UBSAN_PTR.
* fold-const.c (build_range_check): Compute pointer range check in
integral type if pointer arithmetics would be needed.  Formatting
fixes.
gcc/testsuite/
* c-c++-common/ubsan/ptr-overflow-1.c: New test.
* c-c++-common/ubsan/ptr-overflow-2.c: New test.
libsanitizer/
* ubsan/ubsan_handlers.cc: Cherry-pick upstream r304461.
* ubsan/ubsan_checks.inc: Likewise.
* ubsan/ubsan_handlers.h: Likewise.

--- gcc/sanopt.c.jj 2017-07-04 13:51:47.781815329 +0200
+++ gcc/sanopt.c2017-07-26 13:44:13.833204640 +0200
@@ -1062,6 +1062,9 @@ pass_sanopt::execute (function *fun)
case IFN_UBSAN_OBJECT_SIZE:
  no_next = ubsan_expand_objsize_ifn (&gsi);
  break;
+   case IFN_UBSAN_PTR:
+ no_next = ubsan_expand_ptr_ifn (&gsi);
+ break;
case IFN_UBSAN_VPTR:
  no_next = ubsan_expand_vptr_ifn (&gsi);
  break;
--- gcc/tree-ssa-alias.c.jj 2017-06-19 08:26:17.274597722 +0200
+++ gcc/tree-ssa-alias.c2017-07-26 13:44:13.834204628 +0200
@@ -1991,6 +1991,7 @@ call_may_clobber_ref_p_1 (gcall *call, a
   case IFN_UBSAN_BOUNDS:
   case IFN_UBSAN_VPTR:
   case IFN_UBSAN_OBJECT_SIZE:
+  case IFN_UBSAN_PTR:
   case IFN_ASAN_CHECK:
return false;
   default:
--- gcc/flag-types.h.jj 2017-06-19 08:26:17.593593662 +0200
+++ gcc/flag-types.h2017-07-26 13:44:13.834204628 +0200
@@ -238,6 +238,7 @@ enum sanitize_code {
   SANITIZE_OBJECT_SIZE = 1UL << 21,
   SANITIZE_VPTR = 1UL << 22,
   SANITIZE_BOUNDS_STRICT = 1UL << 23,
+  SANITIZE_POINTER_OVERFLOW = 1UL << 24,
   SANITIZE_SHIFT = SANITIZE_SHIFT_BASE | SANITIZE_SHIFT_EXPONENT,
   SANITIZE_UNDEFINED = SANITIZE_SHIFT | SANITIZE_DIVIDE | SANITIZE_UNREACHABLE
   | SANITIZE_VLA | SANITIZE_NULL | SANITIZE_RETURN
@@ -245,7 +246,8 @@ enum sanitize_code {
   | SANITIZE_BOUNDS | SANITIZE_ALIGNMENT
   | SANITIZE_NONNULL_ATTRIBUTE
   | SANITIZE_RETURNS_NONNULL_ATTRIBUTE
-  | SANITIZE_OBJECT_SIZE | SANITIZE_VPTR,
+  | SANITIZE_OBJECT_SIZE | SANITIZE_VPTR
+  | SANITIZE_POINTER_OVERFLOW,
   SANITIZE_UNDEFINED_NONDEFAULT = SANITIZE_FLOAT_DIVIDE | SANITIZE_FLOAT_CAST
  | SANITIZE_BOUNDS_STRICT
 };
--- gcc/ubsan.c.jj  2017-06-30 09:49:32.306609364 +0200
+++ gcc/ubsan.c 2017-07-26 20:22:34.718284238 +0200
@@ -45,6 +45,8 @@ along with GCC; see the file COPYING3.
 #include "builtins.h"
 #include "tree-object-size.h"
 #include "tree-cfg.h"
+#include "gimple-fold.h"
+#include "varasm.h"
 
 /* Map from a tree to a VAR_DECL tree.  */
 
@@ -1029,6 +1031,170 @@ ubsan_expand_objsize_ifn (gimple_stmt_it
   return true;
 }
 
+/* Expand UBSAN_PTR internal call.  */
+
+bool
+ubsan_expand_ptr_ifn (gimple_stmt_iterator *gsip)
+{
+  gimple_stmt_iterator gsi = *gsip;
+  gimple *stmt = gsi_stmt (gsi);
+  location_t loc = gimple_location (stmt);
+  gcc_assert (gimple_call_num_args (stmt) == 2);
+  tree ptr = gim

Re: [PATCH] Backport to GCC7

2017-07-27 Thread Martin Liška
Adding one more that I've just tested.

Thanks,
Martin

>From 96cea83ca9362a5598ab54ed83132cc0ab30a7e7 Mon Sep 17 00:00:00 2001
From: marxin 
Date: Mon, 17 Jul 2017 11:44:54 +
Subject: Backport r250271

gcc/ChangeLog:

2017-07-17  Martin Liska  

	PR sanitizer/81302
	* opts.c (finish_options): Do not allow -fgnu-tm
	w/ -fsanitize={kernel-,}address.  Say sorry.

---
diff --git a/gcc/opts.c b/gcc/opts.c
index 0343d6a5e86..3182bc99d65 100644
--- a/gcc/opts.c
+++ b/gcc/opts.c
@@ -1005,6 +1005,13 @@ finish_options (struct gcc_options *opts, struct gcc_options *opts_set,

   opts->x_flag_stack_reuse = SR_NONE;
 }
+
+  if ((opts->x_flag_sanitize & SANITIZE_USER_ADDRESS) && opts->x_flag_tm)
+sorry ("transactional memory is not supported with %<-fsanitize=address%>");
+
+  if ((opts->x_flag_sanitize & SANITIZE_KERNEL_ADDRESS) && opts->x_flag_tm)
+sorry ("transactional memory is not supported with "
+	   "%<-fsanitize=kernel-address%>");
 }

 #define LEFT_COLUMN	27
--
2.13.3


[PATCH gfortran] PR53542 USE-associated variables shows original instead of renamed symbol name

2017-07-27 Thread Dominique d'Humières
Dear all,

I am planning to commit the following patch as obvious (once Tobias has done 
the debugging) unless someone objects in the coming days.

Cheers,

Dominique

2017-07-27  Dominique d'Humieres  

PR fortran/53542
* expr.c (gfc_check_init_expr): Use the renamed name.

2017-07-27  Dominique d'Humieres  

PR testsuite/53542
* gfortran.dg/use_30.f90: New test.

--- ../_clean/gcc/fortran/expr.c2017-06-04 21:41:26.0 +0200
+++ gcc/fortran/expr.c  2017-06-25 13:07:33.0 +0200
@@ -2591,7 +2591,7 @@ gfc_check_init_expr (gfc_expr *e)
   else
gfc_error ("Parameter %qs at %L has not been declared or is "
   "a variable, which does not reduce to a constant "
-  "expression", e->symtree->n.sym->name, &e->where);
+  "expression", e->symtree->name, &e->where);
 
   break;
 
--- ../_clean/gcc/testsuite/gfortran.dg/use_30.f90  1970-01-01 
01:00:00.0 +0100
+++ gcc/testsuite/gfortran.dg/use_30.f902017-04-03 15:49:13.0 
+0200
@@ -0,0 +1,17 @@
+! { dg-do compile }
+!
+! PR53542 USE-associated variables shows original instead of renamed symbol 
name
+! Contributed by Tobias Burnus 
+!
+module select_precision
+integer :: dp = kind(1.0)
+end module select_precision
+
+module ode_types
+use select_precision, only: wp => dp
+contains
+subroutine ode_derivative(x)
+real(wp) :: x ! { dg-error "Parameter .wp. at .1. has not been 
declared" }
+end subroutine ode_derivative
+end module ode_types
+end



  1   2   >