date:20170630

Re: [PATCH, rs6000] Add additional support for vec_subc, vec_sube, vec_subec builtins.

2017-06-30 Thread Segher Boessenkool

Hi Carl,

On Fri, Jun 30, 2017 at 01:32:01PM -0700, Carl Love wrote:
>   vector unsigned __int128 vec_sube (vector unsigned __int128, vector 
> unsigned __int128, vector unsigned __int128);

Many of the changelog lines are much too long.

>   * gcc.target/powerpc/p8vector-builtin-8.c (foo): Add test cases for
>   the new vec_subc, vec_sube, vec_subec built-ins. Add the missing test
>   cases for vec_addc, adde and addec builtins.

Two spaces after a full stop.

> @@ -5855,13 +5864,14 @@ altivec_resolve_overloaded_builtin (location_t loc, 
> tree fndecl,
>/* else, fall through and process the Power9 alternative below */
>  }
>  
> -  if (fcode == ALTIVEC_BUILTIN_VEC_ADDE)
> +  if ((fcode == ALTIVEC_BUILTIN_VEC_ADDE)
> +  || (fcode == ALTIVEC_BUILTIN_VEC_SUBE))
>  {
>/* vec_adde needs to be special cased because there is no instruction
> for the {un}signed int version.  */
>if (nargs != 3)
>   {
> -   error ("vec_adde only accepts 3 arguments");
> +   error ("vec_adde and vec_sube only accepts 3 arguments");
> return error_mark_node;
>   }

Please only print the relevant name; see how this is handled for
ALTIVEC_BUILTIN_VEC_SPLATS and ALTIVEC_BUILTIN_VEC_PROMOTE, for example.

>  
> @@ -5884,14 +5894,24 @@ altivec_resolve_overloaded_builtin (location_t loc, 
> tree fndecl,
>   {
> /* For {un}signed ints,
>vec_adde (va, vb, carryv) == vec_add (vec_add (va, vb),
> +vec_and (carryv, 0x1)).
> +  vec_sube (va, vb, carryv) == vec_sub (vec_sub (va, vb),
>  vec_and (carryv, 0x1)).  */

s/0x1/1/ (yeah I realise this was here before, but let's stop the silliness).

> @@ -5919,13 +5951,14 @@ altivec_resolve_overloaded_builtin (location_t loc, 
> tree fndecl,
>   }
>  }
>  
> -  if (fcode == ALTIVEC_BUILTIN_VEC_ADDEC)
> +  if ((fcode == ALTIVEC_BUILTIN_VEC_ADDEC)
> +  || (fcode == ALTIVEC_BUILTIN_VEC_SUBEC))

No superfluous parentheses please, they don't help readability.

> +if (fcode == ALTIVEC_BUILTIN_VEC_ADDEC)
> +  {
> +tree VADDECUQ_bii = 
> rs6000_builtin_decls[P8V_BUILTIN_VEC_VADDECUQ];

Line too long.  Just name the local variable something shorter?  You don't
need VADDECUQ in the name.

> +vector signed __int128 vec_sube (vector signed __int128,
> + vector signed __int128,
> + vector signed__int128);

Missing space in that last line.

Some of the above happen more than once, please check.

Okay for trunk with those issues fixed.  Thanks!


Segher

Re: [PATCH, rs6000] Add support to __builtin_cpu_supports() for two new HWCAP2 bits

2017-06-30 Thread Segher Boessenkool

On Fri, Jun 30, 2017 at 11:53:48AM -0500, Peter Bergner wrote:
> >> Not use an installed header, that's not what I'm asking.  Share the
> >> source file, i.e., just copy it over from the glibc source tree (it
> >> should probably hold the master copy).  Fewer typos, cannot forget to
> >> update some entry, etc.

[ snip ]

We could factor out the parts used by both projects to a separate file,
and share that one.

Does glibc also have the names in our cpu_supports_info array?  Should
it have them?

What I'm after is making errors and omissions less likely.  Maybe I
just shouldn't worry, it's not like auxv changes *that* often.

Segher

Re: [PATCH,rs6000] PR80103: Fix typo in test case

2017-06-30 Thread Segher Boessenkool

On Fri, Jun 30, 2017 at 09:56:03AM -0600, Kelvin Nilsen wrote:
> 
> While reviewing regression test results for a back port of the PR80103
> patch, I discovered a typographic error in the test case.  This patch
> corrects the error.

A syntax error, even...  I wonder why we haven't noticed that before.
It won't compile as-is afaics?

> --- gcc/testsuite/gcc.target/powerpc/pr80103-1.c  (revision 249798)
> +++ gcc/testsuite/gcc.target/powerpc/pr80103-1.c  (working copy)
> @@ -12,5 +12,5 @@
>  int a;
>  void b (__attribute__ ((__vector_size__ (16))) char c)
>  {
> -  a = ((__attributes__ ((__vector_size__ (2 * sizeof (long long) c)[0];
> +  a = ((__attribute__ ((__vector_size__ (2 * sizeof (long long) c)[0];
>  }


Segher

Re: [PATCH] rs6000 branch probability changes

2017-06-30 Thread Segher Boessenkool

On Fri, Jun 30, 2017 at 10:36:27PM +0100, Ramana Radhakrishnan wrote:
> On Fri, Jun 30, 2017 at 2:36 PM, David Edelsohn  wrote:
> > Convert the rs6000 port to use the new API for branch probabilities.
> >
> > Okay?
> >
> > Thanks, David
> >
> > * config/rs6000/rs6000.c (emit_unlikely_jump): Adjust to new branch
> > probability data type.
> >
> > Index: rs6000.c
> > ===
> > --- rs6000.c (revision 249839)
> > +++ rs6000.c (working copy)
> > @@ -23514,10 +23514,9 @@
> >  static void
> >  emit_unlikely_jump (rtx cond, rtx label)
> >  {
> > -  int very_unlikely = REG_BR_PROB_BASE / 100 - 1;
> >rtx x = gen_rtx_IF_THEN_ELSE (VOIDmode, cond, label, pc_rtx);
> >rtx_insn *insn = emit_jump_insn (gen_rtx_SET (pc_rtx, x));
> > -  add_int_reg_note (insn, REG_BR_PROB, very_unlikely);
> > +  add_int_reg_note (insn, REG_BR_PROB, profile_probability::very_unlikely 
> > ());
> 
> Hmmm isn't this very unlikely to work :) ?
> 
> I used this as inspiration to do this for the arm ports but
> add_int_reg_note expects an integer but very_unlikely returns
> profile_probability  ...

It probably should be converted using to_reg_br_prob_base ?


Segher

Re: Convert profile probabilities to new type

2017-06-30 Thread Joseph Myers

Also seeing a possibly related failure from my glibc bot, building for 
hppa-linux-gnu:

/scratch/jmyers/glibc-bot/src/gcc/gcc/config/pa/pa.c: In function 'bool 
pa_expand_compare_and_swap_loop(rtx, rtx, rtx, rtx)':
/scratch/jmyers/glibc-bot/src/gcc/gcc/config/pa/pa.c:10726:59: error: could not 
convert '0' from 'int' to 'profile_probability'
GET_MODE (success), 1, label, 0);
   ^

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: [C++ Patch] PR 65775 ("Late-specified return type bypasses return type checks (qualified, function, array)")

2017-06-30 Thread Jason Merrill

OK.

On Sun, Jun 25, 2017 at 8:18 PM, Paolo Carlini  wrote:
> ... in fact, simply moving the checks forward, past the
> splice_late_return_type call, appears to work fine. I'm finishing testing
> the below.
>
> Thanks!
> Paolo.
>
> /

Re: [PATCH] rs6000 branch probability changes

2017-06-30 Thread Ramana Radhakrishnan

On Fri, Jun 30, 2017 at 2:36 PM, David Edelsohn  wrote:
> Convert the rs6000 port to use the new API for branch probabilities.
>
> Okay?
>
> Thanks, David
>
> * config/rs6000/rs6000.c (emit_unlikely_jump): Adjust to new branch
> probability data type.
>
> Index: rs6000.c
> ===
> --- rs6000.c (revision 249839)
> +++ rs6000.c (working copy)
> @@ -23514,10 +23514,9 @@
>  static void
>  emit_unlikely_jump (rtx cond, rtx label)
>  {
> -  int very_unlikely = REG_BR_PROB_BASE / 100 - 1;
>rtx x = gen_rtx_IF_THEN_ELSE (VOIDmode, cond, label, pc_rtx);
>rtx_insn *insn = emit_jump_insn (gen_rtx_SET (pc_rtx, x));
> -  add_int_reg_note (insn, REG_BR_PROB, very_unlikely);
> +  add_int_reg_note (insn, REG_BR_PROB, profile_probability::very_unlikely 
> ());

Hmmm isn't this very unlikely to work :) ?

I used this as inspiration to do this for the arm ports but
add_int_reg_note expects an integer but very_unlikely returns
profile_probability  ...

regards
Ramana



Ramana

>  }
>
>  /* A subroutine of the atomic operation splitters.  Emit a load-locked

Fix ICE with -fno-guess-branch-probability

2017-06-30 Thread Jan Hubicka

Hi,
this patch fixes a problem when -fno-guess-branch-probability is inlined into
expanded thunk.
When inlining, the profile status needs to be dropped to lowest common one
(well in theory we can avoid that when callee has only one BB or so).

Index: ChangeLog
===
--- ChangeLog   (revision 249837)
+++ ChangeLog   (working copy)
@@ -1,3 +1,8 @@
+2017-06-29  Jan Hubicka  
+
+   PR ipa/81261
+   * tree-inline.c (expand_call_inline): Combine profile statuses.
+
 2017-06-30  Richard Biener  
 
* graph.c (draw_cfg_node_succ_edges): Fix broken dot syntax.
Index: testsuite/ChangeLog
===
--- testsuite/ChangeLog (revision 249837)
+++ testsuite/ChangeLog (working copy)
@@ -1,3 +1,8 @@
+2017-06-29  Jan Hubicka  
+
+   PR ipa/81261
+   * g++.dg/ipa/pr81261.C: New testcase.
+
 2017-06-30  Nathan Sidwell  
 
PR c++/81229
Index: testsuite/g++.dg/ipa/pr81261.C
===
--- testsuite/g++.dg/ipa/pr81261.C  (revision 0)
+++ testsuite/g++.dg/ipa/pr81261.C  (working copy)
@@ -0,0 +1,31 @@
+/* { dg-do compile } */
+/* { dg-options "-O1 -fno-guess-branch-probability"  } */
+
+struct CBase {
+  virtual void BaseFunc () {}
+};
+
+struct MMixin {
+  virtual void * MixinFunc (int, void *) = 0;
+};
+
+struct CExample: CBase, public MMixin
+{
+  void *MixinFunc (int arg, void *arg2)
+  {
+if (arg != 1 || arg2)
+  return 0;
+return this;
+  }
+};
+
+void *test (MMixin & anExample)
+{
+  return anExample.MixinFunc (1, 0);
+}
+
+int main ()
+{
+  CExample c;
+  return (test (c) != );
+}
Index: tree-inline.c
===
--- tree-inline.c   (revision 249837)
+++ tree-inline.c   (working copy)
@@ -4650,6 +4650,9 @@ expand_call_inline (basic_block bb, gimp
   else
 id->dst_simt_vars = NULL;
 
+  if (profile_status_for_fn (id->src_cfun) == PROFILE_ABSENT)
+profile_status_for_fn (dst_cfun) = PROFILE_ABSENT;
+
   /* If the src function contains an IFN_VA_ARG, then so will the dst
  function after inlining.  Likewise for IFN_GOMP_USE_SIMT.  */
   prop_mask = PROP_gimple_lva | PROP_gimple_lomp_dev;

C++ PATCH for c++/81257, ICE with invalid ::template

2017-06-30 Thread Jason Merrill

The change for 54769 caused us to accept this invalid ::template A
even though the scope is non-dependent, so we can do the lookup.
Fixed by reverting part of the change to cp_parser_template_name,
which isn't actually needed.

Tested x86_64-pc-linux-gnu, applying to trunk.
commit a553e80cfd772d103963b5176af17f9e8232c1c7
Author: Jason Merrill 
Date:   Fri Jun 30 16:38:14 2017 -0400

PR c++/81257 - ICE with invalid ::template.

PR c++/54769 - wrong lookup of dependent template-name.
* parser.c (cp_parser_template_name): Revert part of last change.

diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index c6a8e37..575dbfc 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -15911,11 +15911,6 @@ cp_parser_template_name (cp_parser* parser,
/*ambiguous_decls=*/NULL,
token->location);
 
-  /* If the lookup failed and we got the 'template' keyword, believe it.  */
-  if (decl == error_mark_node && template_keyword_p
-  && processing_template_decl)
-return identifier;
-
   decl = strip_using_decl (decl);
 
   /* If DECL is a template, then the name was a template-name.  */
diff --git a/gcc/testsuite/g++.dg/parse/template-keyword1.C 
b/gcc/testsuite/g++.dg/parse/template-keyword1.C
new file mode 100644
index 000..0a1298e
--- /dev/null
+++ b/gcc/testsuite/g++.dg/parse/template-keyword1.C
@@ -0,0 +1,3 @@
+// PR c++/81257
+
+template < typename ::template A < int > >; // { dg-error "" }

Re: [PATCH] [PR 81245] Fix tree-if-conv calling of update_stmt after fold_stmt

2017-06-30 Thread Andrew Pinski

On Fri, Jun 30, 2017 at 1:20 AM, Richard Biener
 wrote:
> On Thu, Jun 29, 2017 at 10:12 PM, Andrew Pinski  wrote:
>> Hi,
>>   As described in the bug, tree-if-conv is calling update_stmt on an
>> old stmt which might have been removed from the IR already
>> (transforming of an assignment to a call in this case).  This fixes
>> the problem by calling update_stmt on the new statement that fold_stmt
>> might have created.
>>
>> OK?  Bootstrapped and tested on aarch64-linux-gnu with no regressions.
>
> update_stmt is not necessary when fold_stmt doesn't return true as
> gsi_insert_before already updates the stmt.
>
> Thus ok with moving update_stmt under the conditional.

This is what I applied.

Thanks,
Andrew

>
> Thanks,
> Richard.
>
>> Thanks,
>> Andrew Pinski
>> ChangeLog:
>> * tree-if-conv.c (predicate_scalar_phi): Update new_stmt if fold_stmt
>> returned true.
>>
>> testsuite/ChangeLog:
>> * gcc.dg/torture/pr81245.c: New testcase.
Index: testsuite/gcc.dg/torture/pr81245.c
===
--- testsuite/gcc.dg/torture/pr81245.c  (nonexistent)
+++ testsuite/gcc.dg/torture/pr81245.c  (working copy)
@@ -0,0 +1,16 @@
+/* { dg-options "-ffast-math" } */
+/* { dg-do compile } */
+/* This test used to crash the vectorizer as the ifconvert pass
+   used to convert the if to copysign but called update_stmt on
+   the old statement after calling fold_stmt. */
+double sg[18];
+void f(void)
+{
+  for (int i = 0 ;i < 18;i++)
+  {
+if (sg[i] < 0.0)
+  sg[i] = -1.0;
+else
+  sg[i] = 1.0;
+  }
+}
Index: tree-if-conv.c
===
--- tree-if-conv.c  (revision 249844)
+++ tree-if-conv.c  (working copy)
@@ -1853,8 +1853,11 @@
   new_stmt = gimple_build_assign (res, rhs);
   gsi_insert_before (gsi, new_stmt, GSI_SAME_STMT);
   gimple_stmt_iterator new_gsi = gsi_for_stmt (new_stmt);
-  fold_stmt (_gsi, ifcvt_follow_ssa_use_edges);
-  update_stmt (new_stmt);
+  if (fold_stmt (_gsi, ifcvt_follow_ssa_use_edges))
+   {
+ new_stmt = gsi_stmt (new_gsi);
+ update_stmt (new_stmt);
+   }
 
   if (dump_file && (dump_flags & TDF_DETAILS))
{

[PATCH, rs6000] Add additional support for vec_subc, vec_sube, vec_subec builtins.

2017-06-30 Thread Carl Love

GCC Maintainers:

The following patch adds support for additional  the vec_subc, vec_sube,
vec_subec builtins.  It also adds a few missing tests for currently
supported builtins.  The patch has been tested on
powerpc64le-unknown-linux-gnu (Power 8 LE) and
powerpc64-unknown-linux-gnu(Power 8 BE).

Please let me know if the following patch is acceptable.  Thanks.

Carl Love

-


gcc/ChangeLog:

2017-06-29  Carl Love  

* config/rs6000/rs6000-c.c: Add support for built-in functions.
vector signed int vec_subc (vector signed int, vector signed int);
vector signed __int128 vec_subc (vector signed __int128, vector signed 
__int128);
vector unsigned __int128 vec_subc (vector unsigned __int128, vector 
unsigned __int128);
vector signed int vec_sube (vector signed int, vector signed int, 
vector signed int);
vector unsigned int vec_sube (vector unsigned int, vector unsigned int, 
vector unsigned int);
vector signed __int128 vec_sube (vector signed __int128, vector signed 
__int128, vector signed__int128);
vector unsigned __int128 vec_sube (vector unsigned __int128, vector 
unsigned __int128, vector unsigned __int128);
vector signed int vec_subec (vector signed int, vector signed int, 
vector signed int);
vector unsigned int vec_subec (vector unsigned int, vector unsigned 
int, vector unsigned int);
vector signed __int128 vec_subec (vector signed __int128, vector signed 
__int128, vector signed__int128);
vector unsigned __int128 vec_subec (vector unsigned __int128, vector 
unsigned __int128, vector unsigned __int128);
* config/rs6000/rs6000.c (ALTIVEC_BUILTIN_VEC_SUBE, 
ALTIVEC_BUILTIN_VEC_SUBEC): Add
def_builtins.
* config/rs6000/rs6000-builtin.def (SUBE, SUBEC): Add
BU_ALTIVEC_OVERLOAD_X definitions.
* config/rs6000/altivec.h (vec_sube, vec_subec): Add builtin defines.
* doc/extend.texi: Update the built-in documentation file for the new
built-in functions.

gcc/testsuite/ChangeLog:

2017-06-29  Carl Love  

* gcc.target/powerpc/p8vector-builtin-8.c (foo): Add test cases for
the new vec_subc, vec_sube, vec_subec built-ins. Add the missing test
cases for vec_addc, adde and addec builtins.
---
 gcc/config/rs6000/altivec.h|   2 +
 gcc/config/rs6000/rs6000-builtin.def   |   2 +
 gcc/config/rs6000/rs6000-c.c   | 106 +
 gcc/config/rs6000/rs6000.c |   4 +
 gcc/doc/extend.texi|  27 ++
 .../gcc.target/powerpc/p8vector-builtin-8.c|  29 +-
 6 files changed, 148 insertions(+), 22 deletions(-)

diff --git a/gcc/config/rs6000/altivec.h b/gcc/config/rs6000/altivec.h
index 417e143..45bf615 100644
--- a/gcc/config/rs6000/altivec.h
+++ b/gcc/config/rs6000/altivec.h
@@ -191,6 +191,8 @@
 #define vec_unsignedo __builtin_vec_vunsignedo
 #define vec_vsubfp __builtin_vec_vsubfp
 #define vec_subc __builtin_vec_subc
+#define vec_sube __builtin_vec_sube
+#define vec_subec __builtin_vec_subec
 #define vec_vsubsws __builtin_vec_vsubsws
 #define vec_vsubshs __builtin_vec_vsubshs
 #define vec_vsubsbs __builtin_vec_vsubsbs
diff --git a/gcc/config/rs6000/rs6000-builtin.def 
b/gcc/config/rs6000/rs6000-builtin.def
index 9bdc2b4..f1c8ae0 100644
--- a/gcc/config/rs6000/rs6000-builtin.def
+++ b/gcc/config/rs6000/rs6000-builtin.def
@@ -1465,6 +1465,8 @@ BU_ALTIVEC_OVERLOAD_X (STVLX,"stvlx")
 BU_ALTIVEC_OVERLOAD_X (STVLXL,"stvlxl")
 BU_ALTIVEC_OVERLOAD_X (STVRX, "stvrx")
 BU_ALTIVEC_OVERLOAD_X (STVRXL,"stvrxl")
+BU_ALTIVEC_OVERLOAD_X (SUBE,  "sube")
+BU_ALTIVEC_OVERLOAD_X (SUBEC, "subec")
 BU_ALTIVEC_OVERLOAD_X (VCFSX, "vcfsx")
 BU_ALTIVEC_OVERLOAD_X (VCFUX, "vcfux")
 BU_ALTIVEC_OVERLOAD_X (VSPLTB,"vspltb")
diff --git a/gcc/config/rs6000/rs6000-c.c b/gcc/config/rs6000/rs6000-c.c
index 0752ef9..46a99d5 100644
--- a/gcc/config/rs6000/rs6000-c.c
+++ b/gcc/config/rs6000/rs6000-c.c
@@ -2950,8 +2950,17 @@ const struct altivec_builtin_types 
altivec_overloaded_builtins[] = {
 RS6000_BTI_unsigned_V16QI, RS6000_BTI_bool_V16QI, 
RS6000_BTI_unsigned_V16QI, 0 },
   { ALTIVEC_BUILTIN_VEC_VSUBUBM, ALTIVEC_BUILTIN_VSUBUBM,
 RS6000_BTI_unsigned_V16QI, RS6000_BTI_unsigned_V16QI, 
RS6000_BTI_bool_V16QI, 0 },
+
+  { ALTIVEC_BUILTIN_VEC_SUBC, ALTIVEC_BUILTIN_VSUBCUW,
+RS6000_BTI_V4SI, RS6000_BTI_V4SI, RS6000_BTI_V4SI, 0 },
   { ALTIVEC_BUILTIN_VEC_SUBC, ALTIVEC_BUILTIN_VSUBCUW,
 RS6000_BTI_unsigned_V4SI, RS6000_BTI_unsigned_V4SI, 
RS6000_BTI_unsigned_V4SI, 0 },
+  { ALTIVEC_BUILTIN_VEC_SUBC, P8V_BUILTIN_VSUBCUQ,
+RS6000_BTI_unsigned_V1TI, RS6000_BTI_unsigned_V1TI,
+RS6000_BTI_unsigned_V1TI, 0 },
+  { ALTIVEC_BUILTIN_VEC_SUBC, P8V_BUILTIN_VSUBCUQ,
+RS6000_BTI_V1TI,

Re: [C++ PATCH] conversion operator names

2017-06-30 Thread Jason Merrill

On Fri, Jun 30, 2017 at 3:53 PM, Nathan Sidwell  wrote:
> On 06/30/2017 03:11 PM, Jason Merrill wrote:
>>
>> On Fri, Jun 30, 2017 at 2:48 PM, Nathan Sidwell  wrote:
>>>
>>> We used to give conversion operatos type-specific mangled names.  Now we
>>> just number them 'operator $N' for N in [1,...].  As such
>>> mangle_convop_name_for_type is misnamed and has no place in mangle.c
>>
>> Hmm, wouldn't actual mangling make more sense, so that it's consistent
>> between compilations?
>
> While that would work, I have a different and simpler plan in mind.

Suspense! :)

Jason

Re: [PING] Re: [PATCH] c/c++: Add fix-it hints for suggested missing #includes

2017-06-30 Thread David Malcolm

On Fri, 2017-06-30 at 09:40 -0600, Jeff Law wrote:
> On 05/26/2017 01:54 PM, David Malcolm wrote:
> > Ping:
> >   https://gcc.gnu.org/ml/gcc-patches/2017-05/msg00321.html
> > 
> > On Thu, 2017-05-04 at 12:36 -0400, David Malcolm wrote:
> > > As of r247522, fix-it-hints can suggest the insertion of new
> > > lines.
> > > 
> > > This patch uses this to implement a new "maybe_add_include_fixit"
> > > function in c-common.c and uses it in the two places where the C
> > > and
> > > C++
> > > frontend can suggest missing #include directives. [1]
> > > 
> > > The idea is that the user can then click on the fix-it in an IDE
> > > and have it add the #include for them (or use -fdiagnostics
> > > -generate
> > > -patch).
> > > 
> > > Examples can be seen in the test cases.
> > > 
> > > The function attempts to put the #include in a reasonable place:
> > > immediately after the last #include within the file, or at the
> > > top of the file.  It is idempotent, so -fdiagnostics-generate
> > > -patch
> > > does the right thing if several such diagnostics are emitted.
> > > 
> > > Successfully bootstrapped on x86_64-pc-linux-gnu.
> > > 
> > > OK for trunk?
> > > 
> > > [1] I'm working on a followup which tweaks another diagnostic so
> > > that
> > > it
> > > can suggest that a #include was missing, so I'll use it there as
> > > well.
> > > 
> > > gcc/c-family/ChangeLog:
> > >   * c-common.c (try_to_locate_new_include_insertion_point): New
> > >   function.
> > >   (per_file_includes_t): New typedef.
> > >   (added_includes_t): New typedef.
> > >   (added_includes): New variable.
> > >   (maybe_add_include_fixit): New function.
> > >   * c-common.h (maybe_add_include_fixit): New decl.
> > > 
> > > gcc/c/ChangeLog:
> > >   * c-decl.c (implicitly_declare): When suggesting a missing
> > >   #include, provide a fix-it hint.
> > > 
> > > gcc/cp/ChangeLog:
> > >   * name-lookup.c (get_std_name_hint): Add '<' and '>' around
> > >   the header names.
> > >   (maybe_suggest_missing_header): Update for addition of '<' and
> > > '>'
> > >   to above.  Provide a fix-it hint.
> > > 
> > > gcc/testsuite/ChangeLog:
> > >   * g++.dg/lookup/missing-std-include-2.C: New text case.
> > >   * gcc.dg/missing-header-fixit-1.c: New test case.
> Generally OK.  But a few comments on how you find the location for
> where
> to suggest the new #include.

Thanks for looking at it.

> It looks like you're walking the whole table every time?!?

Note that try_to_locate_new_include_insertion_point is guarded by the
idempotency logic within maybe_add_include_fixit: it will be called at
most once per suggested #include.

I don't know if "walking" is the word I would have used: it's iterating
over the whole array of structs, doing a few pointer compares of fields
within it.  The elements in the array are line *maps*, not lines
themselves, so it's likely to be of the order of a few thousand entries
for a source file with plenty of nested #includes.

> Shouldn't
> you at least bound things between start of file 

By "start of file", do you mean "start of the main source file", or the
start of the file containing the diagnostic?  AIUI, getting at this
information requires a walk forwards through the line maps, which is
exactly what the patch is doing.

> and the point where an
> error was issued?  ie, if you used an undefined type on line XX, a
> #include after line XX makes no sense to resolve the error.

Hmmm.  I agree that the fix-it hint would makes little sense for this
case.  I did some testing of this case, and it uncovered some bugs with
the patch, so I'm going to post an updated patch, with some test
coverage for it (and addressing the bugs).

But I'd prefer to focus on correctness rather than on performance here;
because of the idempotency guard it's only called once per header file
that we "know" about (and can thus suggest in a fix-it hint).

> I'm not sure how often this will get called when someone does
> something
> stupid and needs the #include.  But ISTM you're looking for two
> bounds.
> 
> For the "last" case you start at the statement which generated the
> error
> and walk backwards stopping when you find the last map.

It's not clear to me what you're proposing.  If we expand the
statement's location to get a physical location, that will be within an
ordinary line_map, within the file in question.  The start of that
linemap might related to returning from a #include, but it might also
relate to changes in line length (due to cpplib trying to make best use
of the 32-bit encoding space) e.g. in the middle of a function, or in
the middle of a comment.

We could maybe use this ordinary map as an upper boundary for the
search (though I'd want to check that the logic in the search still
makes sense for this case).

> For the "first" case, you start at the beginning and walk forward to
> find the map, then quit.

Again, I'm not quite sure what you mean here.

> Are those not appropriate here?

As noted above, these feel to me like premature

Re: [C++ PATCH] conversion operator names

2017-06-30 Thread Nathan Sidwell


On 06/30/2017 03:11 PM, Jason Merrill wrote:

On Fri, Jun 30, 2017 at 2:48 PM, Nathan Sidwell  wrote:

We used to give conversion operatos type-specific mangled names.  Now we
just number them 'operator $N' for N in [1,...].  As such
mangle_convop_name_for_type is misnamed and has no place in mangle.c


Hmm, wouldn't actual mangling make more sense, so that it's consistent
between compilations?


While that would work, I have a different and simpler plan in mind.

nathan

--
Nathan Sidwell

Re: [C++ PATCH] "decomposition declaration" -> "structured binding" in C++ diagnostics

2017-06-30 Thread Jason Merrill

On Fri, Jun 30, 2017 at 3:35 PM, Jakub Jelinek  wrote:
> On Fri, Jun 30, 2017 at 03:15:21PM -0400, Jason Merrill wrote:
>> On Fri, Jun 30, 2017 at 1:26 PM, Jakub Jelinek  wrote:
>> > Hi!
>> >
>> > As C++17 decided to rename decompositions to structured bindings, this
>> > patch attempts to adjust the diagnostics by replacing
>> > "decomposition declaration" with "structured binding".
>> > Or shall I use "structured binding declaration" instead (or is that too
>> > longer/verbose), or something different?
>>
>> "structured binding declaration" is better when you're talking about
>> the declaration syntax.  When you're talking about the feature in the
>> first hunk, "structured bindings".  "structured binding variable" is
>> good in the one hunk where you've used that.
>
> What about that:
> case cdk_decomp:
> - name = "decomposition";
> + name = "structured binding";
>   break;
> hunk?

I think that's OK.

> Also, aren't there too many "declar*" roots, both in the previous
> "decomposition declaration cannot be declared"
> and
> "structured binding declaration cannot be declared"
> ?  I mean wouldn't be "structured bindings cannot be declared"
> or "structured binding cannot be declared" better in that case?

Well, the term "structured binding" refers to one of the names
declared by the declaration, not the declaration as a whole, and those
errors refer to the latter.  We could change "cannot be declared" to
something else, perhaps just drop the "declared", so e.g. "structured
binding declaration cannot be %"?  Or "cannot use X
specifier"?

Jason

Re: [C++ PATCH] "decomposition declaration" -> "structured binding" in C++ diagnostics

2017-06-30 Thread Jakub Jelinek

On Fri, Jun 30, 2017 at 03:15:21PM -0400, Jason Merrill wrote:
> On Fri, Jun 30, 2017 at 1:26 PM, Jakub Jelinek  wrote:
> > Hi!
> >
> > As C++17 decided to rename decompositions to structured bindings, this
> > patch attempts to adjust the diagnostics by replacing
> > "decomposition declaration" with "structured binding".
> > Or shall I use "structured binding declaration" instead (or is that too
> > longer/verbose), or something different?
> 
> "structured binding declaration" is better when you're talking about
> the declaration syntax.  When you're talking about the feature in the
> first hunk, "structured bindings".  "structured binding variable" is
> good in the one hunk where you've used that.

What about that:
case cdk_decomp:
- name = "decomposition";
+ name = "structured binding";
  break;
hunk?

Also, aren't there too many "declar*" roots, both in the previous
"decomposition declaration cannot be declared"
and
"structured binding declaration cannot be declared"
?  I mean wouldn't be "structured bindings cannot be declared"
or "structured binding cannot be declared" better in that case?

Jakub

Re: [PATCH v2][RFC] Canonize names of attributes.

2017-06-30 Thread Jason Merrill

On Fri, Jun 30, 2017 at 5:23 AM, Martin Liška  wrote:
> This is v2 of the patch, where just names of attributes are canonicalized.
> Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.

What is the purpose of the new "strict" parameter to cmp_attribs* ?  I
don't see any discussion of it.

Jason

Re: Convert profile probabilities to new type

2017-06-30 Thread Jan Hubicka

> 
> After the change, edges that fail the predicate contribute REG_BR_PROB_BASE in
> the upper hunk, but 0 in the lower hunk. Before the change, they contributed 0
> in both cases.
> 
> The following patch should restore things:
> 
> diff --git a/gcc/sel-sched-ir.c b/gcc/sel-sched-ir.c
> index dd72828..fa88259 100644
> --- a/gcc/sel-sched-ir.c
> +++ b/gcc/sel-sched-ir.c
> @@ -4747,8 +4747,10 @@ compute_succs_info (insn_t insn, short flags)
>sinfo->probs_ok.safe_push (
>   /* FIXME: Improve calculation when skipping
> inner loop to exits.  */
> -si.bb_end && si.e1->probability.initialized_p ()
> - ? si.e1->probability.to_reg_br_prob_base ()
> +si.bb_end
> + ? (si.e1->probability.initialized_p ()
> +   ? si.e1->probability.to_reg_br_prob_base ()
> +   : 0)
>   : REG_BR_PROB_BASE);
>sinfo->succs_ok_n++;
>  }

Ah, thanks!  This looks OK to me.
I was not quite sure what to do with missing probabilities in this case.

Honza

Re: [C++ PATCH] "decomposition declaration" -> "structured binding" in C++ diagnostics

2017-06-30 Thread Jason Merrill

On Fri, Jun 30, 2017 at 1:26 PM, Jakub Jelinek  wrote:
> Hi!
>
> As C++17 decided to rename decompositions to structured bindings, this
> patch attempts to adjust the diagnostics by replacing
> "decomposition declaration" with "structured binding".
> Or shall I use "structured binding declaration" instead (or is that too
> longer/verbose), or something different?

"structured binding declaration" is better when you're talking about
the declaration syntax.  When you're talking about the feature in the
first hunk, "structured bindings".  "structured binding variable" is
good in the one hunk where you've used that.

Jason

Re: [C++ PATCH] conversion operator names

2017-06-30 Thread Jason Merrill

On Fri, Jun 30, 2017 at 2:48 PM, Nathan Sidwell  wrote:
> We used to give conversion operatos type-specific mangled names.  Now we
> just number them 'operator $N' for N in [1,...].  As such
> mangle_convop_name_for_type is misnamed and has no place in mangle.c

Hmm, wouldn't actual mangling make more sense, so that it's consistent
between compilations?

Jason

Re: [PATCH] [Aarch64] Variable shift count truncation issues

2017-06-30 Thread Michael Collison

Okay I will take a look at this.

Michael Collison


> On Jun 30, 2017, at 11:04 AM, Andreas Schwab  wrote:
> 
>> On Jun 23 2017, Michael Collison  wrote:
>> 
>> diff --git a/gcc/testsuite/gcc.target/aarch64/var_shift_mask_1.c 
>> b/gcc/testsuite/gcc.target/aarch64/var_shift_mask_1.c
>> new file mode 100644
>> index 000..e2b020e
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/aarch64/var_shift_mask_1.c
>> @@ -0,0 +1,61 @@
>> +/* { dg-do compile } */
>> +/* { dg-options "-O2" } */
>> +
>> +/* The integer variable shift and rotate instructions truncate their
>> +   shift amounts by the datasize.  Make sure that we don't emit a redundant
>> +   masking operation.  */
>> +
>> +unsigned
>> +f1 (unsigned x, int y)
>> +{
>> +  return x << (y & 31);
>> +}
>> +
>> +unsigned long
>> +f2 (unsigned long x, int y)
>> +{
>> +  return x << (y & 63);
>> +}
>> +
>> +unsigned long
>> +f3 (unsigned long bit_addr, int y)
>> +{
>> +  unsigned long bitnumb = bit_addr & 63;
>> +  return (1L << bitnumb);
>> +}
>> +
>> +unsigned int
>> +f4 (unsigned int x, unsigned int y)
>> +{
>> +  y &= 31;
>> +  return x >> y | (x << (32 - y));
>> +}
>> +
>> +unsigned long
>> +f5 (unsigned long x, unsigned long y)
>> +{
>> +  y &= 63;
>> +  return x >> y | (x << (64 - y));
>> +}
>> +
>> +unsigned long
>> +f6 (unsigned long x, unsigned long y)
>> +{
>> +
>> +  return (x << (64 - (y & 63)));
>> +
>> +}
>> +
>> +unsigned long
>> +f7 (unsigned long x, unsigned long y)
>> +{
>> +  return (x << -(y & 63));
>> +}
>> +
>> +/* { dg-final { scan-assembler-times "lsl\tw\[0-9\]+, w\[0-9\]+, w\[0-9\]+" 
>> 1 } } */
>> +/* { dg-final { scan-assembler-times "lsl\tx\[0-9\]+, x\[0-9\]+, x\[0-9\]+" 
>> 4 } } */
>> +/* { dg-final { scan-assembler-times "ror\tw\[0-9\]+, w\[0-9\]+, w\[0-9\]+" 
>> 1 } } */
>> +/* { dg-final { scan-assembler-times "ror\tx\[0-9\]+, x\[0-9\]+, x\[0-9\]+" 
>> 1 } } */
>> +/* { dg-final { scan-assembler-not "and\tw\[0-9\]+, w\[0-9\]+, 31" } } */
>> +/* { dg-final { scan-assembler-not "and\tx\[0-9\]+, x\[0-9\]+, 63" } } */
>> +/* { dg-final { scan-assembler-not "sub\tw\[0-9\]+, w\[0-9\]+, w\[0-9\]+" } 
>> } */
> 
> That fails with -mabi=ilp32.
> 
> Andreas.
> 
> -- 
> Andreas Schwab, sch...@linux-m68k.org
> GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
> "And now for something completely different."

Re: [C++ PATCH] conversion operator names

2017-06-30 Thread Nathan Sidwell


On 06/30/2017 02:48 PM, Nathan Sidwell wrote:
We used to give conversion operatos type-specific mangled names.  Now we 
just number them 'operator $N' for N in [1,...].  As such 
mangle_convop_name_for_type is misnamed and has no place in mangle.c


This patch moves that machinery to lex.c, which is a better place (best 
place?).  Other than renaming to make_conv_op_name, nothing else 
changes.  For. The. Moment.


Oh, because this adds cp/lex.c to the gt list, a reconfigure at least is 
necessary.


nathan

--
Nathan Sidwell

[C++ PATCH] conversion operator names

2017-06-30 Thread Nathan Sidwell

We used to give conversion operatos type-specific mangled names.  Now we 
just number them 'operator $N' for N in [1,...].  As such 
mangle_convop_name_for_type is misnamed and has no place in mangle.c


This patch moves that machinery to lex.c, which is a better place (best 
place?).  Other than renaming to make_conv_op_name, nothing else 
changes.  For. The. Moment.


nathan
--
Nathan Sidwell
2017-06-30  Nathan Sidwell  

	* config-lang.in (gtfiles): Add cp/lex.c.
	* cp-tree.h (mangle_convop_name_for_type): Rename ...
	(make_conv_op_name): ... here.  Move to lex.
	* lambda.c (maybe_add_lambda_conv_op): Update.
	* parser.c (cp_parser_conversion_function_id): Update.
	* pt.c (tsubst_decl, tsubst_baselink, tsubst_copy,
	tsubst_copy_and_build): Update.
	* semantics.c (apply_deduced_return_type): Update.
	* mangle.c (conv_type_hasher, conv_type_names,
	mangle_conv_op_name_for_type): Move to ...
	* lex.c (conv_type_hasher, conv_type_names, make_convop_name):
	... here.  Rename.

	* libcp1plugin.cc (plugin_build_decl): Use make_conv_op_name.
	(plugin_build_dependent_expr): Likewise.

Index: gcc/cp/config-lang.in
===
--- gcc/cp/config-lang.in	(revision 249835)
+++ gcc/cp/config-lang.in	(working copy)
@@ -45,7 +45,7 @@ gtfiles="\
 \$(srcdir)/cp/except.c \
 \$(srcdir)/cp/friend.c \
 \$(srcdir)/cp/init.c \
-\$(srcdir)/cp/lambda.c \
+\$(srcdir)/cp/lambda.c \$(srcdir)/cp/lex.c \
 \$(srcdir)/cp/mangle.c \$(srcdir)/cp/method.c \
 \$(srcdir)/cp/name-lookup.c \
 \$(srcdir)/cp/parser.c \$(srcdir)/cp/pt.c \
Index: gcc/cp/cp-tree.h
===
--- gcc/cp/cp-tree.h	(revision 249843)
+++ gcc/cp/cp-tree.h	(working copy)
@@ -6326,6 +6326,7 @@ extern void yyungetc(int, int);
 extern tree unqualified_name_lookup_error	(tree,
 		 location_t = UNKNOWN_LOCATION);
 extern tree unqualified_fn_lookup_error		(cp_expr);
+extern tree make_conv_op_name			(tree);
 extern tree build_lang_decl			(enum tree_code, tree, tree);
 extern tree build_lang_decl_loc			(location_t, enum tree_code, tree, tree);
 extern void retrofit_lang_decl			(tree);
@@ -7180,7 +7181,6 @@ extern tree mangle_vtbl_for_type		(tree)
 extern tree mangle_vtt_for_type			(tree);
 extern tree mangle_ctor_vtbl_for_type		(tree, tree);
 extern tree mangle_thunk			(tree, int, tree, tree, tree);
-extern tree mangle_conv_op_name_for_type	(tree);
 extern tree mangle_guard_variable		(tree);
 extern tree mangle_tls_init_fn			(tree);
 extern tree mangle_tls_wrapper_fn		(tree);
Index: gcc/cp/lambda.c
===
--- gcc/cp/lambda.c	(revision 249835)
+++ gcc/cp/lambda.c	(working copy)
@@ -1088,7 +1088,7 @@ maybe_add_lambda_conv_op (tree type)
   /* First build up the conversion op.  */
 
   tree rettype = build_pointer_type (stattype);
-  tree name = mangle_conv_op_name_for_type (rettype);
+  tree name = make_conv_op_name (rettype);
   tree thistype = cp_build_qualified_type (type, TYPE_QUAL_CONST);
   tree fntype = build_method_type_directly (thistype, rettype, void_list_node);
   tree convfn = build_lang_decl (FUNCTION_DECL, name, fntype);
Index: gcc/cp/lex.c
===
--- gcc/cp/lex.c	(revision 249835)
+++ gcc/cp/lex.c	(working copy)
@@ -531,6 +531,74 @@ unqualified_fn_lookup_error (cp_expr nam
   return unqualified_name_lookup_error (name, loc);
 }
 
+struct conv_type_hasher : ggc_ptr_hash
+{
+  static hashval_t hash (tree);
+  static bool equal (tree, tree);
+};
+
+/* This hash table maps TYPEs to the IDENTIFIER for a conversion
+   operator to TYPE.  The nodes are IDENTIFIERs whose TREE_TYPE is the
+   TYPE.  */
+
+static GTY (()) hash_table *conv_type_names;
+
+/* Hash a node (VAL1) in the table.  */
+
+hashval_t
+conv_type_hasher::hash (tree val)
+{
+  return (hashval_t) TYPE_UID (TREE_TYPE (val));
+}
+
+/* Compare VAL1 (a node in the table) with VAL2 (a TYPE).  */
+
+bool
+conv_type_hasher::equal (tree val1, tree val2)
+{
+  return TREE_TYPE (val1) == val2;
+}
+
+/* Return an identifier for a conversion operator to TYPE.  We can
+   get from the returned identifier to the type.  */
+
+tree
+make_conv_op_name (tree type)
+{
+  tree *slot;
+  tree identifier;
+
+  if (type == error_mark_node)
+return error_mark_node;
+
+  if (conv_type_names == NULL)
+conv_type_names = hash_table::create_ggc (31);
+
+  slot = conv_type_names->find_slot_with_hash (type,
+	   (hashval_t) TYPE_UID (type),
+	   INSERT);
+  identifier = *slot;
+  if (!identifier)
+{
+  char buffer[64];
+
+   /* Create a unique name corresponding to TYPE.  */
+  sprintf (buffer, "operator %lu",
+	   (unsigned long) conv_type_names->elements ());
+  identifier = get_identifier (buffer);
+  *slot = identifier;
+
+  /* Hang TYPE off the identifier so it can be found easily later
+	 when performing conversions.  */

Re: Convert profile probabilities to new type

2017-06-30 Thread Alexander Monakov

On Fri, 30 Jun 2017, Andreas Schwab wrote:

> This breaks ia64, a lot of testsuite failures like this:
> 
> FAIL: c-c++-common/torture/pr53505.c   -O3 -fomit-frame-pointer 
> -funroll-loops -fpeel-loops -ftracer -finline-functions  (internal compiler 
> error)
> FAIL: c-c++-common/torture/pr53505.c   -O3 -fomit-frame-pointer 
> -funroll-loops -fpeel-loops -ftracer -finline-functions  (test for excess 
> errors)
> Excess errors:
> during RTL pass: mach
> /usr/local/gcc/gcc-20170630/gcc/testsuite/c-c++-common/torture/pr53505.c:26:1:
>  internal compiler error: in merge_expr, at sel-sched-ir.c:1886

The patch introduced a new inconsistency in sel-sched-ir.c:

@@ -4747,7 +4747,9 @@ compute_succs_info (insn_t insn, short flags)
   sinfo->probs_ok.safe_push (
/* FIXME: Improve calculation when skipping
inner loop to exits.  */
-si.bb_end ? si.e1->probability : REG_BR_PROB_BASE);
+si.bb_end && si.e1->probability.initialized_p ()
+   ? si.e1->probability.to_reg_br_prob_base ()
+   : REG_BR_PROB_BASE);
   sinfo->succs_ok_n++;
 }
   else
@@ -4756,8 +4758,8 @@ compute_succs_info (insn_t insn, short flags)
   /* Compute all_prob.  */
   if (!si.bb_end)
 sinfo->all_prob = REG_BR_PROB_BASE;
-  else
-sinfo->all_prob += si.e1->probability;
+  else if (si.e1->probability.initialized_p ())
+sinfo->all_prob += si.e1->probability.to_reg_br_prob_base ();

   sinfo->all_succs_n++;
 }

After the change, edges that fail the predicate contribute REG_BR_PROB_BASE in
the upper hunk, but 0 in the lower hunk. Before the change, they contributed 0
in both cases.

The following patch should restore things:

diff --git a/gcc/sel-sched-ir.c b/gcc/sel-sched-ir.c
index dd72828..fa88259 100644
--- a/gcc/sel-sched-ir.c
+++ b/gcc/sel-sched-ir.c
@@ -4747,8 +4747,10 @@ compute_succs_info (insn_t insn, short flags)
   sinfo->probs_ok.safe_push (
/* FIXME: Improve calculation when skipping
inner loop to exits.  */
-si.bb_end && si.e1->probability.initialized_p ()
-   ? si.e1->probability.to_reg_br_prob_base ()
+si.bb_end
+   ? (si.e1->probability.initialized_p ()
+   ? si.e1->probability.to_reg_br_prob_base ()
+   : 0)
: REG_BR_PROB_BASE);
   sinfo->succs_ok_n++;
 }

Re: [PATCH][AArch64] Fix ILP32 memory access

2017-06-30 Thread Andreas Schwab

On Jun 27 2017, Wilco Dijkstra <wilco.dijks...@arm.com> wrote:

> @@ -5207,6 +5209,7 @@ aarch64_print_operand (FILE *f, rtx x, int code)
>  
>   case MEM:
> output_address (GET_MODE (x), XEXP (x, 0));
> +   gcc_assert (GET_MODE (XEXP (x, 0)) == Pmode);
> break;
>  
>   case CONST:

That breaks a lot of gnat tests in ilp32 mode:

spawn -ignore SIGHUP /opt/gcc/gcc-20170630/Build/gcc/gnatmake 
--GCC=/opt/gcc/gcc-20170630/Build/gcc/xgcc 
--GNATBIND=/opt/gcc/gcc-20170630/Build/gcc/gnatbind 
--GNATLINK=/opt/gcc/gcc-20170630/Build/gcc/gnatlink -cargs 
-B/opt/gcc/gcc-20170630/Build/gcc -largs 
--GCC=/opt/gcc/gcc-20170630/Build/gcc/xgcc -B/opt/gcc/gcc-20170630/Build/gcc  
-mabi=ilp32 -margs 
--RTS=/opt/gcc/gcc-20170630/Build/aarch64-suse-linux/ilp32/libada -q -f 
/opt/gcc/gcc-20170630/gcc/testsuite/gnat.dg/abstract_with_anonymous_result.adb 
-mabi=ilp32 -fno-diagnostics-show-caret -fdiagnostics-color=never -lm -o 
./abstract_with_anonymous_result.exe
+===GNAT BUG DETECTED==+
| 8.0.0 20170630 (experimental) [trunk revision 249826] (aarch64-suse-linux) 
GCC error:|
| in aarch64_print_operand, at config/aarch64/aarch64.c:5271   |
| Error detected around 
/opt/gcc/gcc-20170630/gcc/testsuite/gnat.dg/abstract_with_anonymous_result.adb:37:5|
| Please submit a bug report; see https://gcc.gnu.org/bugs/ .  |
| Use a subject line meaningful to you and us to track the bug.|
| Include the entire contents of this bug box in the report.   |
| Include the exact command that you entered.  |
| Also include sources listed below.   |
+==+


Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

Re: [PATCH] [Aarch64] Variable shift count truncation issues

2017-06-30 Thread Andreas Schwab

On Jun 23 2017, Michael Collison  wrote:

> diff --git a/gcc/testsuite/gcc.target/aarch64/var_shift_mask_1.c 
> b/gcc/testsuite/gcc.target/aarch64/var_shift_mask_1.c
> new file mode 100644
> index 000..e2b020e
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/var_shift_mask_1.c
> @@ -0,0 +1,61 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2" } */
> +
> +/* The integer variable shift and rotate instructions truncate their
> +   shift amounts by the datasize.  Make sure that we don't emit a redundant
> +   masking operation.  */
> +
> +unsigned
> +f1 (unsigned x, int y)
> +{
> +  return x << (y & 31);
> +}
> +
> +unsigned long
> +f2 (unsigned long x, int y)
> +{
> +  return x << (y & 63);
> +}
> +
> +unsigned long
> +f3 (unsigned long bit_addr, int y)
> +{
> +  unsigned long bitnumb = bit_addr & 63;
> +  return (1L << bitnumb);
> +}
> +
> +unsigned int
> +f4 (unsigned int x, unsigned int y)
> +{
> +  y &= 31;
> +  return x >> y | (x << (32 - y));
> +}
> +
> +unsigned long
> +f5 (unsigned long x, unsigned long y)
> +{
> +  y &= 63;
> +  return x >> y | (x << (64 - y));
> +}
> +
> +unsigned long
> +f6 (unsigned long x, unsigned long y)
> +{
> +
> +  return (x << (64 - (y & 63)));
> +
> +}
> +
> +unsigned long
> +f7 (unsigned long x, unsigned long y)
> +{
> +  return (x << -(y & 63));
> +}
> +
> +/* { dg-final { scan-assembler-times "lsl\tw\[0-9\]+, w\[0-9\]+, w\[0-9\]+" 
> 1 } } */
> +/* { dg-final { scan-assembler-times "lsl\tx\[0-9\]+, x\[0-9\]+, x\[0-9\]+" 
> 4 } } */
> +/* { dg-final { scan-assembler-times "ror\tw\[0-9\]+, w\[0-9\]+, w\[0-9\]+" 
> 1 } } */
> +/* { dg-final { scan-assembler-times "ror\tx\[0-9\]+, x\[0-9\]+, x\[0-9\]+" 
> 1 } } */
> +/* { dg-final { scan-assembler-not "and\tw\[0-9\]+, w\[0-9\]+, 31" } } */
> +/* { dg-final { scan-assembler-not "and\tx\[0-9\]+, x\[0-9\]+, 63" } } */
> +/* { dg-final { scan-assembler-not "sub\tw\[0-9\]+, w\[0-9\]+, w\[0-9\]+" } 
> } */

That fails with -mabi=ilp32.

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

Re: [C++ PATCH] "decomposition declaration" -> "structured binding" in C++ diagnostics

2017-06-30 Thread Nathan Sidwell


On 06/30/2017 01:26 PM, Jakub Jelinek wrote:

Hi!

As C++17 decided to rename decompositions to structured bindings, this
patch attempts to adjust the diagnostics by replacing
"decomposition declaration" with "structured binding".
Or shall I use "structured binding declaration" instead (or is that too
longer/verbose), or something different?


IMHO 'structured binding' is fine.

nathan

--
Nathan Sidwell

Re: [C++] Fix decomp ICE with invalid initializer (PR c++/81258)

2017-06-30 Thread Nathan Sidwell


On 06/30/2017 01:24 PM, Jakub Jelinek wrote:


The initializer for structured binding has to be one of:
= assignment-expression
( assignment-expression )
{ assignment-expression }
but cp_parser_initializer can parse other forms, with fewer or more
expressions in there.  Some cases we caught with various cryptic errors
or pedwarns, but others we just ICEd on.

The following patch attempts to check this.


ok, but ...


--- gcc/testsuite/g++.dg/cpp1z/decomp21.C.jj2017-01-19 17:01:21.0 
+0100
+++ gcc/testsuite/g++.dg/cpp1z/decomp21.C   2017-06-30 11:07:04.786746784 
+0200
@@ -12,5 +12,5 @@ foo ()
auto [ n, o, p ] { a };
auto [ q, r, t ] ( s );
auto [ u, v, w ] ( s, );  // { dg-error "expected primary-expression before 
'.' token" }
-  auto [ x, y, z ] ( a );   // { dg-error "expression list treated as compound expression 
in initializer" "" { target *-*-* } .-1 }
+  auto [ x, y, z ] ( a );   // { dg-error "invalid initializer for structured 
binding" "" { target *-*-* } .-1 }
  }


The .-1 on the final error is actually about the previous statement, not 
the line it's lexically on.  Could you put it on a line on its own, 
while you're there?


--
Nathan Sidwell

[PATCH] Fix bb-reorder asm goto handling (PR sanitizer/81262)

2017-06-30 Thread Jakub Jelinek

Hi!

The following testcases now ICE on the trunk.  The problem is that
fix_up_fall_thru_edges doesn't notice asm goto does have a fallthru edge
when it has 3 edges and the EDGE_FALLTHRU is only 3rd.  Fixed by using
find_fallthru_edge if we didn't find it among the first 2 edges no matter
what the branch kind is.

Another bug is that the cond_jump variable is not really cleared and thus
once it is set to something on one of the bbs, it could be used later on
completely different bb.  This got fixed by moving the vars into the scopes
where they IMHO belong.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2017-06-30  Jakub Jelinek  

PR sanitizer/81262
* bb-reorder.c (fix_up_fall_thru_edges): Move variable declarations to
the right scopes, make sure cond_jump isn't preserved between multiple
iterations.  Search for fallthru edge whenever there are 3+ edges and
use find_fallthru_edge for it.

* gcc.c-torture/compile/pr81262.c: New test.
* g++.dg/ubsan/pr81262.C: New test.

--- gcc/bb-reorder.c.jj 2017-06-30 09:49:32.0 +0200
+++ gcc/bb-reorder.c2017-06-30 13:31:06.709898101 +0200
@@ -1812,18 +1812,15 @@ static void
 fix_up_fall_thru_edges (void)
 {
   basic_block cur_bb;
-  basic_block new_bb;
-  edge succ1;
-  edge succ2;
-  edge fall_thru;
-  edge cond_jump = NULL;
-  bool cond_jump_crosses;
-  int invert_worked;
-  rtx_insn *old_jump;
-  rtx_code_label *fall_thru_label;
 
   FOR_EACH_BB_FN (cur_bb, cfun)
 {
+  edge succ1;
+  edge succ2;
+  edge fall_thru = NULL;
+  edge cond_jump = NULL;
+  rtx_code_label *fall_thru_label;
+
   fall_thru = NULL;
   if (EDGE_COUNT (cur_bb->succs) > 0)
succ1 = EDGE_SUCC (cur_bb, 0);
@@ -1849,20 +1846,8 @@ fix_up_fall_thru_edges (void)
  fall_thru = succ2;
  cond_jump = succ1;
}
-  else if (succ1
-  && (block_ends_with_call_p (cur_bb)
-  || can_throw_internal (BB_END (cur_bb
-   {
- edge e;
- edge_iterator ei;
-
- FOR_EACH_EDGE (e, ei, cur_bb->succs)
-   if (e->flags & EDGE_FALLTHRU)
- {
-   fall_thru = e;
-   break;
- }
-   }
+  else if (succ2 && EDGE_COUNT (cur_bb->succs) > 2)
+   fall_thru = find_fallthru_edge (cur_bb->succs);
 
   if (fall_thru && (fall_thru->dest != EXIT_BLOCK_PTR_FOR_FN (cfun)))
{
@@ -1873,9 +1858,9 @@ fix_up_fall_thru_edges (void)
  /* The fall_thru edge crosses; now check the cond jump edge, if
 it exists.  */
 
- cond_jump_crosses = true;
- invert_worked  = 0;
- old_jump = BB_END (cur_bb);
+ bool cond_jump_crosses = true;
+ int invert_worked = 0;
+ rtx_insn *old_jump = BB_END (cur_bb);
 
  /* Find the jump instruction, if there is one.  */
 
@@ -1895,12 +1880,13 @@ fix_up_fall_thru_edges (void)
  /* Find label in fall_thru block. We've already added
 any missing labels, so there must be one.  */
 
- fall_thru_label = block_label (fall_thru->dest);
+ rtx_code_label *fall_thru_label
+   = block_label (fall_thru->dest);
 
  if (old_jump && fall_thru_label)
{
- rtx_jump_insn *old_jump_insn =
-   dyn_cast  (old_jump);
+ rtx_jump_insn *old_jump_insn
+   = dyn_cast  (old_jump);
  if (old_jump_insn)
invert_worked = invert_jump (old_jump_insn,
 fall_thru_label, 0);
@@ -1931,7 +1917,7 @@ fix_up_fall_thru_edges (void)
 becomes EDGE_CROSSING.  */
 
  fall_thru->flags &= ~EDGE_CROSSING;
- new_bb = force_nonfallthru (fall_thru);
+ basic_block new_bb = force_nonfallthru (fall_thru);
 
  if (new_bb)
{
--- gcc/testsuite/gcc.c-torture/compile/pr81262.c.jj2017-06-30 
13:30:06.493624559 +0200
+++ gcc/testsuite/gcc.c-torture/compile/pr81262.c   2017-06-30 
13:30:15.000521931 +0200
@@ -0,0 +1,14 @@
+/* PR sanitizer/81262 */
+
+void bar (void) __attribute__((cold, noreturn));
+
+int
+foo (void)
+{
+  asm goto ("" : : : : l1, l2);
+  bar ();
+ l1:
+  return 1;
+ l2:
+  return 0;
+}
--- gcc/testsuite/g++.dg/ubsan/pr81262.C.jj 2017-06-30 13:25:59.339606262 
+0200
+++ gcc/testsuite/g++.dg/ubsan/pr81262.C2017-06-30 13:26:08.563494984 
+0200
@@ -0,0 +1,14 @@
+// PR sanitizer/81262
+// { dg-do compile }
+// { dg-options "-O2 -fsanitize=unreachable" }
+
+int
+foo ()
+{
+  asm goto ("" : : : : l1, l2);
+  __builtin_unreachable ();
+ l1:
+  return 1;
+ l2:
+  return 0;
+}

Jakub

[C++ PATCH] "decomposition declaration" -> "structured binding" in C++ diagnostics

2017-06-30 Thread Jakub Jelinek

Hi!

As C++17 decided to rename decompositions to structured bindings, this
patch attempts to adjust the diagnostics by replacing
"decomposition declaration" with "structured binding".
Or shall I use "structured binding declaration" instead (or is that too
longer/verbose), or something different?

Bootstrapped/regtested on x86_64-linux and i686-linux.

2017-06-30  Jakub Jelinek  

* parser.c (cp_parser_decomposition_declaration): Replace
decomposition declaration with structured binding in diagnostics.
* decl.c (cp_finish_decomp): Likewise.
(grokdeclarator): Likewise.

* g++.dg/cpp1z/decomp1.C: Expect structured binding instead of
decomposition declaration in diagnostics.
* g++.dg/cpp1z/decomp2.C: Likewise.
* g++.dg/cpp1z/decomp3.C: Likewise.
* g++.dg/cpp1z/decomp4.C: Likewise.
* g++.dg/cpp1z/decomp5.C: Likewise.
* g++.dg/cpp1z/decomp6.C: Likewise.
* g++.dg/cpp1z/decomp7.C: Likewise.
* g++.dg/cpp1z/decomp8.C: Likewise.
* g++.dg/cpp1z/decomp13.C: Likewise.
* g++.dg/cpp1z/decomp14.C: Likewise.
* g++.dg/cpp1z/decomp18.C: Likewise.
* g++.dg/cpp1z/decomp19.C: Likewise.
* g++.dg/cpp1z/decomp22.C: Likewise.
* g++.dg/cpp1z/decomp23.C: Likewise.
* g++.dg/cpp1z/decomp24.C: Likewise.
* g++.dg/cpp1z/decomp25.C: Likewise.
* g++.dg/cpp1z/decomp26.C: Likewise.
* g++.dg/cpp1z/decomp28.C: Likewise.

--- gcc/cp/parser.c.jj  2017-06-30 11:03:18.0 +0200
+++ gcc/cp/parser.c 2017-06-30 11:23:26.430710676 +0200
@@ -13136,7 +13136,7 @@ cp_parser_decomposition_declaration (cp_
 }
 
   if (cxx_dialect < cxx1z)
-pedwarn (loc, 0, "decomposition declaration only available with "
+pedwarn (loc, 0, "structured binding only available with "
 "-std=c++1z or -std=gnu++1z");
 
   tree pushed_scope;
@@ -13185,7 +13185,7 @@ cp_parser_decomposition_declaration (cp_
 
   if (v.is_empty ())
 {
-  error_at (loc, "empty decomposition declaration");
+  error_at (loc, "empty structured binding");
   decl = error_mark_node;
 }
 
--- gcc/cp/decl.c.jj2017-06-30 09:49:24.0 +0200
+++ gcc/cp/decl.c   2017-06-30 11:22:59.496040927 +0200
@@ -7486,8 +7486,8 @@ cp_finish_decomp (tree decl, tree first,
 
  if (init == error_mark_node || eltype == error_mark_node)
{
- inform (dloc, "in initialization of decomposition variable %qD",
- v[i]);
+ inform (dloc, "in initialization of structured binding "
+ "variable %qD", v[i]);
  goto error_out;
}
  /* Save the decltype away before reference collapse.  */
@@ -10137,7 +10137,7 @@ grokdeclarator (const cp_declarator *dec
  break;
 
case cdk_decomp:
- name = "decomposition";
+ name = "structured binding";
  break;
 
case cdk_error:
@@ -10591,43 +10591,43 @@ grokdeclarator (const cp_declarator *dec
? declarator->declarator->id_loc : declarator->id_loc);
   if (inlinep)
error_at (declspecs->locations[ds_inline],
- "decomposition declaration cannot be declared %");
+ "structured binding cannot be declared %");
   if (typedef_p)
error_at (declspecs->locations[ds_typedef],
- "decomposition declaration cannot be declared %");
+ "structured binding cannot be declared %");
   if (constexpr_p)
-   error_at (declspecs->locations[ds_constexpr], "decomposition "
- "declaration cannot be declared %");
+   error_at (declspecs->locations[ds_constexpr], "structured "
+ "binding cannot be declared %");
   if (thread_p)
error_at (declspecs->locations[ds_thread],
- "decomposition declaration cannot be declared %qs",
+ "structured binding cannot be declared %qs",
  declspecs->gnu_thread_keyword_p
  ? "__thread" : "thread_local");
   if (concept_p)
error_at (declspecs->locations[ds_concept],
- "decomposition declaration cannot be declared %");
+ "structured binding cannot be declared %");
   switch (storage_class)
{
case sc_none:
  break;
case sc_register:
- error_at (loc, "decomposition declaration cannot be declared "
+ error_at (loc, "structured binding cannot be declared "
"%");
  break;
case sc_static:
- error_at (loc, "decomposition declaration cannot be declared "
+ error_at (loc, "structured binding cannot be declared "
"%");
  break;
case sc_extern:
- error_at (loc,

[C++] Fix decomp ICE with invalid initializer (PR c++/81258)

2017-06-30 Thread Jakub Jelinek

Hi!

The initializer for structured binding has to be one of:
= assignment-expression
( assignment-expression )
{ assignment-expression }
but cp_parser_initializer can parse other forms, with fewer or more
expressions in there.  Some cases we caught with various cryptic errors
or pedwarns, but others we just ICEd on.

The following patch attempts to check this.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2017-06-30  Jakub Jelinek  

PR c++/81258
* parser.c (cp_parser_decomposition_declaration): Diagnose invalid
forms of structured binding initializers.

* g++.dg/cpp1z/decomp21.C (foo): Adjust expected diagnostics.
* g++.dg/cpp1z/decomp30.C: New test.

--- gcc/cp/parser.c.jj  2017-06-30 09:49:25.0 +0200
+++ gcc/cp/parser.c 2017-06-30 11:03:18.526521000 +0200
@@ -13196,6 +13196,15 @@ cp_parser_decomposition_declaration (cp_
   *init_loc = cp_lexer_peek_token (parser->lexer)->location;
   tree initializer = cp_parser_initializer (parser, _direct_init,
_constant_p);
+  if (initializer == NULL_TREE
+ || (TREE_CODE (initializer) == TREE_LIST
+ && TREE_CHAIN (initializer))
+ || (TREE_CODE (initializer) == CONSTRUCTOR
+ && CONSTRUCTOR_NELTS (initializer) != 1))
+   {
+ error_at (loc, "invalid initializer for structured binding");
+ initializer = error_mark_node;
+   }
 
   if (decl != error_mark_node)
{
--- gcc/testsuite/g++.dg/cpp1z/decomp21.C.jj2017-01-19 17:01:21.0 
+0100
+++ gcc/testsuite/g++.dg/cpp1z/decomp21.C   2017-06-30 11:07:04.786746784 
+0200
@@ -12,5 +12,5 @@ foo ()
   auto [ n, o, p ] { a };
   auto [ q, r, t ] ( s );
   auto [ u, v, w ] ( s, );  // { dg-error "expected primary-expression 
before '.' token" }
-  auto [ x, y, z ] ( a );   // { dg-error "expression list treated as 
compound expression in initializer" "" { target *-*-* } .-1 }
+  auto [ x, y, z ] ( a );   // { dg-error "invalid initializer for 
structured binding" "" { target *-*-* } .-1 }
 }
--- gcc/testsuite/g++.dg/cpp1z/decomp30.C.jj2017-06-30 11:09:31.934942575 
+0200
+++ gcc/testsuite/g++.dg/cpp1z/decomp30.C   2017-06-30 11:09:22.0 
+0200
@@ -0,0 +1,12 @@
+// PR c++/81258
+// { dg-options -std=c++1z }
+
+int a[2];
+auto [b, c] (a);
+auto [d, e] { a };
+auto [f, g] = a;
+auto [h, i] ( a, a );  // { dg-error "invalid initializer for structured 
binding" }
+auto [j, k] { a, a };  // { dg-error "invalid initializer for structured 
binding" }
+auto [l, m] = { a };   // { dg-error "deducing from brace-enclosed initializer 
list requires" }
+auto [n, o] {};// { dg-error "invalid initializer for 
structured binding" }
+auto [p, q] ();// { dg-error "invalid initializer for 
structured binding" }

Jakub

[PATCH] stringpool clean

2017-06-30 Thread Nathan Sidwell

The string pool provides 'empty_string', a zero-length string.  But we 
only use it in 3 places, preferring the more obvious "" elsewhere. 
These days we have linkers that can do string table merging, so any size 
optimimization empty_string might have bought us is long gone.  So 
nuking empty_string.


Then the ggc_string allocator decides to use  ggc_internal_cleared_alloc 
before replacing the just-cleared contents with the string being 
allocated.  So let's just call ggc_alloc_atomic.


Finally, there's some curious optimization for strings of one digit.  I 
fail to see the point of that, so also nuked.


I did discover that the dwarf machinery likes creating zero length 
strings, so I kept that optimization.


applied as obvious.

nathan
--
Nathan Sidwell
2017-06-30  Nathan Sidwell  

	* ggc.h (empty_string): Delete.
	* cfgexpand.c (expand_asm_stmt): Use plain "".
	* optabs.c (expand_asm_memory_barrier): Likewise.
	* stringpool.c (empty_string): Delete.
	(digit_vector, digit_string): Delete.
	(ggc_alloc_string): Use plain "", don't optimize single digit
	strings.  Use ggc_alloc_atomic.

Index: cfgexpand.c
===
--- cfgexpand.c	(revision 249835)
+++ cfgexpand.c	(working copy)
@@ -3165,7 +3165,7 @@ expand_asm_stmt (gasm *stmt)
   rtx body = gen_rtx_ASM_OPERANDS ((noutputs == 0 ? VOIDmode
 : GET_MODE (output_rvec[0])),
    ggc_strdup (gimple_asm_string (stmt)),
-   empty_string, 0, argvec, constraintvec,
+   "", 0, argvec, constraintvec,
    labelvec, locus);
   MEM_VOLATILE_P (body) = gimple_asm_volatile_p (stmt);
 
Index: ggc.h
===
--- ggc.h	(revision 249848)
+++ ggc.h	(working copy)
@@ -24,9 +24,6 @@ along with GCC; see the file COPYING3.
 /* Symbols are marked with `ggc' for `gcc gc' so as not to interfere with
an external gc library that might be linked in.  */
 
-/* Constants for general use.  */
-extern const char empty_string[];	/* empty string */
-
 /* Internal functions and data structures used by the GTY
machinery, including the generated gt*.[hc] files.  */
 
Index: optabs.c
===
--- optabs.c	(revision 249835)
+++ optabs.c	(working copy)
@@ -6278,7 +6278,7 @@ expand_asm_memory_barrier (void)
 {
   rtx asm_op, clob;
 
-  asm_op = gen_rtx_ASM_OPERANDS (VOIDmode, empty_string, empty_string, 0,
+  asm_op = gen_rtx_ASM_OPERANDS (VOIDmode, "", "", 0,
  rtvec_alloc (0), rtvec_alloc (0),
  rtvec_alloc (0), UNKNOWN_LOCATION);
   MEM_VOLATILE_P (asm_op) = 1;
Index: stringpool.c
===
--- stringpool.c	(revision 249835)
+++ stringpool.c	(working copy)
@@ -30,18 +30,6 @@ along with GCC; see the file COPYING3.
 #include "coretypes.h"
 #include "tree.h"
 
-/* The "" allocated string.  */
-const char empty_string[] = "";
-
-/* Character strings, each containing a single decimal digit.
-   Written this way to save space.  */
-static const char digit_vector[] = {
-  '0', 0, '1', 0, '2', 0, '3', 0, '4', 0,
-  '5', 0, '6', 0, '7', 0, '8', 0, '9', 0
-};
-
-#define digit_string(d) (digit_vector + ((d) * 2))
-
 struct ht *ident_hash;
 
 static hashnode alloc_node (cpp_hash_table *);
@@ -82,19 +70,16 @@ alloc_node (cpp_hash_table *table ATTRIB
 const char *
 ggc_alloc_string (const char *contents, int length MEM_STAT_DECL)
 {
-  char *result;
-
   if (length == -1)
 length = strlen (contents);
 
-  if (length == 0)
-return empty_string;
-  if (length == 1 && ISDIGIT (contents[0]))
-return digit_string (contents[0] - '0');
+  if (!length)
+return "";
 
-  result = (char *) ggc_internal_cleared_alloc (length + 1 PASS_MEM_STAT);
+  char *result = (char *) ggc_alloc_atomic (length + 1);
   memcpy (result, contents, length);
   result[length] = '\0';
+
   return (const char *) result;
 }

Re: [PATCH, rs6000] Add support to __builtin_cpu_supports() for two new HWCAP2 bits

2017-06-30 Thread Peter Bergner

Segher, any response to my findings below about whether we should
try and share header files with GLIBC?

Peter


On 6/27/17 1:06 PM, Peter Bergner wrote:
> On 6/27/17 11:07 AM, Segher Boessenkool wrote:
>> Not use an installed header, that's not what I'm asking.  Share the
>> source file, i.e., just copy it over from the glibc source tree (it
>> should probably hold the master copy).  Fewer typos, cannot forget to
>> update some entry, etc.
> 
> So the glibc file is:
> 
>   sysdeps/powerpc/bits/hwcap.h
> 
> which contains only the #define PPC_FEATURE[2]_* definitions.
> The GCC file is:
> 
>   gcc/config/rs6000/ppc-auxv.h
> 
> and contains the same #define's as hwcap.h above, plus the additional
> #defines's:
> 
> /* The PLATFORM value stored in the TCB is offset by _DL_FIRST_PLATFORM.  */
> #define _DL_FIRST_PLATFORM 32
> 
> /* AT_PLATFORM bits.  These must match the values defined in GLIBC. */
> #define PPC_PLATFORM_POWER40
> #define PPC_PLATFORM_PPC9701
> #define PPC_PLATFORM_POWER52
> ...
> 
> which match values in glibc's sysdeps/powerpc/dl-procinfo.h, but that
> file contains a lot more than just the defines that we (GCC) doesn't
> want or need.
> 
> ppc-auxv.h also contains the following helper macros that calculate the
> fixed offsets to the TCB slots that glibc initializes, but glibc has
> access to the structs that the slows live in, so they don't need these
> helper macros and hence don't have them:
> 
> /* Thread Control Block (TCB) offsets of the AT_PLATFORM, AT_HWCAP and
>AT_HWCAP2 values.  These must match the values defined in GLIBC.  */
> #define TCB_PLATFORM_OFFSET ((TARGET_64BIT) ? -28764 : -28724)
> #define TCB_HWCAP_BASE_OFFSET ((TARGET_64BIT) ? -28776 : -28736)
> #define TCB_HWCAP1_OFFSET \
>   ((BYTES_BIG_ENDIAN) ? TCB_HWCAP_BASE_OFFSET : TCB_HWCAP_BASE_OFFSET+4)
> #define TCB_HWCAP2_OFFSET \
>   ((BYTES_BIG_ENDIAN) ? TCB_HWCAP_BASE_OFFSET+4 : TCB_HWCAP_BASE_OFFSET)
> #define TCB_HWCAP_OFFSET(ID) \
>   (((ID) == 0) ? TCB_HWCAP1_OFFSET : TCB_HWCAP2_OFFSET)
> 
> These are only used in rs6000.c, so I could move them there.
> 
> So given the above, how do we want to handle this?  If we were to copy a
> header file(s) over from glibc, are we able to modify it in the process?
> Ie, to remove the parts we don't need like hwcap.h's use of:
> 
>   #if !defined(_SYS_AUXV_H) && !defined(_SYSDEPS_SYSDEP_H)
>   # error "Never include  directly; use  instead."
>   #endif
> 
> which would trigger for our use of it.  And also to remove unneeded code from
> dl-procinfo.h, since we only want the #defines.
> 
> Peter
> 
>

Re: [PATCH v2][ASAN] Implement dynamic allocas/VLAs sanitization.

2017-06-30 Thread Maxim Ostapenko


Hi,

On 29/06/17 15:35, Jakub Jelinek wrote:

Hi!

Sorry for the review delay.

On Mon, Jun 26, 2017 at 03:49:23PM +0300, Maxim Ostapenko wrote:

(handle_builtin_stackrestore): Likewise.

The function is called with _ between stack and restore.


* match.pd: Add new pattern.

Unless the patch relies on this, I think it should be posted separately
and reviewed by Richard.


OK, good point, will remove from this patch.




@@ -245,6 +246,7 @@ along with GCC; see the file COPYING3.  If not see
  static unsigned HOST_WIDE_INT asan_shadow_offset_value;
  static bool asan_shadow_offset_computed;
  static vec sanitized_sections;
+static tree last_alloca_addr = NULL_TREE;

You are shadowing this variable in multiple places.  Either rename it to
something different, or rename the results of get_last_alloca_addr.
And the " = NULL_TREE" part is not needed.


Err, thanks, will fix it.



  
  /* Set of variable declarations that are going to be guarded by

 use-after-scope sanitizer.  */
@@ -529,11 +531,183 @@ get_mem_ref_of_assignment (const gassign *assignment,
return true;
  }
  
+/* Return address of last allocated dynamic alloca.  */

+
+static tree
+get_last_alloca_addr ()
+{
+  if (last_alloca_addr)
+return last_alloca_addr;
+
+  gimple_seq seq = NULL;
+  gassign *g;
+
+  last_alloca_addr = create_tmp_reg (ptr_type_node, "last_alloca_addr");
+  g = gimple_build_assign (last_alloca_addr, NOP_EXPR,
+  build_int_cst (ptr_type_node, 0));

Instead of build_int_cst (ptr_type_node, 0) you should use
null_pointer_node.  And the NOP_EXPR there is just wrong, either it
should be gimple_build_assign (last_alloca_addr, null_pointer_node);
or gimple_build_assign (last_alloca_addr, INTEGER_CST, null_pointer_node);


+  gimple_seq_add_stmt_without_update (, g);

Why the seq stuff at all?  You have a single stmt you want to insert on
edge.


Right, will fix it.


+
+  edge e = single_succ_edge (ENTRY_BLOCK_PTR_FOR_FN (cfun));
+  gsi_insert_seq_on_edge_immediate (e, seq);

So just use here
   gsi_insert_on_edge_immediate (e, g);
instead.


+  return last_alloca_addr;
+}
+
+/* Insert __asan_allocas_unpoison (top, bottom) call after
+   __builtin_stack_restore (new_sp) call.
+   The pseudocode of this routine should look like this:
+ __builtin_stack_restore (new_sp);
+ top = last_alloca_addr;
+ bot = virtual_dynamic_stack_rtx;
+ __asan_allocas_unpoison (top, bottom);
+ last_alloca_addr = new_sp;

The comment doesn't seem to agree with what you actually implement.
There is no virtual_dynamic_stack_rtx during the asan pass, it is there
only during expansion until the virtual regs are instantiated in the next
pass.  Furthermore, you have bot variable, but then use bottom.


Right, 'bottom' should be replaced by 'bot'.
Regarding to virtual_dynamic_stactk_rtx - as you correctly noted below, 
second parameter of __asan_allocas_unpoison will be completely rewritten 
in expand_builtin_alloca function by virtual_dynamic_stactk_rtx. Here 
I've just passed new_sp as a dummy argument. This looks hacky, but the 
problem is that for several architectures (e.g. PPC) we cannot use 
new_sp as a 'bot' parameter because new_sp != last_alloca_addr on these 
targets. Originally, I tried to do something like this:


  top = last_alloca_addr;
  bot = new_sp + STACK_DYNAMIC_OFFSET;
  __asan_allocas_unpoison(top, bot);
  last_alloca_addr = bot;

but I was confused by the fact that STACK_DYNAMIC_OFFSET becomes 
available only after expansion to RTL. Rewriting 'bot' with 
virtual_dynamic_stactk_rtx in RTL looks like the most feasible way how 
to overcome this issue for me.





+  tree last_alloca_addr = get_last_alloca_addr ();

Here is the shadowing I talked about.


+  tree restored_stack = gimple_call_arg (call, 0);
+  tree fn = builtin_decl_implicit (BUILT_IN_ASAN_ALLOCAS_UNPOISON);
+  gimple *g = gimple_build_call (fn, 2, last_alloca_addr, restored_stack);

Here you clearly use the first argument of __builtin_stack_restore, which
is that new_sp.


+  gimple_seq_add_stmt_without_update (, g);

Why the messing up with sequences?  Just insert the stmt immediately in,
and the others as well.


+  g = gimple_build_assign (last_alloca_addr, NOP_EXPR, restored_stack);

This is again wrong, here you really don't know what restored_stack is,
it could be SSA_NAME, but also something different, so you should use
gimple_build_assign (last_alloca_addr, restored_stack);
and let it figure out the rhs code.


Thanks, will fix.




+  /* Extract lower bits from old_size.  */
+  wide_int size_nonzero_bits = get_nonzero_bits (old_size);
+  wide_int rz_mask
+= wi::uhwi (redzone_mask, wi::get_precision (size_nonzero_bits));
+  wide_int old_size_lower_bits = wi::bit_and (size_nonzero_bits, rz_mask);
+
+  /* If alloca size is aligned to ASAN_RED_ZONE_SIZE, we don't need partial
+ redzone.  Otherwise, compute its size here.  */
+  if (wi::ne_p (old_size_lower_bits, 0))
+{
+  /*

Re: Convert profile probabilities to new type

2017-06-30 Thread Andreas Schwab

This breaks ia64, a lot of testsuite failures like this:

FAIL: c-c++-common/torture/pr53505.c   -O3 -fomit-frame-pointer -funroll-loops 
-fpeel-loops -ftracer -finline-functions  (internal compiler error)
FAIL: c-c++-common/torture/pr53505.c   -O3 -fomit-frame-pointer -funroll-loops 
-fpeel-loops -ftracer -finline-functions  (test for excess errors)
Excess errors:
during RTL pass: mach
/usr/local/gcc/gcc-20170630/gcc/testsuite/c-c++-common/torture/pr53505.c:26:1: 
internal compiler error: in merge_expr, at sel-sched-ir.c:1886
0x410b897f merge_expr(_expr*, _expr*, rtx_insn*)
../../gcc/sel-sched-ir.c:1886
0x410c809f av_set_union_and_live(_list_node**, _list_node**, 
bitmap_head*, bitmap_head*, rtx_insn*)
../../gcc/sel-sched-ir.c:2278
0x410e430f compute_av_set_at_bb_end
../../gcc/sel-sched.c:2808
0x410e430f compute_av_set_inside_bb
../../gcc/sel-sched.c:2980
0x410e52ff move_op_at_first_insn
../../gcc/sel-sched.c:6070
0x410e889f code_motion_path_driver
../../gcc/sel-sched.c:6659
0x410e90af code_motion_process_successors
../../gcc/sel-sched.c:6346
0x410e90af code_motion_path_driver
../../gcc/sel-sched.c:6612
0x410ea57f move_op
../../gcc/sel-sched.c:6704
0x410ea57f move_exprs_to_boundary
../../gcc/sel-sched.c:5227
0x410ea57f schedule_expr_on_boundary
../../gcc/sel-sched.c:5440
0x410f4e2f fill_insns
../../gcc/sel-sched.c:5582
0x410f4e2f schedule_on_fences
../../gcc/sel-sched.c:7356
0x410f4e2f sel_sched_region_2
../../gcc/sel-sched.c:7494
0x410fe48f sel_sched_region_1
../../gcc/sel-sched.c:7536
0x410fe48f sel_sched_region(int)
../../gcc/sel-sched.c:7637
0x410ff60f run_selective_scheduling()
../../gcc/sel-sched.c:7713
0x41a7e6cf ia64_reorg
../../gcc/config/ia64/ia64.c:9764
0x4105564f execute
../../gcc/reorg.c:3946

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

[gomp5] Fix ICE on clauses-1.c testcase

2017-06-30 Thread Jakub Jelinek

Hi!

Another thing I've missed in the testing (or lack thereof) of the
task reduction parsing patch.  Reduction is a data sharing clause
that is best located on the task construct for the taskloop IL sandwich,
it will be of course added as argument to the GOMP_taskloop call
when implemented later.

Regtested on x86_64-linux, committed to gomp-5_0-branch.

2017-06-30  Jakub Jelinek  

* gimplify.c (gimplify_omp_for): Move OMP_CLAUSE_REDUCTION
and OMP_CLAUSE_IN_REDUCTION from taskloop to the task construct
sandwiched in between two taskloops.

--- gcc/gimplify.c.jj   2017-05-24 14:45:26.0 +0200
+++ gcc/gimplify.c  2017-06-30 17:22:21.401090633 +0200
@@ -10273,6 +10273,8 @@ gimplify_omp_for (tree *expr_p, gimple_s
  case OMP_CLAUSE_FINAL:
  case OMP_CLAUSE_MERGEABLE:
  case OMP_CLAUSE_PRIORITY:
+ case OMP_CLAUSE_REDUCTION:
+ case OMP_CLAUSE_IN_REDUCTION:
*gtask_clauses_ptr = c;
gtask_clauses_ptr = _CLAUSE_CHAIN (c);
break;

Jakub

[GC PATCH] Clean ggc.h

2017-06-30 Thread Nathan Sidwell

ggc.h had a bit of funky formatting and widespread pre-C++ idiom 'static 
inline'.  Fixed as obvious.


template  static inline T *

just looks weird :)

nathan
--
Nathan Sidwell
2017-06-30  Nathan Sidwell  

	* ggc.h: Replace all 'static inline' with plain 'inline'.  Fix
	some formatting.

Index: ggc.h
===
--- ggc.h	(revision 249835)
+++ ggc.h	(working copy)
@@ -127,9 +127,8 @@ extern void *ggc_internal_alloc (size_t,
  size_t CXX_MEM_STAT_INFO)
  ATTRIBUTE_MALLOC;
 
- static inline
- void *
- ggc_internal_alloc (size_t s CXX_MEM_STAT_INFO)
+inline void *
+ggc_internal_alloc (size_t s CXX_MEM_STAT_INFO)
 {
   return ggc_internal_alloc (s, NULL, 0, 1 PASS_MEM_STAT);
 }
@@ -141,8 +140,7 @@ extern void *ggc_internal_cleared_alloc
 	 size_t, size_t
 	 CXX_MEM_STAT_INFO) ATTRIBUTE_MALLOC;
 
-static inline
-void *
+inline void *
 ggc_internal_cleared_alloc (size_t s CXX_MEM_STAT_INFO)
 {
   return ggc_internal_cleared_alloc (s, NULL, 0, 1 PASS_MEM_STAT);
@@ -168,7 +166,7 @@ finalize (void *p)
 }
 
 template
-static inline bool
+inline bool
 need_finalization_p ()
 {
 #if GCC_VERSION >= 4003
@@ -179,7 +177,7 @@ need_finalization_p ()
 }
 
 template
-static inline T *
+inline T *
 ggc_alloc (ALONE_CXX_MEM_STAT_INFO)
 {
   if (need_finalization_p ())
@@ -191,7 +189,7 @@ ggc_alloc (ALONE_CXX_MEM_STAT_INFO)
 }
 
 template
-static inline T *
+inline T *
 ggc_cleared_alloc (ALONE_CXX_MEM_STAT_INFO)
 {
   if (need_finalization_p ())
@@ -204,7 +202,7 @@ ggc_cleared_alloc (ALONE_CXX_MEM_STAT_IN
 }
 
 template
-static inline T *
+inline T *
 ggc_vec_alloc (size_t c CXX_MEM_STAT_INFO)
 {
   if (need_finalization_p ())
@@ -216,7 +214,7 @@ ggc_vec_alloc (size_t c CXX_MEM_STAT_INF
 }
 
 template
-static inline T *
+inline T *
 ggc_cleared_vec_alloc (size_t c CXX_MEM_STAT_INFO)
 {
   if (need_finalization_p ())
@@ -229,7 +227,7 @@ ggc_cleared_vec_alloc (size_t c CXX_MEM_
 			 0, 0 PASS_MEM_STAT));
 }
 
-static inline void *
+inline void *
 ggc_alloc_atomic (size_t s CXX_MEM_STAT_INFO)
 {
 return ggc_internal_alloc (s PASS_MEM_STAT);
@@ -274,52 +272,52 @@ extern void init_ggc_heuristics (void);
 
 /* Memory statistics passing versions of some allocators.  Too few of them to
make gengtype produce them, so just define the needed ones here.  */
-static inline struct rtx_def *
+inline struct rtx_def *
 ggc_alloc_rtx_def_stat (size_t s CXX_MEM_STAT_INFO)
 {
   return (struct rtx_def *) ggc_internal_alloc (s PASS_MEM_STAT);
 }
 
-static inline union tree_node *
+inline union tree_node *
 ggc_alloc_tree_node_stat (size_t s CXX_MEM_STAT_INFO)
 {
   return (union tree_node *) ggc_internal_alloc (s PASS_MEM_STAT);
 }
 
-static inline union tree_node *
+inline union tree_node *
 ggc_alloc_cleared_tree_node_stat (size_t s CXX_MEM_STAT_INFO)
 {
   return (union tree_node *) ggc_internal_cleared_alloc (s PASS_MEM_STAT);
 }
 
-static inline gimple *
+inline gimple *
 ggc_alloc_cleared_gimple_statement_stat (size_t s CXX_MEM_STAT_INFO)
 {
   return (gimple *) ggc_internal_cleared_alloc (s PASS_MEM_STAT);
 }
 
-static inline void
+inline void
 gt_ggc_mx (const char *s)
 {
   ggc_test_and_set_mark (const_cast (s));
 }
 
-static inline void
+inline void
 gt_pch_nx (const char *)
 {
 }
 
-static inline void
+inline void
 gt_ggc_mx (int)
 {
 }
 
-static inline void
+inline void
 gt_pch_nx (int)
 {
 }
 
-static inline void
+inline void
 gt_pch_nx (unsigned int)
 {
 }

Re: [PATCH GCC][2/2]Refine CFG and bound information for split loops

2017-06-30 Thread Jeff Law

On 06/14/2017 07:08 AM, Bin Cheng wrote:
> Hi,
> Loop split currently generates below control flow graph for split loops:
> +
> +   .-- guard1  --.
> +   v v
> + pre1(loop1).-->pre2(loop2)
> +  | ||
> +.--->h1 |   h2<.
> +| | || |
> +|ex1---.|   .---ex2|
> +|/ v|   | \|
> +'---l1 guard2---'   | l2---'
> +   ||
> +   ||
> +   '--->join<---'
> +
> In which,
> +   LOOP2 is the second loop after split, GUARD1 and GUARD2 are the two bbs
> +   controling if loop2 should be executed.
> 
> Take added test case as an example, the second loop only iterates for 1 time,
> as a result, the CFG and loop niter bound information can be refined.  In 
> general,
> guard1/guard2 can be folded to true/false if loop2's niters is known at 
> compilation
> time.  This patch does such improvement by analyzing and refining niters of
> loop2; as well as using that information to simplify CFG.  With this patch,
> the second split loop as in test can be completely unrolled by later passes.
In your testcase the second loop iterates precisely once, right?  In
fact, we know it always executes precisely one time regardless of the
incoming value of LEN.  That's the property you're trying to detect and
exploit, right?


> 
> Bootstrap and test on x86_64 and AArch64.  Is it OK?
> 
> Thanks,
> bin
> 2017-06-12  Bin Cheng  
> 
>   * tree-ssa-loop-split.c (compute_new_first_bound): Compute and
>   return bound information for the second split loop.
>   (adjust_loop_split): New function.
>   (split_loop): Update calls to above functions.
> 
> gcc/testsuite/ChangeLog
> 2017-06-12  Bin Cheng  
> 
>   * gcc.dg/loop-split-1.c: New test.
> 
> 
> 0002-lsplit-refine-cfg-niter-bound-20170601.txt
> 
> 
> From 61855c74c7db6178008f007198aedee9a03f13e6 Mon Sep 17 00:00:00 2001
> From: amker 
> Date: Sun, 4 Jun 2017 02:26:34 +0800
> Subject: [PATCH 2/2] lsplit-refine-cfg-niter-bound-20170601.txt
> 
> ---
>  gcc/testsuite/gcc.dg/loop-split-1.c |  16 +++
>  gcc/tree-ssa-loop-split.c   | 197 
> 
>  2 files changed, 194 insertions(+), 19 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/loop-split-1.c
> 
> diff --git a/gcc/tree-ssa-loop-split.c b/gcc/tree-ssa-loop-split.c
> index f8fe8e6..73c7dc2 100644
> --- a/gcc/tree-ssa-loop-split.c
> +++ b/gcc/tree-ssa-loop-split.c
> @@ -387,28 +386,41 @@ connect_loops (struct loop *loop1, struct loop *loop2)
>  
> Depending on the direction of the IVs and if the exit tests
> are strict or non-strict we need to use MIN or MAX,
> -   and add or subtract 1.  This routine computes newend above.  */
> +   and add or subtract 1.  This routine computes newend above.
> +
> +   After computing the new bound (on j), we may be able to compare the
> +   first loop's iteration space against the original loop's.  If it is
> +   comparable at compilation time, we may be able to compute the niter
> +   bound of the second loop.  Record the second loop's iteration bound
> +   to SECOND_LOOP_NITER_BOUND which has below meaning:
> +
> + -3: Don't know anything about the second loop;
> + -2: The second loop must not be executed;
> + -1: The second loop must be executed, but niter bound is unknown;
> +  n: The second loop must be executed, niter bound is n (>= 0);
> +
> +   Note we compute niter bound for the second loop's latch.  */
How hard would it be to have a test for each of the 4 cases above?  I
don't always ask for this, but ISTM this is a good chance to make sure
most of the new code gets covered by testing.

Does case -2 really occur in practice or are you just being safe?  ISTM
that if case -2 occurs then the loop shouldn't have been split to begin
with.  If this happens in practice, do we clean up all the dead code if
the bounds are set properly.

Similarly for N = 0, presumably meaning the second loop iterates exactly
once, but never traverses its backedge, is there any value in changing
the loop latch test so that it's always false?  Or will that get cleaned
up later?


>  
>  static tree
> -compute_new_first_bound (gimple_seq *stmts, struct tree_niter_desc *niter,
> -  tree border,
> -  enum tree_code guard_code, tree guard_init)
> +compute_new_first_bound (struct tree_niter_desc *niter, tree border,
> +  enum tree_code guard_code, tree guard_init,
> +  tree step, HOST_WIDE_INT *second_loop_niter_bound)
>  {
> -  /* The niter structure contains the after-increment IV, we need
> - the loop-enter base, so subtract STEP once.  */
>tree controlbase = niter->control.base;
>tree

Re: [PATCH,rs6000] PR80103: Fix typo in test case

2017-06-30 Thread Peter Bergner

On 6/30/17 10:56 AM, Kelvin Nilsen wrote:
> While reviewing regression test results for a back port of the PR80103
> patch, I discovered a typographic error in the test case.  This patch
> corrects the error.
> 
> I have tested this fix on powerpc64le-unknown-linux-gnu with no
> regressions.  Is this ok for trunk?

This falls under the "obvious" rule.  Go ahead.

Peter

[PATCH, 6/4] Handle GOMP_OPENACC_NVPTX_JIT=-arch= in libgomp nvptx plugin

2017-06-30 Thread Tom de Vries

[ was: Re: [PATCH, 0/4] Handle 
GOMP_OPENACC_NVPTX_{DISASM,SAVE_TEMPS,JIT} in libgomp nvptx plugin ]

On 06/26/2017 01:24 PM, Tom de Vries wrote:

Hi,

I've written a patch series to facilitate debugging libgomp openacc 
testcase failures on the nvptx accelerator.



When running an openacc test-case on an nvptx accelerator, the following 
happens:

- the plugin obtains the ptx assembly for the acceleration kernels
- it calls the cuda jit to compile and link the ptx into a module
- it loads the module
- it starts an acceleration kernel



This patch adds handling of GOMP_OPENACC_NVPTX_JIT=-arch= in libgomp 
nvptx plugin.


F.i. GOMP_OPENACC_NVPTX_JIT=-arch=60 for sm_60.

Thanks,
- Tom
libgomp/ChangeLog:

2017-06-30  Tom de Vries  

	* plugin/plugin-nvptx.c (parse_number):
	(process_GOMP_OPENACC_NVPTX_JIT):
	(link_ptx):


Handle GOMP_OPENACC_NVPTX_JIT=-arch= in libgomp nvptx plugin

---
 libgomp/plugin/plugin-nvptx.c | 40 ++--
 1 file changed, 38 insertions(+), 2 deletions(-)

diff --git a/libgomp/plugin/plugin-nvptx.c b/libgomp/plugin/plugin-nvptx.c
index 365c787..4cca0c7 100644
--- a/libgomp/plugin/plugin-nvptx.c
+++ b/libgomp/plugin/plugin-nvptx.c
@@ -983,9 +983,24 @@ debug_linkout (void *linkout, size_t linkoutsize)
 }
 }
 
+static bool
+parse_number (const char *c, unsigned long* resp, char **end)
+{
+  unsigned long res;
+
+  errno = 0;
+  res = strtoul (c, end, 10);
+  if (errno)
+return false;
+
+  *resp = res;
+  return true;
+}
+
 static void
 process_GOMP_OPENACC_NVPTX_JIT (intptr_t *gomp_openacc_nvptx_o,
-intptr_t *gomp_openacc_nvptx_ori)
+intptr_t *gomp_openacc_nvptx_ori,
+uintptr_t *gomp_openacc_nvptx_target)
 {
   const char *var_name = "GOMP_OPENACC_NVPTX_JIT";
   const char *env_var = getenv (var_name);
@@ -1019,6 +1034,19 @@ process_GOMP_OPENACC_NVPTX_JIT (intptr_t *gomp_openacc_nvptx_o,
 	  continue;
 	}
 
+  if (c[0] == '-' && c[1] == 'a' && c[2] == 'r' && c[3] == 'c'
+	  && c[4] == 'h' && c[5] == '=')
+	{
+	  const char *end;
+	  unsigned long val;
+	  if (parse_number ([6], , (char**)))
+	{
+	  *gomp_openacc_nvptx_target = val;
+	  c = end;
+	  continue;
+	}
+	}
+
   GOMP_PLUGIN_error ("Error parsing %s", var_name);
   break;
 }
@@ -1183,9 +1211,11 @@ link_ptx (CUmodule *module, const struct targ_ptx_obj *ptx_objs,
 
   static intptr_t gomp_openacc_nvptx_o = -1;
   static intptr_t gomp_openacc_nvptx_ori = -1;
+  static uintptr_t gomp_openacc_nvptx_target = 0;
   if (gomp_openacc_nvptx_o == -1)
 process_GOMP_OPENACC_NVPTX_JIT (_openacc_nvptx_o,
-_openacc_nvptx_ori);
+_openacc_nvptx_ori,
+_openacc_nvptx_target);
 
   opts[6] = CU_JIT_OPTIMIZATION_LEVEL;
   optvals[6] = (void *) gomp_openacc_nvptx_o;
@@ -1197,6 +1227,12 @@ link_ptx (CUmodule *module, const struct targ_ptx_obj *ptx_objs,
   optvals[nopts] = (void *) gomp_openacc_nvptx_ori;
   nopts++;
 }
+  if (gomp_openacc_nvptx_target)
+{
+  opts[nopts] = CU_JIT_TARGET;
+  optvals[nopts] = (void *) gomp_openacc_nvptx_target;
+  nopts++;
+}
 
   CUDA_CALL (cuLinkCreate, nopts, opts, optvals, );

Re: [PATCH] make find_taken_edge handle case with just default

2017-06-30 Thread Peter Bergner

On 6/30/17 3:27 AM, Richard Biener wrote:
> On Thu, 29 Jun 2017, Peter Bergner wrote:
>> On 6/29/17 8:58 AM, Richard Biener wrote:
>>> We can also merge both loops, counting new_size upwards, storing
>>> to label new_size if new_size != i ...
>>
>> Like this.  I'm bootstrapping and regtesting the following patch on
>> powerpc64le-linux.  Ok if it survives?
> 
> Looks good to me.

Bootstrap / regtest was clean on both powerpc64le-linux and x86_64-linux.
Committed as revision 249847.  Thanks!

Peter

Re: [PATCH, 5/4] Handle GOMP_OPENACC_NVPTX_PTXRW in libgomp nvptx plugin

2017-06-30 Thread Tom de Vries


On 06/27/2017 11:16 AM, Tom de Vries wrote:
[ was: Re: [PATCH, 0/4] Handle 
GOMP_OPENACC_NVPTX_{DISASM,SAVE_TEMPS,JIT} in libgomp nvptx plugin ]


On 06/26/2017 01:24 PM, Tom de Vries wrote:

Hi,

I've written a patch series to facilitate debugging libgomp openacc 
testcase failures on the nvptx accelerator.



When running an openacc test-case on an nvptx accelerator, the 
following happens:

- the plugin obtains the ptx assembly for the acceleration kernels
- it calls the cuda jit to compile and link the ptx into a module
- it loads the module
- it starts an acceleration kernel


A typical scenario when developing the compiler is:
- run gcc test.c -save-temps
- run a.out
- edit test.s to fix bug or make code faster or smaller
- run gcc test.s
- run a.out
- edit compiler sources to make the compiler do the same as the .s edit

With openacc test-cases, this scenario is currently not available for 
ptx assembly. Using -save-temps -foffload=-save-temps we can get a .s 
containing ptx. But to insert the edited .s back into the compilation 
flow is difficult.


This patch facilitates such a scenario in the nvptx plugin.
- we define GOMP_OPENACC_NVPTX_PTXRW == 'w', and the plugin writes the
   ptx assembly into a series of files
- we edit one of those files
- we define GOMP_OPENACC_NVPTX_PTXRW == 'r', and the plugin reads the
   ptx assembly back from those files, and uses that instead of the ptx
   in the executable.

I've tested this patch series on top of gomp-4_0-branch, by running an
openacc testcase from the command line and going through the 
write-edit-readscenario with an observable ptx edit.


OK for trunk if bootstrap and reg-test on x86_64 with nvidia accelerator
succeeds?



This updated patch fixes a trivial build error.

Thanks,
- Tom

Handle GOMP_OPENACC_NVPTX_PTXRW in libgomp nvptx plugin

2017-06-27  Tom de Vries  

	* plugin/plugin-nvptx.c (post_process_ptx): New function.
	(link_ptx): Call post_process_ptx.

---
 libgomp/plugin/plugin-nvptx.c | 132 +-
 1 file changed, 130 insertions(+), 2 deletions(-)

diff --git a/libgomp/plugin/plugin-nvptx.c b/libgomp/plugin/plugin-nvptx.c
index 41ecfec..365c787 100644
--- a/libgomp/plugin/plugin-nvptx.c
+++ b/libgomp/plugin/plugin-nvptx.c
@@ -1024,6 +1024,131 @@ process_GOMP_OPENACC_NVPTX_JIT (intptr_t *gomp_openacc_nvptx_o,
 }
 }
 
+/* If environment variable GOMP_OPENACC_NVPTX_PTXRW=[Ww], write *RES_CODE to
+   file plugin-nvptx..ptx.  If it is [Rr], read *RES_CODE from file
+   instead.  */
+
+static void
+post_process_ptx (unsigned num, const char **res_code, size_t *res_size)
+{
+  static int gomp_openacc_nvptx_ptxrw = -1;
+
+  if (gomp_openacc_nvptx_ptxrw == -1)
+{
+  const char *var_name = "GOMP_OPENACC_NVPTX_PTXRW";
+  const char *env_var = secure_getenv (var_name);
+  notify_var (var_name, env_var);
+
+  gomp_openacc_nvptx_ptxrw = 0;
+  if (env_var == NULL)
+	;
+  else if ((env_var[0] == 'w' || env_var[0] == 'W')
+	   && env_var[1] == '\0')
+	gomp_openacc_nvptx_ptxrw = 1;
+  else if ((env_var[0] == 'r' || env_var[0] == 'R')
+	   && env_var[1] == '\0')
+	gomp_openacc_nvptx_ptxrw = 2;
+  else
+	GOMP_PLUGIN_error ("Error parsing %s", var_name);
+}
+
+  if (gomp_openacc_nvptx_ptxrw == 0)
+return;
+
+  const char *code = *res_code;
+  size_t size = *res_size;
+
+  const char *prefix = "plugin-nvptx.";
+  const char *postfix = ".ptx";
+  const int len =	(strlen (prefix)
+			 + 10 /* %u.  */
+			 + strlen (postfix)
+			 + 1  /* '\0'.  */);
+  char file_name[len];
+  int res = snprintf (file_name, len, "%s%u%s", prefix,
+		  num, postfix);
+  assert (res < len); /* Assert there's no truncation.  */
+
+  GOMP_PLUGIN_debug (0, "%s %s \n",
+		 (gomp_openacc_nvptx_ptxrw == 1 ? "Writing" : "Reading"),
+		 file_name);
+
+  if (gomp_openacc_nvptx_ptxrw == 1)
+{
+  FILE *ptx_file = fopen (file_name, "w");
+  if (ptx_file == NULL)
+	{
+	  GOMP_PLUGIN_debug (0, "Opening %s failed\n", file_name);
+	  return;
+	}
+
+  int res = fprintf (ptx_file, "%s", code);
+  unsigned int write_succeeded = res == size - 1;
+  if (!write_succeeded)
+	GOMP_PLUGIN_debug (0,
+			   "Writing %s failed: written %d but expected %zu\n",
+			   file_name, res, size - 1);
+
+  res = fclose (ptx_file);
+  if (res != 0)
+	GOMP_PLUGIN_debug (0, "Closing %s failed\n", file_name);
+
+  return;
+}
+
+  if (gomp_openacc_nvptx_ptxrw == 2)
+{
+  FILE *ptx_file = fopen (file_name, "r");
+  if (ptx_file == NULL)
+	{
+	  GOMP_PLUGIN_debug (0, "Opening %s failed\n", file_name);
+	  return;
+	}
+
+  if (fseek (ptx_file, 0L, SEEK_END) != 0)
+	{
+	  GOMP_PLUGIN_debug (0, "Seeking end of %s failed\n", file_name);
+	  return;
+	}
+
+  long bufsize = ftell (ptx_file);
+  if (bufsize == -1)
+	{
+	  GOMP_PLUGIN_debug (0, "ftell of %s failed\n", file_name);
+	  return;
+	}
+
+  if (fseek (ptx_file, 0L, SEEK_SET)

Re: [PATCH 1/2] Support C++-specific selftests

2017-06-30 Thread Jeff Law

On 06/22/2017 06:20 PM, David Malcolm wrote:
> Currently the "make selftest" target run during each stage of the
> build just runs the selftests within cc1.
> 
> As part of the fix for PR c++/81167 I want to be able to write
> C++-specific selftests (to exercise the C++ implementation of
> the pp's format_decoder).
> 
> Hence this patch generalizes the existing selftest logic in
> Makefile.in so that selftests are run for both cc1 and cc1plus.
> It also adds convenience methods for running them under
> gdb and under valgrind.
> 
> It also reorganizes the existing lang-specific selftests, splitting
> them into ones shared by c-family, and those that are specific
> to individual members of the c family.  There aren't any of the
> latter yet, but the followup adds some for C++.
> 
> Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu in
> combination with the followup patch; the number of reported passes
> from -fself-test for cc1 at each stage was unaffected by the kit:
>   -fself-test: 39233 pass(es),
> and cc1plus gained a similar line of output.
> 
> OK for trunk?
> 
> gcc/ChangeLog:
>   * Makefile.in (SELFTEST_FLAGS): Drop "-x c", moving it to...
>   (C_SELFTEST_FLAGS): New.
>   (CPP_SELFTEST_FLAGS): New.
>   (SELFTEST_DEPS): New, from deps of s-selftest.
>   (C_SELFTEST_DEPS): New, from deps of s-selftest.
>   (CPP_SELFTEST_DEPS): New.
>   (selftest): Add dependency on s-selftest-c++.
>   (s-selftest): Rename to...
>   (s-selftest-c): ...this, moving deps to SELFTEST_DEPS
>   and C_SELFTEST_DEPS, and using C_SELFTEST_FLAGS rather
>   than SELFTEST_FLAGS.
>   (selftest-gdb): Rename to...
>   (selftest-c-gdb): ...this, using C_SELFTEST_DEPS and
>   C_SELFTEST_FLAGS.
>   (selftest-gdb): Reintroduce as an alias for selftest-c-gdb.
>   (selftest-valgrind): Rename to...
>   (selftest-c-valgrind): ...this, using C_SELFTEST_DEPS and
>   C_SELFTEST_FLAGS.
>   (selftest-valgrind): Reintroduce as an alias for
>   selftest-c-valgrind.
>   (s-selftest-c++): New.
>   (selftest-c++-gdb): New.
>   (selftest-c++-valgrind): New.
> 
> gcc/c-family/ChangeLog:
>   * c-common.c (selftest::c_family_tests): New.
>   * c-common.h (selftest::run_c_tests): Move decl to c/c-lang.h.
>   (selftest::c_family_tests): New decl.
> 
> gcc/c/ChangeLog:
>   * c-lang.c (selftest::run_c_tests): Move body to c_family_tests,
>   and call that instead.
>   * c-tree.h (selftest::run_c_tests): New decl.
> 
> gcc/cp/ChangeLog:
>   * cp-lang.c (LANG_HOOKS_RUN_LANG_SELFTESTS): Define as
>   selftest::run_cp_tests.
>   (selftest::run_cp_tests): New function.
>   * cp-tree.h (selftest::run_cp_tests): New decl.
OK.
jeff

[PATCH,rs6000] PR80103: Fix typo in test case

2017-06-30 Thread Kelvin Nilsen


While reviewing regression test results for a back port of the PR80103
patch, I discovered a typographic error in the test case.  This patch
corrects the error.

I have tested this fix on powerpc64le-unknown-linux-gnu with no
regressions.  Is this ok for trunk?

gcc/testsuite/ChangeLog:

2017-06-30  Kelvin Nilsen  

PR target/80103
* gcc.target/powerpc/pr80103-1.c (b): Correct spelling of
__attribute__.


Index: gcc/testsuite/gcc.target/powerpc/pr80103-1.c
===
--- gcc/testsuite/gcc.target/powerpc/pr80103-1.c(revision 249798)
+++ gcc/testsuite/gcc.target/powerpc/pr80103-1.c(working copy)
@@ -12,5 +12,5 @@
 int a;
 void b (__attribute__ ((__vector_size__ (16))) char c)
 {
-  a = ((__attributes__ ((__vector_size__ (2 * sizeof (long long) c)[0];
+  a = ((__attribute__ ((__vector_size__ (2 * sizeof (long long) c)[0];
 }

Re: [2/2] PR 80769: Incorrect strlen optimisation

2017-06-30 Thread Jakub Jelinek

On Tue, May 16, 2017 at 09:02:08AM +0100, Richard Sandiford wrote:
> 2017-05-16  Richard Sandiford  
> 
> gcc/
>   PR tree-optimization/80769
>   * tree-ssa-strlen.c (strinfo): Document that "stmt" is also used
>   for malloc and calloc.  Document the new invariant that all related
>   strinfos have delayed lengths or none do.
>   (verify_related_strinfos): Move earlier in file.
>   (set_endptr_and_length): New function, split out from...
>   (get_string_length): ...here.  Also set the lengths of related
>   strinfos.
>   (zero_length_string): Assert that chainsi has known (rather than
>   delayed) lengths.
>   (adjust_related_strinfos): Likewise.
> 
> gcc/testsuite/
>   PR tree-optimization/80769
>   * gcc.dg/strlenopt-31.c: New test.
>   * gcc.dg/strlenopt-31g.c: Likewise.

Ok for trunk, sorry for the delay.  I assume 7.x is not affected, right?

Jakub

Re: [PATCH, 4/4] Handle GOMP_OPENACC_NVPTX_JIT=-ori in libgomp nvptx plugin

2017-06-30 Thread Tom de Vries


On 06/26/2017 01:44 PM, Tom de Vries wrote:

On 06/26/2017 01:24 PM, Tom de Vries wrote:

Hi,

I've written a patch series to facilitate debugging libgomp openacc 
testcase failures on the nvptx accelerator.



When running an openacc test-case on an nvptx accelerator, the 
following happens:

- the plugin obtains the ptx assembly for the acceleration kernels
- it calls the cuda jit to compile and link the ptx into a module
- it loads the module
- it starts an acceleration kernel

The patch series adds these environment variables:
- GOMP_OPENACC_NVPTX_SAVE_TEMPS: a means to save the resulting module
   such that it can be investigated using nvdisasm and cuobjdump.
- GOMP_OPENACC_NVPTX_DISASM: a means to see the resulting module in
   the debug output,  by writing it into a file and calling nvdisasm on
   it
- GOMP_OPENACC_NVPTX_JIT: a means to set parameters of the
   compilation/linking process, currently supporting:
   * -O[0-4], mapping onto CU_JIT_OPTIMIZATION_LEVEL
   * -ori, mapping onto CU_JIT_NEW_SM3X_OPT


The patch series consists of these patches:

4. Handle GOMP_OPENACC_NVPTX_JIT=-ori in libgomp nvptx plugin


This patch adds handling of GOMP_OPENACC_NVPTX_JIT=-ori.

Thanks,
- Tom

0004-Handle-GOMP_OPENACC_NVPTX_JIT-ori-in-libgomp-nvptx-plugin.patch




-  CU_JIT_LOG_VERBOSE = 12
+  CU_JIT_LOG_VERBOSE = 12,
+  CU_JIT_NEW_SM3X_OPT = 15
  } CUjit_option;


Adding the constant to plugin/cuda/cuda.h makes sure the constant is 
available when not linking the plugin against cuda.


But when linking against cuda 7.5 and earlier, this still fails because 
the constant is not available yet in cuda.h. Fixed by hardcoding the 
constant if not available in the cuda version.


Thanks,
- Tom
Handle GOMP_OPENACC_NVPTX_JIT=-ori in libgomp nvptx plugin

2017-06-26  Tom de Vries  

	* plugin/cuda/cuda.h (enum CUjit_option): Add CU_JIT_NEW_SM3X_OPT.
	* plugin/plugin-nvptx.c (process_GOMP_OPENACC_NVPTX_JIT): Add
	gomp_openacc_nvptx_ori parameter.  Handle -ori.
	(link_ptx): Add CU_JIT_NEW_SM3X_OPT to opts.

---
 libgomp/plugin/cuda/cuda.h|  3 ++-
 libgomp/plugin/plugin-nvptx.c | 34 +-
 2 files changed, 31 insertions(+), 6 deletions(-)

diff --git a/libgomp/plugin/cuda/cuda.h b/libgomp/plugin/cuda/cuda.h
index 75dfe3d..4644870 100644
--- a/libgomp/plugin/cuda/cuda.h
+++ b/libgomp/plugin/cuda/cuda.h
@@ -89,7 +89,8 @@ typedef enum {
   CU_JIT_ERROR_LOG_BUFFER = 5,
   CU_JIT_ERROR_LOG_BUFFER_SIZE_BYTES = 6,
   CU_JIT_OPTIMIZATION_LEVEL = 7,
-  CU_JIT_LOG_VERBOSE = 12
+  CU_JIT_LOG_VERBOSE = 12,
+  CU_JIT_NEW_SM3X_OPT = 15
 } CUjit_option;
 
 typedef enum {
diff --git a/libgomp/plugin/plugin-nvptx.c b/libgomp/plugin/plugin-nvptx.c
index 594ca39..41ecfec 100644
--- a/libgomp/plugin/plugin-nvptx.c
+++ b/libgomp/plugin/plugin-nvptx.c
@@ -143,6 +143,10 @@ init_cuda_lib (void)
 
 #include "secure_getenv.h"
 
+#if CUDA_VERSION < 8000
+#define CU_JIT_NEW_SM3X_OPT 15
+#endif
+
 /* Convenience macros for the frequently used CUDA library call and
error handling sequence as well as CUDA library calls that
do the error checking themselves or don't do it at all.  */
@@ -980,13 +984,15 @@ debug_linkout (void *linkout, size_t linkoutsize)
 }
 
 static void
-process_GOMP_OPENACC_NVPTX_JIT (intptr_t *gomp_openacc_nvptx_o)
+process_GOMP_OPENACC_NVPTX_JIT (intptr_t *gomp_openacc_nvptx_o,
+intptr_t *gomp_openacc_nvptx_ori)
 {
   const char *var_name = "GOMP_OPENACC_NVPTX_JIT";
   const char *env_var = getenv (var_name);
   notify_var (var_name, env_var);
 
   *gomp_openacc_nvptx_o = 4;
+  *gomp_openacc_nvptx_ori = 0;
   if (env_var == NULL)
 return;
 
@@ -1005,6 +1011,14 @@ process_GOMP_OPENACC_NVPTX_JIT (intptr_t *gomp_openacc_nvptx_o)
 	  continue;
 	}
 
+  if (c[0] == '-' && c[1] == 'o' && c[2] == 'r' && c[3] == 'i'
+	  && (c[4] == '\0' || c[4] == ' '))
+	{
+	  *gomp_openacc_nvptx_ori = 1;
+	  c += 4;
+	  continue;
+	}
+
   GOMP_PLUGIN_error ("Error parsing %s", var_name);
   break;
 }
@@ -1014,8 +1028,8 @@ static bool
 link_ptx (CUmodule *module, const struct targ_ptx_obj *ptx_objs,
 	  unsigned num_objs)
 {
-  CUjit_option opts[7];
-  void *optvals[7];
+  CUjit_option opts[8];
+  void *optvals[8];
   float elapsed = 0.0;
   char elog[1024];
   char ilog[16384];
@@ -1043,13 +1057,23 @@ link_ptx (CUmodule *module, const struct targ_ptx_obj *ptx_objs,
   optvals[5] = (void *) 1;
 
   static intptr_t gomp_openacc_nvptx_o = -1;
+  static intptr_t gomp_openacc_nvptx_ori = -1;
   if (gomp_openacc_nvptx_o == -1)
-process_GOMP_OPENACC_NVPTX_JIT (_openacc_nvptx_o);
+process_GOMP_OPENACC_NVPTX_JIT (_openacc_nvptx_o,
+_openacc_nvptx_ori);
 
   opts[6] = CU_JIT_OPTIMIZATION_LEVEL;
   optvals[6] = (void *) gomp_openacc_nvptx_o;
 
-  CUDA_CALL (cuLinkCreate, 7, opts, optvals, );
+  int nopts = 7;
+  if (gomp_openacc_nvptx_ori)
+{
+  opts[nopts] = CU_JIT_NEW_SM3X_OPT;
+  optvals[nopts] = (void *) gomp_openacc_nvptx_ori;
+

Re: [PATCH, GCC/ARM, gcc-5-branch, ping2] Fix gcc.target/arm/fpscr.c

2017-06-30 Thread Kyrill Tkachov



On 30/06/17 16:41, Thomas Preudhomme wrote:

Ping?

Best regards,

Thomas

On 28/06/17 12:35, Thomas Preudhomme wrote:

Ping?

Best regards,

Thomas

On 26/06/17 12:32, Thomas Preudhomme wrote:

Hi,

As raised by Christophe Lyon, fpscr.c FAILs because arm_fp_ok and arm_fp
are not defined in GCC 5. This commit changes the test to use the same
recipe as gcc.target/arm/cmp-2.c

ChangeLog entry is as follows:


*** gcc/testsuite/ChangeLog ***

2017-06-26  Thomas Preud'homme 

 * gcc.target/arm/fpscr.c: Require arm_vfp_ok instead of arm_fp_ok and
 add -mfpu=vfp -mfloat-abi=softfp instead of fp_ok options.


Ok for GCC 5?



Ok.
Thanks,
Kyrill


Best regards,

Thomas

Re: [PATCH, GCC/ARM, gcc-5-branch, ping2] Fix gcc.target/arm/fpscr.c

2017-06-30 Thread Thomas Preudhomme


Ping?

Best regards,

Thomas

On 28/06/17 12:35, Thomas Preudhomme wrote:

Ping?

Best regards,

Thomas

On 26/06/17 12:32, Thomas Preudhomme wrote:

Hi,

As raised by Christophe Lyon, fpscr.c FAILs because arm_fp_ok and arm_fp
are not defined in GCC 5. This commit changes the test to use the same
recipe as gcc.target/arm/cmp-2.c

ChangeLog entry is as follows:


*** gcc/testsuite/ChangeLog ***

2017-06-26  Thomas Preud'homme  

 * gcc.target/arm/fpscr.c: Require arm_vfp_ok instead of arm_fp_ok and
 add -mfpu=vfp -mfloat-abi=softfp instead of fp_ok options.


Ok for GCC 5?

Best regards,

Thomas
diff --git a/gcc/testsuite/gcc.target/arm/fpscr.c b/gcc/testsuite/gcc.target/arm/fpscr.c
index 7b4d71d72d8964f6da0d0604bf59aeb4a895df43..cafba4e8d67545bd210477230b9682fe86620e23 100644
--- a/gcc/testsuite/gcc.target/arm/fpscr.c
+++ b/gcc/testsuite/gcc.target/arm/fpscr.c
@@ -1,9 +1,9 @@
 /* Test the fpscr builtins.  */
 
 /* { dg-do compile } */
-/* { dg-require-effective-target arm_fp_ok } */
+/* { dg-require-effective-target arm_vfp_ok } */
 /* { dg-skip-if "need fp instructions" { *-*-* } { "-mfloat-abi=soft" } { "" } } */
-/* { dg-add-options arm_fp } */
+/* { dg-options "-mfpu=vfp -mfloat-abi=softfp" } */
 
 void
 test_fpscr ()

Re: [PING] Re: [PATCH] c/c++: Add fix-it hints for suggested missing #includes

2017-06-30 Thread Jeff Law

On 05/26/2017 01:54 PM, David Malcolm wrote:
> Ping:
>   https://gcc.gnu.org/ml/gcc-patches/2017-05/msg00321.html
> 
> On Thu, 2017-05-04 at 12:36 -0400, David Malcolm wrote:
>> As of r247522, fix-it-hints can suggest the insertion of new lines.
>>
>> This patch uses this to implement a new "maybe_add_include_fixit"
>> function in c-common.c and uses it in the two places where the C and
>> C++
>> frontend can suggest missing #include directives. [1]
>>
>> The idea is that the user can then click on the fix-it in an IDE
>> and have it add the #include for them (or use -fdiagnostics-generate
>> -patch).
>>
>> Examples can be seen in the test cases.
>>
>> The function attempts to put the #include in a reasonable place:
>> immediately after the last #include within the file, or at the
>> top of the file.  It is idempotent, so -fdiagnostics-generate-patch
>> does the right thing if several such diagnostics are emitted.
>>
>> Successfully bootstrapped on x86_64-pc-linux-gnu.
>>
>> OK for trunk?
>>
>> [1] I'm working on a followup which tweaks another diagnostic so that
>> it
>> can suggest that a #include was missing, so I'll use it there as
>> well.
>>
>> gcc/c-family/ChangeLog:
>>  * c-common.c (try_to_locate_new_include_insertion_point): New
>>  function.
>>  (per_file_includes_t): New typedef.
>>  (added_includes_t): New typedef.
>>  (added_includes): New variable.
>>  (maybe_add_include_fixit): New function.
>>  * c-common.h (maybe_add_include_fixit): New decl.
>>
>> gcc/c/ChangeLog:
>>  * c-decl.c (implicitly_declare): When suggesting a missing
>>  #include, provide a fix-it hint.
>>
>> gcc/cp/ChangeLog:
>>  * name-lookup.c (get_std_name_hint): Add '<' and '>' around
>>  the header names.
>>  (maybe_suggest_missing_header): Update for addition of '<' and
>> '>'
>>  to above.  Provide a fix-it hint.
>>
>> gcc/testsuite/ChangeLog:
>>  * g++.dg/lookup/missing-std-include-2.C: New text case.
>>  * gcc.dg/missing-header-fixit-1.c: New test case.
Generally OK.  But a few comments on how you find the location for where
to suggest the new #include.

It looks like you're walking the whole table every time?!?  Shouldn't
you at least bound things between start of file and the point where an
error was issued?  ie, if you used an undefined type on line XX, a
#include after line XX makes no sense to resolve the error.

I'm not sure how often this will get called when someone does something
stupid and needs the #include.  But ISTM you're looking for two bounds.

For the "last" case you start at the statement which generated the error
and walk backwards stopping when you find the last map.

For the "first" case, you start at the beginning and walk forward to
find the map, then quit.

Are those not appropriate here?

Jeff

Re: [ping * 2] PR78736: New C warning -Wenum-conversion

2017-06-30 Thread Jeff Law

On 06/29/2017 05:01 PM, Joseph Myers wrote:
> On Thu, 29 Jun 2017, Jeff Law wrote:
> 
>> On 05/23/2017 07:54 AM, Prathamesh Kulkarni wrote:
>>> Hi,
>>> I would like to ping this patch for review:
>>> https://gcc.gnu.org/ml/gcc-patches/2017-05/msg00775.html
>> So was there any kind of resolution on the case in libcomp where we had
>> an assignment between two essentially equivalent, but distinct enums?
> 
> I don't know.  I approved this patch on the 12th subject to one 
> correction, but see it hasn't yet been committed.
OK.  I missed that (not uncommon given the depth of the queue of things
I'm working through).  I'll trust that Prathamesh will commit after
addressing the one issue you raised.

THanks,
jeff

Re: [rtlanal] Do a better job of costing parallel sets containing flag-setting operations.

2017-06-30 Thread Jeff Law

On 06/30/2017 03:03 AM, Richard Earnshaw (lists) wrote:
> On 19/06/17 14:46, Richard Earnshaw (lists) wrote:
>> Many parallel set insns are of the form of a single set that also sets
>> the condition code flags.  In this case the cost of such an insn is
>> normally the cost of the part that doesn't set the flags, since updating
>> the condition flags is simply a side effect.
>>
>> At present all such insns are treated as having unknown cost (ie 0) and
>> combine assumes that such insns are infinitely more expensive than any
>> other insn sequence with a non-zero cost.
>>
>> This patch addresses this problem by allowing insn_rtx_cost to ignore
>> the condition setting part of a PARALLEL iff there is exactly one
>> comparison set and one non-comparison set.  If the only set operation is
>> a comparison we still use that as the basis of the insn cost.
>>
>>  * rtlanal.c (insn_rtx_cost): If a parallel contains exactly one
>>  comparison set and one other set, use the cost of the
>>  non-comparison set.
>>
>> Bootstrapped on aarch64-none-linuxgnu
>>
>> OK?
>>
> 
> Ping?
Needs a ChangeLog, with that, OK.  Ideally we'd have something which
verifies we're getting the cost right, but that's probably nontrivial to
do in a generic manner.

jeff

Re: [PATCH][DOC] Enhance target_clones documentation (PR other/78366).

2017-06-30 Thread Jeff Law

On 06/30/2017 07:55 AM, Martin Liška wrote:
> PING^1
> 
> On 06/22/2017 02:42 PM, Martin Liška wrote:
>> Hi.
>>
>> As mentioned in the PR, we only generate a resolver function when there's a 
>> usage
>> of a function with target_clones attribute. Let's document the behavior.
>>
>> Ready to be installed?
>> Martin
>>
>> gcc/ChangeLog:
>>
>> 2017-06-22  Martin Liska  
>>
>>  PR other/78366
>>  * doc/extend.texi: Document when a resolver function is
>>  generated for target_clones.
>> ---
>>  gcc/doc/extend.texi | 10 +-
>>  1 file changed, 5 insertions(+), 5 deletions(-)
>>
>>
> 
OK.

jeff

[gomp4, nvptx, committed] Fix assert in nvptx_propagate_unified

2017-06-30 Thread Tom de Vries


Hi,

with the openacc test-case in attached patch, I ran into an assert here:
...
static void
nvptx_propagate_unified (rtx_insn *unified)
{
   rtx_insn *probe = unified;
   rtx cond_reg = SET_DEST (PATTERN (unified));
   rtx pat;

   /* Find the comparison.  (We could skip this and simply scan to he
  blocks' terminating branch, if we didn't care for self
  checking.)  */
   for (;;)
 {
   probe = NEXT_INSN (probe);
   pat = PATTERN (probe);

   if (GET_CODE (pat) == SET
   && GET_RTX_CLASS (GET_CODE (SET_SRC (pat))) == RTX_COMPARE
   && XEXP (SET_SRC (pat), 0) == cond_reg)
 break;
   gcc_assert (NONJUMP_INSN_P (probe));
 }
...

The assert happens when processing insn 56:
...
(insn 54 53 56 3 (set (reg:SI 47 [ _71 ])
  (unspec:SI [
  (reg:SI 36 [ _58 ])
  ] UNSPEC_BR_UNIFIED)) 108 {cond_uni}
   (nil))
(note 56 54 57 3 NOTE_INSN_DELETED)
(insn 57 56 58 3 (set (reg:BI 68)
  (gt:BI (reg:SI 47 [ _71 ])
  (const_int 1 [0x1]))) 99 {*cmpsi}
   (expr_list:REG_DEAD (reg:SI 47 [ _71 ])
  (nil)))
...
The insn 56 was originally a '(set (reg x) (const_int 1))', but that one 
has been combined into insn 57 and replaced with a 'NOTE_INSN_DELETED'. 
So it seems reasonable for the loop to skip over this note.


Fixed by making the assert condition less strict.

Build on x86_64 with nvptx accelerator.

Tested test-case included in the patch.

Committed as trivial.

Thanks,
- Tom

Fix assert in nvptx_propagate_unified

2017-06-30  Tom de Vries  

	* config/nvptx/nvptx.c (nvptx_propagate_unified): Fix gcc_assert
	condition by allowing !INSN_P.

	* testsuite/libgomp.oacc-c-c++-common/reduction-cplx-flt-2.c: New test.

---
 gcc/config/nvptx/nvptx.c   |  2 +-
 .../reduction-cplx-flt-2.c | 32 ++
 2 files changed, 33 insertions(+), 1 deletion(-)

diff --git a/gcc/config/nvptx/nvptx.c b/gcc/config/nvptx/nvptx.c
index 4a93bba..a3abb4b 100644
--- a/gcc/config/nvptx/nvptx.c
+++ b/gcc/config/nvptx/nvptx.c
@@ -2300,7 +2300,7 @@ nvptx_propagate_unified (rtx_insn *unified)
 	  && GET_RTX_CLASS (GET_CODE (SET_SRC (pat))) == RTX_COMPARE
 	  && XEXP (SET_SRC (pat), 0) == cond_reg)
 	break;
-  gcc_assert (NONJUMP_INSN_P (probe));
+  gcc_assert (NONJUMP_INSN_P (probe) || !INSN_P (probe));
 }
   rtx pred_reg = SET_DEST (pat);
 
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/reduction-cplx-flt-2.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/reduction-cplx-flt-2.c
new file mode 100644
index 000..f80f38c
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/reduction-cplx-flt-2.c
@@ -0,0 +1,32 @@
+#include 
+#include 
+#include 
+
+typedef float _Complex Type;
+
+#define N 32
+
+int
+main (void)
+{
+  Type ary[N];
+
+  for (int ix = 0; ix < N;  ix++)
+ary[ix] = 1.0 + 1.0i;
+
+  Type tprod = 1.0;
+
+#pragma acc parallel vector_length(32)
+  {
+#pragma acc loop vector reduction (*:tprod)
+for (int ix = 0; ix < N; ix++)
+  tprod *= ary[ix];
+  }
+
+  Type expected = 65536.0;
+
+  if (tprod != expected)
+abort ();
+
+  return 0;
+}

Re: [PATCH 0/3] Introduce internal_error_cont and exclude it from pot files

2017-06-30 Thread Jeff Law

On 06/30/2017 04:31 AM, Martin Liška wrote:
> On 04/10/2017 05:44 PM, Richard Biener wrote:
>> On Mon, Apr 10, 2017 at 5:42 PM, Jeff Law  wrote:
>>> On 04/07/2017 04:30 AM, Martin Liška wrote:

 On 04/06/2017 06:44 PM, Jeff Law wrote:
>
> On 03/24/2017 03:29 AM, Martin Liška wrote:
>>
>> I would like to ping that. I'm not sure what's agreement after I read
>> discussion in: https://gcc.gnu.org/ml/gcc/2017-03/msg00070.html
>>
>> Martin Sebor may know, CC'ing him.
>
> Not sure if you're pinging the internal_error_cont stuff, or the ODR
> diagnostics changes.
>
> WRT the ODR diagnostics, I'd say let's go with the C++17 style
> (all-lowercase with a hyphen).
>
> If you've got a pointer to the internal_err_cont changes, please pass it
> along.


 Yep, I was interested in internal_err_cont. It's next stage1 material, but
 I'm willing to have
 a feedback whether separation of internal messages is desired to be
 excluded from translation or not?
>>>
>>> As stage1 stuff, I'll largely be ignoring.
>>>
>>> I'm torn on translating this stuff.  It's hard enough for a non-developer to
>>> interpret an ICE message, if it's not in their native language it'd be
>>> impossible.
>>>
>>> OTOH, if we translate and the user forwards the translated message to us, we
>>> may not be able to interpret.
>>>
>>> That probably argues there's some part that should be translated to be
>>> easier for the users, but the real guts of the message should not be
>>> translated.
>>
>> Message number and catalogue.
>>
>> /me runs...
>>
>>> jeff
> 
> I there any discussion going on this topic? If not, I'll probably leave it as 
> it is.
No ongoing discussion -- I'm working through my backlog of items that
were deferred for gcc-8 stage1 as well as items that I couldn't get to
while focused on stack-clash.

jeff

[C++ PATCH] METHODVEC indexing

2017-06-30 Thread Nathan Sidwell

We store a class's overloaded methods on the METHODVEC, and provide one 
set of accessors that return an index (or -1), and another set of 
accessors that simply return the value of the slot (or NULL).  This is 
confusing.


The patch kills the former accessors, so everything outside of search.c 
now uses the slot accessors (or is dealing with conversion operators).


While there, I cleaned up the diagnostics for mismatched member 
declarations. Because f conv-op funkiness we had a confusing loop to 
print candidates.  It's simpler to use the new accessor I created to 
simply cons up a lookup of the conversion operators and use the exising 
candidate printing.  In doing so we now print these as notes, rather 
than errors.  This change is responsible for the testcase churn.


nathan
--
Nathan Sidwell
2017-06-30  Nathan Sidwell  

	* cp-tree.h (lookup_fnfields_1, class_method_index_for_fn): Don't
	declare.
	(lookup_all_conversions): Declare.
	* class.c (get_basefndecls): Use lookup_fnfields_slot.
	* decl.c (register_dtor_fn): Use lookup_fnfields_slot.
	* decl2.c (check_class_fn): Use lookup_fnfields_slot.  Rework
	diagnostics.
	* pt.c (retrieve_specialization): Use lookup_fnfields_slot.
	(check_explicit_specialization): Use lookup_fnfields_slot_nolazy,
	lookup_all_conversions.
	* search.c (lookup_fnfields_1): Make static.
	(lookup_all_conversions): New.
	(class_method_index_for_fn): Delete.
	* semantics.c (classtype_has_nothrow_assign_or_copy_p): Use
	lookup_fnfields_slot.

	* g++.dg/concepts/memfun-err.C: Adjust diagnostics.
	* g++.dg/cpp0x/decltype9.C: Likewise.
	* g++.dg/cpp0x/forw_enum9.C: Likewise.
	* g++.dg/lookup/decl1.C: Likewise.
	* g++.dg/lookup/extern-c-redecl.C: Likewise.
	* g++.dg/other/pr28432.C: Likewise.
	* g++.dg/parse/crash12.C: Likewise.
	* g++.dg/parse/enum3.C: Likewise.
	* g++.dg/parse/operator6.C: Likewise.
	* g++.dg/template/crash69.C: Likewise.
	* g++.dg/template/error27.C: Likewise.
	* g++.dg/template/error28.C: Likewise.
	* g++.dg/template/memfriend6.C: Likewise.
	* g++.old-deja/g++.mike/err1.C: Likewise.
	* g++.old-deja/g++.mike/p811.C: Likewise.
	* g++.old-deja/g++.other/crash25.C: Likewise.
	* g++.old-deja/g++.other/dtor4.C: Likewise.
	* g++.old-deja/g++.pt/t37.C: Likewise.

Index: cp/class.c
===
--- cp/class.c	(revision 249835)
+++ cp/class.c	(working copy)
@@ -2966,29 +2966,25 @@ modify_all_vtables (tree t, tree virtual
 static void
 get_basefndecls (tree name, tree t, vec *base_fndecls)
 {
-  int n_baseclasses = BINFO_N_BASE_BINFOS (TYPE_BINFO (t));
-  int i;
+  bool found_decls = false;
 
   /* Find virtual functions in T with the indicated NAME.  */
-  i = lookup_fnfields_1 (t, name);
-  bool found_decls = false;
-  if (i != -1)
-for (ovl_iterator iter ((*CLASSTYPE_METHOD_VEC (t))[i]); iter; ++iter)
-  {
-	tree method = *iter;
+  for (ovl_iterator iter (lookup_fnfields_slot (t, name)); iter; ++iter)
+{
+  tree method = *iter;
 
-	if (TREE_CODE (method) == FUNCTION_DECL
-	&& DECL_VINDEX (method))
-	  {
-	base_fndecls->safe_push (method);
-	found_decls = true;
-	  }
-  }
+  if (TREE_CODE (method) == FUNCTION_DECL && DECL_VINDEX (method))
+	{
+	  base_fndecls->safe_push (method);
+	  found_decls = true;
+	}
+}
 
   if (found_decls)
 return;
 
-  for (i = 0; i < n_baseclasses; i++)
+  int n_baseclasses = BINFO_N_BASE_BINFOS (TYPE_BINFO (t));
+  for (int i = 0; i < n_baseclasses; i++)
 {
   tree basetype = BINFO_TYPE (BINFO_BASE_BINFO (TYPE_BINFO (t), i));
   get_basefndecls (name, basetype, base_fndecls);
Index: cp/cp-tree.h
===
--- cp/cp-tree.h	(revision 249835)
+++ cp/cp-tree.h	(working copy)
@@ -6567,10 +6567,9 @@ extern int accessible_p(tree, tree,
 extern int accessible_in_template_p		(tree, tree);
 extern tree lookup_field_1			(tree, tree, bool);
 extern tree lookup_field			(tree, tree, int, bool);
-extern int lookup_fnfields_1			(tree, tree);
 extern tree lookup_fnfields_slot		(tree, tree);
 extern tree lookup_fnfields_slot_nolazy		(tree, tree);
-extern int class_method_index_for_fn		(tree, tree);
+extern tree lookup_all_conversions		(tree);
 extern tree lookup_fnfields			(tree, tree, int);
 extern tree lookup_member			(tree, tree, int, bool,
 		 tsubst_flags_t,
Index: cp/decl.c
===
--- cp/decl.c	(revision 249838)
+++ cp/decl.c	(working copy)
@@ -7849,12 +7849,8 @@ register_dtor_fn (tree decl)
   use_dtor = ob_parm && CLASS_TYPE_P (type);
   if (use_dtor)
 {
-  int idx;
+  cleanup = lookup_fnfields_slot (type, complete_dtor_identifier);
 
-  /* Find the destructor.  */
-  idx = lookup_fnfields_1 (type, complete_dtor_identifier);
-  gcc_assert (idx >= 0);
-  cleanup = (*CLASSTYPE_METHOD_VEC (type))[idx];
   /* Make sure it is accessible.  */

Re: [PATCH] Move static chain and non-local goto init after NOTE_INSN_FUNCTION_BEG (PR sanitize/81186).

2017-06-30 Thread Martin Liška

On 06/30/2017 04:03 PM, Michael Matz wrote:
> Hi,
> 
> On Wed, 28 Jun 2017, Martin Liška wrote:
> 
>> Thanks for the review. I'm not so familiar with RTL, but hopefully new 
>> version of the patch should do it properly. Idea is to come up with a 
>> new asan_stack_birth_insn that points after place where stack is ready 
>> to use (when one uses use-after-return). And then in function.c 
>> static_chain_decl and nonlocal_goto_save_area expansion is generated to 
>> a new sequence and the sequence is put after the asan_stack_birth_insn.
> 
> That has the same problem.
> 
> The code for function start is conceptually like this:
> 
> entry_point:
>   setup return space:
> on some targets the address of the return buffer is passed in a 
> certain incoming arg (i.e. it's like an argument)
>   setup arguments:
> store away incoming registers into pseudo
> load stack slot arguments (or put them there, all target dependend)
>   * point 1 where you insert code *
>   setup static chain:
> conceptually this is just a hidden additional function argument
>   setup nonlocal goto save area
> 
> If you simply create any other code inside this sequence you potentially 
> have the problem of clobbering any of the incoming hardregs.  In simple
> examples you might not notice when the register allocator heeds the 
> hardreg uses, but if you for instance call other functions in there you 
> most likely clobber some.  E.g. r10 (the incoming static chain pointer on 
> x86-64) isn't preserved across function calls.  So if you call a function 
> at point 1 above you clobber r10, but the code for setting up the static 
> chain needs it as input.

Thanks Michael.

Now I got it as I understand the problematic of usage of r10 register. Actually
it's easy to come up with a test-case that breaks that:

$ cat /home/marxin/Programming/gcc/gcc/testsuite/gcc.dg/asan/pr81186.c
/* PR sanitizer/81186 */
/* { dg-do run } */

int
main ()
{
  __label__ l;
  void f ()
{
  int a[123];

goto l;
}

  f ();
l:
  return 0;
}

where:

f.2139:
.LASANPC1:
.LFB1:
.cfi_startproc
pushq   %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq%rsp, %rbp
.cfi_def_cfa_register 6
pushq   %rbx
subq$600, %rsp
.cfi_offset 3, -24
leaq-592(%rbp), %rbx
cmpl$0, __asan_option_detect_stack_use_after_return(%rip)
je  .L1
movl$576, %edi
call__asan_stack_malloc_4 <--- here clobbering of r10 happens
testq   %rax, %rax
je  .L1
movq%rax, %rbx
.L1:
movq%r10, %rdx <- use after it's clobbered
movq%r10, -600(%rbp)
movq$1102416563, (%rbx)


> 
> So you need to find some other solution of setting up the stack for ASAN.  
> And it'd be best if that solution doesn't require inserting code inside 
> the above sequence of parameter setup instructions, and you certainly 
> can't call any functions inside that sequence.  It might mean that you 
> can't track the static chain place or the nonlocal goto save area.  You 
> also don't track the parameter stack slots, right?

IIUC parameter stack slots are not sanitized.

I will think about it more ;)
Martin

> 
> 
> Ciao,
> Michael.
>>
>>>
>>> Also if you put something in front of the static_chain_decl initialization 
>>> (as you do if you move the parm_birth_insn in front of it) you'd have to 
>>> make sure that the incoming hidding parameter containing the static chain 
>>> (r10 on x86_64) isn't clobbered from function start up to that point.  
>>> So that won't work either generally.
>>>
>>> I don't know what exactly is the issue with calling 
>>> __asan_handle_no_return before the other instructions emitted by 
>>> expand_used_vars.  Either it shouldn't be called for the static chain 
>>> (i.e. not instrumented) or whatever setup asan needs needs to happen in 
>>> front of the static chain setup, but then without clobbering the incoming 
>>> static chain param (!).
>>
>> Here I need to have FRAME variable to be sanitized as seen in pr78541.c
>> and pr78541-2.c test-cases.
>>
>> Thoughts?
>> Thanks,
>> Martin
>>
>>>
>>>
>>> Ciao,
>>> Michael.

 expanded cfun->static_chain_decl:

 (note 1 0 5 NOTE_INSN_DELETED)
 (note 5 1 2 2 [bb 2] NOTE_INSN_BASIC_BLOCK)
 (insn 2 5 3 2 (set (reg/f:DI 88 [ CHAIN.1 ])
 (reg:DI 39 r10 [ CHAIN.1 ])) "pr81186.c":5 -1
  (nil))
 (insn 3 2 4 2 (set (mem/c:DI (plus:DI (reg/f:DI 82 virtual-stack-vars)
 (const_int -8 [0xfff8])) [0  S8 A64])
 (reg:DI 39 r10 [ CHAIN.1 ])) "pr81186.c":5 -1
  (nil))
 (note 4 3 7 2 NOTE_INSN_FUNCTION_BEG)
 (insn 7 4 8 2 (set (reg/f:DI 87 [ _2 ])
 (reg/f:DI 88 [ CHAIN.1 ])) "pr81186.c":5 -1
  (nil))
 (call_insn 8 7 9 2 (call (mem:QI (symbol_ref:DI 
 ("__asan_handle_no_return") [flags 0x41]  >>>

Re: [PATCH] warn on mem calls modifying objects of non-trivial types (PR 80560)

2017-06-30 Thread Martin Sebor


On 06/30/2017 02:34 AM, Richard Biener wrote:

On Thu, Jun 29, 2017 at 10:23 PM, Martin Sebor  wrote:

On 06/29/2017 10:15 AM, Jan Hubicka wrote:


Hello,


diff --git a/gcc/hash-table.h b/gcc/hash-table.h
index 0f7e21a..443d16c 100644
--- a/gcc/hash-table.h
+++ b/gcc/hash-table.h
@@ -803,7 +803,10 @@ hash_table::empty_slow ()
   m_size_prime_index = nindex;
 }
   else
-memset (entries, 0, size * sizeof (value_type));
+{
+  for ( ; size; ++entries, --size)
+   *entries = value_type ();
+}
   m_n_deleted = 0;
   m_n_elements = 0;
 }



This change sends our periodic testers into an infinite loop.  It is fault
of gcc 4.2 being used
as bootstrap compiler, but perhaps that can be worked around?



The warning in the original code could have been suppressed (by
casting the pointer to char*), but it was valid so I opted not
to.  I'd expect it to be possible to work around the bug but
I don't have easy access to GCC 4.2 to reproduce it or verify
the fix.

FWIW, after looking at the function again, I wondered if zeroing
out the elements (either way) was the right thing to do and if
they shouldn't be cleared by calling Descriptor::mark_empty()
instead, like in alloc_entries(), but making that change broke
a bunch of ipa/ipa-pta-*.c tests.  It's not really clear to me
what this code is supposed to do.

Martin

PS Does this help at all?

@@ -804,8 +804,8 @@ hash_table::empty_slow ()
 }
   else
 {
-  for ( ; size; ++entries, --size)
-   *entries = value_type ();
+  for (size_t i = 0; i != size; ++i)
+   entries[i] = value_type ();

 }
   m_n_deleted = 0;
   m_n_elements = 0;


alloc_entries uses mark_empty.  untested:


Right.  That was my initial thought as well but it broke a number
of ipa/ipa-pta-*.c tests.  I didn't have time to investigate why.

Martin



Index: gcc/hash-table.h
===
--- gcc/hash-table.h(revision 249780)
+++ gcc/hash-table.h(working copy)
@@ -804,8 +804,8 @@ hash_table::empty
 }
   else
 {
-  for ( ; size; ++entries, --size)
-   *entries = value_type ();
+  for (size_t i = 0; i < size; ++i)
+   mark_empty (entries[i]);
 }
   m_n_deleted = 0;
   m_n_elements = 0;

Re: [PATCH] Remove dead code in asan.c

2017-06-30 Thread Jakub Jelinek

> +  /* Unpoison shadow memory that corresponds to a variable that was
> +  is subject of use-after-return sanitization.  */

was is ?
And shouldn't it be subject to instead of subject of?

Otherwise LGTM.

Jakub

Re: [PATCH] Fix PR81249

2017-06-30 Thread Christophe Lyon

Hi Richard,


On 29 June 2017 at 14:53, Richard Biener  wrote:
>
> Bootstrap and regtest running on x86_64-unknown-linux-gnu.
>
> Richard.
>
> 2017-06-29  Richard Biener  
>
> PR tree-optimization/81249
> * tree-vect-loop.c (vect_create_epilog_for_reduction): Convert
> condition reduction result to original scalar type.
>
> * g++.dg/torture/pr81249.C: New testcase.
>

I think this patch (r249831) causes a regression on arm / aarch64:
gcc.dg/vect/pr65947-10.c (internal compiler error)
gcc.dg/vect/pr65947-10.c -flto -ffat-lto-objects (internal compiler error)

The regression appears between r249828 and 249831, which seems the
most likely guilty?

The log says:

/testsuite/gcc.dg/vect/pr65947-10.c: In function 'condition_reduction':
/testsuite/gcc.dg/vect/pr65947-10.c:12:1: error: invalid types in nop conversion
float
unsigned int
_47 = (float) _46;
during GIMPLE pass: vect
dump file: pr65947-10.c.159t.vect
/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/testsuite/gcc.dg/vect/pr65947-10.c:12:1:
internal compiler error: verify_gimple failed
0xbb9107 verify_gimple_in_cfg(function*, bool)
/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/tree-cfg.c:5308
0xa7735c execute_function_todo
/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/passes.c:1989
0xa770f5 execute_todo
/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/passes.c:2043
Please submit a full bug report,

Christophe


> Index: gcc/tree-vect-loop.c
> ===
> --- gcc/tree-vect-loop.c(revision 249780)
> +++ gcc/tree-vect-loop.c(working copy)
> @@ -4833,12 +4858,9 @@ vect_create_epilog_for_reduction (vec
>/* Convert the reduced value back to the result type and set as the
>  result.  */
> -  tree data_reduc_cast = build1 (VIEW_CONVERT_EXPR, scalar_type,
> -data_reduc);
> -  epilog_stmt = gimple_build_assign (new_scalar_dest, data_reduc_cast);
> -  new_temp = make_ssa_name (new_scalar_dest, epilog_stmt);
> -  gimple_assign_set_lhs (epilog_stmt, new_temp);
> -  gsi_insert_before (_gsi, epilog_stmt, GSI_SAME_STMT);
> +  gimple_seq stmts = NULL;
> +  new_temp = gimple_convert (, scalar_type, data_reduc);
> +  gsi_insert_seq_before (_gsi, stmts, GSI_SAME_STMT);
>scalar_results.safe_push (new_temp);
>  }
>else if (STMT_VINFO_VEC_REDUCTION_TYPE (stmt_info) == COND_REDUCTION
> @@ -4903,6 +4925,11 @@ vect_create_epilog_for_reduction (vec   val = new_val;
> }
> }
> +  /* Convert the reduced value back to the result type and set as the
> +result.  */
> +  gimple_seq stmts = NULL;
> +  val = gimple_convert (, scalar_type, val);
> +  gsi_insert_seq_before (_gsi, stmts, GSI_SAME_STMT);
>scalar_results.safe_push (val);
>  }
>
>
> Index: gcc/testsuite/g++.dg/torture/pr81249.C
> ===
> --- gcc/testsuite/g++.dg/torture/pr81249.C  (nonexistent)
> +++ gcc/testsuite/g++.dg/torture/pr81249.C  (working copy)
> @@ -0,0 +1,25 @@
> +/* { dg-do compile } */
> +/* { dg-additional-options "-mavx2 -mprefer-avx128" { target x86_64-*-* 
> i?86-*-* } } */
> +
> +typedef struct rtx_def *rtx;
> +union rtunion {
> +rtx rt_rtx;
> +};
> +struct rtx_def {
> +struct {
> +   rtunion fld[0];
> +} u;
> +rtx elem[];
> +} a;
> +int b, c, d;
> +rtx e;
> +int main() {
> +for (;;) {
> +   d = 0;
> +   for (; d < b; d++)
> + if (a.elem[d])
> +   e = a.elem[d]->u.fld[1].rt_rtx;
> +   if (e)
> + c = 0;
> +}
> +}

Re: [PATCH] Move static chain and non-local goto init after NOTE_INSN_FUNCTION_BEG (PR sanitize/81186).

2017-06-30 Thread Michael Matz

Hi,

On Wed, 28 Jun 2017, Martin Liška wrote:

> Thanks for the review. I'm not so familiar with RTL, but hopefully new 
> version of the patch should do it properly. Idea is to come up with a 
> new asan_stack_birth_insn that points after place where stack is ready 
> to use (when one uses use-after-return). And then in function.c 
> static_chain_decl and nonlocal_goto_save_area expansion is generated to 
> a new sequence and the sequence is put after the asan_stack_birth_insn.

That has the same problem.

The code for function start is conceptually like this:

entry_point:
  setup return space:
on some targets the address of the return buffer is passed in a 
certain incoming arg (i.e. it's like an argument)
  setup arguments:
store away incoming registers into pseudo
load stack slot arguments (or put them there, all target dependend)
  * point 1 where you insert code *
  setup static chain:
conceptually this is just a hidden additional function argument
  setup nonlocal goto save area

If you simply create any other code inside this sequence you potentially 
have the problem of clobbering any of the incoming hardregs.  In simple
examples you might not notice when the register allocator heeds the 
hardreg uses, but if you for instance call other functions in there you 
most likely clobber some.  E.g. r10 (the incoming static chain pointer on 
x86-64) isn't preserved across function calls.  So if you call a function 
at point 1 above you clobber r10, but the code for setting up the static 
chain needs it as input.

So you need to find some other solution of setting up the stack for ASAN.  
And it'd be best if that solution doesn't require inserting code inside 
the above sequence of parameter setup instructions, and you certainly 
can't call any functions inside that sequence.  It might mean that you 
can't track the static chain place or the nonlocal goto save area.  You 
also don't track the parameter stack slots, right?

Ciao,
Michael.
> 
> > 
> > Also if you put something in front of the static_chain_decl initialization 
> > (as you do if you move the parm_birth_insn in front of it) you'd have to 
> > make sure that the incoming hidding parameter containing the static chain 
> > (r10 on x86_64) isn't clobbered from function start up to that point.  
> > So that won't work either generally.
> > 
> > I don't know what exactly is the issue with calling 
> > __asan_handle_no_return before the other instructions emitted by 
> > expand_used_vars.  Either it shouldn't be called for the static chain 
> > (i.e. not instrumented) or whatever setup asan needs needs to happen in 
> > front of the static chain setup, but then without clobbering the incoming 
> > static chain param (!).
> 
> Here I need to have FRAME variable to be sanitized as seen in pr78541.c
> and pr78541-2.c test-cases.
> 
> Thoughts?
> Thanks,
> Martin
> 
> > 
> > 
> > Ciao,
> > Michael.
> >>
> >> expanded cfun->static_chain_decl:
> >>
> >> (note 1 0 5 NOTE_INSN_DELETED)
> >> (note 5 1 2 2 [bb 2] NOTE_INSN_BASIC_BLOCK)
> >> (insn 2 5 3 2 (set (reg/f:DI 88 [ CHAIN.1 ])
> >> (reg:DI 39 r10 [ CHAIN.1 ])) "pr81186.c":5 -1
> >>  (nil))
> >> (insn 3 2 4 2 (set (mem/c:DI (plus:DI (reg/f:DI 82 virtual-stack-vars)
> >> (const_int -8 [0xfff8])) [0  S8 A64])
> >> (reg:DI 39 r10 [ CHAIN.1 ])) "pr81186.c":5 -1
> >>  (nil))
> >> (note 4 3 7 2 NOTE_INSN_FUNCTION_BEG)
> >> (insn 7 4 8 2 (set (reg/f:DI 87 [ _2 ])
> >> (reg/f:DI 88 [ CHAIN.1 ])) "pr81186.c":5 -1
> >>  (nil))
> >> (call_insn 8 7 9 2 (call (mem:QI (symbol_ref:DI 
> >> ("__asan_handle_no_return") [flags 0x41]   >> __builtin___asan_handle_no_return>) [0 __builtin___asan_handle_no_return 
> >> S1 A8])
> >> (const_int 0 [0])) "pr81186.c":5 -1
> >>  (expr_list:REG_EH_REGION (const_int 0 [0])
> >> (nil))
> >> (nil))
> >>
> >> expanded cfun->nonlocal_goto_save_area:
> >>
> >> (note 1 0 34 NOTE_INSN_DELETED)
> >> (note 34 1 31 2 [bb 2] NOTE_INSN_BASIC_BLOCK)
> >> (insn 31 34 32 2 (set (mem/f/c:DI (plus:DI (reg:DI 95)
> >> (const_int -64 [0xffc0])) [4 
> >> FRAME.0.__nl_goto_buf+0 S8 A64])
> >> (reg/f:DI 82 virtual-stack-vars)) "pr81186.c":3 -1
> >>  (nil))
> >> (insn 32 31 2 2 (set (mem/f/c:DI (plus:DI (reg:DI 95)
> >> (const_int -56 [0xffc8])) [4 
> >> FRAME.0.__nl_goto_buf+8 S8 A64])
> >> (reg/f:DI 7 sp)) "pr81186.c":3 -1
> >>  (nil))
> >> (insn 2 32 3 2 (parallel [
> >> (set (reg:DI 96)
> >> (plus:DI (reg/f:DI 82 virtual-stack-vars)
> >> (const_int -96 [0xffa0])))
> >> (clobber (reg:CC 17 flags))
> >> ]) "pr81186.c":3 -1
> >>  (nil))
> >> (insn 3 2 4 2 (set (reg:DI 97)
> >> (reg:DI 96)) "pr81186.c":3 -1
> >>  (nil))
> >> (insn 4 3 5 2 (set (reg:CCZ 17 flags)
> >> (compare:CCZ (mem/c:SI (symbol_ref:DI 
> >>

Re: [PATCH][DOC] Enhance target_clones documentation (PR other/78366).

2017-06-30 Thread Martin Liška

PING^1

On 06/22/2017 02:42 PM, Martin Liška wrote:
> Hi.
> 
> As mentioned in the PR, we only generate a resolver function when there's a 
> usage
> of a function with target_clones attribute. Let's document the behavior.
> 
> Ready to be installed?
> Martin
> 
> gcc/ChangeLog:
> 
> 2017-06-22  Martin Liska  
> 
>   PR other/78366
>   * doc/extend.texi: Document when a resolver function is
>   generated for target_clones.
> ---
>  gcc/doc/extend.texi | 10 +-
>  1 file changed, 5 insertions(+), 5 deletions(-)
> 
>

Re: [PATCH] Make inlining consistent in LTO and non-LTO mode (PR target/71991).

2017-06-30 Thread Martin Liška

On 06/28/2017 05:18 PM, Jan Hubicka wrote:
>> On 06/28/2017 04:24 PM, Jan Hubicka wrote:
 -  /* If callee has no option attributes, then it is ok to inline.  */
 -  if (!callee_tree)
 +  /* If callee has no option attributes (or default),
 + then it is ok to inline.  */
 +  if (!callee_tree || callee_tree == target_option_default_node)
>>>
>>> I am not sure this actually makes sense, because target_option_default_node 
>>> is not very
>>> meaningful for LTO (it contains whatever was passed to LTO driver). 
>>
>> I see!
>>
>>  Perhaps one can check
>>> for explicit optimization/machine attribute and whether caller and callee 
>>> come from

I'm not sure what you mean by 'for explicit optimization/machine attribute' ?

I'm attaching a new patch, is it closer?

Martin

>>> same compilation unit, though this is quite hackish and will do unexpected 
>>> things with COMDATs.
>>
>> That's quite cumbersome. Any other idea than marking the PR as won't fix?
> 
> Yep, it is not prettiest. The problem is that the concept that callee can 
> change semantics
> when no explicit attribute is present is sloppy.  I am not sure how many 
> programs rely on it
> (it is kind of surprising to see functions not being inlined into your target 
> attribute annotated
> function I guess).
> Note that we check for original file in inliner already - this can be done by 
> comparing lto_file_data
> of corresponding cgraph nodes.
> 
> Honza
> 

>From c07dd3f079382dca617f1a2c32b83f7eaa1797f9 Mon Sep 17 00:00:00 2001
From: marxin 
Date: Tue, 27 Jun 2017 16:39:27 +0200
Subject: [PATCH] Make inlining consistent in LTO and non-LTO mode (PR
 target/71991).

gcc/testsuite/ChangeLog:

2017-06-28  Martin Liska  

	PR target/71991
	* gcc.dg/torture/pr71991.c: New test.

gcc/ChangeLog:

2017-06-28  Martin Liska  

	PR target/71991
	* config/i386/i386.c (ix86_can_inline_p): Make inlining consistent
	in LTO and non-LTO mode.
---
 gcc/config/i386/i386.c |  9 +++--
 gcc/testsuite/gcc.dg/torture/pr71991.c | 12 
 2 files changed, 19 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/torture/pr71991.c

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 2c4479e1751..92cda94b556 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -7438,8 +7438,13 @@ ix86_can_inline_p (tree caller, tree callee)
   tree caller_tree = DECL_FUNCTION_SPECIFIC_TARGET (caller);
   tree callee_tree = DECL_FUNCTION_SPECIFIC_TARGET (callee);
 
-  /* If callee has no option attributes, then it is ok to inline.  */
-  if (!callee_tree)
+  /* If callee has no option attributes (or default),
+ then it is ok to inline.  */
+  cgraph_node *caller_node = cgraph_node::get (caller);
+  cgraph_node *callee_node = cgraph_node::get (callee);
+  if (!callee_tree
+  || (callee_tree == target_option_default_node
+	  && caller_node->lto_file_data == callee_node->lto_file_data))
 ret = true;
 
   /* If caller has no option attributes, but callee does then it is not ok to
diff --git a/gcc/testsuite/gcc.dg/torture/pr71991.c b/gcc/testsuite/gcc.dg/torture/pr71991.c
new file mode 100644
index 000..79c927f6844
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr71991.c
@@ -0,0 +1,12 @@
+/* PR target/71991 */
+
+/* { dg-do compile { target i?86-*-* x86_64-*-* } } */
+/* { dg-options "-std=c99" } */
+
+static inline int __attribute__ ((__always_inline__)) fn1 () { return 0; }
+static inline int __attribute__ ((target("inline-all-stringops"))) fn2 () { return fn1 (); }
+
+int main()
+{
+  return fn2();
+}
-- 
2.13.1

Re: [PATCH 4/N] Recover GOTO predictor.

2017-06-30 Thread Martin Liška

On 06/22/2017 12:27 PM, Richard Biener wrote:
> On Wed, Jun 21, 2017 at 3:06 PM, Martin Liška  wrote:
>> Hello.
>>
>> There's one additional predictor enhancement that is GOTO predict that
>> used to working. Following patch adds expect statement for C and C++ family
>> languages.
>>
>> There's one fallout which is vrp24.c test-case, where only 'Simplified 
>> relational'
>> appears just once. Adding Richi and Patrick who can probably help how to fix 
>> the
>> test-case.
> 
> Happens to be optimized better now, there's only one predicate to simplify
> left in the IL input to VRP1.  I suggest to change it to match 1 times and add
> -fdump-tree-optimized to dg-options and
> 
> /* { dg-final { scan-tree-dump-times "if " 3 "optimized" } } */
> 
> to verify we have 3 conditions left.

One small note, I see 4 conditions in optimized dump:

sss (struct rtx_def * insn, int code1, int code2, int code3)
{
  int D1544;
  struct rtx_def * body;
  _Bool D1562;

   [100.00%] [count: INV]:
  body_5 = insn_4(D)->u.fld[5].rt_rtx;
  D1544_6 = body_5->code;
  if (D1544_6 == 55)
goto  (L7); [34.00%] [count: INV]
  else
goto ; [66.00%] [count: INV]

   [66.00%] [count: INV]:
  if (code3_7(D) == 99)
goto ; [34.00%] [count: INV]
  else
goto  (L16); [66.00%] [count: INV]

   [22.44%] [count: INV]:
  D1562_9 = code1_8(D) == 10;
  if (D1562_9 == 1)
goto  (L7); [64.00%] [count: INV]
  else
goto  (L16); [36.00%] [count: INV]

   [9.82%] [count: INV]:
  arf ();

   [46.68%] [count: INV]:
  frob (); [tail call]
  goto  (L16); [100.00%] [count: INV]

L7 [48.36%] [count: INV]:
  if (code2_12(D) == 42)
goto ; [20.24%] [count: INV]
  else
goto ; [79.76%] [count: INV]

L16 [100.00%] [count: INV]:
  return;

}

Is it a problem or adjusting to 4 is fine?

Martin

> 
>> Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.
>>
>> Ready to be installed?
>> Martin

[PATCH] rs6000 branch probability changes

2017-06-30 Thread David Edelsohn

Convert the rs6000 port to use the new API for branch probabilities.

Okay?

Thanks, David

* config/rs6000/rs6000.c (emit_unlikely_jump): Adjust to new branch
probability data type.

Index: rs6000.c
===
--- rs6000.c (revision 249839)
+++ rs6000.c (working copy)
@@ -23514,10 +23514,9 @@
 static void
 emit_unlikely_jump (rtx cond, rtx label)
 {
-  int very_unlikely = REG_BR_PROB_BASE / 100 - 1;
   rtx x = gen_rtx_IF_THEN_ELSE (VOIDmode, cond, label, pc_rtx);
   rtx_insn *insn = emit_jump_insn (gen_rtx_SET (pc_rtx, x));
-  add_int_reg_note (insn, REG_BR_PROB, very_unlikely);
+  add_int_reg_note (insn, REG_BR_PROB, profile_probability::very_unlikely ());
 }

 /* A subroutine of the atomic operation splitters.  Emit a load-locked

Re: [PATCH] Remove dead code in asan.c

2017-06-30 Thread Martin Liška

On 06/30/2017 12:15 PM, Jakub Jelinek wrote:
> On Fri, Jun 30, 2017 at 12:00:36PM +0200, Martin Liška wrote:
>> Hi.
>>
>> Following crap code was added by me when I added use-after-scope.
>> Actually decl always points to LASANPC, so asan_handled_variables->contains 
>> (decl)
>> is always false.
>>
>> Well, originally the idea was to not clear content (place in shadow memory 
>> in between
>> red zoner) of auto variables, but as we emit 0xf5 in order to have working 
>> use-after-return,
>> it probably does not worth for doing an optimization?
> 
> use-after-return is only runtime conditional, defaults to off.
> And your patch doesn't bring the code to anything close to what we had
> before the -fsanitize-use-after-scope changes, just look what it did before
> - only cleared the shadow spots that weren't known to be 0, clearing the
> whole shadow might be too expensive.  Consider many KB large local
> variables.
> 
> You can find the right decl in decls[l / 2] or decls[l / 2 - 1] or so.
> 
>   Jakub
> 

Huh, sorry, I overlooked that shadow stack used by use-after-return has it's own
code that does shadow memory modification.

Please take a look at v2 of the patch where I reverted the hunk to pre 
use-after-scope
and then I added condition to unpoison variables used in use-after-scope.

I'm going to test it.

Martin
>From 2e249424e28a0d66ffb1c9cf8f54c1160043a2cc Mon Sep 17 00:00:00 2001
From: marxin 
Date: Fri, 30 Jun 2017 15:08:55 +0200
Subject: [PATCH] Make stack epilogue more efficient

gcc/ChangeLog:

2017-06-30  Martin Liska  

	* asan.c (asan_emit_stack_protection): Unpoison just red zones
	and shadow memory of auto variables which are subject of
	use-after-scope sanitization.
	(asan_expand_mark_ifn): Add do set only when is_poison.
---
 gcc/asan.c | 79 +++---
 1 file changed, 40 insertions(+), 39 deletions(-)

diff --git a/gcc/asan.c b/gcc/asan.c
index 2de16402c51..5cce2be9962 100644
--- a/gcc/asan.c
+++ b/gcc/asan.c
@@ -1062,7 +1062,7 @@ asan_emit_stack_protection (rtx base, rtx pbase, unsigned int alignb,
   HOST_WIDE_INT base_offset = offsets[length - 1];
   HOST_WIDE_INT base_align_bias = 0, offset, prev_offset;
   HOST_WIDE_INT asan_frame_size = offsets[0] - base_offset;
-  HOST_WIDE_INT last_offset;
+  HOST_WIDE_INT last_offset, last_size;
   int l;
   unsigned char cur_shadow_byte = ASAN_STACK_MAGIC_LEFT;
   tree str_cst, decl, id;
@@ -1297,58 +1297,55 @@ asan_emit_stack_protection (rtx base, rtx pbase, unsigned int alignb,
   if (STRICT_ALIGNMENT)
 set_mem_align (shadow_mem, (GET_MODE_ALIGNMENT (SImode)));
 
-  /* Unpoison shadow memory of a stack at the very end of a function.
- As we're poisoning stack variables at the end of their scope,
- shadow memory must be properly unpoisoned here.  The easiest approach
- would be to collect all variables that should not be unpoisoned and
- we unpoison shadow memory of the whole stack except ranges
- occupied by these variables.  */
+  prev_offset = base_offset;
   last_offset = base_offset;
-  HOST_WIDE_INT current_offset = last_offset;
-  if (length)
+  last_size = 0;
+  for (l = length; l; l -= 2)
 {
-  HOST_WIDE_INT var_end_offset = 0;
-  HOST_WIDE_INT stack_start = offsets[length - 1];
-  gcc_assert (last_offset == stack_start);
-
-  for (int l = length - 2; l > 0; l -= 2)
+  offset = base_offset + ((offsets[l - 1] - base_offset)
+			 & ~(ASAN_RED_ZONE_SIZE - HOST_WIDE_INT_1));
+  if (last_offset + last_size != offset)
 	{
-	  HOST_WIDE_INT var_offset = offsets[l];
-	  current_offset = var_offset;
-	  var_end_offset = offsets[l - 1];
-	  HOST_WIDE_INT rounded_size = ROUND_UP (var_end_offset - var_offset,
-	 BITS_PER_UNIT);
+	  shadow_mem = adjust_address (shadow_mem, VOIDmode,
+   (last_offset - prev_offset)
+   >> ASAN_SHADOW_SHIFT);
+	  prev_offset = last_offset;
+	  asan_clear_shadow (shadow_mem, last_size >> ASAN_SHADOW_SHIFT);
+	  last_offset = offset;
+	  last_size = 0;
+	}
+  last_size += base_offset + ((offsets[l - 2] - base_offset)
+  & ~(ASAN_RED_ZONE_SIZE - HOST_WIDE_INT_1))
+		   - offset;
 
-	  /* Should we unpoison the variable?  */
+  /* Unpoison shadow memory that corresponds to a variable that was
+	 is subject of use-after-return sanitization.  */
+  if (l > 2)
+	{
+	  decl = decls[l / 2 - 2];
 	  if (asan_handled_variables != NULL
 	  && asan_handled_variables->contains (decl))
 	{
+	  HOST_WIDE_INT size = offsets[l - 3] - offsets[l - 2];
 	  if (dump_file && (dump_flags & TDF_DETAILS))
 		{
 		  const char *n = (DECL_NAME (decl)
    ? IDENTIFIER_POINTER (DECL_NAME (decl))
    : "");
 		  fprintf (dump_file, "Unpoisoning shadow stack for variable: "
-			   "%s (%" PRId64 "B)\n", n,
-			   var_end_offset - var_offset);
+			   "%s (%" PRId64 " B)\n", n, size);
 		}
 
-	  unsigned HOST_WIDE_INT s
-		= shadow_mem_size

[C++ PATCH] cdtor naming

2017-06-30 Thread Nathan Sidwell

This patch changes the way we name cdtors internally.  Currently both 
in-charge ctor and dtor get the DECL_NAME naming the class they are a 
member of. That's the same DECL_NAME as the class's self-reference 
TYPE_DECL. (The specialized base/complete/deleting cdtors get special 
names already.)


Not only is this confusing when looking inside the compiler, but we end 
up having to frob the name in a few places when trying to do lookups. 
We already check for cdtorness when emitting a user-readable name 
(except in one place, which this patch addresses).


This patch changes things so that ctors use ctor_identifer and dtors use 
dtor_identifier.  This makes it clearer when debugging the compiler and 
avoids the need to frob the name.  We do now have two ways to check 
cdtorness -- look at the DECL_CONSTRUCTOR_P of the DECL, or look at the 
IDENTIFIER_CTOR_P flag of the IDENTIFIER.  The former is used by the 
language-agnostic middle-end, and the latter is used by the C++ FE to 
determine if we lazily need to synthesize things.  It doesn't seem 
useful, (at the moment), to just have one mechanism.


The testcase change is in the introspection plugin, where the plugin 
prints the raw name.  It now sees the special name, rather than the 
class name (which was only 'right' for the ctor anyway).  This seems 
better to me.


nathan
--
Nathan Sidwell
2017-06-30  Nathan Sidwell  

	* call.c (build_new_method_call_1): Use constructo_name to get
	ctor name.  Move argument processing earlier to merge cdtor
	handling blocks.
	* decl.c (grokfndecl): Cdtors have special names.
	* method.c (implicitly_declare_fn): Likewise. Simplify flag setting.
	* pt.c (check_explicit_specialization): Cdtor name is already
	special.
	* search.c (class_method_index_for_fn): Likewise.

	* g++.dg/plugin/decl-plugin-test.C: Expect special ctor name.

Index: cp/call.c
===
--- cp/call.c	(revision 249835)
+++ cp/call.c	(working copy)
@@ -8994,6 +8994,7 @@ build_new_method_call_1 (tree instance,
   if (! (complain & tf_error))
 	return error_mark_node;
 
+  name = constructor_name (basetype);
   if (permerror (input_location,
 		 "cannot call constructor %<%T::%D%> directly",
 		 basetype, name))
@@ -9004,6 +9005,19 @@ build_new_method_call_1 (tree instance,
   return call;
 }
 
+  /* Process the argument list.  */
+  if (args != NULL && *args != NULL)
+{
+  *args = resolve_args (*args, complain);
+  if (*args == NULL)
+	return error_mark_node;
+}
+
+  /* Consider the object argument to be used even if we end up selecting a
+ static member function.  */
+  instance = mark_type_use (instance);
+
+
   /* Figure out whether to skip the first argument for the error
  message we will display to users if an error occurs.  We don't
  want to display any compiler-generated arguments.  The "this"
@@ -9013,35 +9027,18 @@ build_new_method_call_1 (tree instance,
   skip_first_for_error = false;
   if (IDENTIFIER_CDTOR_P (name))
 {
-  /* Callers should explicitly indicate whether they want to construct
+  /* Callers should explicitly indicate whether they want to ctor
 	 the complete object or just the part without virtual bases.  */
   gcc_assert (name != ctor_identifier);
-  /* Similarly for destructors.  */
-  gcc_assert (name != dtor_identifier);
+
   /* Remove the VTT pointer, if present.  */
   if ((name == base_ctor_identifier || name == base_dtor_identifier)
 	  && CLASSTYPE_VBASECLASSES (basetype))
 	skip_first_for_error = true;
-}
-
-  /* Process the argument list.  */
-  if (args != NULL && *args != NULL)
-{
-  *args = resolve_args (*args, complain);
-  if (*args == NULL)
-	return error_mark_node;
-}
 
-  /* Consider the object argument to be used even if we end up selecting a
- static member function.  */
-  instance = mark_type_use (instance);
-
-  /* It's OK to call destructors and constructors on cv-qualified objects.
- Therefore, convert the INSTANCE to the unqualified type, if
- necessary.  */
-  if (DECL_DESTRUCTOR_P (fn)
-  || DECL_CONSTRUCTOR_P (fn))
-{
+  /* It's OK to call destructors and constructors on cv-qualified
+	 objects.  Therefore, convert the INSTANCE to the unqualified
+	 type, if necessary.  */
   if (!same_type_p (basetype, TREE_TYPE (instance)))
 	{
 	  instance = build_this (instance);
@@ -9049,8 +9046,8 @@ build_new_method_call_1 (tree instance,
 	  instance = build_fold_indirect_ref (instance);
 	}
 }
-  if (DECL_DESTRUCTOR_P (fn))
-name = complete_dtor_identifier;
+  else
+gcc_assert (!DECL_DESTRUCTOR_P (fn) && !DECL_CONSTRUCTOR_P (fn));
 
   /* For the overload resolution we need to find the actual `this`
  that would be captured if the call turns out to be to a
Index: cp/decl.c
===
--- cp/decl.c	(revision 249835)
+++

[PATCH] Speed up SLP analysis

2017-06-30 Thread Richard Biener


This removes the redundant analyzing of all scalar stmts in a SLP
node (which also should cut down size of the dump file) as well
as move the vector type computation for BBs into the SLP analysis
routine (computing it only once).  The transform phase later does
essentially the same (only look at the first stmt).

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

Richard.

2017-06-30  Richard Biener  

* tree-vect-slp.c (vect_slp_analyze_node_operations): Only
analyze the first scalar stmt.  Move vector type computation
for the BB case here from ...
* tree-vect-stmts.c (vect_analyze_stmt): ... here.  Guard
live operation processing in the SLP case properly.

Index: gcc/tree-vect-slp.c
===
--- gcc/tree-vect-slp.c (revision 249834)
+++ gcc/tree-vect-slp.c (working copy)
@@ -2437,29 +2437,70 @@ vect_slp_analyze_node_operations (slp_tr
 if (!vect_slp_analyze_node_operations (child))
   return false;
 
-  bool res = true;
-  FOR_EACH_VEC_ELT (SLP_TREE_SCALAR_STMTS (node), i, stmt)
+  stmt = SLP_TREE_SCALAR_STMTS (node)[0];
+  stmt_vec_info stmt_info = vinfo_for_stmt (stmt);
+  gcc_assert (stmt_info);
+  gcc_assert (STMT_SLP_TYPE (stmt_info) != loop_vect);
+
+  /* For BB vectorization vector types are assigned here.
+ Memory accesses already got their vector type assigned
+ in vect_analyze_data_refs.  */
+  bb_vec_info bb_vinfo = STMT_VINFO_BB_VINFO (stmt_info);
+  if (bb_vinfo
+  && ! STMT_VINFO_DATA_REF (stmt_info))
 {
-  stmt_vec_info stmt_info = vinfo_for_stmt (stmt);
-  gcc_assert (stmt_info);
-  gcc_assert (STMT_SLP_TYPE (stmt_info) != loop_vect);
-
-  /* Push SLP node def-type to stmt operands.  */
-  FOR_EACH_VEC_ELT (SLP_TREE_CHILDREN (node), j, child)
-   if (SLP_TREE_DEF_TYPE (child) != vect_internal_def)
- STMT_VINFO_DEF_TYPE (vinfo_for_stmt (SLP_TREE_SCALAR_STMTS 
(child)[i]))
-   = SLP_TREE_DEF_TYPE (child);
-  res = vect_analyze_stmt (stmt, , node);
-  /* Restore def-types.  */
-  FOR_EACH_VEC_ELT (SLP_TREE_CHILDREN (node), j, child)
-   if (SLP_TREE_DEF_TYPE (child) != vect_internal_def)
- STMT_VINFO_DEF_TYPE (vinfo_for_stmt (SLP_TREE_SCALAR_STMTS 
(child)[i]))
-   = vect_internal_def;
-  if (! res)
-   break;
+  gcc_assert (PURE_SLP_STMT (stmt_info));
+
+  tree scalar_type = TREE_TYPE (gimple_get_lhs (stmt));
+  if (dump_enabled_p ())
+   {
+ dump_printf_loc (MSG_NOTE, vect_location,
+  "get vectype for scalar type:  ");
+ dump_generic_expr (MSG_NOTE, TDF_SLIM, scalar_type);
+ dump_printf (MSG_NOTE, "\n");
+   }
+
+  tree vectype = get_vectype_for_scalar_type (scalar_type);
+  if (!vectype)
+   {
+ if (dump_enabled_p ())
+   {
+ dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
+  "not SLPed: unsupported data-type ");
+ dump_generic_expr (MSG_MISSED_OPTIMIZATION, TDF_SLIM,
+scalar_type);
+ dump_printf (MSG_MISSED_OPTIMIZATION, "\n");
+   }
+ return false;
+   }
+
+  if (dump_enabled_p ())
+   {
+ dump_printf_loc (MSG_NOTE, vect_location, "vectype:  ");
+ dump_generic_expr (MSG_NOTE, TDF_SLIM, vectype);
+ dump_printf (MSG_NOTE, "\n");
+   }
+
+  gimple *sstmt;
+  FOR_EACH_VEC_ELT (SLP_TREE_SCALAR_STMTS (node), i, sstmt)
+   STMT_VINFO_VECTYPE (vinfo_for_stmt (sstmt)) = vectype;
 }
 
-  return res;
+  /* Push SLP node def-type to stmt operands.  */
+  FOR_EACH_VEC_ELT (SLP_TREE_CHILDREN (node), j, child)
+if (SLP_TREE_DEF_TYPE (child) != vect_internal_def)
+  STMT_VINFO_DEF_TYPE (vinfo_for_stmt (SLP_TREE_SCALAR_STMTS (child)[0]))
+   = SLP_TREE_DEF_TYPE (child);
+  bool res = vect_analyze_stmt (stmt, , node);
+  /* Restore def-types.  */
+  FOR_EACH_VEC_ELT (SLP_TREE_CHILDREN (node), j, child)
+if (SLP_TREE_DEF_TYPE (child) != vect_internal_def)
+  STMT_VINFO_DEF_TYPE (vinfo_for_stmt (SLP_TREE_SCALAR_STMTS (child)[0]))
+   = vect_internal_def;
+  if (! res)
+return false;
+
+  return true;
 }
 
 
Index: gcc/tree-vect-stmts.c
===
--- gcc/tree-vect-stmts.c   (revision 249834)
+++ gcc/tree-vect-stmts.c   (working copy)
@@ -8398,7 +8398,6 @@ vect_analyze_stmt (gimple *stmt, bool *n
   bb_vec_info bb_vinfo = STMT_VINFO_BB_VINFO (stmt_info);
   enum vect_relevant relevance = STMT_VINFO_RELEVANT (stmt_info);
   bool ok;
-  tree scalar_type, vectype;
   gimple *pattern_stmt;
   gimple_seq pattern_def_seq;
 
@@ -8529,48 +8528,6 @@ vect_analyze_stmt (gimple *stmt, bool *n
 gcc_unreachable ();
 }
 
-  if (bb_vinfo)
-{
-  gcc_assert (PURE_SLP_STMT (stmt_info));
-
-  /* Memory accesses

Re: [PATCH] Fix vec_extract_lo_* patterns (PR target/81225)

2017-06-30 Thread Kirill Yukhin

Hello Jakub,
On 29 Jun 18:51, Jakub Jelinek wrote:
> Hi!
> 
> This patch fixes various issues with the vec_extract_lo_* patterns.
> There are splitters for these, but only for some cases (no mask, and
> in one case also not xmm32+ reg) that change those into just a copy or load
> of the low part subreg, but if those can't be used, the vextract* insns
> don't accept memory input operand, but 3 of the 4 patterns have
> nonimmediate_operand input, which is wrong for the masked case, and the
> other one uses register_operand, even when the splitter can handle
> nonimmediate_operand when not masked.
> 
> Thus this patch makes sure that the input is nonimmediate_operand and v,vm
> if not masked and register_operand and v,v if masked, returns "#" to ensure
> splitting in cases the input is a memory, simplifies the conditions (for
> masked we don't need to test at runtime if both arguments aren't MEMs,
> because the predicate is now register_operand with v constraint), and
> changes the single case that used register_operand to follow the rest.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
Your patch is OK for trunk.

> 
> 2017-06-29  Jakub Jelinek  
> 
>   PR target/81225
>   * config/i386/sse.md (vec_extract_lo_): For
>   V8FI, V16FI and VI8F_256 iterators, use  instead
>   of nonimmediate_operand and  instead of m for
>   the input operand.  For V8FI iterator, always split if input is a MEM.
>   For V16FI and V8SF_256 iterators, don't test if both operands are MEM
>   if .  For VI4F_256 iterator, use 
>   instead of register_operand and  instead of v for
> the input operand.  Make sure both operands aren't MEMs for if not
>   .
> 
>   * gcc.target/i386/pr81225.c: New test.

--
Thanks, K

Re: C++ PATCH to remove WITH_CLEANUP_EXPR handling

2017-06-30 Thread Nathan Sidwell


On 06/29/2017 05:44 PM, Jason Merrill wrote:

The C++ front end hasn't generated WITH_CLEANUP_EXPR in a very long
time (20+ years?), so there's no need to handle it.


I see dead code

nathan

--
Nathan Sidwell

Re: [PATCH] Fix removal of ifunc (PR ipa/81214).

2017-06-30 Thread Jan Hubicka

> Hello.
> 
> Following patch fixes the issue where we do not emit ifunc and resolver
> for function that are not called in a compilation unit or and not
> referenced.
> 
> Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.
> i386.exp tests work on x86_64-linux-gnu.

OK,
Honza
> 
> Ready to be installed?
> Martin
> 
> gcc/testsuite/ChangeLog:
> 
> 2017-06-29  Martin Liska  
> 
>   PR ipa/81214
>   * gcc.target/i386/pr81214.c: New test.
> 
> gcc/ChangeLog:
> 
> 2017-06-29  Martin Liska  
> 
>   PR ipa/81214
>   * multiple_target.c (create_dispatcher_calls): Make ifunc
>   also for function that don't have calls or are not referenced.
> ---
>  gcc/multiple_target.c   | 64 
> -
>  gcc/testsuite/gcc.target/i386/pr81214.c | 14 
>  2 files changed, 46 insertions(+), 32 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr81214.c
> 
> 

> diff --git a/gcc/multiple_target.c b/gcc/multiple_target.c
> index eddc7d3744b..0d7cc3a2939 100644
> --- a/gcc/multiple_target.c
> +++ b/gcc/multiple_target.c
> @@ -68,6 +68,38 @@ create_dispatcher_calls (struct cgraph_node *node)
>|| !is_function_default_version (node->decl))
>  return;
>  
> +  if (!targetm.has_ifunc_p ())
> +{
> +  error_at (DECL_SOURCE_LOCATION (node->decl),
> + "the call requires ifunc, which is not"
> + " supported by this target");
> +  return;
> +}
> +  else if (!targetm.get_function_versions_dispatcher)
> +{
> +  error_at (DECL_SOURCE_LOCATION (node->decl),
> + "target does not support function version dispatcher");
> +  return;
> +}
> +
> +  tree idecl = targetm.get_function_versions_dispatcher (node->decl);
> +  if (!idecl)
> +{
> +  error_at (DECL_SOURCE_LOCATION (node->decl),
> + "default target_clones attribute was not set");
> +  return;
> +}
> +
> +  cgraph_node *inode = cgraph_node::get (idecl);
> +  gcc_assert (inode);
> +  tree resolver_decl = targetm.generate_version_dispatcher_body (inode);
> +
> +  /* Update aliases.  */
> +  inode->alias = true;
> +  inode->alias_target = resolver_decl;
> +  if (!inode->analyzed)
> +inode->resolve_alias (cgraph_node::get (resolver_decl));
> +
>auto_vec edges_to_redirect;
>auto_vec references_to_redirect;
>  
> @@ -80,38 +112,6 @@ create_dispatcher_calls (struct cgraph_node *node)
>  
>if (!edges_to_redirect.is_empty () || !references_to_redirect.is_empty ())
>  {
> -  if (!targetm.has_ifunc_p ())
> - {
> -   error_at (DECL_SOURCE_LOCATION (node->decl),
> - "the call requires ifunc, which is not"
> - " supported by this target");
> -   return;
> - }
> -  else if (!targetm.get_function_versions_dispatcher)
> - {
> -   error_at (DECL_SOURCE_LOCATION (node->decl),
> - "target does not support function version dispatcher");
> -   return;
> - }
> -
> -  tree idecl = targetm.get_function_versions_dispatcher (node->decl);
> -  if (!idecl)
> - {
> -   error_at (DECL_SOURCE_LOCATION (node->decl),
> - "default target_clones attribute was not set");
> -   return;
> - }
> -
> -  cgraph_node *inode = cgraph_node::get (idecl);
> -  gcc_assert (inode);
> -  tree resolver_decl = targetm.generate_version_dispatcher_body (inode);
> -
> -  /* Update aliases.  */
> -  inode->alias = true;
> -  inode->alias_target = resolver_decl;
> -  if (!inode->analyzed)
> - inode->resolve_alias (cgraph_node::get (resolver_decl));
> -
>/* Redirect edges.  */
>unsigned i;
>cgraph_edge *e;
> diff --git a/gcc/testsuite/gcc.target/i386/pr81214.c 
> b/gcc/testsuite/gcc.target/i386/pr81214.c
> new file mode 100644
> index 000..2584decdb3c
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/pr81214.c
> @@ -0,0 +1,14 @@
> +/* PR ipa/81214.  */
> +/* { dg-do compile } */
> +/* { dg-require-ifunc "" } */
> +
> +__attribute__((target_clones("avx","arch=slm","arch=core-avx2","default")))
> +int
> +foo ()
> +{
> +  return -2;
> +}
> +
> +/* { dg-final { scan-assembler "\t.globl\tfoo" } } */
> +/* { dg-final { scan-assembler "foo.resolver:" } } */
> +/* { dg-final { scan-assembler "foo, @gnu_indirect_function" } } */
>

Re: [PATCH 4/N] Recover GOTO predictor.

2017-06-30 Thread Jan Hubicka

> PING^1
> 
> Can you please Honza give a formal approval for the patch?

OK,
thanks!
Honza
> 
> Thanks,
> Martin
> 
> On 06/22/2017 01:47 PM, Richard Biener wrote:
> > On Thu, Jun 22, 2017 at 12:57 PM, Martin Liška  wrote:
> >> On 06/22/2017 12:27 PM, Richard Biener wrote:
> >>> On Wed, Jun 21, 2017 at 3:06 PM, Martin Liška  wrote:
>  Hello.
> 
>  There's one additional predictor enhancement that is GOTO predict that
>  used to working. Following patch adds expect statement for C and C++ 
>  family
>  languages.
> 
>  There's one fallout which is vrp24.c test-case, where only 'Simplified 
>  relational'
>  appears just once. Adding Richi and Patrick who can probably help how to 
>  fix the
>  test-case.
> >>>
> >>> Happens to be optimized better now, there's only one predicate to simplify
> >>> left in the IL input to VRP1.  I suggest to change it to match 1 times 
> >>> and add
> >>> -fdump-tree-optimized to dg-options and
> >>>
> >>> /* { dg-final { scan-tree-dump-times "if " 3 "optimized" } } */
> >>>
> >>> to verify we have 3 conditions left.
> >>
> >> Thanks for help.
> >> What about the comment:
> >>
> >> /* The first n_sets > 0 test can be simplfiied into n_sets == 1 since
> >>n_sets can only have the values [0, 1] as it's the result of a
> >>boolean operation.
> >>
> >>The second n_sets > 0 test can also be simplified into n_sets == 1
> >>as the only way to reach the tests is when n_sets <= 1 and the only
> >>value which satisfies both conditions is n_sets == 1.  */
> >>
> >> I guess just only one can be valid? Or is it a different story now?
> > 
> > The 2nd one is handled by earlier jump-threading.
> > 
> >> Martin
> >>
> >>>
>  Patch can bootstrap on ppc64le-redhat-linux and survives regression 
>  tests.
> 
>  Ready to be installed?
>  Martin
> >>

Re: [PATCH v9] add -fpatchable-function-entry=N,M option

2017-06-30 Thread Torsten Duwe


Richard, Ping?

On Tue, Jun 13, 2017 at 07:00:27PM +0200, Torsten Duwe wrote:
> Changes since v8:
> 
>   * Documentation changes as requested by Sandra
>   * 3 functional test cases added
> 
>   Torsten
> 
> 
> gcc/c-family/ChangeLog
> 2017-06-13  Torsten Duwe  
> 
>   * c-attribs.c (c_common_attribute_table): Add entry for
>   "patchable_function_entry".
> 
> gcc/lto/ChangeLog
> 2017-06-13  Torsten Duwe  
> 
>   * lto-lang.c (lto_attribute_table): Add entry for
>   "patchable_function_entry".
> 
> gcc/ChangeLog
> 2017-06-13  Torsten Duwe  
> 
>   * common.opt: Introduce -fpatchable-function-entry
>   command line option, and its variables function_entry_patch_area_size
>   and function_entry_patch_area_start.
>   * opts.c (common_handle_option): Add -fpatchable_function_entry_ case,
>   including a two-value parser.
>   * target.def (print_patchable_function_entry): New target hook.
>   * targhooks.h (default_print_patchable_function_entry): New function.
>   * targhooks.c (default_print_patchable_function_entry): Likewise.
>   * toplev.c (process_options): Switch off IPA-RA if
>   patchable function entries are being generated.
>   * varasm.c (assemble_start_function): Look at the
>   patchable-function-entry command line switch and current
>   function attributes and maybe generate NOP instructions by
>   calling the print_patchable_function_entry hook.
>   * doc/extend.texi: Document patchable_function_entry attribute.
>   * doc/invoke.texi: Document -fpatchable_function_entry
>   command line option.
>   * doc/tm.texi.in (TARGET_ASM_PRINT_PATCHABLE_FUNCTION_ENTRY):
>   New target hook.
>   * doc/tm.texi: Likewise.
> 
> gcc/testsuite/ChangeLog
> 2017-06-13  Torsten Duwe  
> 
>   * c-c++-common/patchable_function_entry-default.c: New test.
>   * c-c++-common/patchable_function_entry-decl.c: Likewise.
>   * c-c++-common/patchable_function_entry-definition.c: Likewise.
> 
> diff --git a/gcc/c-family/c-attribs.c b/gcc/c-family/c-attribs.c
> index f2a88e147ba..31137ce0433 100644
> --- a/gcc/c-family/c-attribs.c
> +++ b/gcc/c-family/c-attribs.c
> @@ -139,6 +139,8 @@ static tree handle_bnd_variable_size_attribute (tree *, 
> tree, tree, int, bool *)
>  static tree handle_bnd_legacy (tree *, tree, tree, int, bool *);
>  static tree handle_bnd_instrument (tree *, tree, tree, int, bool *);
>  static tree handle_fallthrough_attribute (tree *, tree, tree, int, bool *);
> +static tree handle_patchable_function_entry_attribute (tree *, tree, tree,
> +int, bool *);
>  
>  /* Table of machine-independent attributes common to all C-like languages.
>  
> @@ -345,6 +347,9 @@ const struct attribute_spec c_common_attribute_table[] =
> handle_bnd_instrument, false },
>{ "fallthrough", 0, 0, false, false, false,
> handle_fallthrough_attribute, false },
> +  { "patchable_function_entry",  1, 2, true, false, false,
> +   handle_patchable_function_entry_attribute,
> +   false },
>{ NULL, 0, 0, false, false, false, NULL, false }
>  };
>  
> @@ -3173,3 +3178,10 @@ handle_fallthrough_attribute (tree *, tree name, tree, 
> int,
>*no_add_attrs = true;
>return NULL_TREE;
>  }
> +
> +static tree
> +handle_patchable_function_entry_attribute (tree *, tree, tree, int, bool *)
> +{
> +  /* Nothing to be done here.  */
> +  return NULL_TREE;
> +}
> diff --git a/gcc/common.opt b/gcc/common.opt
> index a5c3aeaa336..f542590650c 100644
> --- a/gcc/common.opt
> +++ b/gcc/common.opt
> @@ -163,6 +163,13 @@ bool flag_stack_usage_info = false
>  Variable
>  int flag_debug_asm
>  
> +; How many NOP insns to place at each function entry by default
> +Variable
> +HOST_WIDE_INT function_entry_patch_area_size
> +
> +; And how far the real asm entry point is into this area
> +Variable
> +HOST_WIDE_INT function_entry_patch_area_start
>  
>  ; Balance between GNAT encodings and standard DWARF to emit.
>  Variable
> @@ -2022,6 +2029,10 @@ fprofile-reorder-functions
>  Common Report Var(flag_profile_reorder_functions)
>  Enable function reordering that improves code placement.
>  
> +fpatchable-function-entry=
> +Common Joined Optimization
> +Insert NOP instructions at each function entry.
> +
>  frandom-seed
>  Common Var(common_deferred_options) Defer
>  
> diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
> index 1255995eb78..d09ccd90c42 100644
> --- a/gcc/doc/extend.texi
> +++ b/gcc/doc/extend.texi
> @@ -3083,6 +3083,25 @@ that affect more than one function.
>  This attribute should be used for debugging purposes only.  It is not
>  suitable in production code.
>  
> +@item patchable_function_entry
> +@cindex @code{patchable_function_entry} function attribute
> +@cindex extra NOP

[PR c++/81229] Dangling GC pointer

2017-06-30 Thread Nathan Sidwell

This fixes 81229, an ICE during GC.  When converting the namespace 
symbol handling I was mystified by a SET_IDENTIFIER_TYPE call that 
seemingly had no effect.  Turns out it is needed, because we've already 
set the IDENTIFIER_TYPE to the new decl's type.  Whose name points back 
at the new TYPE_DECL.  Which duplicate_decls gcc_freed when it found the 
already matched one.  So we need to change IDENTIFIER_TYPE to the old 
decl's type node.


This problem didn't manifest on a non-attributed int typedef, not sure 
why.  I suspect I'll be coming back to revisit this code as my symbol 
table overhaul continues.


nathan
--
Nathan Sidwell
2017-06-30  Nathan Sidwell  

	PR c++/81229
	* name-lookup.c (do_pushdecl): Reset IDENTIFIER_TYPE when finding
	a matching TYPE_DECL.

	* g++.dg/lookup/pr81229.C: New.

Index: gcc/cp/name-lookup.c
===
--- gcc/cp/name-lookup.c	(revision 249834)
+++ gcc/cp/name-lookup.c	(working copy)
@@ -2354,9 +2354,13 @@ do_pushdecl (tree decl, bool is_friend)
 	  ; /* Ignore using decls here.  */
 	else if (tree match = duplicate_decls (decl, *iter, is_friend))
 	  {
-	if (iter.hidden_p ()
-		&& match != error_mark_node
-		&& !DECL_HIDDEN_P (match))
+	if (match == error_mark_node)
+	  ;
+	else if (TREE_CODE (match) == TYPE_DECL)
+	  /* The IDENTIFIER will have the type referring to the
+		 now-smashed TYPE_DECL, because ...?  Reset it.  */
+	  SET_IDENTIFIER_TYPE_VALUE (name, TREE_TYPE (match));
+	else if (iter.hidden_p () && !DECL_HIDDEN_P (match))
 	  {
 		/* Unhiding a previously hidden decl.  */
 		tree head = iter.reveal_node (old);
Index: gcc/testsuite/g++.dg/lookup/pr81229.C
===
--- gcc/testsuite/g++.dg/lookup/pr81229.C	(revision 0)
+++ gcc/testsuite/g++.dg/lookup/pr81229.C	(working copy)
@@ -0,0 +1,7 @@
+// PR c++/81229 GC ICE with stale pointed in identifier type.
+// { dg-additional-options "--param ggc-min-heapsize=0" }
+
+typedef unsigned L;
+typedef unsigned L;
+
+L l;

Re: [PATCH GCC][13/13]Distribute loop with loop versioning under runtime alias check

2017-06-30 Thread Bin.Cheng

On Wed, Jun 28, 2017 at 2:09 PM, Bin.Cheng  wrote:
> On Wed, Jun 28, 2017 at 1:29 PM, Richard Biener
>  wrote:
>> On Wed, Jun 28, 2017 at 1:46 PM, Bin.Cheng  wrote:
>>> On Wed, Jun 28, 2017 at 11:58 AM, Richard Biener
>>>  wrote:
 On Tue, Jun 27, 2017 at 4:07 PM, Bin.Cheng  wrote:
> On Tue, Jun 27, 2017 at 1:44 PM, Richard Biener
>  wrote:
>> On Fri, Jun 23, 2017 at 12:30 PM, Bin.Cheng  
>> wrote:
>>> On Tue, Jun 20, 2017 at 10:22 AM, Bin.Cheng  
>>> wrote:
 On Mon, Jun 12, 2017 at 6:03 PM, Bin Cheng  wrote:
> Hi,
>>> Rebased V3 for changes in previous patches.  Bootstap and test on
>>> x86_64 and aarch64.
>>
>> why is ldist-12.c no longer distributed?  your comment says it doesn't 
>> expose
>> more "parallelism" but the point is to reduce memory bandwith 
>> requirements
>> which it clearly does.
>>
>> Likewise for -13.c, -14.c.  -4.c may be a questionable case but the 
>> wording
>> of "parallelism" still confuses me.
>>
>> Can you elaborate on that.  Now onto the patch:
> Given we don't model data locality or memory bandwidth, whether
> distribution enables loops that can be executed paralleled becomes the
> major criteria for distribution.  BTW, I think a good memory stream
> optimization model shouldn't consider small loops as in ldist-12.c,
> etc., appropriate for distribution.

 True.  But what means "parallel" here?  ldist-13.c if partitioned into two 
 loops
 can be executed "in parallel"
>>> So if a loop by itself can be vectorized (or so called can be executed
>>> paralleled), we tend to no distribute it into small ones.  But there
>>> is one exception here, if the distributed small loops are recognized
>>> as builtin functions, we still distribute it.  I assume it's generally
>>> better to call builtin memory functions than vectorize it by GCC?
>>
>> Yes.
>>

>>
>> +   Loop distribution is the dual of loop fusion.  It separates 
>> statements
>> +   of a loop (or loop nest) into multiple loops (or loop nests) with the
>> +   same loop header.  The major goal is to separate statements which may
>> +   be vectorized from those that can't.  This pass implements 
>> distribution
>> +   in the following steps:
>>
>> misses the goal of being a memory stream optimization, not only a 
>> vectorization
>> enabler.  distributing a loop can also reduce register pressure.
> I will revise the comment, but as explained, enabling more
> vectorization is the major criteria for distribution to some extend
> now.

 Yes, I agree -- originally it was written to optimize the stream benchmark 
 IIRC.
>>> Let's see if any performance drop will be reported against this patch.
>>> Let's see if we can create a cost model for it.
>>
>> Fine.
> I will run some benchmarks to see if there is breakage.
>>

>>
>> You introduce ldist_alias_id in struct loop (probably in 01/n which I
>> didn't look
>> into yet).  If you don't use that please introduce it separately.
> Hmm, yes it is introduced in patch [01/n] and set in this patch.
>
>>
>> + /* Be conservative.  If data references are not well 
>> analyzed,
>> +or the two data references have the same base address 
>> and
>> +offset, add dependence and consider it alias to each 
>> other.
>> +In other words, the dependence can not be resolved by
>> +runtime alias check.  */
>> + if (!DR_BASE_ADDRESS (dr1) || !DR_BASE_ADDRESS (dr2)
>> + || !DR_OFFSET (dr1) || !DR_OFFSET (dr2)
>> + || !DR_INIT (dr1) || !DR_INIT (dr2)
>> + || !DR_STEP (dr1) || !tree_fits_uhwi_p (DR_STEP (dr1))
>> + || !DR_STEP (dr2) || !tree_fits_uhwi_p (DR_STEP (dr2))
>> + || res == 0)
>>
>> ISTR a helper that computes whether we can handle a runtime alias check 
>> for
>> a specific case?
> I guess you mean runtime_alias_check_p that I factored out previously?
>  Unfortunately, it's factored out vectorizer's usage and doesn't fit
> here straightforwardly.  Shall I try to further generalize the
> interface as independence patch to this one?

 That would be nice.

>>
>> +  /* Depend on vectorizer to fold IFN_LOOP_DIST_ALIAS.  */
>> +  if (flag_tree_loop_vectorize)
>> +{
>>
>> so at this point I'd condition the whole runtime-alias check generating
>> on flag_tree_loop_vectorize.  You seem to support versioning w/o
>> that here but in other places disable

Re: [PATCH GCC][01/13]Introduce internal function IFN_LOOP_DIST_ALIAS

2017-06-30 Thread Bin.Cheng

On Wed, Jun 28, 2017 at 8:29 AM, Richard Biener
 wrote:
> On Tue, Jun 27, 2017 at 6:46 PM, Bin.Cheng  wrote:
>> On Tue, Jun 27, 2017 at 3:59 PM, Richard Biener
>>  wrote:
>>> On June 27, 2017 4:27:17 PM GMT+02:00, "Bin.Cheng"  
>>> wrote:
On Tue, Jun 27, 2017 at 1:58 PM, Richard Biener
 wrote:
> On Fri, Jun 23, 2017 at 12:10 PM, Bin.Cheng 
wrote:
>> On Mon, Jun 12, 2017 at 6:02 PM, Bin Cheng 
wrote:
>>> Hi,
>>> I was asked by upstream to split the loop distribution patch into
small ones.
>>> It is hard because data structure and algorithm are closely coupled
together.
>>> Anyway, this is the patch series with smaller patches.  Basically I
tried to
>>> separate data structure and bug-fix changes apart with one as the
main patch.
>>> Note I only made necessary code refactoring in order to separate
patch, apart
>>> from that, there is no change against the last version.
>>>
>>> This is the first patch introducing new internal function
IFN_LOOP_DIST_ALIAS.
>>> GCC will distribute loops under condition of this function call.
>>>
>>> Bootstrap and test on x86_64 and AArch64.  Is it OK?
>> Hi,
>> I need to update this patch fixing an issue in
>> vect_loop_dist_alias_call.  The previous patch fails to find some
>> IFN_LOOP_DIST_ALIAS calls.
>>
>> Bootstrap and test in series.  Is it OK?
>
> So I wonder if we really need to track ldist_alias_id or if we can do
sth
Yes, it is needed because otherwise we probably falsely trying to
search for IFN_LOOP_DIST_ALIAS for a normal (not from distribution)
loop.

> more "general", like tracking a copy_of or origin and then directly
> go to nearest_common_dominator (loop->header, copy_of->header)
> to find the controlling condition?
I tend to not record any pointer in loop structure, it can easily go
dangling for a across passes data structure.
>>>
>>> I didn't mean to record a pointer, just rename your field and make it more 
>>> general.  The common dominator thing shod still work, no?
>> I might not be following.  If we record the original loop->num in the
>> renamed field, nearest_common_dominator can't work because we don't
>> have basic blocks to start the call?  The original loop could be
>> eliminated at several points, for example, instantly after
>> distribution, or folded in vectorizer for other loops distributed from
>> the original loop.
>> BTW, setting the copy_of/origin field in loop_version is not enough
>> for this use case, all generated loops (actually, except the versioned
>> loop) from distribution need to be set.
>
> Of course it would need to be set for all distributed loops.
>
> I'm not sure "loop vanishes" is the case to optimize for though.  If the loop
> is still available then origin->header should work as BB.  If not then can't
> we conclude, for the purpose of IFN_LOOP_DIST_ALIAS, that the whole
> region must be dead?  We still have to identify it of course, but it means
> we can fold stray IFN_LOOP_DIST_ALIAS calls left over after vectorization
> to whatever we like?
>
>>>
>>> As far as memory usage
is concerned.  I actually don't need a whole integer to record the
loop num.  I can simply restrict number of distributions in one
function to at most 256, and record such id in a char field in struct
loop?  Does this sounds better?
>>>
>>> As said, tracking loop origin sounds useful anyway so I'd rather add and 
>>> use that somehow.
>> To be honest, I don't know.  the current field works like a unique
>> index of distribution operation.  The original loop could be destroyed
>> at different points thus no longer exists, this makes the recorded
>> copy_of/origin less meaningful?
>
> I think we talked about prologue and epilogue loops to be easier identifiable
> as so (and as to what "main" loop).  So lets say we have one "origin" field
> and accompaning flags "origin_is_loop_dist_alias_version",
> "origin_is_main_loop_of_prologue", etc.?  I can't think of the case where
> origin would be two things at the same time (you can always walk up
> the origin tree).
Hi,
Here is the updated patch working in this way.  There is still one
problem with this method.  Considering one distributed loop is
if-converted later, the orig loop for if-converted distributed loop is
different.  Though we can update orig_loop_num, it's inaccurate and
one consequence is we need to walk up dominance tree till entry_block.
Note if orig_loop_num is not shared, we can stop once basic block goes
beyond outer loop.
I didn't introduce flags in this patch, which can be done along with
solving the above problem.

>
> Vanishing origins could also be cleared btw, or we can make the new origin
> the origin of the vanishing origin...
Yes,

Re: [PATCH 0/3] Introduce internal_error_cont and exclude it from pot files

2017-06-30 Thread Martin Liška

On 04/10/2017 05:44 PM, Richard Biener wrote:
> On Mon, Apr 10, 2017 at 5:42 PM, Jeff Law  wrote:
>> On 04/07/2017 04:30 AM, Martin Liška wrote:
>>>
>>> On 04/06/2017 06:44 PM, Jeff Law wrote:

 On 03/24/2017 03:29 AM, Martin Liška wrote:
>
> I would like to ping that. I'm not sure what's agreement after I read
> discussion in: https://gcc.gnu.org/ml/gcc/2017-03/msg00070.html
>
> Martin Sebor may know, CC'ing him.

 Not sure if you're pinging the internal_error_cont stuff, or the ODR
 diagnostics changes.

 WRT the ODR diagnostics, I'd say let's go with the C++17 style
 (all-lowercase with a hyphen).

 If you've got a pointer to the internal_err_cont changes, please pass it
 along.
>>>
>>>
>>> Yep, I was interested in internal_err_cont. It's next stage1 material, but
>>> I'm willing to have
>>> a feedback whether separation of internal messages is desired to be
>>> excluded from translation or not?
>>
>> As stage1 stuff, I'll largely be ignoring.
>>
>> I'm torn on translating this stuff.  It's hard enough for a non-developer to
>> interpret an ICE message, if it's not in their native language it'd be
>> impossible.
>>
>> OTOH, if we translate and the user forwards the translated message to us, we
>> may not be able to interpret.
>>
>> That probably argues there's some part that should be translated to be
>> easier for the users, but the real guts of the message should not be
>> translated.
> 
> Message number and catalogue.
> 
> /me runs...
> 
>> jeff

I there any discussion going on this topic? If not, I'll probably leave it as 
it is.

Thanks,
Martin

Re: [PATCH] Remove dead code in asan.c

2017-06-30 Thread Jakub Jelinek

On Fri, Jun 30, 2017 at 12:00:36PM +0200, Martin Liška wrote:
> Hi.
> 
> Following crap code was added by me when I added use-after-scope.
> Actually decl always points to LASANPC, so asan_handled_variables->contains 
> (decl)
> is always false.
> 
> Well, originally the idea was to not clear content (place in shadow memory in 
> between
> red zoner) of auto variables, but as we emit 0xf5 in order to have working 
> use-after-return,
> it probably does not worth for doing an optimization?

use-after-return is only runtime conditional, defaults to off.
And your patch doesn't bring the code to anything close to what we had
before the -fsanitize-use-after-scope changes, just look what it did before
- only cleared the shadow spots that weren't known to be 0, clearing the
whole shadow might be too expensive.  Consider many KB large local
variables.

You can find the right decl in decls[l / 2] or decls[l / 2 - 1] or so.

Jakub

[PATCH] Remove dead code in asan.c

2017-06-30 Thread Martin Liška

Hi.

Following crap code was added by me when I added use-after-scope.
Actually decl always points to LASANPC, so asan_handled_variables->contains 
(decl)
is always false.

Well, originally the idea was to not clear content (place in shadow memory in 
between
red zoner) of auto variables, but as we emit 0xf5 in order to have working 
use-after-return,
it probably does not worth for doing an optimization?

Survives asan.exp tests.

Ready to be installed after proper regression tests?

Martin

gcc/ChangeLog:

2017-06-30  Martin Liska  

* asan.c (asan_emit_stack_protection): Do not use
asan_handled_variables.
(asan_expand_mark_ifn): Likewise.
---
 gcc/asan.c | 62 +-
 1 file changed, 1 insertion(+), 61 deletions(-)


diff --git a/gcc/asan.c b/gcc/asan.c
index 2de16402c51..b415e28bd2f 100644
--- a/gcc/asan.c
+++ b/gcc/asan.c
@@ -246,11 +246,6 @@ static unsigned HOST_WIDE_INT asan_shadow_offset_value;
 static bool asan_shadow_offset_computed;
 static vec sanitized_sections;
 
-/* Set of variable declarations that are going to be guarded by
-   use-after-scope sanitizer.  */
-
-static hash_set *asan_handled_variables = NULL;
-
 hash_set  *asan_used_labels = NULL;
 
 /* Sets shadow offset to value in string VAL.  */
@@ -1297,63 +1292,11 @@ asan_emit_stack_protection (rtx base, rtx pbase, unsigned int alignb,
   if (STRICT_ALIGNMENT)
 set_mem_align (shadow_mem, (GET_MODE_ALIGNMENT (SImode)));
 
-  /* Unpoison shadow memory of a stack at the very end of a function.
- As we're poisoning stack variables at the end of their scope,
- shadow memory must be properly unpoisoned here.  The easiest approach
- would be to collect all variables that should not be unpoisoned and
- we unpoison shadow memory of the whole stack except ranges
- occupied by these variables.  */
   last_offset = base_offset;
-  HOST_WIDE_INT current_offset = last_offset;
   if (length)
-{
-  HOST_WIDE_INT var_end_offset = 0;
-  HOST_WIDE_INT stack_start = offsets[length - 1];
-  gcc_assert (last_offset == stack_start);
-
-  for (int l = length - 2; l > 0; l -= 2)
-	{
-	  HOST_WIDE_INT var_offset = offsets[l];
-	  current_offset = var_offset;
-	  var_end_offset = offsets[l - 1];
-	  HOST_WIDE_INT rounded_size = ROUND_UP (var_end_offset - var_offset,
-	 BITS_PER_UNIT);
-
-	  /* Should we unpoison the variable?  */
-	  if (asan_handled_variables != NULL
-	  && asan_handled_variables->contains (decl))
-	{
-	  if (dump_file && (dump_flags & TDF_DETAILS))
-		{
-		  const char *n = (DECL_NAME (decl)
-   ? IDENTIFIER_POINTER (DECL_NAME (decl))
-   : "");
-		  fprintf (dump_file, "Unpoisoning shadow stack for variable: "
-			   "%s (%" PRId64 "B)\n", n,
-			   var_end_offset - var_offset);
-		}
-
-	  unsigned HOST_WIDE_INT s
-		= shadow_mem_size (current_offset - last_offset);
-	  asan_clear_shadow (shadow_mem, s);
-	  HOST_WIDE_INT shift
-		= shadow_mem_size (current_offset - last_offset + rounded_size);
-	  shadow_mem = adjust_address (shadow_mem, VOIDmode, shift);
-	  last_offset = var_offset + rounded_size;
-	  current_offset = last_offset;
-	}
-
-	}
-
-  /* Handle last redzone.  */
-  current_offset = offsets[0];
-  asan_clear_shadow (shadow_mem,
-			 shadow_mem_size (current_offset - last_offset));
-}
+asan_clear_shadow (shadow_mem, shadow_mem_size (offsets[0] - last_offset));
 
   /* Clean-up set with instrumented stack variables.  */
-  delete asan_handled_variables;
-  asan_handled_variables = NULL;
   delete asan_used_labels;
   asan_used_labels = NULL;
 
@@ -2802,9 +2745,6 @@ asan_expand_mark_ifn (gimple_stmt_iterator *iter)
 decl = TREE_OPERAND (decl, 0);
 
   gcc_checking_assert (TREE_CODE (decl) == VAR_DECL);
-  if (asan_handled_variables == NULL)
-asan_handled_variables = new hash_set (16);
-  asan_handled_variables->add (decl);
   tree len = gimple_call_arg (g, 2);
 
   gcc_assert (tree_fits_shwi_p (len));

[PATCH] Fix dot-fn

2017-06-30 Thread Richard Biener


Committed.

2017-06-30  Richard Biener  

* graph.c (draw_cfg_node_succ_edges): Fix broken dot syntax.

Index: gcc/graph.c
===
--- gcc/graph.c (revision 249832)
+++ gcc/graph.c (working copy)
@@ -136,13 +136,13 @@ draw_cfg_node_succ_edges (pretty_printer
 
   pp_printf (pp,
 "\tfn_%d_basic_block_%d:s -> fn_%d_basic_block_%d:n "
-"[style=%s,color=%s,weight=%d,constraint=%s];\n",
+"[style=%s,color=%s,weight=%d,constraint=%s",
 funcdef_no, e->src->index,
 funcdef_no, e->dest->index,
 style, color, weight,
 (e->flags & (EDGE_FAKE | EDGE_DFS_BACK)) ? "false" : "true");
   if (e->probability.initialized_p ())
-pp_printf (pp, ", label=\"[%i%%]\"",
+pp_printf (pp, ",label=\"[%i%%]\"",
   e->probability.to_reg_br_prob_base ()
   * 100 / REG_BR_PROB_BASE);
   pp_printf (pp, "];\n");

Re: [PATCH] ASAN: handle addressable params (PR sanitize/81040).

2017-06-30 Thread Jakub Jelinek

On Fri, Jun 30, 2017 at 11:21:48AM +0200, Martin Liška wrote:
> @@ -858,6 +864,132 @@ sanitize_asan_mark_poison (void)
>  }
>  }
>  
> +/* Rewrite all usages of tree OP which is a PARM_DECL with a VAR_DECL
> +   that is it's DECL_VALUE_EXPR.  */
> +
> +static tree
> +rewrite_usage_of_param (tree *op, int *walk_subtrees, void *)
> +{
> +  if (TREE_CODE (*op) == PARM_DECL && DECL_HAS_VALUE_EXPR_P (*op))
> +{
> +  *op = DECL_VALUE_EXPR (*op);
> +  *walk_subtrees = 0;
> +}

If you are going to rely just on DECL_HAS_VALUE_EXPR_P here

> +  for (tree arg = DECL_ARGUMENTS (current_function_decl);
> +   arg; arg = DECL_CHAIN (arg))
> +{

I think you should gcc_assert here that !DECL_HAS_VALUE_EXPR_P (arg) here.

If that ever fails, another possibility would be to temporarily clear those
flags and remember it and then set it again after the walk_*.  The question
would be what to do with arguments that already have DECL_VALUE_EXPR, are
addressable and you want to rewrite them.

> +  if (TREE_ADDRESSABLE (arg) && !TREE_ADDRESSABLE (TREE_TYPE (arg)))
> + {
> +   TREE_ADDRESSABLE (arg) = 0;
> +   /* The parameter is no longer addressable.  */
> +   tree type = TREE_TYPE (arg);
> +   has_any_addressable_param = true;

Otherwise LGTM.

Jakub

Re: [PATCH 4/N] Recover GOTO predictor.

2017-06-30 Thread Martin Liška

PING^1

Can you please Honza give a formal approval for the patch?

Thanks,
Martin

On 06/22/2017 01:47 PM, Richard Biener wrote:
> On Thu, Jun 22, 2017 at 12:57 PM, Martin Liška  wrote:
>> On 06/22/2017 12:27 PM, Richard Biener wrote:
>>> On Wed, Jun 21, 2017 at 3:06 PM, Martin Liška  wrote:
 Hello.

 There's one additional predictor enhancement that is GOTO predict that
 used to working. Following patch adds expect statement for C and C++ family
 languages.

 There's one fallout which is vrp24.c test-case, where only 'Simplified 
 relational'
 appears just once. Adding Richi and Patrick who can probably help how to 
 fix the
 test-case.
>>>
>>> Happens to be optimized better now, there's only one predicate to simplify
>>> left in the IL input to VRP1.  I suggest to change it to match 1 times and 
>>> add
>>> -fdump-tree-optimized to dg-options and
>>>
>>> /* { dg-final { scan-tree-dump-times "if " 3 "optimized" } } */
>>>
>>> to verify we have 3 conditions left.
>>
>> Thanks for help.
>> What about the comment:
>>
>> /* The first n_sets > 0 test can be simplfiied into n_sets == 1 since
>>n_sets can only have the values [0, 1] as it's the result of a
>>boolean operation.
>>
>>The second n_sets > 0 test can also be simplified into n_sets == 1
>>as the only way to reach the tests is when n_sets <= 1 and the only
>>value which satisfies both conditions is n_sets == 1.  */
>>
>> I guess just only one can be valid? Or is it a different story now?
> 
> The 2nd one is handled by earlier jump-threading.
> 
>> Martin
>>
>>>
 Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.

 Ready to be installed?
 Martin
>>

[PATCH v2][RFC] Canonize names of attributes.

2017-06-30 Thread Martin Liška

On 06/28/2017 09:01 PM, Jason Merrill wrote:
> On Wed, Jun 28, 2017 at 12:05 PM, Joseph Myers  
> wrote:
>> On Wed, 28 Jun 2017, Martin Liška wrote:
>>
>>> On 06/14/2017 07:24 PM, Jason Merrill wrote:
 On Tue, Jun 13, 2017 at 8:32 AM, Martin Liška  wrote:
> (canonize_attr_name): New function.

 I think this should be "canonicalize"; "canonize" means something else.

 Jason

>>>
>>> Yes, I hope it's a cosmetic problem. In general, do you welcome the change
>>> to canonicalize attribute names?
>>
>> I think *names* (as opposed to arguments) should be canonicalized, but
>> arguments need separate consideration.
> 
> I agree.
> 
> Jason
> 

Good.

This is v2 of the patch, where just names of attributes are canonicalized.
Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.

Ready to be installed?
Martin

>From 4c604148da4d837b06b166512ae71ddf83e5cf3b Mon Sep 17 00:00:00 2001
From: marxin 
Date: Thu, 8 Jun 2017 10:23:25 +0200
Subject: [PATCH] Canonicalize names of attributes.

gcc/c-family/ChangeLog:

2017-06-29  Martin Liska  

	* c-format.c (convert_format_name_to_system_name): Add argument
	to function call.
	(cmp_attribs): Remove the function.
	* c-lex.c (c_common_has_attribute): Use canonicalize_attr_name.
	* c-pretty-print.c: Include stringpool.h.

gcc/cp/ChangeLog:

2017-06-29  Martin Liska  

	* parser.c (cp_parser_gnu_attribute_list): Use
	canonicalize_attr_name.
	(cp_parser_std_attribute): Likewise.
	* tree.c: Include stringpool.h.

gcc/go/ChangeLog:

2017-06-29  Martin Liska  

	* go-gcc.cc (Gcc_backend::function): Look up for no_split_stack
	and not __no_split_stack__.

gcc/c/ChangeLog:

2017-06-29  Martin Liska  

	* c-parser.c (c_parser_attributes): Use canonicalize_attr_name.

gcc/testsuite/ChangeLog:

2017-06-29  Martin Liska  

	* g++.dg/cpp0x/pr65558.C: Update scanned pattern.
	* gcc.dg/parm-impl-decl-1.c: Likewise.
	* gcc.dg/parm-impl-decl-3.c: Likewise.

gcc/ChangeLog:

2017-06-29  Martin Liska  

	* attribs.h (canonicalize_attr_name): New function.
	* tree.c (cmp_attrib_identifiers): Use cmp_attribs.
	(attribute_value_equal): Add argument to a call of function.
	(private_is_attribute_p): Add checking and use id_equal.
	(private_lookup_attribute): Consider attributes do not begin
	with '_' character.
	(private_lookup_attribute_by_prefix): Likewise.
	(remove_attribute): Use string comparison.
	* tree.h (cmp_attribs): New functions.
---
 gcc/attribs.h   |  19 ++
 gcc/c-family/c-format.c |  27 ++---
 gcc/c-family/c-lex.c|   1 +
 gcc/c-family/c-pretty-print.c   |   1 +
 gcc/c/c-parser.c|   3 +
 gcc/cp/parser.c |   6 +-
 gcc/cp/tree.c   |   1 +
 gcc/go/go-gcc.cc|   2 +-
 gcc/testsuite/g++.dg/cpp0x/pr65558.C|   2 +-
 gcc/testsuite/gcc.dg/parm-impl-decl-1.c |   2 +-
 gcc/testsuite/gcc.dg/parm-impl-decl-3.c |   2 +-
 gcc/tree.c  | 103 +++-
 gcc/tree.h  |  42 +
 13 files changed, 103 insertions(+), 108 deletions(-)

diff --git a/gcc/attribs.h b/gcc/attribs.h
index 7f13332700e..b3b68b2e57e 100644
--- a/gcc/attribs.h
+++ b/gcc/attribs.h
@@ -47,4 +47,23 @@ extern char *make_unique_name (tree, const char *, bool);
 extern tree make_dispatcher_decl (const tree);
 extern bool is_function_default_version (const tree);
 
+/* For a given IDENTIFIER_NODE, strip leading and trailing '_' characters
+   so that we have a canonical form of attribute names.  */
+
+static inline tree
+canonicalize_attr_name (tree attr_name)
+{
+  const size_t l = IDENTIFIER_LENGTH (attr_name);
+  const char *s = IDENTIFIER_POINTER (attr_name);
+
+  if (l > 4 && s[0] == '_')
+{
+  gcc_checking_assert (s[l - 2] == '_');
+  return get_identifier_with_length (s + 2, l - 4);
+}
+
+  return attr_name;
+}
+
+
 #endif // GCC_ATTRIBS_H
diff --git a/gcc/c-family/c-format.c b/gcc/c-family/c-format.c
index 732339b9b5e..3aad17522df 100644
--- a/gcc/c-family/c-format.c
+++ b/gcc/c-family/c-format.c
@@ -67,7 +67,6 @@ static bool check_format_string (tree argument,
 static bool get_constant (tree expr, unsigned HOST_WIDE_INT *value,
 			  int validated_p);
 static const char *convert_format_name_to_system_name (const char *attr_name);
-static bool cmp_attribs (const char *tattr_name, const char *attr_name);
 
 static int first_target_format_type;
 static const char *format_name (int format_num);
@@ -3951,10 +3950,10 @@ convert_format_name_to_system_name (const char *attr_name)
   for (i = 0; i < TARGET_OVERRIDES_FORMAT_ATTRIBUTES_COUNT; ++i)
 {
   if (cmp_attribs (TARGET_OVERRIDES_FORMAT_ATTRIBUTES[i].named_attr_src,
-			   attr_name))
+

Re: [PATCH] ASAN: handle addressable params (PR sanitize/81040).

2017-06-30 Thread Martin Liška

On 06/29/2017 01:17 PM, Jakub Jelinek wrote:
> On Tue, Jun 20, 2017 at 03:06:56PM +0200, Martin Liška wrote:
>> +/* Rewrite all usages of tree OP which is a PARM_DECL with a VAR_DECL
>> +   that is it's DECL_VALUE_EXPR.  */
>> +
>> +static tree
>> +rewrite_usage_of_param (tree *op, int *walk_subtrees, void *)
>> +{
>> +  if (TREE_CODE (*op) == PARM_DECL && DECL_VALUE_EXPR (*op) != NULL_TREE)
> 
> DECL_VALUE_EXPR testing is costly (it is a hash table lookup).
> Therefore you should test DECL_HAS_VALUE_EXPR_P (*op) after checking
> == PARM_DECL.  And DECL_HAS_VALUE_EXPR_P should apply non-NULL
> DECL_VALUE_EXPR.
> That said, I wonder if we don't create DECL_VALUE_EXPR for PARM_DECLs in
> other parts of the compiler, whether it wouldn't be safer to also test here
> after == PARM_DECL and DECL_HAS_VALUE_EXPR_P check whether *op is in
> addressable_params hash table.

Thanks for explanation, DECL_HAS_VALUE_EXPR_P is a flag, while DECL_VALUE_EXPR
is a hash table lookup.

> 
>> +{
>> +  *op = DECL_VALUE_EXPR (*op);
>> +  *walk_subtrees = 0;
>> +}
>> +
>> +  return NULL;
>> +}
>> +
>> +/* For a given function FUN, rewrite all addressable parameters so that
>> +   a new automatic variable is introduced.  Right after function entry
>> +   a parameter is assigned to the variable.  */
>> +
>> +static void
>> +sanitize_rewrite_addressable_params (function *fun)
>> +{
>> +  gimple *g;
>> +  gimple_seq stmts = NULL;
>> +  auto_vec addressable_params;
> 
> You don't really use the addressable_params vector anywhere, right?
> Except for:
> 
>> +
>> +  for (tree arg = DECL_ARGUMENTS (current_function_decl);
>> +   arg; arg = DECL_CHAIN (arg))
>> +{
>> +  if (TREE_ADDRESSABLE (arg) && !TREE_ADDRESSABLE (TREE_TYPE (arg)))
>> +{
>> +  TREE_ADDRESSABLE (arg) = 0;
>> +  /* The parameter is no longer addressable.  */
>> +  tree type = TREE_TYPE (arg);
>> +  addressable_params.safe_push (arg);
> 
> pushing stuff into it and later
> 
>> +  if (addressable_params.is_empty ())
>> +return;
> 
> If you only need that, a bool flag if any params have been changed is
> enough.  But see above whether it wouldn't be safer to use a hash table
> to verify it.  Plus, I think it would be desirable to clear
> DECL_HAS_VALUE_EXPR_P and SET_DECL_VALUE_EXPR to NULL afterwards
> if (target_for_debug_bind (arg)) - whch can be done either the with vec
> or with a hash table traversal, for that we don't care about the ordering.

Good point, I decided to come up with a flag + vector of arguments where
VALUE_EXPR should be set to NULL.

> 
>> +
>> +  /* Create a new automatic variable.  */
>> +  tree var = build_decl (DECL_SOURCE_LOCATION (arg),
>> + VAR_DECL, DECL_NAME (arg), type);
>> +  TREE_ADDRESSABLE (var) = 1;
>> +  DECL_ARTIFICIAL (var) = 1;
>> +  DECL_SEEN_IN_BIND_EXPR_P (var) = 0;
> 
> This is 0 already from build_decl, IMHO no need to set it.

Done.

> 
>> +  gimple_add_tmp_var (var);
>> +
>> +  if (dump_file)
>> +fprintf (dump_file,
>> + "Rewriting parameter whose address is taken: %s\n",
>> + IDENTIFIER_POINTER (DECL_NAME (arg)));
>> +
>> +  SET_DECL_VALUE_EXPR (arg, var);
> 
> But obviously you miss setting DECL_HAS_VALUE_EXPR_P here.

Likewise.

> 
>> +  /* Assign value of parameter to newly created variable.  */
>> +  if ((TREE_CODE (type) == COMPLEX_TYPE
>> +   || TREE_CODE (type) == VECTOR_TYPE))
>> +{
>> +  /* We need to create a SSA name that will be used for the
>> + assignment.  */
> 
> Why don't you just set DECL_GIMPLE_REG_P (arg) = 1; for
> COMPLEX_TYPE/VECTOR_TYPE?  The arg is going to be only used to copy it into
> the new var.  And then just use get_or_create_ssa_default_def,
> regardless of whether if is complex/vector or other.

Doing so fails here:

$ ./xgcc -B. 
/home/marxin/Programming/gcc/gcc/testsuite/g++.dg/asan/function-argument-1.C 
-fsanitize=address 
during RTL pass: expand
/home/marxin/Programming/gcc/gcc/testsuite/g++.dg/asan/function-argument-1.C: 
In function ‘int foo(A)’:
/home/marxin/Programming/gcc/gcc/testsuite/g++.dg/asan/function-argument-1.C:17:1:
 internal compiler error: in set_parm_rtl, at cfgexpand.c:1271
 foo (A arg)
 ^~~
0x938536 set_parm_rtl(tree_node*, rtx_def*)
../../gcc/cfgexpand.c:1271
0xab9813 assign_parms
../../gcc/function.c:3782
0xabc605 expand_function_start(tree_node*)
../../gcc/function.c:5221
0x943e01 execute
../../gcc/cfgexpand.c:6248

> 
>> +  /* Replace all usages of PARM_DECLs with the newly
>> + created variable VAR.  */
>> +  basic_block bb;
>> +  FOR_EACH_BB_FN (bb, fun)
>> +{
>> +  gimple_stmt_iterator gsi;
>> +  for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next ())
>> +{
>> +  gimple *stmt = gsi_stmt (gsi);
>> +  gimple_stmt_iterator it = gsi_for_stmt (stmt);
>> +  walk_gimple_stmt (, NULL, rewrite_usage_of_param,

Re: [rtlanal] Do a better job of costing parallel sets containing flag-setting operations.

2017-06-30 Thread Richard Earnshaw (lists)

On 19/06/17 14:46, Richard Earnshaw (lists) wrote:
> Many parallel set insns are of the form of a single set that also sets
> the condition code flags.  In this case the cost of such an insn is
> normally the cost of the part that doesn't set the flags, since updating
> the condition flags is simply a side effect.
> 
> At present all such insns are treated as having unknown cost (ie 0) and
> combine assumes that such insns are infinitely more expensive than any
> other insn sequence with a non-zero cost.
> 
> This patch addresses this problem by allowing insn_rtx_cost to ignore
> the condition setting part of a PARALLEL iff there is exactly one
> comparison set and one non-comparison set.  If the only set operation is
> a comparison we still use that as the basis of the insn cost.
> 
>   * rtlanal.c (insn_rtx_cost): If a parallel contains exactly one
>   comparison set and one other set, use the cost of the
>   non-comparison set.
> 
> Bootstrapped on aarch64-none-linuxgnu
> 
> OK?
> 

Ping?

R.

> R.
> 
> 
> insn-costs.patch
> 
> 
> diff --git a/gcc/rtlanal.c b/gcc/rtlanal.c
> index d9f57c3..5cae793 100644
> --- a/gcc/rtlanal.c
> +++ b/gcc/rtlanal.c
> @@ -5260,23 +5260,41 @@ insn_rtx_cost (rtx pat, bool speed)
>int i, cost;
>rtx set;
>  
> -  /* Extract the single set rtx from the instruction pattern.
> - We can't use single_set since we only have the pattern.  */
> +  /* Extract the single set rtx from the instruction pattern.  We
> + can't use single_set since we only have the pattern.  We also
> + consider PARALLELs of a normal set and and a single comparison.
> + In that case we use the cost of the non-comparison SET operation,
> + which is most-likely to be the real cost of this operation.  */
>if (GET_CODE (pat) == SET)
>  set = pat;
>else if (GET_CODE (pat) == PARALLEL)
>  {
>set = NULL_RTX;
> +  rtx comparison = NULL_RTX;
> +
>for (i = 0; i < XVECLEN (pat, 0); i++)
>   {
> rtx x = XVECEXP (pat, 0, i);
> if (GET_CODE (x) == SET)
>   {
> -   if (set)
> - return 0;
> -   set = x;
> +   if (GET_CODE (SET_SRC (x)) == COMPARE)
> + {
> +   if (comparison)
> + return 0;
> +   comparison = x;
> + }
> +   else
> + {
> +   if (set)
> + return 0;
> +   set = x;
> + }
>   }
>   }
> +
> +  if (!set && comparison)
> + set = comparison;
> +
>if (!set)
>   return 0;
>  }
>

[PATCH] Fix removal of ifunc (PR ipa/81214).

2017-06-30 Thread Martin Liška

Hello.

Following patch fixes the issue where we do not emit ifunc and resolver
for function that are not called in a compilation unit or and not
referenced.

Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.
i386.exp tests work on x86_64-linux-gnu.

Ready to be installed?
Martin

gcc/testsuite/ChangeLog:

2017-06-29  Martin Liska  

PR ipa/81214
* gcc.target/i386/pr81214.c: New test.

gcc/ChangeLog:

2017-06-29  Martin Liska  

PR ipa/81214
* multiple_target.c (create_dispatcher_calls): Make ifunc
also for function that don't have calls or are not referenced.
---
 gcc/multiple_target.c   | 64 -
 gcc/testsuite/gcc.target/i386/pr81214.c | 14 
 2 files changed, 46 insertions(+), 32 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr81214.c


diff --git a/gcc/multiple_target.c b/gcc/multiple_target.c
index eddc7d3744b..0d7cc3a2939 100644
--- a/gcc/multiple_target.c
+++ b/gcc/multiple_target.c
@@ -68,6 +68,38 @@ create_dispatcher_calls (struct cgraph_node *node)
   || !is_function_default_version (node->decl))
 return;
 
+  if (!targetm.has_ifunc_p ())
+{
+  error_at (DECL_SOURCE_LOCATION (node->decl),
+		"the call requires ifunc, which is not"
+		" supported by this target");
+  return;
+}
+  else if (!targetm.get_function_versions_dispatcher)
+{
+  error_at (DECL_SOURCE_LOCATION (node->decl),
+		"target does not support function version dispatcher");
+  return;
+}
+
+  tree idecl = targetm.get_function_versions_dispatcher (node->decl);
+  if (!idecl)
+{
+  error_at (DECL_SOURCE_LOCATION (node->decl),
+		"default target_clones attribute was not set");
+  return;
+}
+
+  cgraph_node *inode = cgraph_node::get (idecl);
+  gcc_assert (inode);
+  tree resolver_decl = targetm.generate_version_dispatcher_body (inode);
+
+  /* Update aliases.  */
+  inode->alias = true;
+  inode->alias_target = resolver_decl;
+  if (!inode->analyzed)
+inode->resolve_alias (cgraph_node::get (resolver_decl));
+
   auto_vec edges_to_redirect;
   auto_vec references_to_redirect;
 
@@ -80,38 +112,6 @@ create_dispatcher_calls (struct cgraph_node *node)
 
   if (!edges_to_redirect.is_empty () || !references_to_redirect.is_empty ())
 {
-  if (!targetm.has_ifunc_p ())
-	{
-	  error_at (DECL_SOURCE_LOCATION (node->decl),
-		"the call requires ifunc, which is not"
-		" supported by this target");
-	  return;
-	}
-  else if (!targetm.get_function_versions_dispatcher)
-	{
-	  error_at (DECL_SOURCE_LOCATION (node->decl),
-		"target does not support function version dispatcher");
-	  return;
-	}
-
-  tree idecl = targetm.get_function_versions_dispatcher (node->decl);
-  if (!idecl)
-	{
-	  error_at (DECL_SOURCE_LOCATION (node->decl),
-		"default target_clones attribute was not set");
-	  return;
-	}
-
-  cgraph_node *inode = cgraph_node::get (idecl);
-  gcc_assert (inode);
-  tree resolver_decl = targetm.generate_version_dispatcher_body (inode);
-
-  /* Update aliases.  */
-  inode->alias = true;
-  inode->alias_target = resolver_decl;
-  if (!inode->analyzed)
-	inode->resolve_alias (cgraph_node::get (resolver_decl));
-
   /* Redirect edges.  */
   unsigned i;
   cgraph_edge *e;
diff --git a/gcc/testsuite/gcc.target/i386/pr81214.c b/gcc/testsuite/gcc.target/i386/pr81214.c
new file mode 100644
index 000..2584decdb3c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr81214.c
@@ -0,0 +1,14 @@
+/* PR ipa/81214.  */
+/* { dg-do compile } */
+/* { dg-require-ifunc "" } */
+
+__attribute__((target_clones("avx","arch=slm","arch=core-avx2","default")))
+int
+foo ()
+{
+  return -2;
+}
+
+/* { dg-final { scan-assembler "\t.globl\tfoo" } } */
+/* { dg-final { scan-assembler "foo.resolver:" } } */
+/* { dg-final { scan-assembler "foo, @gnu_indirect_function" } } */

[PATCH] Fix ifunc and resolver (PR ipa/81213).

2017-06-30 Thread Martin Liška

Hello.

Following patch does refactoring of make_resolver_func where ifunc
alias and resolver were probably confused.

Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.
i386.exp tests work on x86_64-linux-gnu.

Ready to be installed?
Martin

gcc/ChangeLog:

2017-06-29  Martin Liska  

PR ipa/81213
* config/i386/i386.c (make_resolver_func): Do complete
refactoring of the function.

gcc/testsuite/ChangeLog:

2017-06-29  Martin Liska  

PR ipa/81213
* gcc.target/i386/pr81213.c: New test.
---
 gcc/config/i386/i386.c  | 37 -
 gcc/testsuite/gcc.target/i386/pr81213.c | 19 +
 2 files changed, 37 insertions(+), 19 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr81213.c


diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 9d53a6c869c..e90ae249b96 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -33816,30 +33816,30 @@ ix86_get_function_versions_dispatcher (void *decl)
 }
 
 /* Make the resolver function decl to dispatch the versions of
-   a multi-versioned function,  DEFAULT_DECL.  Create an
+   a multi-versioned function,  DEFAULT_DECL.  IFUNC_ALIAS_DECL is
+   ifunc alias that will point to the created resolver.  Create an
empty basic block in the resolver and store the pointer in
EMPTY_BB.  Return the decl of the resolver function.  */
 
 static tree
 make_resolver_func (const tree default_decl,
-		const tree dispatch_decl,
+		const tree ifunc_alias_decl,
 		basic_block *empty_bb)
 {
   char *resolver_name;
   tree decl, type, decl_name, t;
-  bool is_uniq = false;
 
   /* IFUNC's have to be globally visible.  So, if the default_decl is
  not, then the name of the IFUNC should be made unique.  */
   if (TREE_PUBLIC (default_decl) == 0)
-is_uniq = true;
+{
+  char *ifunc_name = make_unique_name (default_decl, "ifunc", true);
+  symtab->change_decl_assembler_name (ifunc_alias_decl,
+	  get_identifier (ifunc_name));
+  XDELETEVEC (ifunc_name);
+}
 
-  /* Append the filename to the resolver function if the versions are
- not externally visible.  This is because the resolver function has
- to be externally visible for the loader to find it.  So, appending
- the filename will prevent conflicts with a resolver function from
- another module which is based on the same version name.  */
-  resolver_name = make_unique_name (default_decl, "resolver", is_uniq);
+  resolver_name = make_unique_name (default_decl, "resolver", false);
 
   /* The resolver function should return a (void *). */
   type = build_function_type_list (ptr_type_node, NULL_TREE);
@@ -33852,13 +33852,12 @@ make_resolver_func (const tree default_decl,
   TREE_USED (decl) = 1;
   DECL_ARTIFICIAL (decl) = 1;
   DECL_IGNORED_P (decl) = 0;
-  /* IFUNC resolvers have to be externally visible.  */
-  TREE_PUBLIC (decl) = 1;
+  TREE_PUBLIC (decl) = 0;
   DECL_UNINLINABLE (decl) = 1;
 
   /* Resolver is not external, body is generated.  */
   DECL_EXTERNAL (decl) = 0;
-  DECL_EXTERNAL (dispatch_decl) = 0;
+  DECL_EXTERNAL (ifunc_alias_decl) = 0;
 
   DECL_CONTEXT (decl) = NULL_TREE;
   DECL_INITIAL (decl) = make_node (BLOCK);
@@ -33889,14 +33888,14 @@ make_resolver_func (const tree default_decl,
 
   pop_cfun ();
 
-  gcc_assert (dispatch_decl != NULL);
-  /* Mark dispatch_decl as "ifunc" with resolver as resolver_name.  */
-  DECL_ATTRIBUTES (dispatch_decl) 
-= make_attribute ("ifunc", resolver_name, DECL_ATTRIBUTES (dispatch_decl));
+  gcc_assert (ifunc_alias_decl != NULL);
+  /* Mark ifunc_alias_decl as "ifunc" with resolver as resolver_name.  */
+  DECL_ATTRIBUTES (ifunc_alias_decl)
+= make_attribute ("ifunc", resolver_name,
+		  DECL_ATTRIBUTES (ifunc_alias_decl));
 
   /* Create the alias for dispatch to resolver here.  */
-  /*cgraph_create_function_alias (dispatch_decl, decl);*/
-  cgraph_node::create_same_body_alias (dispatch_decl, decl);
+  cgraph_node::create_same_body_alias (ifunc_alias_decl, decl);
   XDELETEVEC (resolver_name);
   return decl;
 }
diff --git a/gcc/testsuite/gcc.target/i386/pr81213.c b/gcc/testsuite/gcc.target/i386/pr81213.c
new file mode 100644
index 000..13e15d5fef0
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr81213.c
@@ -0,0 +1,19 @@
+/* PR ipa/81214.  */
+/* { dg-do compile } */
+/* { dg-require-ifunc "" } */
+
+__attribute__((target_clones("avx","arch=slm","arch=core-avx2","default")))
+static int
+foo ()
+{
+  return -2;
+}
+
+int main()
+{
+  return foo();
+}
+
+/* { dg-final { scan-assembler "\t.globl\tfoo\\..*\\.ifunc" } } */
+/* { dg-final { scan-assembler "foo.resolver:" } } */
+/* { dg-final { scan-assembler "foo\\..*\\.ifunc, @gnu_indirect_function" } } */

Re: [PATCH] warn on mem calls modifying objects of non-trivial types (PR 80560)

2017-06-30 Thread Richard Biener

On Thu, Jun 29, 2017 at 10:23 PM, Martin Sebor  wrote:
> On 06/29/2017 10:15 AM, Jan Hubicka wrote:
>>
>> Hello,
>>>
>>> diff --git a/gcc/hash-table.h b/gcc/hash-table.h
>>> index 0f7e21a..443d16c 100644
>>> --- a/gcc/hash-table.h
>>> +++ b/gcc/hash-table.h
>>> @@ -803,7 +803,10 @@ hash_table::empty_slow ()
>>>m_size_prime_index = nindex;
>>>  }
>>>else
>>> -memset (entries, 0, size * sizeof (value_type));
>>> +{
>>> +  for ( ; size; ++entries, --size)
>>> +   *entries = value_type ();
>>> +}
>>>m_n_deleted = 0;
>>>m_n_elements = 0;
>>>  }
>>
>>
>> This change sends our periodic testers into an infinite loop.  It is fault
>> of gcc 4.2 being used
>> as bootstrap compiler, but perhaps that can be worked around?
>
>
> The warning in the original code could have been suppressed (by
> casting the pointer to char*), but it was valid so I opted not
> to.  I'd expect it to be possible to work around the bug but
> I don't have easy access to GCC 4.2 to reproduce it or verify
> the fix.
>
> FWIW, after looking at the function again, I wondered if zeroing
> out the elements (either way) was the right thing to do and if
> they shouldn't be cleared by calling Descriptor::mark_empty()
> instead, like in alloc_entries(), but making that change broke
> a bunch of ipa/ipa-pta-*.c tests.  It's not really clear to me
> what this code is supposed to do.
>
> Martin
>
> PS Does this help at all?
>
> @@ -804,8 +804,8 @@ hash_table::empty_slow ()
>  }
>else
>  {
> -  for ( ; size; ++entries, --size)
> -   *entries = value_type ();
> +  for (size_t i = 0; i != size; ++i)
> +   entries[i] = value_type ();
>
>  }
>m_n_deleted = 0;
>m_n_elements = 0;

alloc_entries uses mark_empty.  untested:

Index: gcc/hash-table.h
===
--- gcc/hash-table.h(revision 249780)
+++ gcc/hash-table.h(working copy)
@@ -804,8 +804,8 @@ hash_table::empty
 }
   else
 {
-  for ( ; size; ++entries, --size)
-   *entries = value_type ();
+  for (size_t i = 0; i < size; ++i)
+   mark_empty (entries[i]);
 }
   m_n_deleted = 0;
   m_n_elements = 0;

Re: PR80806

2017-06-30 Thread Jakub Jelinek

On Fri, Jun 30, 2017 at 10:25:51AM +0200, Richard Biener wrote:
> Do not go down the route of -Wunreachable-code again please.

Yeah, I don't think we want -Wunused-but-set* as a late warning,
it is intentionally in the FE where is the only place where the false
positive ratio of the warning can stay under control, and even then many
projects like the Linux kernel turn the warning off.

If we want to special case memset/memcpy dest arg for DECL_READ_P, let's do
it, but in the FE only.

Jakub

Re: [PATCH] make find_taken_edge handle case with just default

2017-06-30 Thread Richard Biener

On Thu, 29 Jun 2017, Peter Bergner wrote:

> On 6/29/17 8:58 AM, Richard Biener wrote:
> > On Thu, 29 Jun 2017, Peter Bergner wrote:
> > 
> >> On 6/29/17 4:03 AM, Richard Biener wrote:
> >>>
> >>> This refactors things a bit to make CFG cleanup handle switches with
> >>> just a default label.  If we make sure to cleanup the CFG after
> >>> group_case_labels removes cases with just __builtin_unreachable ()
> >>> inside then this fixes the ICE seen in PR81994 as well.
> >>>
> >>> I wonder if find_taken_edge should generally handle successors
> >>> with __builtin_unreachable () -- OTOH that would get rid of those
> >>> too early I guess.
> >>
> >> Should we offer an early out of group_case_labels_stmt() for the
> >> fairly common case of new_size == old_size?  There's no reason to
> >> execute the compress labels loop if we didn't combine any of the
> >> labels.
> > 
> > We can also merge both loops, counting new_size upwards, storing
> > to label new_size if new_size != i ...
> 
> Like this.  I'm bootstrapping and regtesting the following patch on
> powerpc64le-linux.  Ok if it survives?

Looks good to me.

Thanks,
Richard.

> Peter
> 
>   * tree-cfg.c (group_case_labels_stmt): Merge scanning and compressing
>   loops.  Remove now unneeded calls to gimple_switch_set_label() that
>   just set removed labels to NULL_TREE.
> 
> Index: gcc/tree-cfg.c
> ===
> --- gcc/tree-cfg.c(revision 249801)
> +++ gcc/tree-cfg.c(working copy)
> @@ -1679,13 +1679,13 @@ bool
>  group_case_labels_stmt (gswitch *stmt)
>  {
>int old_size = gimple_switch_num_labels (stmt);
> -  int i, j, base_index, new_size = old_size;
> +  int i, next_index, new_size;
>basic_block default_bb = NULL;
> 
>default_bb = label_to_block (CASE_LABEL (gimple_switch_default_label 
> (stmt)));
> 
>/* Look for possible opportunities to merge cases.  */
> -  i = 1;
> +  new_size = i = 1;
>while (i < old_size)
>  {
>tree base_case, base_high;
> @@ -1699,23 +1699,21 @@ group_case_labels_stmt (gswitch *stmt)
>/* Discard cases that have the same destination as the default case.  
> */
>if (base_bb == default_bb)
>   {
> -   gimple_switch_set_label (stmt, i, NULL_TREE);
> i++;
> -   new_size--;
> continue;
>   }
> 
>base_high = CASE_HIGH (base_case)
> ? CASE_HIGH (base_case)
> : CASE_LOW (base_case);
> -  base_index = i++;
> +  next_index = i + 1;
> 
>/* Try to merge case labels.  Break out when we reach the end
>of the label vector or when we cannot merge the next case
>label with the current one.  */
> -  while (i < old_size)
> +  while (next_index < old_size)
>   {
> -   tree merge_case = gimple_switch_label (stmt, i);
> +   tree merge_case = gimple_switch_label (stmt, next_index);
> basic_block merge_bb = label_to_block (CASE_LABEL (merge_case));
> wide_int bhp1 = wi::add (base_high, 1);
> 
> @@ -1727,9 +1725,7 @@ group_case_labels_stmt (gswitch *stmt)
> base_high = CASE_HIGH (merge_case) ?
> CASE_HIGH (merge_case) : CASE_LOW (merge_case);
> CASE_HIGH (base_case) = base_high;
> -   gimple_switch_set_label (stmt, i, NULL_TREE);
> -   new_size--;
> -   i++;
> +   next_index++;
>   }
> else
>   break;
> @@ -1742,23 +1738,22 @@ group_case_labels_stmt (gswitch *stmt)
> edge base_edge = find_edge (gimple_bb (stmt), base_bb);
> if (base_edge != NULL)
>   remove_edge_and_dominated_blocks (base_edge);
> -   gimple_switch_set_label (stmt, base_index, NULL_TREE);
> -   new_size--;
> +   i = next_index;
> +   continue;
>   }
> -}
> 
> -  /* Compress the case labels in the label vector, and adjust the
> - length of the vector.  */
> -  for (i = 0, j = 0; i < new_size; i++)
> -{
> -  while (! gimple_switch_label (stmt, j))
> - j++;
> -  gimple_switch_set_label (stmt, i,
> -gimple_switch_label (stmt, j++));
> +  if (new_size < i)
> + gimple_switch_set_label (stmt, new_size,
> +  gimple_switch_label (stmt, i));
> +  i = next_index;
> +  new_size++;
>  }
> 
>gcc_assert (new_size <= old_size);
> -  gimple_switch_set_num_labels (stmt, new_size);
> +
> +  if (new_size < old_size)
> +gimple_switch_set_num_labels (stmt, new_size);
> +
>return new_size < old_size;
>  }
> 
> 
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)

Re: PR80806

2017-06-30 Thread Richard Biener

On Thu, 29 Jun 2017, Jeff Law wrote:

> On 05/18/2017 12:55 PM, Prathamesh Kulkarni wrote:
> > Hi,
> > The attached patch tries to fix PR80806 by warning when a variable is
> > set using memset (and friends) but not used. I chose to warn in dse
> > pass since dse would detect if the variable passed as 1st argument is
> > a dead store. Does this approach look OK ?
> [ ... ]
> So I think the biggest question is whether or not the case like Martin's
> deserves a warning.
> 
> What we have is an object that is conditionally set but not used
> depending on inlining context.  We've generally "allowed" inlining to
> expose new warnings in the sense that inlining may (for example) allow
> us to remove the addressibility property on an object -- which makes the
> object subject to the usual -Wuninitialized analysis.  In fact, I think
> we've generally considered that a positive outcome because it's exposing
> bugs in subtle paths.

But set-but-not-used isn't a warning like that, it's a warning like
'unused variable' which directs the user to simply delete the
affected stmts (and variable).  So the warning should only trigger
if that would not make the program fail to compile.

> I'm less sure that this case falls into that same category.  What we're
> really talking about is warning for a partially dead store.   Would we
> want a warning if rather than a memset this was a simple store?   Is
> that the right guiding principle here?

We had a -Wunreachable-code that was quite useless because it basically
triggered at DCE so was full of useless false positives.  It was merely
an optimization report.  I fear this one will be similar
(warning: we applied DSE!).

> I hate to say it, but I wonder if this is another case where there
> likely won't be a clear consensus and we're going to end up with a two
> level warning system?
> 
> For something like Martin's case what I really think we should do is
> sink the memset call into the conditional.  In cases where "i" is not a
> constant, but actually has the value zero at runtime we win.
> 
> --
> 
> So I've got no objections to the idea of using DSE to detect the dead
> store and potentially warn.  My concern is are we in a case where that
> warning is going to annoy users and we end up needing a level of
> -Wunused-but-set.

It's not a warning.  It's an hint for an optimization the user could
apply (sink the thing!) or a report for an optimization we do.

Do not go down the route of -Wunreachable-code again please.

Richard.

> Jeff
> 
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)

Re: [PATCH] [PR 81245] Fix tree-if-conv calling of update_stmt after fold_stmt

2017-06-30 Thread Richard Biener

On Thu, Jun 29, 2017 at 10:12 PM, Andrew Pinski  wrote:
> Hi,
>   As described in the bug, tree-if-conv is calling update_stmt on an
> old stmt which might have been removed from the IR already
> (transforming of an assignment to a call in this case).  This fixes
> the problem by calling update_stmt on the new statement that fold_stmt
> might have created.
>
> OK?  Bootstrapped and tested on aarch64-linux-gnu with no regressions.

update_stmt is not necessary when fold_stmt doesn't return true as
gsi_insert_before already updates the stmt.

Thus ok with moving update_stmt under the conditional.

Thanks,
Richard.

> Thanks,
> Andrew Pinski
> ChangeLog:
> * tree-if-conv.c (predicate_scalar_phi): Update new_stmt if fold_stmt
> returned true.
>
> testsuite/ChangeLog:
> * gcc.dg/torture/pr81245.c: New testcase.

Re: [PATCH] Transform (m1 > m2) * d into m1> m2 ? d : 0

2017-06-30 Thread Richard Biener

On Thu, Jun 29, 2017 at 5:10 PM, Wilco Dijkstra  wrote:
> Richard Biener wrote:
>
>> int f (int m, int c)
>> {
>>  return (m & 1) * c;
>> }
>
> This case (integer[0,1] rather than boolean input) should be transformed into 
> c & -(m & 1).

The proposed patch handled both the same.  This means the pattern
shouldn't use range-info
but instead match a more complex

(mult (convert (cmp @0 @1)) @2)

?  Note that from a gimple perspective c & -(m & 1) is more complex
than (m & 1) * c
and thus non-canonical.

> Wilco

[Committed] profile-count.h: Fix typos and whitespace issues.

2017-06-30 Thread Andreas Krebbel

I noticed a couple of typos and whitespace issues in profile-count.h.
I've committed the following patch to fix them.

gcc/ChangeLog:

2017-06-30  Andreas Krebbel  

* profile-count.h (enum profile_quality): Fix typos and whitespace
issues.
---
 gcc/profile-count.h | 46 +++---
 1 file changed, 23 insertions(+), 23 deletions(-)

diff --git a/gcc/profile-count.h b/gcc/profile-count.h
index 461dac6..6a10315 100644
--- a/gcc/profile-count.h
+++ b/gcc/profile-count.h
@@ -21,15 +21,15 @@ along with GCC; see the file COPYING3.  If not see
 #ifndef GCC_PROFILE_COUNT_H
 #define GCC_PROFILE_COUNT_H
 
-/* Quality of the proflie count.  Because gengtype does not support enums
-   inside of clases, this is in global namespace.  */
+/* Quality of the profile count.  Because gengtype does not support enums
+   inside of classes, this is in global namespace.  */
 enum profile_quality {
   /* Profile is based on static branch prediction heuristics.  It may or may
  not reflect the reality.  */
   profile_guessed = 0,
   /* Profile was determined by autofdo.  */
   profile_afdo = 1,
-  /* Profile was originally based on feedback but it was adjusted 
+  /* Profile was originally based on feedback but it was adjusted
  by code duplicating optimization.  It may not precisely reflect the
  particular code path.  */
   profile_adjusted = 2,
@@ -43,13 +43,13 @@ enum profile_quality {
 
 #define RDIV(X,Y) (((X) + (Y) / 2) / (Y))
 
-/* Data type to hold probabilities.  It implement fixed point arithmetics
+/* Data type to hold probabilities.  It implements fixed point arithmetics
with capping so probability is always in range [0,1] and scaling requiring
values greater than 1 needs to be represented otherwise.
 
In addition to actual value the quality of profile is tracked and propagated
through all operations.  Special value UNINITIALIZED is used for 
probabilities
-   that has not been detemrined yet (for example bacause of
+   that has not been determined yet (for example bacause of
-fno-guess-branch-probability)
 
Typically probabilities are derived from profile feedback (via
@@ -68,15 +68,15 @@ enum profile_quality {
  - always
 
Named probabilities except for never/always are assumed to be statically
-   guessed and thus not necessarily acurate.  The difference between never
-   and guessedn never is that the first one should be used only in case that
+   guessed and thus not necessarily accurate.  The difference between never
+   and guessed_never is that the first one should be used only in case that
well behaving program will very likely not execute the "never" path.
For example if the path is going to abort () call or it exception handling.
 
-   Alawyas and guessted_always probabilities are symmetric.
+   Always and guessed_always probabilities are symmetric.
 
For legacy code we support conversion to/from REG_BR_PROB_BASE based 
fixpoint
-   integer arithmetics. Once the code is converted to branch probabiitlies,
+   integer arithmetics. Once the code is converted to branch probabilities,
these conversions will probably go away because they are lossy.
 */
 
@@ -175,7 +175,7 @@ public:
 }
 
   /* Conversion from and to REG_BR_PROB_BASE integer fixpoint arithmetics.
- this is mostly to support legacy code and hsould go away.  */
+ this is mostly to support legacy code and should go away.  */
   static profile_probability from_reg_br_prob_base (int v)
 {
   profile_probability ret;
@@ -199,7 +199,7 @@ public:
   if (val1 > val2)
ret.m_val = max_probability;
   else
-ret.m_val = RDIV (val1 * max_probability, val2);
+   ret.m_val = RDIV (val1 * max_probability, val2);
   ret.m_quality = profile_precise;
   return ret;
 }
@@ -237,7 +237,7 @@ public:
   else
{
  m_val = MIN ((uint32_t)(m_val + other.m_val), max_probability);
-  m_quality = MIN (m_quality, other.m_quality);
+ m_quality = MIN (m_quality, other.m_quality);
}
   return *this;
 }
@@ -263,7 +263,7 @@ public:
   else
{
  m_val = m_val >= other.m_val ? m_val - other.m_val : 0;
-  m_quality = MIN (m_quality, other.m_quality);
+ m_quality = MIN (m_quality, other.m_quality);
}
   return *this;
 }
@@ -288,8 +288,8 @@ public:
return *this = profile_probability::uninitialized ();
   else
{
-  m_val = RDIV ((uint64_t)m_val * other.m_val, max_probability);
-  m_quality = MIN (m_quality, other.m_quality);
+ m_val = RDIV ((uint64_t)m_val * other.m_val, max_probability);
+ m_quality = MIN (m_quality, other.m_quality);
}
   return *this;
 }
@@ -307,7 +307,7 @@ public:
   else
{
  gcc_checking_assert (other.m_val);
-  ret.m_val = MIN (RDIV ((uint64_t)m_val * max_probability,
+

[Committed] S/390: Adjust to the recent branch probability changes.

2017-06-30 Thread Andreas Krebbel

This fixes the bootstrap failure triggered by the recent changes wrt
branch probabilities aka emit_cmp_and_jump_insns does not accept
integers as branch probability anymore.

Regressiontested on s390x.

gcc/ChangeLog:

2017-06-30  Andreas Krebbel  

* config/s390/s390.c (s390_expand_setmem): Adjust to the new data
type for branch probabilities.
---
 gcc/config/s390/s390.c | 7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
index bfc38db..958ee3b 100644
--- a/gcc/config/s390/s390.c
+++ b/gcc/config/s390/s390.c
@@ -5349,8 +5349,6 @@ s390_expand_movmem (rtx dst, rtx src, rtx len)
 void
 s390_expand_setmem (rtx dst, rtx len, rtx val)
 {
-  const int very_unlikely = REG_BR_PROB_BASE / 100 - 1;
-
   if (GET_CODE (len) == CONST_INT && INTVAL (len) <= 0)
 return;
 
@@ -5424,7 +5422,7 @@ s390_expand_setmem (rtx dst, rtx len, rtx val)
   convert_move (count, len, 1);
   emit_cmp_and_jump_insns (count, const0_rtx,
   EQ, NULL_RTX, mode, 1, zerobyte_end_label,
-  very_unlikely);
+  profile_probability::very_unlikely ());
 
   /* We need to make a copy of the target address since memset is
 supposed to return it unmodified.  We have to make it here
@@ -5441,7 +5439,8 @@ s390_expand_setmem (rtx dst, rtx len, rtx val)
  dstp1 = adjust_address (dst, VOIDmode, 1);
  emit_cmp_and_jump_insns (count,
   const1_rtx, EQ, NULL_RTX, mode, 1,
-  onebyte_end_label, very_unlikely);
+  onebyte_end_label,
+  profile_probability::very_unlikely ());
}
 
   /* There is one unconditional (mvi+mvc)/xc after the loop
-- 
2.9.1

Re: gotools patch committed: Test runtime, misc/cgo/{test,testcarchive}

2017-06-30 Thread Uros Bizjak

Hello!

> This patch to the gotools Makefile adds tests to `make check`.  We now
> test the runtime package using the newly built go tool, and test that
> cgo works by running the misc/cgo/test and misc/cgo/testcarchive
> tests.  Bootstrapped and ran Go testsuite on x86_64-pc-linux-gnu.
> Committed to mainline.

There are several failures on non-split targets, e.g.:

FAIL: TestCgoHandlesWlORIGIN
go_test.go:267: running testgo [build origin]
go_test.go:286: standard error:
go_test.go:287: # origin
cc1: error: '-fsplit-stack' requires assembler support for CFI
directives
cc1: error: '-fsplit-stack' is not supported by this compiler
configuration

and:

FAIL: TestCgoCrashHandler
crash_test.go:70: building testprogcgo []: exit status 2
# _/home/uros/git/gcc/libgo/go/runtime/testdata/testprogcgo
cc1: error: '-fsplit-stack' requires assembler support for CFI
directives
cc1: error: '-fsplit-stack' is not supported by this compiler
configuration

As evident from TestBuildDryRunWithCgo dump, -fsplit-stack argument is
added unconditionally to the compile flags.

Uros.

Re: [PING 4] [PATCH] [AArch64] vec_pack_trunc_ should split after register allocator

2017-06-30 Thread Hurugalawadi, Naveen

Hi,  

Please consider this as a personal reminder to review the patch
at following link and let me know your comments on the same.  

https://gcc.gnu.org/ml/gcc-patches/2017-04/msg01334.html

Thanks,
Naveen

96 matches

Mail list logo