Re: [PATCH, rs6000] pr65479 Add -fasynchronous-unwind-tables when the -fsanitize=address option is seen.

2017-01-05 Thread Segher Boessenkool
On Thu, Jan 05, 2017 at 01:21:40PM -0600, Bill Seurer wrote:
> (backport from trunk to gcc 6)
> 
> This patch adds the -fasynchronous-unwind-tables option to compilations when
> the -fsanitize=address option is seen but not if any
> -fasynchronous-unwind-tables options were already specified.
> -fasynchronous-unwind-tables causes a full strack trace to be produced when
> the sanitizer detects an error.  Without the full trace several of the asan
> test cases fail on powerpc.
> 
> See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65479 for more information.
> 
> Bootstrapped and tested on powerpc64le-unknown-linux-gnu,
> powerpc64be-unknown-linux-gnu, and x86_64-pc-linux-gnu with no regressions.
> Is this ok for trunk?

It already is on trunk...  You mean the branches :-)  It is okay for all
open release branches (i.e., 5 and 6).  Or does it not apply to GCC 5?

Thanks,


Segher


> 2017-01-05  Bill Seurer  
> 
>   PR sanitizer/65479
>   * config/rs6000/rs6000.c (rs6000_option_override_internal): Add
>   -fasynchronous-unwind-tables option when -fsanitize=address is
>   specified.


Re: [PATCH] avoid infinite recursion in maybe_warn_alloc_args_overflow (pr 78775)

2017-01-05 Thread Martin Sebor

So Richi asked for removal of the VR_ANTI_RANGE handling, which would
imply removal of operand_signed_p.

What are the implications if we do that?


I just got back to this yesterday.  The implications of the removal
of the anti-range handling are a number of false negatives in the
test suite:

I was thinking more at a higher level.  ie, are the warnings still
useful if we don't have the anti-range handling?  I suspect so, but
would like to hear your opinion.

...

  n = ~[-4, MAX];   (I.e., n is either negative or too big.)
  p = malloc (n);

Understood.  The low level question is do we get these kinds of ranges
often enough in computations leading to allocation sizes?


My intuition tells me that they are likely common enough not to
disregard but I don't have a lot of data to back it up with.  In
a Bash build a full 23% of all checked calls are of this kind (24
out of 106).  In a Binutils build only 4% are (9 out of 228).  In
Glibc, a little under 3%.  My guess is that the number will be
inversely proportional to the quality of the code.


  m = [-3, 7];
  n = [-5, 11];
  p = calloc (m, n);

which I suspect are common in the wild as well.

I must be missing something, given those ranges I wouldn't think we have
a false positive.The resulting size computation is going to have a
range [-35, 88], right?  ISTM that we'd really want to warn for that.  I
must be missing something.


The warning is meant to trigger only for cases of arguments that
are definitely too big (i.e., it's not a -Wmaybe-alloc-size-larger-
than type of warning).

Given a range with negative lower bound and a positive upper bound
the warning uses zero as the lower bound (it always ignores the upper
bound).  Doing otherwise would likely trigger lots of false positives.

(This is true for -Wstringop-overflow as well.)

The tradeoff, of course, is false negatives.  In the -Walloc-larger-
than case it can be mitigated by setting a lower threshold (I think
we might want to consider lowering the default to something less
liberal than PTRDIFF_MAX -- it seems very unlikely that a program
would try to allocate that much memory, especially in LP64).

Martin


Re: [bootstrap-O3,fortran] silence warning in simplify_transformation_to_array

2017-01-05 Thread Alexandre Oliva
On Jan  5, 2017, Jeff Law  wrote:

> On 01/05/2017 05:15 AM, Richard Biener wrote:
>> Reasonable -- I'll leave it for others to comment on that "standard
>> practice" part (it'll be the first case of using this IIRC).

> It's a fair amount of clutter.  But I won't object.

I'll take these as a tentative ok and check it in before swapping it all
out and going on vacations ;-) If there's an objection, you all feel
free to revert if I don't respond in a timely manner ;-)

-- 
Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer


Re: [bootstrap-O1] change value type to avoid sprintf buffer size warning

2017-01-05 Thread Alexandre Oliva
On Jan  5, 2017, Andreas Schwab  wrote:

> On Jan 05 2017, Alexandre Oliva  wrote:
>> On Jan  5, 2017, Andreas Schwab  wrote:
>> 
>>> On Jan 05 2017, Alexandre Oliva  wrote:
 -  sprintf (xname, "", ((unsigned)((uintptr_t)(t) & 0x)));
 +  sprintf (xname, "", ((unsigned short)((uintptr_t)(t) & 
 0x)));
>> 
>>> Please fix the spacing while you are at it.
>> 
>> Err...  I sure would, if I knew what fix you had in mind.  Care to share
>> your thoughts?  Thanks,

> Space after cast.

Wow, thanks, I think I never got that one right.

Here's what I'm installing, also breaking the now-too-long line:


In stage2 of bootstrap-O1, the code that warns if sprintf might
overflow its output buffer cannot tell that an unsigned value narrowed
to 16 bits will fit in 4 bytes with %4x.

Converting the value to 'unsigned short' makes it obvious that it
fits, at least on machines with 16-bit shorts.

for  gcc/c-family/ChangeLog

* c-pretty-print.c (pp_c_tree_decl_identifier): Convert 16-bit
value to unsigned short to fit in 4 hex digits without
warnings.
---
 gcc/c-family/c-pretty-print.c |3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/gcc/c-family/c-pretty-print.c b/gcc/c-family/c-pretty-print.c
index 90428ca..2908669 100644
--- a/gcc/c-family/c-pretty-print.c
+++ b/gcc/c-family/c-pretty-print.c
@@ -2400,7 +2400,8 @@ pp_c_tree_decl_identifier (c_pretty_printer *pp, tree t)
   else
 {
   static char xname[8];
-  sprintf (xname, "", ((unsigned)((uintptr_t)(t) & 0x)));
+  sprintf (xname, "", ((unsigned short) ((uintptr_t) (t)
+   & 0x)));
   name = xname;
 }
 


-- 
Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer


Re: [-fcompare-debug] var tracking options are not optimization options

2017-01-05 Thread Alexandre Oliva
On Jan  5, 2017, Jakub Jelinek  wrote:

> You've just changed the hash function and my mail was about the fact that
> it is not enough.

Sorry, it wasn't clear 'enough for what'.  It's enough to fix the
bug/symptom I had observed and intended to fix, but yes, there is
another latent -fcompare-debug bug with a very different symptom that
the patch did not even attempt to address (because I had not even
realized we had that bug ;-)

> With your patch, both functions
> will hash the same (that is ok), but as ICF for functions that hash the same
> also performs memcmp on the OPTIMIZATION_NODE content, they will compare
> the same in one case (-g0) and not in the other one (-g) and so ICF will
> merge them in one case and not in the other one.

Whereas without the proposed patch, ICF will also merge them in one case
and not in the other.  The patch does not change that, it just
introduces a situation in which hashes that used to be different become
the same, but still without a match for merging.

> BTW, seems the above exact case ICEs actually.

Both with and without the proposed patch, I suppose.

Anyway, would you please file a bug report about this, and copy me?  I
might be able to have a look into this one when I get a chance.

> We need to deal not just with the iteration order, but also with equality
> of functions, otherwise ICF will do something depending on -g vs. -g0.

Agreed.  But IMHO that's two different, unrelated bugs.  Maybe more ;-)

-- 
Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer


[PATCH, gimplefe] passes.c: split out pass-skipping logic into subroutines

2017-01-05 Thread David Malcolm
The GIMPLE frontend's pass-skipping logic in execute_one_pass is somewhat
awkward.  This patch splits it out into two subroutines to simplify
things.

No functional change intended (and this may make it easier to
update the logic so that it can be shared with the RTL frontend).

Successfully bootstrapped on x86_64-pc-linux-gnu.

OK for trunk?

gcc/ChangeLog:
* passes.c (execute_one_pass): Split out pass-skipping logic
into...
(determine_pass_name_match): ...this new function and...
(should_skip_pass_p): ...this new function.
---
 gcc/passes.c | 97 +++-
 1 file changed, 63 insertions(+), 34 deletions(-)

diff --git a/gcc/passes.c b/gcc/passes.c
index 32d964b..31262ed 100644
--- a/gcc/passes.c
+++ b/gcc/passes.c
@@ -2273,6 +2273,67 @@ override_gate_status (opt_pass *pass, tree func, bool 
gate_status)
   return gate_status;
 }
 
+/* Determine if PASS_NAME matches CRITERION.
+   Not a pure predicate, since it can update CRITERION, to support
+   matching the Nth invocation of a pass.
+   Subroutine of should_skip_pass_p.  */
+
+static bool
+determine_pass_name_match (const char *pass_name, char *criterion)
+{
+  size_t namelen = strlen (pass_name);
+  if (! strncmp (pass_name, criterion, namelen))
+{
+  /* The following supports starting with the Nth invocation
+of a pass (where N does not necessarily is equal to the
+dump file suffix).  */
+  if (criterion[namelen] == '\0'
+ || (criterion[namelen] == '1'
+ && criterion[namelen + 1] == '\0'))
+   return true;
+  else
+   {
+ if (criterion[namelen + 1] == '\0')
+   --criterion[namelen];
+ return false;
+   }
+}
+  else
+return false;
+}
+
+/* For skipping passes until "startwith" pass.
+   Return true iff PASS should be skipped.
+   Clear cfun->pass_startwith when encountering the "startwith" pass,
+   so that all subsequent passes are run.  */
+
+static bool
+should_skip_pass_p (opt_pass *pass)
+{
+  if (!cfun)
+return false;
+  if (!cfun->pass_startwith)
+return false;
+
+  /* We can't skip the lowering phase yet -- ideally we'd
+ drive that phase fully via properties.  */
+  if (!(cfun->curr_properties & PROP_ssa))
+return false;
+
+  if (determine_pass_name_match (pass->name, cfun->pass_startwith))
+{
+  cfun->pass_startwith = NULL;
+  return false;
+}
+
+  /* And also run any property provider.  */
+  if (pass->properties_provided != 0)
+return false;
+
+  /* If we get here, then we have a "startwith" that we haven't seen yet;
+ skip the pass.  */
+  return true;
+}
 
 /* Execute PASS. */
 
@@ -2313,40 +2374,8 @@ execute_one_pass (opt_pass *pass)
   return false;
 }
 
-  /* For skipping passes until startwith pass */
-  if (cfun
-  && cfun->pass_startwith
-  /* But we can't skip the lowering phase yet -- ideally we'd
- drive that phase fully via properties.  */
-  && (cfun->curr_properties & PROP_ssa))
-{
-  size_t namelen = strlen (pass->name);
-  /* We have to at least start when we leave SSA.  */
-  if (pass->properties_destroyed & PROP_ssa)
-   cfun->pass_startwith = NULL;
-  else if (! strncmp (pass->name, cfun->pass_startwith, namelen))
-   {
- /* The following supports starting with the Nth invocation
-of a pass (where N does not necessarily is equal to the
-dump file suffix).  */
- if (cfun->pass_startwith[namelen] == '\0'
- || (cfun->pass_startwith[namelen] == '1'
- && cfun->pass_startwith[namelen + 1] == '\0'))
-   cfun->pass_startwith = NULL;
- else
-   {
- if (cfun->pass_startwith[namelen + 1] != '\0')
-   return true;
- --cfun->pass_startwith[namelen];
- return true;
-   }
-   }
-  /* And also run any property provider.  */
-  else if (pass->properties_provided != 0)
-   ;
-  else
-   return true;
-}
+  if (should_skip_pass_p (pass))
+return true;
 
   /* Pass execution event trigger: useful to identify passes being
  executed.  */
-- 
1.8.5.3



Re: [PATCH] avoid infinite recursion in maybe_warn_alloc_args_overflow (pr 78775)

2017-01-05 Thread Jeff Law

On 01/05/2017 11:49 AM, Martin Sebor wrote:

On 01/05/2017 11:03 AM, Jeff Law wrote:

On 12/12/2016 06:36 PM, Martin Sebor wrote:

The attached patch avoids infinite recursion when traversing phi
nodes in maybe_warn_alloc_args_overflow by using a bitmap to keep
track of those already visited and breaking out.

Thanks
Martin

gcc-78775.diff


PR tree-optimization/78775 - ICE in maybe_warn_alloc_args_overflow

gcc/ChangeLog:

PR tree-optimization/78775
* calls.c (operand_signed_p): Add overload and avoid getting into
infinite recursion when traversing phi nodes.

gcc/testsuite/ChangeLog:

PR tree-optimization/78775
* gcc.dg/pr78775.c: New test.

So Richi asked for removal of the VR_ANTI_RANGE handling, which would
imply removal of operand_signed_p.

What are the implications if we do that?


I just got back to this yesterday.  The implications of the removal
of the anti-range handling are a number of false negatives in the
test suite:
I was thinking more at a higher level.  ie, are the warnings still 
useful if we don't have the anti-range handling?  I suspect so, but 
would like to hear your opinion.





  FAIL: gcc.dg/attr-alloc_size-3.c  (test for warnings, line 448)
  FAIL: gcc.dg/attr-alloc_size-4.c  (test for warnings, line 137)
  FAIL: gcc.dg/attr-alloc_size-4.c  (test for warnings, line 139)
  FAIL: gcc.dg/attr-alloc_size-4.c  (test for warnings, line 175)

with the second one, for example, being:

  n = ~[-4, MAX];   (I.e., n is either negative or too big.)
  p = malloc (n);
Understood.  The low level question is do we get these kinds of ranges 
often enough in computations leading to allocation sizes?





Passing signed integers as arguments to allocation functions is
common so I've been looking into how else to avoid the phi recursion
(which I assume is the concern here) without compromising this case.
Removing just the operand_signed_p() handling causes a number of
false positives in the test suite, such as for
Yes, passing signed integers as arguments is common.  But how often do 
we have one of these anti-ranges that allows us to do something useful?






  m = [-3, 7];
  n = [-5, 11];
  p = calloc (m, n);

which I suspect are common in the wild as well.
I must be missing something, given those ranges I wouldn't think we have 
a false positive.The resulting size computation is going to have a 
range [-35, 88], right?  ISTM that we'd really want to warn for that.  I 
must be missing something.


Jeff


Re: [PATCH][dwarf2out] Fix PR79000

2017-01-05 Thread Jason Merrill
OK.

On Thu, Jan 5, 2017 at 8:19 AM, Richard Biener  wrote:
>
> The following fixes a LTO dwarf2out ICE when mixing C and C++ TUs.
> is_cxx then claims we're C++ but we are not happy to see a C typedef
> handled by
>
> gen_typedef_die (tree decl, dw_die_ref context_die)
> {
> ...
>   if (is_naming_typedef_decl (TYPE_NAME (type)))
> {
>   /* Here, we are in the case of decl being a typedef naming
>  an anonymous type, e.g:
>  typedef struct {...} foo;
>  In that case TREE_TYPE (decl) is not a typedef variant
>  type and TYPE_NAME of the anonymous type is set to the
>  TYPE_DECL of the typedef. This construct is emitted by
>  the C++ FE.
>
>  TYPE is the anonymous struct named by the typedef
>  DECL. As we need the DW_AT_type attribute of the
>  DW_TAG_typedef to point to the DIE of TYPE, let's
>  generate that DIE right away. add_type_attribute
>  called below will then pick (via lookup_type_die) that
>  anonymous struct DIE.  */
>   if (!TREE_ASM_WRITTEN (type))
> gen_tagged_type_die (type, context_die,
> DINFO_USAGE_DIR_USE);
>
>   /* This is a GNU Extension.  We are adding a
>  DW_AT_linkage_name attribute to the DIE of the
>  anonymous struct TYPE.  The value of that attribute
>  is the name of the typedef decl naming the anonymous
>  struct.  This greatly eases the work of consumers of
>  this debug info.  */
>   add_linkage_name_raw (lookup_type_die (type), decl);
> }
>
> Here gen_tagged_type_die -> gen_member_die eventually ICEs not expecting a
> variant type.  "Fixing" that assert just shows that add_linkage_name_raw
> fails as we do not have a DECL_ASSEMBLER_NAME for the type decl and
> we do not allow deferred asm names late (the C type decl won't ever
> get an assembler name anyway...).
>
> Thus the following provides an overload to is_cxx () we can feed
> with context (to see whether the typedef decl was created by the C++ FE).
>
> Hopefully with GCC 8 we'll get early LTO debug and all these issues
> gone...
>
> LTO bootstrapped and tested on x86_64-unknown-linux-gnu, ok for trunk
> and branches?
>
> Thanks,
> Richard.
>
> 2017-01-05  Richard Biener  
>
> PR debug/79000
> * dwarf2out.c (is_cxx): New overload with context.
> (is_naming_typedef_decl): Use it.
>
> * g++.dg/lto/pr79000_0.C: New testcase.
> * g++.dg/lto/pr79000_1.c: Likewise.
>
> Index: gcc/dwarf2out.c
> ===
> *** gcc/dwarf2out.c (revision 244093)
> --- gcc/dwarf2out.c (working copy)
> *** static int get_AT_flag (dw_die_ref, enum
> *** 3356,3361 
> --- 3356,3362 
>   static unsigned get_AT_unsigned (dw_die_ref, enum dwarf_attribute);
>   static inline dw_die_ref get_AT_ref (dw_die_ref, enum dwarf_attribute);
>   static bool is_cxx (void);
> + static bool is_cxx (const_tree);
>   static bool is_fortran (void);
>   static bool is_ada (void);
>   static bool remove_AT (dw_die_ref, enum dwarf_attribute);
> *** is_cxx (void)
> *** 4990,4995 
> --- 4991,5012 
>   || lang == DW_LANG_C_plus_plus_11 || lang == 
> DW_LANG_C_plus_plus_14);
>   }
>
> + /* Return TRUE if DECL was created by the C++ frontend.  */
> +
> + static bool
> + is_cxx (const_tree decl)
> + {
> +   if (in_lto_p)
> + {
> +   while (DECL_CONTEXT (decl))
> +   decl = DECL_CONTEXT (decl);
> +   if (TREE_CODE (decl) == TRANSLATION_UNIT_DECL
> + && TRANSLATION_UNIT_LANGUAGE (decl))
> +   return strncmp (TRANSLATION_UNIT_LANGUAGE (decl), "GNU C++", 7) == 0;
> + }
> +   return is_cxx ();
> + }
> +
>   /* Return TRUE if the language is Java.  */
>
>   static inline bool
> *** is_naming_typedef_decl (const_tree decl)
> *** 24762,24768 
> /* It looks like Ada produces TYPE_DECLs that are very similar
>to C++ naming typedefs but that have different
>semantics. Let's be specific to c++ for now.  */
> !   || !is_cxx ())
>   return FALSE;
>
> return (DECL_ORIGINAL_TYPE (decl) == NULL_TREE
> --- 24779,24785 
> /* It looks like Ada produces TYPE_DECLs that are very similar
>to C++ naming typedefs but that have different
>semantics. Let's be specific to c++ for now.  */
> !   || !is_cxx (decl))
>   return FALSE;
>
> return (DECL_ORIGINAL_TYPE (decl) == NULL_TREE
> Index: gcc/testsuite/g++.dg/lto/pr79000_0.C
> ===
> *** gcc/testsuite/g++.dg/lto/pr79000_0.C(revision 0)
> --- gcc/testsuite/g++.dg/lto/pr79000_0.C

Re: [-fcompare-debug] var tracking options are not optimization options

2017-01-05 Thread Jakub Jelinek
On Thu, Jan 05, 2017 at 07:13:50PM -0200, Alexandre Oliva wrote:
> > OPTIMIZATION_NODE is created by saving options, computing hash
> > (cl_option_hasher::hash apparently for OPTIMIZATION_NODE does not
> > use cl_optimization_hash, why?), comparing (no cl_optimization_eq,
> > just memcmp for equality) and if there is a match, use it instead.
> > And on the IPA-ICF side, it uses cl_optimization_hash for hash
> > computation and again memcmp for equality.
> 
> You seem to be looking at it from the equality/unequality perspective.
> As far as that is concerned, the hash changes make little difference
> indeed: what was different remains different, and what's the same
> remains the same, maybe with a few extra collisions because certain
> features are no longer regarded as part of the hash.  But remember that,
> even if two different hashed elements map to the same hash number, they
> remain different, and things work just as before (as long as there
> aren't too many collisions, in which case performance may suffer, but
> that's not the case at hand).

You've just changed the hash function and my mail was about the fact that
it is not enough.  Yes, it is needed, but not sufficient.
Consider:

int
foo (int x, int y)
{
  int z = x + y;
  return x * y;
}

__attribute__((optimize ("no-var-tracking"))) int
bar (int x, int y)
{
  int z = x + y;
  return x * y;
}

or something similar (dunno if you don't also need to do
"no-var-tracking-assignments" too).  With your patch, both functions
will hash the same (that is ok), but as ICF for functions that hash the same
also performs memcmp on the OPTIMIZATION_NODE content, they will compare
the same in one case (-g0) and not in the other one (-g) and so ICF will
merge them in one case and not in the other one.
BTW, seems the above exact case ICEs actually.

> What needed changing as far as I was concerned, however, was not
> equality of code units (for want of a better name for the combination of
> code and compile options) within one translation group (possibly
> multiple units), but rather iteration order on the hash table when
> comparing two compilations of the same translation group with slightly
> different options that should still generate the same executable code.

We need to deal not just with the iteration order, but also with equality
of functions, otherwise ICF will do something depending on -g vs. -g0.

Jakub


[PATCH] avoid false positives due to signed to unsigned conversion (PR 78973)

2017-01-05 Thread Martin Sebor

When the size passed to a call to a function like memcpy is a signed
integer whose range has a negative lower bound and a positive upper
bound the lower bound of the range of the argument after conversion
to size_t may be in excess of the maximum object size (PTRDIFF_MAX
by default).  This results in -Wstringop-overflow false positives.

The attached patch detects this case and avoids the problem.

Btw., I had a heck of a time creating a test case for this.  The
large translation unit submitted with the bug is from a file in
the Linux kernel of some complexity.  There, the range of the int
variable (before conversion to size_t) is [INT_MIN, INT_MAX].  It
seems very difficult to create a VR_RANGE for a signed int that
matches it.  The test case I came up with that still reproduces
the false positive crates an anti-range for the signed int argument
by converting an unsigned int in one range to a signed int and
constraining it to another range.  The false positive is avoided
because the code doesn't (yet) handle anti-ranges.

Martin

PS This seems lie a bug or gotcha in the get_range_info() function.
In the regression test added by the patch the VRP dump shows the
following:

  n.0_1: ~[0, 4]
  _6: [18446744071562067968, +INF]

  ...

  _6 = (long unsigned int) n.0_1;
  __builtin_memset (d_5(D), 0, _6);

but get_range_info(_6) returns the VR_RANGE:

  [ 0x:8000, 0x: ]

That doesn't seem right.  Is there a better/more appropriate way
to determine the "correct" range?  If not, perhaps get_range_info
should be enhanced to make it possible to call it to detect this
conversion and report a more correct result (e.g., based on
a new, optional argument).

Martin
PR c/78973 - [7.0 regression] warning: ‘memcpy’: specified size between 18446744071562067968 and 18446744073709551615 exceeds maximum object size 9223372036854775807 [-Wstringop-overflow=]

gcc/ChangeLog:

	PR c/78973
	* builtins.c (get_size_range): Handle signed to unsigned conversion.

gcc/testsuite/ChangeLog:

	PR c/78973
	* gcc.dg/pr78973.c: New test.

diff --git a/gcc/builtins.c b/gcc/builtins.c
index 5b76dfd..883d25c 100644
--- a/gcc/builtins.c
+++ b/gcc/builtins.c
@@ -3051,9 +3051,26 @@ get_size_range (tree exp, tree range[2])
 
   if (range_type == VR_RANGE)
 	{
+	  /* The range can be the result of a conversion of a signed
+	 variable to size_t with the lower bound after conversion
+	 corresponding to a negative lower bound of the original
+	 variable.  Avoid the false positive for this case.  */
+	  gimple *def = SSA_NAME_DEF_STMT (exp);
+	  if (is_gimple_assign (def))
+	{
+	  tree_code code = gimple_assign_rhs_code (def);
+	  if (code == NOP_EXPR)
+		{
+		  tree rhs = gimple_assign_rhs1 (def);
+		  if (!TYPE_UNSIGNED (TREE_TYPE (rhs)))
+		return get_size_range (rhs, range);
+		}
+	}
+
 	  /* Interpret the bound in the variable's type.  */
 	  range[0] = wide_int_to_tree (TREE_TYPE (exp), min);
 	  range[1] = wide_int_to_tree (TREE_TYPE (exp), max);
+
 	  return true;
 	}
   else if (range_type == VR_ANTI_RANGE)
diff --git a/gcc/testsuite/gcc.dg/pr78973.c b/gcc/testsuite/gcc.dg/pr78973.c
new file mode 100644
index 000..ef212cf
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr78973.c
@@ -0,0 +1,18 @@
+/* PR c/78973 - [7.0 regression] warning: ‘memcpy’: specified size between
+   18446744071562067968 and 18446744073709551615 exceeds maximum object
+   size 9223372036854775807 [-Wstringop-overflow=]
+   { dg-do compile }
+   { dg-options "-O2 -Wall" }  */
+
+void f (void *p, int n)
+{
+  if (n <= 4)
+__builtin_memset (p, 0, n);   /* { dg-bogus "exceeds maximum object size" } */
+}
+
+void g (void *d, unsigned n)
+{
+  if (n < 5)
+n = 5;
+  f (d, n);
+}


Re: [PATCH] Fix late dwarf generated early from optimized out globals

2017-01-05 Thread Andreas Tobler

On 05.01.17 13:05, Richard Biener wrote:

On Wed, 4 Jan 2017, Andreas Tobler wrote:


On 04.01.17 10:21, Richard Biener wrote:

On Wed, 28 Dec 2016, Andreas Tobler wrote:


On 28.12.16 19:24, Richard Biener wrote:

On December 27, 2016 11:17:00 PM GMT+01:00, Andreas Tobler
 wrote:

On 16.09.16 13:30, Richard Biener wrote:

On Thu, 15 Sep 2016, Richard Biener wrote:



This addresses sth I needed to address with the early LTO debug

patches

(you might now figure I'm piecemail merging stuff from that
patch).

When the cgraph code optimizes out a global we call the

late_global_decl

debug hook to eventually add a DW_AT_const_value to its DIE (we

don't

really expect a location as that will be invalid after optimizing

out

and will be pruned).

With the early LTO debug patches I have introduced a

early_dwarf_finished

flag (mainly for consistency checking) and I figured I can use
that

to

detect the call to the late hook during the early phase and
provide
the following cleaned up variant of avoiding to create locations

that

require later pruning (which doesn't work with emitting the early

DIEs).


Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

I verified it does the correct thing for a unit like

static const int i = 2;

(but ISTR we do have at least one testcase in the testsuite as

well).


Will commit if testing finishes successfully.


Ok, so it showed issues when merging that back to early-LTO-debug.
Turns out in LTO we never call early_finish and thus

early_dwarf_finished

was never set.  Also dwarf2out_late_global_decl itself is a better
place to constrain generating locations.

The following variant is in very late stage of testing.

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.
LTO bootstrap on x86_64-unknown-linux-gnu in stage3.  LTO bootstrap
with early-LTO-debug in stage3, bootstraped with early-LTO-debug,
testing in progress.


Any chance to backport this commit (r240228) to 6.x?
It fixes a bootstrap comparison issue on aarch64-*-freebsd*.
The aarch64-*-freebsd* port is not yet merged to 6.x and 5.4.x due to
the bootstrap comparison failure I faced.


Did you analyze the bootstrap miscompare?  I suspect the patch merely
papers
over the problem.


gcc/contrib/compare-debug -p prev-gcc/ipa-icf.o gcc/ipa-icf.o
prev-gcc/ipa-icf.o.stripped. gcc/ipa-icf.o.stripped. differ: char 52841,
line
253


The objdump -dSx diff on the non stripped object looked always more or
less
the same, a rodata offset which was different.

-   1448: R_AARCH64_ADD_ABS_LO12_NC .rodata+0x1d8
+   1448: R_AARCH64_ADD_ABS_LO12_NC .rodata+0x410


Hmm, sounds like a constant pool entry was created by -g at a different
time (and thus offset) from regular compilation.  So yes, the patch
in question should have the affect to "fix" this.

Note that I later changed the fix with

2016-10-20  Richard Biener  

* cgraphunit.c (analyze_functions): Set node->definition to
false to signal symbol removal to debug_hooks->late_global_decl.
* ipa.c (symbol_table::remove_unreachable_nodes): When not in
WPA signal symbol removal to the debuginfo machinery.
* dwarf2out.c (dwarf2out_late_global_decl): Instead of
using early_finised to guard the we're called for symbol
removal case look at the symtabs definition flag.
(gen_variable_die): Remove redundant check.

(the dwarf2out_late_global_decl and analyze_functions part are
relevant).

That should be the fix to backport (avoiding the early_dwarf_finished
part).


Ok, I bootstrapped with the attached snippet (had to tweak a bit to apply
cleanly). w/o the previous mentioned fix (r240228). Is this what you had in
mind or do you think some parts of the other fix (r240228) is also needed?


Not sure if you need the dwarf2out.c part you included but you clearly
missed the dwarf2out_late_global_decl part doing

  /* We get called via the symtab code invoking late_global_decl
 for symbols that are optimized out.  Do not add locations
 for those.  */
  varpool_node *node = varpool_node::get (decl);
  if (! node || ! node->definition)
tree_add_const_value_attribute_for_decl (die, decl);
  else
add_location_or_const_value_attribute (die, decl, false);

I wouldn't include the ipa.c part unless required (it adds additional
debug for optimized out decls).


Ok, attached the part I bootstrapped successfully on amd64-*-freebsd12 
and aarch64-*-freebsd12. From the amd64 run you'll find some test 
results at the usual place. The aarch64 run takes some more time.


I hope I got it right this time :)
What do you think?

Thanks,
Andreas

Index: dwarf2out.c
===
--- dwarf2out.c (revision 244100)
+++ dwarf2out.c (working copy)
@@ -23752,7 +23752,17 @@
 {
   dw_die_ref die = 

[PATCH] Fix up vectorizable_condition for comparisons of 2 booleans (PR tree-optimization/78938)

2017-01-05 Thread Jakub Jelinek
Hi!

As mentioned in the PR, while vectorizable_comparison has code to deal
with comparison of 2 booleans by transorming those into one or two
BIT_*_EXPR operations that work equally well on normal vectors as well
as the AVX512 bitset masks, vectorizable_comparison lacks that and we
ICE during expansion because of that.
The following patch teaches vectorizable_condition to do that too.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2017-01-05  Jakub Jelinek  

PR tree-optimization/78938
* tree-vect-stmts.c (vectorizable_condition): For non-masked COND_EXPR
where comp_vectype is VECTOR_BOOLEAN_TYPE_P, use
BIT_{NOT,XOR,AND,IOR}_EXPR on the comparison operands instead of
{EQ,NE,GE,GT,LE,LT}_EXPR directly inside of VEC_COND_EXPR.  Formatting
fixes.

* gcc.dg/vect/pr78938.c: New test.

--- gcc/tree-vect-stmts.c.jj2017-01-01 12:45:39.0 +0100
+++ gcc/tree-vect-stmts.c   2017-01-05 15:54:41.075218409 +0100
@@ -7731,7 +7731,8 @@ vectorizable_condition (gimple *stmt, gi
 {
   tree scalar_dest = NULL_TREE;
   tree vec_dest = NULL_TREE;
-  tree cond_expr, then_clause, else_clause;
+  tree cond_expr, cond_expr0 = NULL_TREE, cond_expr1 = NULL_TREE;
+  tree then_clause, else_clause;
   stmt_vec_info stmt_info = vinfo_for_stmt (stmt);
   tree comp_vectype = NULL_TREE;
   tree vec_cond_lhs = NULL_TREE, vec_cond_rhs = NULL_TREE;
@@ -7741,7 +7742,7 @@ vectorizable_condition (gimple *stmt, gi
   loop_vec_info loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info);
   enum vect_def_type dt, dts[4];
   int ncopies;
-  enum tree_code code;
+  enum tree_code code, cond_code, bitop1 = NOP_EXPR, bitop2 = NOP_EXPR;
   stmt_vec_info prev_stmt_info = NULL;
   int i, j;
   bb_vec_info bb_vinfo = STMT_VINFO_BB_VINFO (stmt_info);
@@ -7825,11 +7826,76 @@ vectorizable_condition (gimple *stmt, gi
   if (vec_cmp_type == NULL_TREE)
 return false;
 
+  cond_code = TREE_CODE (cond_expr);
+  if (!masked)
+{
+  cond_expr0 = TREE_OPERAND (cond_expr, 0);
+  cond_expr1 = TREE_OPERAND (cond_expr, 1);
+}
+
+  if (!masked && VECTOR_BOOLEAN_TYPE_P (comp_vectype))
+{
+  /* Boolean values may have another representation in vectors
+and therefore we prefer bit operations over comparison for
+them (which also works for scalar masks).  We store opcodes
+to use in bitop1 and bitop2.  Statement is vectorized as
+BITOP2 (rhs1 BITOP1 rhs2) or rhs1 BITOP2 (BITOP1 rhs2)
+depending on bitop1 and bitop2 arity.  */
+  switch (cond_code)
+   {
+   case GT_EXPR:
+ bitop1 = BIT_NOT_EXPR;
+ bitop2 = BIT_AND_EXPR;
+ break;
+   case GE_EXPR:
+ bitop1 = BIT_NOT_EXPR;
+ bitop2 = BIT_IOR_EXPR;
+ break;
+   case LT_EXPR:
+ bitop1 = BIT_NOT_EXPR;
+ bitop2 = BIT_AND_EXPR;
+ std::swap (cond_expr0, cond_expr1);
+ break;
+   case LE_EXPR:
+ bitop1 = BIT_NOT_EXPR;
+ bitop2 = BIT_IOR_EXPR;
+ std::swap (cond_expr0, cond_expr1);
+ break;
+   case NE_EXPR:
+ bitop1 = BIT_XOR_EXPR;
+ break;
+   case EQ_EXPR:
+ bitop1 = BIT_XOR_EXPR;
+ bitop2 = BIT_NOT_EXPR;
+ break;
+   default:
+ return false;
+   }
+  cond_code = SSA_NAME;
+}
+
   if (!vec_stmt)
 {
   STMT_VINFO_TYPE (stmt_info) = condition_vec_info_type;
+  if (bitop1 != NOP_EXPR)
+   {
+ machine_mode mode = TYPE_MODE (comp_vectype);
+ optab optab;
+
+ optab = optab_for_tree_code (bitop1, comp_vectype, optab_default);
+ if (!optab || optab_handler (optab, mode) == CODE_FOR_nothing)
+   return false;
+
+ if (bitop2 != NOP_EXPR)
+   {
+ optab = optab_for_tree_code (bitop2, comp_vectype,
+  optab_default);
+ if (!optab || optab_handler (optab, mode) == CODE_FOR_nothing)
+   return false;
+   }
+   }
   return expand_vec_cond_expr_p (vectype, comp_vectype,
-TREE_CODE (cond_expr));
+cond_code);
 }
 
   /* Transform.  */
@@ -7858,11 +7924,11 @@ vectorizable_condition (gimple *stmt, gi
  auto_vec vec_defs;
 
  if (masked)
- ops.safe_push (cond_expr);
+   ops.safe_push (cond_expr);
  else
{
- ops.safe_push (TREE_OPERAND (cond_expr, 0));
- ops.safe_push (TREE_OPERAND (cond_expr, 1));
+ ops.safe_push (cond_expr0);
+ ops.safe_push (cond_expr1);
}
   ops.safe_push (then_clause);
   ops.safe_push (else_clause);
@@ -7886,17 +7952,15 @@ vectorizable_condition (gimple *stmt, gi
}
  else
{
- 

Re: [-fcompare-debug] var tracking options are not optimization options

2017-01-05 Thread Alexandre Oliva
On Jan  5, 2017, Jakub Jelinek  wrote:

> OPTIMIZATION_NODE is created by saving options, computing hash
> (cl_option_hasher::hash apparently for OPTIMIZATION_NODE does not
> use cl_optimization_hash, why?), comparing (no cl_optimization_eq,
> just memcmp for equality) and if there is a match, use it instead.
> And on the IPA-ICF side, it uses cl_optimization_hash for hash
> computation and again memcmp for equality.

You seem to be looking at it from the equality/unequality perspective.
As far as that is concerned, the hash changes make little difference
indeed: what was different remains different, and what's the same
remains the same, maybe with a few extra collisions because certain
features are no longer regarded as part of the hash.  But remember that,
even if two different hashed elements map to the same hash number, they
remain different, and things work just as before (as long as there
aren't too many collisions, in which case performance may suffer, but
that's not the case at hand).

What needed changing as far as I was concerned, however, was not
equality of code units (for want of a better name for the combination of
code and compile options) within one translation group (possibly
multiple units), but rather iteration order on the hash table when
comparing two compilations of the same translation group with slightly
different options that should still generate the same executable code.

So, even if the option sets would still compare different in the end, by
disregarding some of them in computing the hashes, we get the same
hashes in the hash table, and thus the same order of iteration in the
hash table.

As for potentially harmful collisions, I pose we won't have any.  We
already had to have something to deal with two incoming functions that
were identical except for compile options, and to resolve them so that
we retained only one of them since compile options don't get to their
name mangling.  It just so happens that now some of these pairs that
should be collapsed into a single function will have the same hash.
That shouldn't make any difference for the code that deals with this
problem.

> Or tweak the PerFunction opts more and have some extra flag somewhere
> whether the value of the option is default (coming from command line
> options) or not (and use that default flag in cl_optimization_{hash,eq}
> such that options with default flag hash the same no matter what their
> value is and compare unequal to all the non-default options.

That wouldn't help with the problem I'm trying to solve.  If say
explicit -g vs -g -gtoggle, or -g -fvar-tracking-assignments-toggle were
to produce different code, be they "explicit" flags passed through
-fcompare-debug= or given in the command line to debug one such
codegen difference, it would make it more difficult to debug such
issues, but not help avoid the codegen divergences that we don't want
debug-only options, implicit or explicit, to cause.

> This would mean IPA-ICF would not be able to merge a function without
> explicit optimize ("fvar-tracking*") or optimize ("fno-var-tracking*")
> attribute with one that has such an attribute.

If it is to merge them, it must be able to do so already, before and
after the hash change, because they have different hashes *and* their
option sets compare different.  If it's not to merge them, their having
the same hash shouldn't make a difference because the options compare
different in spite of the same hash.  Makes sense?

-- 
Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer


Re: [PATCH] Fix ICE with -fno-sso-struct=none (PR driver/78957)

2017-01-05 Thread Jakub Jelinek
On Thu, Jan 05, 2017 at 08:53:49PM +0100, Eric Botcazou wrote:
> > In this case there is no -fsso-struct option (what would it mean), so
> > -fno-sso-struct isn't there either (again, what would that mean).
> > The only thing that would make sense IMHO would be to allow
> > not just big-endian and little-endian, but also native, so one can
> > cancel earlier -fsso-struct= like
> > gcc ... -fsso-struct=little-endian ... -fsso-struct=native ...
> > and have it act as if neither of those options appeared.
> 
> That makes sense and is immediate since the logic is already based on the 
> native endianness of the target.
> 
> Tested on x86-64_suse-linux, OK for the mainline?
> 
> 
> 2017-01-05  Eric Botcazou  
> 
>   * c-family/c.opt (fsso-struct): Add 'native' value.

c-family/ has its own ChangeLog, so just * c.opt (...): ...

>   * doc/invoke.texi (C Dialect Options): Document it.
> 
> 
> 2017-01-05  Eric Botcazou  
> 
>   * gcc.dg/sso-10.c: New test.

Ok, thanks.

Jakub


Re: [bootstrap-O1] change value type to avoid sprintf buffer size warning

2017-01-05 Thread Andreas Schwab
On Jan 05 2017, Alexandre Oliva  wrote:

> On Jan  5, 2017, Andreas Schwab  wrote:
>
>> On Jan 05 2017, Alexandre Oliva  wrote:
>>> -  sprintf (xname, "", ((unsigned)((uintptr_t)(t) & 0x)));
>>> +  sprintf (xname, "", ((unsigned short)((uintptr_t)(t) & 
>>> 0x)));
>
>> Please fix the spacing while you are at it.
>
> Err...  I sure would, if I knew what fix you had in mind.  Care to share
> your thoughts?  Thanks,

Space after cast.

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."


Re: [bootstrap-O1] change value type to avoid sprintf buffer size warning

2017-01-05 Thread Alexandre Oliva
On Jan  5, 2017, Andreas Schwab  wrote:

> On Jan 05 2017, Alexandre Oliva  wrote:
>> -  sprintf (xname, "", ((unsigned)((uintptr_t)(t) & 0x)));
>> +  sprintf (xname, "", ((unsigned short)((uintptr_t)(t) & 
>> 0x)));

> Please fix the spacing while you are at it.

Err...  I sure would, if I knew what fix you had in mind.  Care to share
your thoughts?  Thanks,

-- 
Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer


Re: [C++ PATCH] Reject invalid reference type non-static data members in unions (PR c++/78890)

2017-01-05 Thread Jason Merrill
OK.

On Wed, Jan 4, 2017 at 5:07 PM, Jakub Jelinek  wrote:
> On Wed, Jan 04, 2017 at 04:24:55PM -0500, Jason Merrill wrote:
>> On Mon, Jan 2, 2017 at 2:32 PM, Jakub Jelinek  wrote:
>> >   if (TREE_CODE (type) == REFERENCE_TYPE)
>> > {
>> > - error ("in C++98 %q+D may not have reference type %qT "
>> > -"because it is a member of a union", x, type);
>> > - continue;
>> > + if (cxx_dialect < cxx11)
>> > +   {
>> > + error ("in C++98 %q+D may not have reference type %qT "
>> > +"because it is a member of a union", x, type);
>> > + continue;
>> > +   }
>>
>> I don't think we need the old error anymore; in C++98 mode we will
>> have already complained about a static data member reference, so we
>> can drop the above and just keep the below.
>>
>> > + else if (TREE_CODE (x) == FIELD_DECL)
>> > +   {
>> > + error ("non-static data member %q+D in a union may not "
>> > +"have reference type %qT", x, type);
>> > + continue;
>> > +   }
>
> Like this?  Ok for trunk if bootstrap/regtest passes (just started them;
> make check-c++-all didn't show regressions)?
>
> 2017-01-04  Jakub Jelinek  
>
> PR c++/78890
> * class.c (check_field_decls): Diagnose REFERENCE_TYPE fields in
> unions even for C++11 and later.
>
> * g++.dg/init/ref14.C: Expect error even in C++11 and later.
> * g++.dg/init/union1.C: Likewise.
> * g++.dg/cpp0x/union6.C: Expect errors.
> * g++.dg/cpp0x/union8.C: New test.
> * g++.dg/cpp0x/pr78890-1.C: New test.
> * g++.dg/cpp0x/pr78890-2.C: New test.
>
> --- gcc/cp/class.c.jj   2017-01-03 08:12:27.154511815 +0100
> +++ gcc/cp/class.c  2017-01-04 22:41:32.669279309 +0100
> @@ -3759,25 +3759,27 @@ check_field_decls (tree t, tree *access_
>/* When this goes into scope, it will be a non-local reference.  */
>DECL_NONLOCAL (x) = 1;
>
> -  if (TREE_CODE (t) == UNION_TYPE
> - && cxx_dialect < cxx11)
> +  if (TREE_CODE (t) == UNION_TYPE)
> {
>   /* [class.union] (C++98)
>
>  If a union contains a static data member, or a member of
>  reference type, the program is ill-formed.
>
> -In C++11 this limitation doesn't exist anymore.  */
> - if (VAR_P (x))
> +In C++11 [class.union] says:
> +If a union contains a non-static data member of reference type
> +the program is ill-formed.  */
> + if (VAR_P (x) && cxx_dialect < cxx11)
> {
>   error ("in C++98 %q+D may not be static because it is "
>  "a member of a union", x);
>   continue;
> }
> - if (TREE_CODE (type) == REFERENCE_TYPE)
> + if (TREE_CODE (type) == REFERENCE_TYPE
> + && TREE_CODE (x) == FIELD_DECL)
> {
> - error ("in C++98 %q+D may not have reference type %qT "
> -"because it is a member of a union", x, type);
> + error ("non-static data member %q+D in a union may not "
> +"have reference type %qT", x, type);
>   continue;
> }
> }
> --- gcc/testsuite/g++.dg/init/ref14.C.jj2017-01-03 08:12:26.878515378 
> +0100
> +++ gcc/testsuite/g++.dg/init/ref14.C   2017-01-04 22:40:35.789996635 +0100
> @@ -4,7 +4,7 @@
>
>  union A
>  {
> -  int  // { dg-error "may not have reference type" "" { target { ! c++11 
> } } }
> +  int  // { dg-error "may not have reference type" }
>  };
>
>  void foo()
> --- gcc/testsuite/g++.dg/init/union1.C.jj   2017-01-03 08:12:26.862515585 
> +0100
> +++ gcc/testsuite/g++.dg/init/union1.C  2017-01-04 22:40:35.789996635 +0100
> @@ -1,5 +1,5 @@
>  // PR c++/14401
>
>  union U {
> -  int& i; // { dg-error "reference type" "" { target { ! c++11 } } }
> +  int& i; // { dg-error "reference type" }
>  };
> --- gcc/testsuite/g++.dg/cpp0x/pr78890-2.C.jj   2017-01-04 22:40:35.821996232 
> +0100
> +++ gcc/testsuite/g++.dg/cpp0x/pr78890-2.C  2017-01-04 22:40:35.821996232 
> +0100
> @@ -0,0 +1,44 @@
> +// PR c++/78890
> +// { dg-do compile { target c++11 } }
> +
> +template 
> +int
> +foo ()
> +{
> +  union {
> +int a;
> +int  = a;// { dg-error "may not have reference 
> type" }
> +  };
> +  a = 1;
> +  auto c = b + 1;
> +  return c;
> +}
> +
> +template 
> +T
> +bar ()
> +{
> +  union {
> +T a;
> +T  = a;  // { dg-error "may not have reference type" }
> +  };
> +  a = 1;
> +  auto c = b + 1;
> +  return c;
> +}
> +
> +template 
> +T baz()
> +{
> +  union {
> +T a;
> +U b = a;   // { dg-error "may not have reference type" }
> +  };
> +  a = 1;
> +  auto c = 

Re: [C++ PATCH] Fix decomp handling of fields with reference type (PR c++/78931)

2017-01-05 Thread Jason Merrill
OK.

On Wed, Jan 4, 2017 at 5:06 PM, Jakub Jelinek  wrote:
> On Wed, Jan 04, 2017 at 04:27:42PM -0500, Jason Merrill wrote:
>> On Thu, Dec 29, 2016 at 10:11 AM, Jakub Jelinek  wrote:
>> >   probe = TREE_OPERAND (probe, 0);
>> > TREE_TYPE (v[i]) = TREE_TYPE (probe);
>> > layout_decl (v[i], 0);
>> > -   SET_DECL_VALUE_EXPR (v[i], tt);
>> > +   SET_DECL_VALUE_EXPR (v[i], probe);
>>
>> I guess we can do away with the probe variable and change tt instead.
>
> Like this?  Ok for trunk if bootstrap/regtest passes (just started them;
> make check-c++-all didn't show regressions)?
>
> 2017-01-04  Jakub Jelinek  
>
> PR c++/78931
> * decl.c (cp_finish_decomp): Remove probe variable, if tt is
> REFERENCE_REF_P, set tt to its operand.
>
> * g++.dg/cpp1z/decomp19.C: New test.
>
> --- gcc/cp/decl.c.jj2017-01-01 12:45:44.562588381 +0100
> +++ gcc/cp/decl.c   2017-01-04 22:39:07.704107521 +0100
> @@ -7593,10 +7593,9 @@ cp_finish_decomp (tree decl, tree first,
> else
>   {
> tree tt = finish_non_static_data_member (field, t, NULL_TREE);
> -   tree probe = tt;
> -   if (REFERENCE_REF_P (probe))
> - probe = TREE_OPERAND (probe, 0);
> -   TREE_TYPE (v[i]) = TREE_TYPE (probe);
> +   if (REFERENCE_REF_P (tt))
> + tt = TREE_OPERAND (tt, 0);
> +   TREE_TYPE (v[i]) = TREE_TYPE (tt);
> layout_decl (v[i], 0);
> SET_DECL_VALUE_EXPR (v[i], tt);
> DECL_HAS_VALUE_EXPR_P (v[i]) = 1;
> --- gcc/testsuite/g++.dg/cpp1z/decomp19.C.jj2017-01-04 22:37:28.737355629 
> +0100
> +++ gcc/testsuite/g++.dg/cpp1z/decomp19.C   2017-01-04 22:37:28.737355629 
> +0100
> @@ -0,0 +1,13 @@
> +// PR c++/78931
> +// { dg-do run { target c++11 } }
> +// { dg-options "" }
> +
> +int
> +main ()
> +{
> +  int x = 99;
> +  struct S { int  };
> +  S s{x};
> +  auto [p] = s;// { dg-warning "decomposition declaration 
> only available with" "" { target c++14_down } }
> +  return p - 99;
> +}
>
>
> Jakub


Re: [PATCH] Fix ICE with -fno-sso-struct=none (PR driver/78957)

2017-01-05 Thread Eric Botcazou
> In this case there is no -fsso-struct option (what would it mean), so
> -fno-sso-struct isn't there either (again, what would that mean).
> The only thing that would make sense IMHO would be to allow
> not just big-endian and little-endian, but also native, so one can
> cancel earlier -fsso-struct= like
> gcc ... -fsso-struct=little-endian ... -fsso-struct=native ...
> and have it act as if neither of those options appeared.

That makes sense and is immediate since the logic is already based on the 
native endianness of the target.

Tested on x86-64_suse-linux, OK for the mainline?


2017-01-05  Eric Botcazou  

* c-family/c.opt (fsso-struct): Add 'native' value.
* doc/invoke.texi (C Dialect Options): Document it.


2017-01-05  Eric Botcazou  

* gcc.dg/sso-10.c: New test.


-- 
Eric BotcazouIndex: c-family/c.opt
===
--- c-family/c.opt	(revision 244107)
+++ c-family/c.opt	(working copy)
@@ -1631,7 +1631,7 @@ C++ ObjC++ Ignore Warn(switch %qs is no
 
 fsso-struct=
 C ObjC Joined RejectNegative Enum(sso_struct) Var(default_sso) Init(SSO_NATIVE)
--fsso-struct=[big-endian|little-endian]	Set the default scalar storage order.
+-fsso-struct=[big-endian|little-endian|native]	Set the default scalar storage order.
 
 Enum
 Name(sso_struct) Type(enum scalar_storage_order_kind) UnknownError(unrecognized scalar storage order value %qs)
@@ -1642,6 +1642,9 @@ Enum(sso_struct) String(big-endian) Valu
 EnumValue
 Enum(sso_struct) String(little-endian) Value(SSO_LITTLE_ENDIAN)
 
+EnumValue
+Enum(sso_struct) String(native) Value(SSO_NATIVE)
+
 fstats
 C++ ObjC++ Var(flag_detailed_statistics)
 Display statistics accumulated during compilation.
Index: doc/invoke.texi
===
--- doc/invoke.texi	(revision 244034)
+++ doc/invoke.texi	(working copy)
@@ -2166,9 +2166,9 @@ basic integer types such as @code{int} a
 @item -fsso-struct=@var{endianness}
 @opindex fsso-struct
 Set the default scalar storage order of structures and unions to the
-specified endianness.  The accepted values are @samp{big-endian} and
-@samp{little-endian}.  If the option is not passed, the compiler uses
-the native endianness of the target.  This option is not supported for C++.
+specified endianness.  The accepted values are @samp{big-endian},
+@samp{little-endian} and @samp{native} for the native endianness of
+the target (the default).  This option is not supported for C++.
 
 @strong{Warning:} the @option{-fsso-struct} switch causes GCC to generate
 code that is not binary compatible with code generated without it if the
/* { dg-do run } */
/* { dg-options "-fsso-struct=native" } */
/* { dg-require-effective-target int32plus } */

struct S1
{
  int i;
};


struct S1 my_s1 = { 0x12345678 };

unsigned char big_endian_pattern[4] = { 0x12, 0x34, 0x56, 0x78 };
unsigned char little_endian_pattern[4] = { 0x78, 0x56, 0x34, 0x12 };

int main (void)
{
#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
  if (__builtin_memcmp (_s1, _endian_pattern, 4) != 0)
__builtin_abort ();
#else
  if (__builtin_memcmp (_s1, _endian_pattern, 4) != 0)
__builtin_abort ();
#endif

  return 0;
}


[committed] Introduce RTL function reader

2017-01-05 Thread David Malcolm
On Thu, 2017-01-05 at 10:43 +0100, Uros Bizjak wrote:
> On Tue, Jan 3, 2017 at 5:47 PM, David Malcolm 
> wrote:
> > Ping:
> >   https://gcc.gnu.org/ml/gcc-patches/2016-12/msg01616.html
> > 
> > (the patch has been successfully bootstrap on
> > x86_64-pc-linux-gnu, and also tested on i686-pc-linux-gnu).
> 
> I consider this part of testing ifrastructure, not part of
> target-dependent functionality, IOW, testing infrastructure just
> happens to live in i386.md.
> 
> So, if there is nothing fundamentally x86 specific, I think that
> previous OKs were enough.

Thanks.

The various parts of patch 8 appear to now have been approved, so
I've committed it (along with "Add ASSERT_RTX_PTR_EQ", which it
requires and is required by) to trunk as r244110, having rebased,
and bootstrapped & regrtested on x86_64-pc-linux-gnu, and tested
stage 1's selftests for aarch64-linux-gnu.

(I also updated the copyright years in the new files).

The only remaining part of the RTL "frontend" needing review is
patch 9 of the kit: integrating the RTL loader into cc1:
 "[PATCH] Add "__RTL" to cc1 (v7)"
https://gcc.gnu.org/ml/gcc-patches/2016-12/msg01662.html

The following is what I committed, for reference:

This is the combination of these patches:
- [8a/9] Introduce class function_reader (v8)
- Add ASSERT_RTX_PTR_EQ
- [8b/9] Add target-independent selftests of RTL function reader (v2)
- [8c/9] Add aarch64-specific selftests for RTL function reader (v2)
- [8d/9] Add x86_64-specific selftests for RTL function reader (v2)

gcc/ChangeLog:
* Makefile.in (OBJS): Add read-md.o, read-rtl.o,
read-rtl-function.o, and selftest-rtl.o.
* config/aarch64/aarch64.c: Include selftest.h and
selftest-rtl.h.
(selftest::aarch64_test_loading_full_dump): New function.
(selftest::aarch64_run_selftests): New function.
(TARGET_RUN_TARGET_SELFTESTS): Wire it up to
selftest::aarch64_run_selftests.
* config/i386/i386.c
(selftest::ix86_test_loading_dump_fragment_1): New function.
(selftest::ix86_test_loading_call_insn): New function.
(selftest::ix86_test_loading_full_dump): New function.
(selftest::ix86_test_loading_unspec): New function.
(selftest::ix86_run_selftests): Call the new functions.
* emit-rtl.c (maybe_set_max_label_num): New function.
* emit-rtl.h (maybe_set_max_label_num): New decl.
* function.c (instantiate_decls): Guard call to
instantiate_decls_1 with if (DECL_INITIAL (fndecl)).
* function-tests.c (selftest::verify_three_block_rtl_cfg): Remove
"static".
* gensupport.c (gen_reader::gen_reader): Pass "false"
for new "compact" param of rtx_reader.
* print-rtl.c (rtx_writer::print_rtx_operand): Print "(nil)"
rather than an empty string for NULL strings.
* read-md.c: Potentially include config.h rather than bconfig.h.
Wrap include of errors.h with #ifdef GENERATOR_FILE.
(have_error): New global, copied from errors.c.
(md_reader::read_name): Rename to...
(md_reader::read_name_1): ...this, adding "out_loc" param,
and converting "missing name or number" to returning false, rather
than failing.
(md_reader::read_name): Reimplement in terms of read_name_1.
(md_reader::read_name_or_nil): New function.
(md_reader::read_string): Handle "(nil)" by returning NULL.
(md_reader::md_reader): Add new param "compact".
(md_reader::read_md_files): Wrap with #ifdef GENERATOR_FILE.
(md_reader::read_file): New method.
* read-md.h (md_reader::md_reader): Add new param "compact".
(md_reader::read_file): New method.
(md_reader::is_compact): New accessor.
(md_reader::read_name): Convert return type from void to
file_location.
(md_reader::read_name_or_nil): New decl.
(md_reader::read_name_1): New decl.
(md_reader::m_compact): New field.
(noop_reader::noop_reader): Pass "false" for new "compact" param
of rtx_reader.
(rtx_reader::rtx_reader): Add new "compact" param.
(rtx_reader::read_rtx_operand): Make virtual and convert return
type from void to rtx.
(rtx_reader::read_until): New decl.
(rtx_reader::handle_any_trailing_information): New virtual
function.
(rtx_reader::postprocess): New virtual function.
(rtx_reader::finalize_string): New virtual function.
(rtx_reader::m_in_call_function_usage): New field.
(rtx_reader::m_reuse_rtx_by_id): New field.
* read-rtl-function.c: New file.
* selftest-rtl.c (selftest::assert_rtx_ptr_eq_at): New function.
* selftest-rtl.h (ASSERT_RTX_PTR_EQ): New macro.
(selftest::verify_three_block_rtl_cfg): New decl.
* read-rtl-function.h: New file.
* read-rtl.c: Potentially include config.h rather than bconfig.h.
For host, 

Re: Partitioned LTO bug with cdtor aliases

2017-01-05 Thread Jan Hubicka
> Jan,
> I think I've located a bug in LTO handling.  I'm working on a gcc 5 source
> base, but AFAICT the findings are still current.  Except that on trunk the
> fix for 68881 hides this problem (not confirmed).
> 
> I have a large source base, invoke LTO in partitioned wpa mode, and end up
> with a local alias defined for a destructor, that is therefore not visible
> in the other partitions. (it should be a global hidden symbol)
> 
> In the original TUs we emit a comdat instantiation of a base and complete
> dtor pair.  The comp_dtor is a weak alias to the weakly defined base_dtor.
> So far so good.
> 
> During wpa analysis we first determine prevailing decls.  That's fine. We
> pick base and comp dtor decls from the same comdat group from one of the
> input TUs.
> 
> Then we determine function_and_variable_visibility.  We encounter the
> comp_dtor alias first (not sure if that's an invariant, or just accident of
> test case).  Anyway its symtab node marks it correctly as an
> externally_visible alias in the same comdat group as the base dtor. It has a
> resolution of LDPR_PREVAILING_DEF, and we therefore think
> cgraph_externally_visible_p (node, whole_program) is true and keep it
> externally visible.
> 
> Next we encounter the base dtor.  That has resolution
> LDPR_PREVAILING_DEF_IRONLY, and we do not think it need remain externally
> visible.  We clear node->externally_visible, and then proceed to make it
> local.  During that process we iterate over the members of the same comdat
> group, making them all local.  That encounters the comp_dtor, which we apply
> make_decl_local to.  That clears TREE_PUBLIC on the decl, but leaves its
> node->externally_visible set.  This is inconsistent.

Yep, this looks like a bug.  It is fun to have both LDPR_PREVAILING_DEF and
LDPR_PREVAILING_DEF_IRONLY in one comdat group, but I guess it makes sense.
The code expects that comdat groups always act as unit, but here it doesn't
because only way to make one local and one exported is to dissolve it.

I guess when we encouter the alias first, we could dissolve the whole comdat
group when we see LDPR_PREVAILING_DEF. (we know the symbol is going to stay and
we no longer need a comdat)
> 
> That inconsistency bites us in partitioning, because we then don't promote
> the comp_dtor decl as we think it already visible.  But as its TREE_PUBLIC
> is clear, it gets output as a local symbol in the partition defining it.
> (the inconsistency is harmless in non-partitioned lto mode)
> 
> Not sure if the right fix is to
> a) clear its externally_visible flag too in
> function_and_variable_visibility.  (What if we encounter the nodes in the
> other order?)  [Perhaps the first loop should be split into one setting
> externally_visible as appropriate and then a second loop fixing up mismatch
> between externally_visible and TREE_PUBLIC?]
> 
> b) check in lto_promote_cross_file_statics for both externally_visible &&
> TREE_PUBLIC:
> /* No need to promote if symbol already is externally visible ... */
> if ((node->externally_visible && TREE_PUBLIC(node->decl))
> 
> c) Something else?

I would try the dissolve the comdat group when seeing the alias.
> 
> clues?
> 
> nathan
> -- 
> Nathan Sidwell


[PATCH, rs6000] pr65479 Add -fasynchronous-unwind-tables when the -fsanitize=address option is seen.

2017-01-05 Thread Bill Seurer
[PATCH, rs6000] pr65479 Add -fasynchronous-unwind-tables when the 
-fsanitize=address option is seen.

(backport from trunk to gcc 6)

This patch adds the -fasynchronous-unwind-tables option to compilations when
the -fsanitize=address option is seen but not if any
-fasynchronous-unwind-tables options were already specified.
-fasynchronous-unwind-tables causes a full strack trace to be produced when
the sanitizer detects an error.  Without the full trace several of the asan
test cases fail on powerpc.

See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65479 for more information.

Bootstrapped and tested on powerpc64le-unknown-linux-gnu,
powerpc64be-unknown-linux-gnu, and x86_64-pc-linux-gnu with no regressions.
Is this ok for trunk?

[gcc]

2017-01-05  Bill Seurer  

PR sanitizer/65479
* config/rs6000/rs6000.c (rs6000_option_override_internal): Add
-fasynchronous-unwind-tables option when -fsanitize=address is
specified.

Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  (revision 244105)
+++ gcc/config/rs6000/rs6000.c  (working copy)
@@ -3750,6 +3750,13 @@ rs6000_option_override_internal (bool global_init_
   && !global_options_set.x_flag_ira_loop_pressure)
 flag_ira_loop_pressure = 1;
 
+  /* -fsanitize=address needs to turn on -fasynchronous-unwind-tables in order
+ for tracebacks to be complete but not if any -fasynchronous-unwind-tables
+ options were already specified.  */
+  if (flag_sanitize & SANITIZE_USER_ADDRESS
+  && !global_options_set.x_flag_asynchronous_unwind_tables)
+flag_asynchronous_unwind_tables = 1;
+
   /* Set the pointer size.  */
   if (TARGET_64BIT)
 {
-- 

-Bill Seurer



Fwd: [PATCH, i386]: Remove special handling of memory operands from *testqi_ext_3

2017-01-05 Thread Uros Bizjak
Hello!

It turned out that recent changes made special handling of memory
operands in *testqi_ext_3 obsolete. Combine now always pass correct
width of memory operands for the extraction, as assured by added
assert.

2017-01-05  Uros Bizjak  

* config/i386/i386.md (*testqi_ext_3): No need to handle memory
operands in a special way.  Assert that pos+len <= mode precision.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

Committed to mainline SVN.

Uros.
Index: i386.md
===
--- i386.md (revision 244058)
+++ i386.md (working copy)
@@ -8007,36 +8007,30 @@
   rtx val = operands[2];
   HOST_WIDE_INT len = INTVAL (operands[3]);
   HOST_WIDE_INT pos = INTVAL (operands[4]);
-  machine_mode mode, submode;
+  machine_mode mode = GET_MODE (val);
 
-  mode = GET_MODE (val);
-  if (MEM_P (val))
+  if (SUBREG_P (val))
 {
-  /* ??? Combine likes to put non-volatile mem extractions in QImode
-no matter the size of the test.  So find a mode that works.  */
-  if (! MEM_VOLATILE_P (val))
+  machine_mode submode = GET_MODE (SUBREG_REG (val));
+
+  /* Narrow paradoxical subregs to prevent partial register stalls.  */
+  if (GET_MODE_BITSIZE (mode) > GET_MODE_BITSIZE (submode)
+ && GET_MODE_CLASS (submode) == MODE_INT)
{
- mode = smallest_mode_for_size (pos + len, MODE_INT);
- val = adjust_address (val, mode, 0);
+ val = SUBREG_REG (val);
+ mode = submode;
}
 }
-  else if (SUBREG_P (val)
-  && (submode = GET_MODE (SUBREG_REG (val)),
-  GET_MODE_BITSIZE (mode) > GET_MODE_BITSIZE (submode))
-  && pos + len <= GET_MODE_BITSIZE (submode)
-  && GET_MODE_CLASS (submode) == MODE_INT)
+
+  /* Small HImode tests can be converted to QImode.  */
+  if (register_operand (val, HImode) && pos + len <= 8)
 {
-  /* Narrow a paradoxical subreg to prevent partial register stalls.  */
-  mode = submode;
-  val = SUBREG_REG (val);
-}
-  else if (mode == HImode && pos + len <= 8)
-{
-  /* Small HImode tests can be converted to QImode.  */
+  val = gen_lowpart (QImode, val);
   mode = QImode;
-  val = gen_lowpart (QImode, val);
 }
 
+  gcc_assert (pos + len <= GET_MODE_PRECISION (mode));
+
   wide_int mask
 = wi::shifted_mask (pos, len, false, GET_MODE_PRECISION (mode));
 


Partitioned LTO bug with cdtor aliases

2017-01-05 Thread Nathan Sidwell

Jan,
I think I've located a bug in LTO handling.  I'm working on a gcc 5 
source base, but AFAICT the findings are still current.  Except that on 
trunk the fix for 68881 hides this problem (not confirmed).


I have a large source base, invoke LTO in partitioned wpa mode, and end 
up with a local alias defined for a destructor, that is therefore not 
visible in the other partitions. (it should be a global hidden symbol)


In the original TUs we emit a comdat instantiation of a base and 
complete dtor pair.  The comp_dtor is a weak alias to the weakly defined 
base_dtor.  So far so good.


During wpa analysis we first determine prevailing decls.  That's fine. 
We pick base and comp dtor decls from the same comdat group from one of 
the input TUs.


Then we determine function_and_variable_visibility.  We encounter the 
comp_dtor alias first (not sure if that's an invariant, or just accident 
of test case).  Anyway its symtab node marks it correctly as an 
externally_visible alias in the same comdat group as the base dtor. It 
has a resolution of LDPR_PREVAILING_DEF, and we therefore think 
cgraph_externally_visible_p (node, whole_program) is true and keep it 
externally visible.


Next we encounter the base dtor.  That has resolution 
LDPR_PREVAILING_DEF_IRONLY, and we do not think it need remain 
externally visible.  We clear node->externally_visible, and then proceed 
to make it local.  During that process we iterate over the members of 
the same comdat group, making them all local.  That encounters the 
comp_dtor, which we apply make_decl_local to.  That clears TREE_PUBLIC 
on the decl, but leaves its node->externally_visible set.  This is 
inconsistent.


That inconsistency bites us in partitioning, because we then don't 
promote the comp_dtor decl as we think it already visible.  But as its 
TREE_PUBLIC is clear, it gets output as a local symbol in the partition 
defining it. (the inconsistency is harmless in non-partitioned lto mode)


Not sure if the right fix is to
a) clear its externally_visible flag too in 
function_and_variable_visibility.  (What if we encounter the nodes in 
the other order?)  [Perhaps the first loop should be split into one 
setting externally_visible as appropriate and then a second loop fixing 
up mismatch between externally_visible and TREE_PUBLIC?]


b) check in lto_promote_cross_file_statics for both externally_visible 
&& TREE_PUBLIC:

  /* No need to promote if symbol already is externally visible ... */
  if ((node->externally_visible && TREE_PUBLIC(node->decl))

c) Something else?

clues?

nathan
--
Nathan Sidwell


Re: [PATCH] avoid infinite recursion in maybe_warn_alloc_args_overflow (pr 78775)

2017-01-05 Thread Martin Sebor

On 01/05/2017 11:03 AM, Jeff Law wrote:

On 12/12/2016 06:36 PM, Martin Sebor wrote:

The attached patch avoids infinite recursion when traversing phi
nodes in maybe_warn_alloc_args_overflow by using a bitmap to keep
track of those already visited and breaking out.

Thanks
Martin

gcc-78775.diff


PR tree-optimization/78775 - ICE in maybe_warn_alloc_args_overflow

gcc/ChangeLog:

PR tree-optimization/78775
* calls.c (operand_signed_p): Add overload and avoid getting into
infinite recursion when traversing phi nodes.

gcc/testsuite/ChangeLog:

PR tree-optimization/78775
* gcc.dg/pr78775.c: New test.

So Richi asked for removal of the VR_ANTI_RANGE handling, which would
imply removal of operand_signed_p.

What are the implications if we do that?


I just got back to this yesterday.  The implications of the removal
of the anti-range handling are a number of false negatives in the
test suite:

  FAIL: gcc.dg/attr-alloc_size-3.c  (test for warnings, line 448)
  FAIL: gcc.dg/attr-alloc_size-4.c  (test for warnings, line 137)
  FAIL: gcc.dg/attr-alloc_size-4.c  (test for warnings, line 139)
  FAIL: gcc.dg/attr-alloc_size-4.c  (test for warnings, line 175)

with the second one, for example, being:

  n = ~[-4, MAX];   (I.e., n is either negative or too big.)
  p = malloc (n);

Passing signed integers as arguments to allocation functions is
common so I've been looking into how else to avoid the phi recursion
(which I assume is the concern here) without compromising this case.
Removing just the operand_signed_p() handling causes a number of
false positives in the test suite, such as for

  m = [-3, 7];
  n = [-5, 11];
  p = calloc (m, n);

which I suspect are common in the wild as well.

Martin


[PATCH][ARM] Backport - Avoid partial overlaps in DImode shifts *

2017-01-05 Thread Wilco Dijkstra
With -fpu=neon DI mode shifts are expanded after reload.  DI mode registers can 
either fully or partially overlap on both ARM and Thumb-2.  However the shift
expansion code can only deal with the full overlap case, and generates incorrect
code for partial overlaps.  The fix is to add new variants that support either
full overlap or no overlap.

Bootstrap & regress on arm-linux-gnueabihf OK on GCC6 branch.

OK for backport?

ChangeLog:
2017-01-05  Wilco Dijkstra  

gcc/
PR target/78041
* config/arm/neon.md (ashldi3_neon): Add "r 0 i" and " r i" variants.
Remove partial overlap check for shift by 1.
(ashldi3_neon): Likewise.
testsuite/
* gcc.target/arm/pr78041.c: New test.
--

diff --git a/gcc/config/arm/neon.md b/gcc/config/arm/neon.md
index 
879c07c13b6aa20c46828d08f5a4f413a5722eca..1d51c4045a1779fbd7927c94cdd6752b95cdabc7
 100644
--- a/gcc/config/arm/neon.md
+++ b/gcc/config/arm/neon.md
@@ -1045,12 +1045,12 @@
 )
 
 (define_insn_and_split "ashldi3_neon"
-  [(set (match_operand:DI 0 "s_register_operand"   "= w, w,?,?r, 
?w,w")
-   (ashift:DI (match_operand:DI 1 "s_register_operand" " 0w, w, 0r, r, 
0w,w")
-  (match_operand:SI 2 "general_operand""rUm, i,  r, 
i,rUm,i")))
-   (clobber (match_scratch:SI 3"= X, 
X,?, X,  X,X"))
-   (clobber (match_scratch:SI 4"= X, 
X,?, X,  X,X"))
-   (clobber (match_scratch:DI 5"=, X,  
X, X, ,X"))
+  [(set (match_operand:DI 0 "s_register_operand"   "= w, w,?,?r,?, 
?w,w")
+   (ashift:DI (match_operand:DI 1 "s_register_operand" " 0w, w, 0r, 0,  r, 
0w,w")
+  (match_operand:SI 2 "general_operand""rUm, i,  r, i,  
i,rUm,i")))
+   (clobber (match_scratch:SI 3"= X, 
X,?, X,  X,  X,X"))
+   (clobber (match_scratch:SI 4"= X, 
X,?, X,  X,  X,X"))
+   (clobber (match_scratch:DI 5"=, X,  
X, X,  X, ,X"))
(clobber (reg:CC_C CC_REGNUM))]
   "TARGET_NEON"
   "#"
@@ -1082,9 +1082,11 @@
   }
 else
   {
-   if (CONST_INT_P (operands[2]) && INTVAL (operands[2]) == 1
-   && (!reg_overlap_mentioned_p (operands[0], operands[1])
-   || REGNO (operands[0]) == REGNO (operands[1])))
+   /* The shift expanders support either full overlap or no overlap.  */
+   gcc_assert (!reg_overlap_mentioned_p (operands[0], operands[1])
+   || REGNO (operands[0]) == REGNO (operands[1]));
+
+   if (operands[2] == CONST1_RTX (SImode))
  /* This clobbers CC.  */
  emit_insn (gen_arm_ashldi3_1bit (operands[0], operands[1]));
else
@@ -1093,8 +1095,8 @@
   }
 DONE;
   }"
-  [(set_attr "arch" 
"neon_for_64bits,neon_for_64bits,*,*,avoid_neon_for_64bits,avoid_neon_for_64bits")
-   (set_attr "opt" "*,*,speed,speed,*,*")
+  [(set_attr "arch" 
"neon_for_64bits,neon_for_64bits,*,*,*,avoid_neon_for_64bits,avoid_neon_for_64bits")
+   (set_attr "opt" "*,*,speed,speed,speed,*,*")
(set_attr "type" "multiple")]
 )
 
@@ -1143,12 +1145,12 @@
 ;; ashrdi3_neon
 ;; lshrdi3_neon
 (define_insn_and_split "di3_neon"
-  [(set (match_operand:DI 0 "s_register_operand""= w, 
w,?,?r,?w,?w")
-   (RSHIFTS:DI (match_operand:DI 1 "s_register_operand" " 0w, w, 0r, r,0w, 
w")
-   (match_operand:SI 2 "reg_or_int_operand" "  r, i,  r, i, r, 
i")))
-   (clobber (match_scratch:SI 3 "=2r, X, 
, X,2r, X"))
-   (clobber (match_scratch:SI 4 "= X, X, 
, X, X, X"))
-   (clobber (match_scratch:DI 5 "=, X,  
X, X,, X"))
+  [(set (match_operand:DI 0 "s_register_operand""= w, 
w,?,?r,?,?w,?w")
+   (RSHIFTS:DI (match_operand:DI 1 "s_register_operand" " 0w, w, 0r, 0,  
r,0w, w")
+   (match_operand:SI 2 "reg_or_int_operand" "  r, i,  r, i,  
i, r, i")))
+   (clobber (match_scratch:SI 3 "=2r, X, 
, X,  X,2r, X"))
+   (clobber (match_scratch:SI 4 "= X, X, 
, X,  X, X, X"))
+   (clobber (match_scratch:DI 5 "=, X,  
X, X, X,, X"))
(clobber (reg:CC CC_REGNUM))]
   "TARGET_NEON"
   "#"
@@ -1184,9 +1186,11 @@
   }
 else
   {
-   if (CONST_INT_P (operands[2]) && INTVAL (operands[2]) == 1
-   && (!reg_overlap_mentioned_p (operands[0], operands[1])
-   || REGNO (operands[0]) == REGNO (operands[1])))
+   /* The shift expanders support either full overlap or no overlap.  */
+   gcc_assert (!reg_overlap_mentioned_p (operands[0], operands[1])
+   || REGNO (operands[0]) == REGNO (operands[1]));
+
+   if (operands[2] == CONST1_RTX (SImode))
  /* This clobbers CC.  */
  emit_insn 

Re: [bootstrap-O1] change value type to avoid sprintf buffer size warning

2017-01-05 Thread Jeff Law

On 01/04/2017 09:05 PM, Alexandre Oliva wrote:

On Jan  4, 2017, Martin Sebor  wrote:


The manual recommends to use a length modifier to constrain the length
of output to that of a narrower type:



  sprintf (xname, "", ((unsigned short)((uintptr_t)(t) & 0x)));



This should work even without optimization.


It might not work if short happens to be wider than 16 bits, but I guess
we need not worry about that at -O1.

In stage2 of bootstrap-O1, the code that warns if sprintf might
overflow its output buffer cannot tell that an unsigned value narrowed
to 16 bits will fit in 4 bytes with %4x.

Converting the value to 'unsigned short' makes it obvious that it
fits, at least on machines with 16-bit shorts.

Regstrapped on x86_64-linux-gnu and i686-linux-gnu.  Ok to install?

for  gcc/c-family/ChangeLog

* c-pretty-print.c (pp_c_tree_decl_identifier): Convert 16-bit
value to unsigned short to fit in 4 hex digits without
warnings.

OK.
jeff



Re: [PATCH] correct off-by-one error in handling of precision of negative numbers (PR 78910)

2017-01-05 Thread Jeff Law

On 01/04/2017 02:43 PM, Martin Sebor wrote:

The attached patch corrects an off-by-one mistake in handling
non-zero precisions in signed conversions (%d and %i) with
negative arguments, such as in sprintf(d, "%.2", -1).  The call
which results in the three bytes "-01" on output, not 2.  I.e.,
the minus sign must be taken into consideration.

Thanks
Martin

gcc-78910.diff


PR tree-optimization/78910 - Wrong print-return-value for a negative number

gcc/ChangeLog:

PR tree-optimization/78910
* gimple-ssa-sprintf.c (tree_digits): Add an argument.
(format_integer): Correct off-by-one error in the handling
of precision with negative numbers in signed conversions..

gcc/testsuite/ChangeLog:

PR tree-optimization/78910
* gcc.dg/tree-ssa/builtin-sprintf-warn-7.c: Adjust text of expected
diagnostics.
* gcc.dg/tree-ssa/builtin-sprintf.c: Add test cases.
* gcc.dg/tree-ssa/pr78910.c: New test.

OK.
jeff



Re: [PATCH] Add testcases to test builtin-expansion of memcmp and strncmp

2017-01-05 Thread Jeff Law

On 12/19/2016 08:02 PM, Aaron Sawdey wrote:

This patch adds tests gcc.dg/memcmp-1.c and gcc.dg/strncmp-1.c that
test builtin expansion of memcmp and strncmp for short strings and also
varying alignment of one arg. The strncmp test checks that things work
when one of the strings crosses a 4k boundary as well.

I've included interested parties from targets that have a strncmp
builtin.

The tests pass on ppc64le and x86_64. OK for trunk?
I think the tests need to be restricted to ports with 32bit or larger 
pointers -- they're likely to overflow the address space on the small 
embedded targets.


Adding this ought to do the trick:

/* { dg-require-effective-target ptr32plus } */

I'd drop the printfs as long as doing so does not compromise the tests.

And you need a ChangeLog.

OK with those fixes.

Jeff




Re: [PATCH] avoid infinite recursion in maybe_warn_alloc_args_overflow (pr 78775)

2017-01-05 Thread Jeff Law

On 12/12/2016 06:36 PM, Martin Sebor wrote:

The attached patch avoids infinite recursion when traversing phi
nodes in maybe_warn_alloc_args_overflow by using a bitmap to keep
track of those already visited and breaking out.

Thanks
Martin

gcc-78775.diff


PR tree-optimization/78775 - ICE in maybe_warn_alloc_args_overflow

gcc/ChangeLog:

PR tree-optimization/78775
* calls.c (operand_signed_p): Add overload and avoid getting into
infinite recursion when traversing phi nodes.

gcc/testsuite/ChangeLog:

PR tree-optimization/78775
* gcc.dg/pr78775.c: New test.
So Richi asked for removal of the VR_ANTI_RANGE handling, which would 
imply removal of operand_signed_p.


What are the implications if we do that?

Jeff



Re: Make MicroBlaze support DWARF EH (old Xilinx patch, needed for glibc build)

2017-01-05 Thread Joseph Myers
On Thu, 5 Jan 2017, Michael Eager wrote:

> On multiple occasions, I have asked Xilinx to submit patches such
> as this one directly to the GCC/Binutils projects (assuming that
> they have a current FSF Copyright Assignment), or to give me
> explicit permission to do so on their behalf, as was the case when
> I originally submitted the MicroBlaze port to these projects.  For
> whatever reason, neither has occurred.

copyright.list shows assignments from Xilinx for GCC, binutils, GDB and 
glibc, with no indication that they are not current, or of any restriction 
on who at Xilinx can contribute changes.

Thus, I presume we simply need someone at Xilinx to approve the 
contribution of this patch to GCC.  The last patch sent to gcc-patches by 
someone @xilinx.com appears to have been 
; author and 
other @xilinx.com people from that discussion CC:ed on this message.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [bootstrap-O3,fortran] silence warning in simplify_transformation_to_array

2017-01-05 Thread Jeff Law

On 01/05/2017 05:15 AM, Richard Biener wrote:

On Thu, Jan 5, 2017 at 5:06 AM, Alexandre Oliva 
wrote:

On Jan  4, 2017, Alexandre Oliva  wrote:


I'll prepare and post a patch anyway, but do we want to make it
standard practice?


Here it is.

simplify_transformation_to_array had the nested loop unrolled 7
times, which is reasonable given that it iterates over arrays of
size GFC_MAX_DIMENSIONS == 7.

The problem is that the last iteration increments the index, tests
that it's less than result->rank, and then accesses the arrays
with the incremented index.

We did not optimize out that part in the 7th iteration, so VRP
flagged the unreachable code as accessing arrays past the end.

It couldn't possibly know that we'd never reach that part, since
the test was on result->rank, and it's not obvious (for the
compiler) that result->rank <= GFC_MAX_DIMENSIONS.

Even an assert to that effect before the enclosing loop didn't
avoid the warning turned to error, though; I suppose there might be
some aliasing at play, because moving the assert into the loop
does.  An assert on the index itself would also work, even more
efficiently, but we're just silencing the warning instead.

Regstrapped on x86_64-linux-gnu and i686-linux-gnu.  Ok to
install?


Reasonable -- I'll leave it for others to comment on that "standard
practice" part (it'll be the first case of using this IIRC).
I thought I'd seen this stuff elsewhere, but that could have just as 
easily been within glibc.


It's a fair amount of clutter.  But I won't object.
jeff


[PATCH] Remove unnecessary typedef from std::function

2017-01-05 Thread Jonathan Wakely

TThis typedef is only used once, in a function body, so it doesn't
need to be defined at class scope. It doesn't need to be defined at
all.

* include/bits/std_function.h (function::_Signature_type): Remove.
(function::function(_Functor)): Adjust.

Tested powerpc64le-linux, committed to trunk.

commit 6bcf4be09c82827b8396b81ff623e827fbbea305
Author: Jonathan Wakely 
Date:   Thu Jan 5 17:24:02 2017 +

Remove unnecessary typedef from std::function

* include/bits/std_function.h (function::_Signature_type): Remove.
(function::function(_Functor)): Adjust.

diff --git a/libstdc++-v3/include/bits/std_function.h 
b/libstdc++-v3/include/bits/std_function.h
index 7b10c42..f7bb22a 100644
--- a/libstdc++-v3/include/bits/std_function.h
+++ b/libstdc++-v3/include/bits/std_function.h
@@ -456,8 +456,6 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 : public _Maybe_unary_or_binary_function<_Res, _ArgTypes...>,
   private _Function_base
 {
-  typedef _Res _Signature_type(_ArgTypes...);
-
   template::type>
struct _Callable : __check_func_return_type<_Res2, _Res> { };
@@ -715,7 +713,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   function(_Functor __f)
   : _Function_base()
   {
-   typedef _Function_handler<_Signature_type, _Functor> _My_handler;
+   typedef _Function_handler<_Res(_ArgTypes...), _Functor> _My_handler;
 
if (_My_handler::_M_not_empty_function(__f))
  {


[RFC] combine: Handle zero_extend without subreg in change_zero_ext.

2017-01-05 Thread Dominik Vogt
The attached patch deals with another type of zero_extend that is
not yet handled in change_zero_ext, i.e. (zero_extend
(pseudoreg)), without a "subreg" in between.  What do you think?
(Mostly untested yet.)

Ciao

Dominik ^_^  ^_^

-- 

Dominik Vogt
IBM Germany
>From 36be099953903aa16ec1f307f0018a3e537638a5 Mon Sep 17 00:00:00 2001
From: Dominik Vogt 
Date: Thu, 5 Jan 2017 17:37:37 +0100
Subject: [PATCH] combine: Handle zero_extend without subreg in
 change_zero_ext.

Try replacing

  (zero_extend (reg))

with

  (and (subreg (reg) 0)
   (const_int))
---
 gcc/combine.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/gcc/combine.c b/gcc/combine.c
index e77f203..b5131b8 100644
--- a/gcc/combine.c
+++ b/gcc/combine.c
@@ -11372,6 +11372,16 @@ change_zero_ext_src (subrtx_ptr_iterator *piter)
   else if (GET_CODE (x) == ZERO_EXTEND
   && SCALAR_INT_MODE_P (mode)
   && REG_P (XEXP (x, 0))
+  && !HARD_REGISTER_P (XEXP (x, 0))
+  && GET_MODE_PRECISION (GET_MODE (XEXP (x, 0)))
+ < GET_MODE_PRECISION (mode))
+{
+  /* (zero_extract (reg)) -> (and (subreg (reg) 0) (const_int)) */
+  x = gen_lowpart_SUBREG (mode, XEXP (x, 0));
+}
+  else if (GET_CODE (x) == ZERO_EXTEND
+  && SCALAR_INT_MODE_P (mode)
+  && REG_P (XEXP (x, 0))
   && HARD_REGISTER_P (XEXP (x, 0))
   && can_change_dest_mode (XEXP (x, 0), 0, mode))
 {
-- 
2.3.0

gcc/ChangeLog-change_zero_ext-5

* combine.c (change_zero_ext_src): Handle zero_extend without subreg.


Re: Implement -Wduplicated-branches (PR c/64279) (v3)

2017-01-05 Thread Richard Biener
On January 5, 2017 4:41:28 PM GMT+01:00, Jakub Jelinek  wrote:
>On Thu, Jan 05, 2017 at 04:39:40PM +0100, Marek Polacek wrote:
>> Coming back to this...
>
>> > Right, after h0 == h1 is missing && operand_equal_p (thenb, elseb,
>0)
>> > or so (the exact last operand needs to be figured out).
>> > OEP_ONLY_CONST is certainly wrong, we want the same VAR_DECLs to
>mean the
>> > same thing.  0 is a tiny bit better, but still it will give up on
>e.g. pure
>> > and other calls.  OEP_PURE_SAME is tiny bit better than that, but
>still
>> > calls with the same arguments to the same function will not be
>considered
>> > equal, plus likely operand_equal_p doesn't handle STATEMENT_LIST
>etc.
>> > So maybe we need another OEP_* mode for this.
>> 
>> Yea, if I add "&& operand_equal_p (thenb, elseb, 0)" then this
>warning doesn't
>> trigger for certain cases, such as MODIFY_EXPR, RETURN_EXPR, probably
>> STATEMENT_LIST and others.  So I suppose I could introduce a new OEP_
>mode for
>> this (names?  OEP_EXTENDED?) and then in operand_equal_p in case
>tcc_expression
>> do
>> 
>>   case MODIFY_EXPR:
>> if (flags & OEP_EXTENDED)
>>   // compare LHS and RHS of both
>>  
>> ?
>
>Yeah.  Not sure what is the best name for that.  Maybe Richi has some
>clever
>ideas.

OEP_LEXICOGRAPHIC?

Richard.

>
>   Jakub



Re: Implement -Wduplicated-branches (PR c/64279) (v3)

2017-01-05 Thread Jakub Jelinek
On Thu, Jan 05, 2017 at 04:39:40PM +0100, Marek Polacek wrote:
> Coming back to this...

> > Right, after h0 == h1 is missing && operand_equal_p (thenb, elseb, 0)
> > or so (the exact last operand needs to be figured out).
> > OEP_ONLY_CONST is certainly wrong, we want the same VAR_DECLs to mean the
> > same thing.  0 is a tiny bit better, but still it will give up on e.g. pure
> > and other calls.  OEP_PURE_SAME is tiny bit better than that, but still
> > calls with the same arguments to the same function will not be considered
> > equal, plus likely operand_equal_p doesn't handle STATEMENT_LIST etc.
> > So maybe we need another OEP_* mode for this.
> 
> Yea, if I add "&& operand_equal_p (thenb, elseb, 0)" then this warning doesn't
> trigger for certain cases, such as MODIFY_EXPR, RETURN_EXPR, probably
> STATEMENT_LIST and others.  So I suppose I could introduce a new OEP_ mode for
> this (names?  OEP_EXTENDED?) and then in operand_equal_p in case 
> tcc_expression
> do
> 
>   case MODIFY_EXPR:
> if (flags & OEP_EXTENDED)
>   // compare LHS and RHS of both
>  
> ?

Yeah.  Not sure what is the best name for that.  Maybe Richi has some clever
ideas.

Jakub


Re: Implement -Wduplicated-branches (PR c/64279) (v3)

2017-01-05 Thread Marek Polacek
Coming back to this...

On Thu, Nov 03, 2016 at 02:38:50PM +0100, Jakub Jelinek wrote:
> On Thu, Nov 03, 2016 at 09:27:55AM -0400, Jason Merrill wrote:
> > On Thu, Nov 3, 2016 at 7:24 AM, Marek Polacek  wrote:
> > > On Tue, Nov 01, 2016 at 02:53:58PM +0100, Jakub Jelinek wrote:
> > >> On Tue, Nov 01, 2016 at 09:41:20AM -0400, Jason Merrill wrote:
> > >> > On Tue, Oct 25, 2016 at 9:59 AM, Marek Polacek  
> > >> > wrote:
> > >> > > On Mon, Oct 24, 2016 at 04:10:21PM +0200, Marek Polacek wrote:
> > >> > >> On Thu, Oct 20, 2016 at 12:28:36PM +0200, Marek Polacek wrote:
> > >> > >> > I found a problem with this patch--we can't call 
> > >> > >> > do_warn_duplicated_branches in
> > >> > >> > build_conditional_expr, because that way the C++-specific codes 
> > >> > >> > might leak into
> > >> > >> > the hasher.  Instead, I should use operand_equal_p, I think.  Let 
> > >> > >> > me rework
> > >> > >> > that part of the patch.
> > >> >
> > >> > Hmm, is there a reason not to use operand_equal_p for
> > >> > do_warn_duplicated_branches as well?  I'm concerned about hash
> > >> > collisions leading to false positives.
> > >>
> > >> If the hashing function is iterative_hash_expr / inchash::add_expr, then
> > >> that is supposed to pair together with operand_equal_p, we even have
> > >> checking verification of that.
> > 
> > Yes, but there could still be hash collisions; we can't guarantee that
> > everything that compares unequal also hashes unequal.
> 
> Right, after h0 == h1 is missing && operand_equal_p (thenb, elseb, 0)
> or so (the exact last operand needs to be figured out).
> OEP_ONLY_CONST is certainly wrong, we want the same VAR_DECLs to mean the
> same thing.  0 is a tiny bit better, but still it will give up on e.g. pure
> and other calls.  OEP_PURE_SAME is tiny bit better than that, but still
> calls with the same arguments to the same function will not be considered
> equal, plus likely operand_equal_p doesn't handle STATEMENT_LIST etc.
> So maybe we need another OEP_* mode for this.

Yea, if I add "&& operand_equal_p (thenb, elseb, 0)" then this warning doesn't
trigger for certain cases, such as MODIFY_EXPR, RETURN_EXPR, probably
STATEMENT_LIST and others.  So I suppose I could introduce a new OEP_ mode for
this (names?  OEP_EXTENDED?) and then in operand_equal_p in case tcc_expression
do

  case MODIFY_EXPR:
if (flags & OEP_EXTENDED)
  // compare LHS and RHS of both
 
?

Marek


[PATCH] Fix PR78997

2017-01-05 Thread Richard Biener

I am testing the following to fix PR78997.

Bootstrap / regtest running on x86_64-unknown-linux-gnu.

Richard.

2016-01-05  Richard Biener  

PR tree-optimization/78997
* tree-vect-slp.c (vect_mask_constant_operand_p): Handle SSA
name condition properly.

* gcc.dg/torture/pr78997.c: New testcase.

Index: gcc/tree-vect-slp.c
===
*** gcc/tree-vect-slp.c (revision 244093)
--- gcc/tree-vect-slp.c (working copy)
*** vect_mask_constant_operand_p (gimple *st
*** 2897,2905 
tree cond = gimple_assign_rhs1 (stmt);
  
if (TREE_CODE (cond) == SSA_NAME)
!   return false;
! 
!   if (opnum)
op = TREE_OPERAND (cond, 1);
else
op = TREE_OPERAND (cond, 0);
--- 2897,2904 
tree cond = gimple_assign_rhs1 (stmt);
  
if (TREE_CODE (cond) == SSA_NAME)
!   op = cond;
!   else if (opnum)
op = TREE_OPERAND (cond, 1);
else
op = TREE_OPERAND (cond, 0);
Index: gcc/testsuite/gcc.dg/torture/pr78997.c
===
*** gcc/testsuite/gcc.dg/torture/pr78997.c  (revision 0)
--- gcc/testsuite/gcc.dg/torture/pr78997.c  (working copy)
***
*** 0 
--- 1,106 
+ /* { dg-do compile } */
+ 
+ int printf (const char *, ...);
+ 
+ static short f, p, q, s, u, aa, ab, ac;
+ static int b, c, d, e, h, k, l, m, n, o, r, t, v, w, x, y, z, ad, ae, af, ag, 
ah, ai, aj, ak, al, am, an;
+ int a, ao, ap, aq, ar, g, as, at, au, av, aw, ax, ay;
+ 
+ void foo ()
+ { 
+   int ba[2], i, j, bb;
+   while (b)
+ { 
+   b++;
+   while (b)
+   { 
+ for (; aw; aw++)
+   for (; q; q++)
+ { 
+   short bc[20];
+   if (k)
+ for (i = 0; i < 4; i++)
+   for (j = 0; j < 4; j++)
+ if (p)
+   bc[i * 4 + j] = 8;
+   for (; ad; ad--)
+ t = bc[1];
+ }
+ for (bb = 0; bb < 5; bb++)
+   if (m && v)
+ { 
+   printf ("%d", n);
+   v = g && v;
+   n = 0;
+ }
+ ab &= ba[0];
+ aw = l;
+ aa++;
+ x++;
+ while (1)
+   { 
+ int bd[] = {0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1};
+ ap = a ? ba[1] : 0;
+ if (ba[0] && o < ax)
+   { 
+ int be[] = 
{0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1};
+ for (; ae; ae++)
+   { 
+ e ^= ba[b] ^ aa;
+ f = r;
+ for (; y; y++)
+   aj &= u | ag;
+ int e[] = 
{0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1};
+ if (a)
+   { 
+ r = c;
+ aj &= ag |= aq;
+   }
+ av = ai * u;
+ af = c;
+   }
+ au = d;
+ p++;
+ u = aj;
+ a = ba[1];
+ at = ar = af != ai && l;
+ as = ax = f;
+ ao = ak ? 0 : ah;
+ aw = ao;
+   }
+ ay = c;
+ int bf[] = 
{0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1};
+ if (w < f)
+   { 
+ int bg[] = {0};
+ if (aw)
+   break;
+   }
+ else
+   aw = aa | (h &= ag) >> d, c = b = z && am;
+ for (; w; w--)
+   l = ac ^= al |= b;
+ for (; k; k = 0)
+   { 
+ int bh = m | s && n;
+ m = bh;
+ for (; t; t--)
+   f = q ^= (c < (e < ah));
+   }
+ d = an |= b;
+ if (v)
+   { 
+ int bi[] = 
{0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,1,1};
+ if (aw)
+   break;
+   }
+   }
+   }
+ }
+ }
+ 
+ int main ()
+ { 
+   foo ();
+   return 0;
+ }


[patch] update zlib to the 1.2.10 release.

2017-01-05 Thread Matthias Klose
These are the changes updating zlib from 1.2.8 to 1.2.10. It is only used when
building without a system zlib.  The new release includes fixes for security
issues CVE-2016-9840, CVE-2016-9841, CVE-2016-9842, CVE-2016-9843.

Checked with a build with disabled system-zlib. Ok for the trunk?

Matthias


Changes in 1.2.10 (2 Jan 2017)
- Avoid warnings on snprintf() return value
- Fix bug in deflate_stored() for zero-length input
- Fix bug in gzwrite.c that produced corrupt gzip files
- Remove files to be installed before copying them in Makefile.in
- Add warnings when compiling with assembler code

Changes in 1.2.9 (31 Dec 2016)
- Fix contrib/minizip to permit unzipping with desktop API [Zouzou]
- Improve contrib/blast to return unused bytes
- Assure that gzoffset() is correct when appending
- Improve compress() and uncompress() to support large lengths
- Fix bug in test/example.c where error code not saved
- Remedy Coverity warning [Randers-Pehrson]
- Improve speed of gzprintf() in transparent mode
- Fix inflateInit2() bug when windowBits is 16 or 32
- Change DEBUG macro to ZLIB_DEBUG
- Avoid uninitialized access by gzclose_w()
- Allow building zlib outside of the source directory
- Fix bug that accepted invalid zlib header when windowBits is zero
- Fix gzseek() problem on MinGW due to buggy _lseeki64 there
- Loop on write() calls in gzwrite.c in case of non-blocking I/O
- Add --warn (-w) option to ./configure for more compiler warnings
- Reject a window size of 256 bytes if not using the zlib wrapper
- Fix bug when level 0 used with Z_HUFFMAN or Z_RLE
- Add --debug (-d) option to ./configure to define ZLIB_DEBUG
- Fix bugs in creating a very large gzip header
- Add uncompress2() function, which returns the input size used
- Assure that deflateParams() will not switch functions mid-block
- Dramatically speed up deflation for level 0 (storing)
- Add gzfread(), duplicating the interface of fread()
- Add gzfwrite(), duplicating the interface of fwrite()
- Add deflateGetDictionary() function
- Use snprintf() for later versions of Microsoft C
- Fix *Init macros to use z_ prefix when requested
- Replace as400 with os400 for OS/400 support [Monnerat]
- Add crc32_z() and adler32_z() functions with size_t lengths
- Update Visual Studio project files [AraHaan]


zlib-1.2.10.diff.gz
Description: application/gzip


[PATCH] [HSA] Implement DIVMOD internal function call

2017-01-05 Thread Martin Liška
As DIVMOD has started to be used on x86_64, HSA targets needs to understand
the internal function call. It's implemented by tuple of division and remainder
or arguments. It's pre-approved by Martin Jambor for both hsa branch and trunk.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Martin
>From 89217a4ce213c0f9bcd4bd157f29e6bc562dd98f Mon Sep 17 00:00:00 2001
From: marxin 
Date: Tue, 3 Jan 2017 13:50:54 +0100
Subject: [PATCH] HSA: implement DIVMOD internal function call

gcc/ChangeLog:

2017-01-03  Martin Liska  

	* hsa-gen.c (gen_hsa_divmod): New function.
	(gen_hsa_insn_for_internal_fn_call): Use the function
	for IFN_DIVMOD.
---
 gcc/hsa-gen.c | 38 ++
 1 file changed, 38 insertions(+)

diff --git a/gcc/hsa-gen.c b/gcc/hsa-gen.c
index f2843b3ed38..632561d5e45 100644
--- a/gcc/hsa-gen.c
+++ b/gcc/hsa-gen.c
@@ -4345,6 +4345,40 @@ gen_hsa_popcount (gcall *call, hsa_bb *hbb)
   gen_hsa_popcount_to_dest (dest, arg, hbb);
 }
 
+/* Emit instructions that implement DIVMOD builtin STMT.
+   Instructions are appended to basic block HBB.  */
+
+static void
+gen_hsa_divmod (gcall *call, hsa_bb *hbb)
+{
+  tree lhs = gimple_call_lhs (call);
+  if (lhs == NULL_TREE)
+return;
+
+  tree rhs0 = gimple_call_arg (call, 0);
+  tree rhs1 = gimple_call_arg (call, 1);
+
+  hsa_op_with_type *arg0 = hsa_reg_or_immed_for_gimple_op (rhs0, hbb);
+  hsa_op_with_type *arg1 = hsa_reg_or_immed_for_gimple_op (rhs1, hbb);
+
+  hsa_op_reg *dest0 = new hsa_op_reg (arg0->m_type);
+  hsa_op_reg *dest1 = new hsa_op_reg (arg1->m_type);
+
+  hsa_insn_basic *insn = new hsa_insn_basic (3, BRIG_OPCODE_DIV, dest0->m_type,
+	 dest0, arg0, arg1);
+  hbb->append_insn (insn);
+  insn = new hsa_insn_basic (3, BRIG_OPCODE_REM, dest1->m_type, dest1, arg0,
+			 arg1);
+  hbb->append_insn (insn);
+
+  hsa_op_reg *dest = hsa_cfun->reg_for_gimple_ssa (lhs);
+  BrigType16_t src_type = hsa_bittype_for_type (dest0->m_type);
+
+  insn = new hsa_insn_packed (3, BRIG_OPCODE_COMBINE, dest->m_type,
+			  src_type, dest, dest0, dest1);
+  hbb->append_insn (insn);
+}
+
 /* Set VALUE to a shadow kernel debug argument and append a new instruction
to HBB basic block.  */
 
@@ -5050,6 +5084,10 @@ gen_hsa_insn_for_internal_fn_call (gcall *stmt, hsa_bb *hbb)
   gen_hsa_popcount (stmt, hbb);
   break;
 
+case IFN_DIVMOD:
+  gen_hsa_divmod (stmt, hbb);
+  break;
+
 case IFN_ACOS:
 case IFN_ASIN:
 case IFN_ATAN:
-- 
2.11.0



Re: [PATCH] Fix precompiled header for '-' being input file (PR, pch/78970)

2017-01-05 Thread Martin Liška
On 01/05/2017 02:30 PM, Jakub Jelinek wrote:
> On Thu, Jan 05, 2017 at 02:08:40PM +0100, Martin Liška wrote:
>> diff --git a/gcc/c-family/c-opts.c b/gcc/c-family/c-opts.c
>> index 14e038c02b3..49df9e19157 100644
>> --- a/gcc/c-family/c-opts.c
>> +++ b/gcc/c-family/c-opts.c
>> @@ -744,7 +744,12 @@ c_common_post_options (const char **pfilename)
>>in_fnames[0] = "";
>>  }
>>else if (strcmp (in_fnames[0], "-") == 0)
>> -in_fnames[0] = "";
>> +{
>> +  if (pch_file)
>> +error ("cannot use '-' as input filename for a precompiled header");
> 
> Please use %<-%> instead of '-', in both places.

Done.

> 
>> diff --git a/gcc/gcc.c b/gcc/gcc.c
>> index 8154953eb1d..f42c4ef372e 100644
>> --- a/gcc/gcc.c
>> +++ b/gcc/gcc.c
>> @@ -8325,7 +8325,19 @@ lookup_compiler (const char *name, size_t length, 
>> const char *language)
>>  {
>>for (cp = compilers + n_compilers - 1; cp >= compilers; cp--)
>>  if (cp->suffix[0] == '@' && !strcmp (cp->suffix + 1, language))
>> -  return cp;
>> +  {
>> +if (name != NULL && strcmp (name, "-") == 0
>> +&& (strcmp (cp->suffix, "@c-header") == 0
>> +|| strcmp (cp->suffix, "@c++-header") == 0))
>> +  {
>> +fatal_error (input_location,
>> + "cannot use '-' as input filename for a "
>> + "precompiled header");
>> +return 0;
>> +  }
> 
> Isn't fatal_error noreturn?  Then it doesn't make sense to do return 0
> after it, so perhaps remove {} and return 0; ?
> 
> Ok with those changes.

Likewise, installed as r244103. Is it fine to backport to both active branches?

Thanks,
Martin


> 
>   Jakub
> 



Re: Make MicroBlaze support DWARF EH (old Xilinx patch, needed for glibc build)

2017-01-05 Thread Michael Eager

On 01/02/2017 03:45 PM, Joseph Myers wrote:

This patch, taken from

and with a few formatting cleanups and an update for the removal of
gen_rtx_raw_REG, enables DWARF EH support for MicroBlaze.

This is needed for building glibc with a compiler that includes shared
libgcc; right now all glibc builds for MicroBlaze are failing with my
bot for lack of this support.  (It's dubious if we should have glibc
ports at all where required support is missing in FSF GCC.)

Tested building glibc with build-many-glibcs.py.  I have *not* done
any other testing or any execution testing for MicroBlaze.  OK to
commit?


I have two concerns;

1.  Lack of testing on MicroBlaze.  Although the patch presumably works
in the Buildroot environment, that's not the same source base.

I may be able to build and test this patch, but it will be a couple
weeks before I can get to it.

2.  Ownership and copyright.  This is clearly not your authorship.
Submission of a patch explicitly indicates that you are assigning
your copyright interest in the patch to the FSF.  I don't believe
that you have copyright to this patch and can't assign it to FSF.

On multiple occasions, I have asked Xilinx to submit patches such
as this one directly to the GCC/Binutils projects (assuming that
they have a current FSF Copyright Assignment), or to give me
explicit permission to do so on their behalf, as was the case when
I originally submitted the MicroBlaze port to these projects.  For
whatever reason, neither has occurred.

The first issue is procedural/technical and easily resolved.  The
second issue involves Copyright Law.  IANAL, but my understanding is
that a third party cannot take a patch from a non-FSF/GNU repository
and apply it to an FSF/GNU repository without the authors' agreement
and assignment of copyright. (If Buildroot were an FSF/GNU project,
then copyright would have already been assigned to FSF, presumably,
and accepting the patch into GCC would not involve any transfer of
ownership.)

Does anyone have any authority on this copyright issue?


--
Michael Eagerea...@eagercon.com
1960 Park Blvd., Palo Alto, CA 94306  650-325-8077


Re: [PATCH] Fix precompiled header for '-' being input file (PR, pch/78970)

2017-01-05 Thread Jakub Jelinek
On Thu, Jan 05, 2017 at 02:08:40PM +0100, Martin Liška wrote:
> diff --git a/gcc/c-family/c-opts.c b/gcc/c-family/c-opts.c
> index 14e038c02b3..49df9e19157 100644
> --- a/gcc/c-family/c-opts.c
> +++ b/gcc/c-family/c-opts.c
> @@ -744,7 +744,12 @@ c_common_post_options (const char **pfilename)
>in_fnames[0] = "";
>  }
>else if (strcmp (in_fnames[0], "-") == 0)
> -in_fnames[0] = "";
> +{
> +  if (pch_file)
> + error ("cannot use '-' as input filename for a precompiled header");

Please use %<-%> instead of '-', in both places.

> diff --git a/gcc/gcc.c b/gcc/gcc.c
> index 8154953eb1d..f42c4ef372e 100644
> --- a/gcc/gcc.c
> +++ b/gcc/gcc.c
> @@ -8325,7 +8325,19 @@ lookup_compiler (const char *name, size_t length, 
> const char *language)
>  {
>for (cp = compilers + n_compilers - 1; cp >= compilers; cp--)
>   if (cp->suffix[0] == '@' && !strcmp (cp->suffix + 1, language))
> -   return cp;
> +   {
> + if (name != NULL && strcmp (name, "-") == 0
> + && (strcmp (cp->suffix, "@c-header") == 0
> + || strcmp (cp->suffix, "@c++-header") == 0))
> +   {
> + fatal_error (input_location,
> +  "cannot use '-' as input filename for a "
> +  "precompiled header");
> + return 0;
> +   }

Isn't fatal_error noreturn?  Then it doesn't make sense to do return 0
after it, so perhaps remove {} and return 0; ?

Ok with those changes.

Jakub


[PATCH][dwarf2out] Fix PR79000

2017-01-05 Thread Richard Biener

The following fixes a LTO dwarf2out ICE when mixing C and C++ TUs.
is_cxx then claims we're C++ but we are not happy to see a C typedef
handled by

gen_typedef_die (tree decl, dw_die_ref context_die)
{
...
  if (is_naming_typedef_decl (TYPE_NAME (type)))
{
  /* Here, we are in the case of decl being a typedef naming
 an anonymous type, e.g:
 typedef struct {...} foo;
 In that case TREE_TYPE (decl) is not a typedef variant
 type and TYPE_NAME of the anonymous type is set to the
 TYPE_DECL of the typedef. This construct is emitted by
 the C++ FE.

 TYPE is the anonymous struct named by the typedef
 DECL. As we need the DW_AT_type attribute of the
 DW_TAG_typedef to point to the DIE of TYPE, let's
 generate that DIE right away. add_type_attribute
 called below will then pick (via lookup_type_die) that
 anonymous struct DIE.  */
  if (!TREE_ASM_WRITTEN (type))
gen_tagged_type_die (type, context_die, 
DINFO_USAGE_DIR_USE);

  /* This is a GNU Extension.  We are adding a
 DW_AT_linkage_name attribute to the DIE of the
 anonymous struct TYPE.  The value of that attribute
 is the name of the typedef decl naming the anonymous
 struct.  This greatly eases the work of consumers of
 this debug info.  */
  add_linkage_name_raw (lookup_type_die (type), decl);
}

Here gen_tagged_type_die -> gen_member_die eventually ICEs not expecting a
variant type.  "Fixing" that assert just shows that add_linkage_name_raw
fails as we do not have a DECL_ASSEMBLER_NAME for the type decl and
we do not allow deferred asm names late (the C type decl won't ever
get an assembler name anyway...).

Thus the following provides an overload to is_cxx () we can feed
with context (to see whether the typedef decl was created by the C++ FE).

Hopefully with GCC 8 we'll get early LTO debug and all these issues
gone...

LTO bootstrapped and tested on x86_64-unknown-linux-gnu, ok for trunk
and branches?

Thanks,
Richard.

2017-01-05  Richard Biener  

PR debug/79000
* dwarf2out.c (is_cxx): New overload with context.
(is_naming_typedef_decl): Use it.

* g++.dg/lto/pr79000_0.C: New testcase.
* g++.dg/lto/pr79000_1.c: Likewise.

Index: gcc/dwarf2out.c
===
*** gcc/dwarf2out.c (revision 244093)
--- gcc/dwarf2out.c (working copy)
*** static int get_AT_flag (dw_die_ref, enum
*** 3356,3361 
--- 3356,3362 
  static unsigned get_AT_unsigned (dw_die_ref, enum dwarf_attribute);
  static inline dw_die_ref get_AT_ref (dw_die_ref, enum dwarf_attribute);
  static bool is_cxx (void);
+ static bool is_cxx (const_tree);
  static bool is_fortran (void);
  static bool is_ada (void);
  static bool remove_AT (dw_die_ref, enum dwarf_attribute);
*** is_cxx (void)
*** 4990,4995 
--- 4991,5012 
  || lang == DW_LANG_C_plus_plus_11 || lang == DW_LANG_C_plus_plus_14);
  }
  
+ /* Return TRUE if DECL was created by the C++ frontend.  */
+ 
+ static bool
+ is_cxx (const_tree decl)
+ {
+   if (in_lto_p)
+ {
+   while (DECL_CONTEXT (decl))
+   decl = DECL_CONTEXT (decl);
+   if (TREE_CODE (decl) == TRANSLATION_UNIT_DECL
+ && TRANSLATION_UNIT_LANGUAGE (decl))
+   return strncmp (TRANSLATION_UNIT_LANGUAGE (decl), "GNU C++", 7) == 0;
+ }
+   return is_cxx ();
+ }
+ 
  /* Return TRUE if the language is Java.  */
  
  static inline bool
*** is_naming_typedef_decl (const_tree decl)
*** 24762,24768 
/* It looks like Ada produces TYPE_DECLs that are very similar
   to C++ naming typedefs but that have different
   semantics. Let's be specific to c++ for now.  */
!   || !is_cxx ())
  return FALSE;
  
return (DECL_ORIGINAL_TYPE (decl) == NULL_TREE
--- 24779,24785 
/* It looks like Ada produces TYPE_DECLs that are very similar
   to C++ naming typedefs but that have different
   semantics. Let's be specific to c++ for now.  */
!   || !is_cxx (decl))
  return FALSE;
  
return (DECL_ORIGINAL_TYPE (decl) == NULL_TREE
Index: gcc/testsuite/g++.dg/lto/pr79000_0.C
===
*** gcc/testsuite/g++.dg/lto/pr79000_0.C(revision 0)
--- gcc/testsuite/g++.dg/lto/pr79000_0.C(working copy)
***
*** 0 
--- 1,7 
+ // { dg-lto-do link }
+ // { dg-lto-options { "-flto -g" } }
+ // { dg-extra-ld-options "-r -nostdlib" }
+ 
+ struct a {
+   a();
+ } b;
Index: gcc/testsuite/g++.dg/lto/pr79000_1.c
===

Re: [PATCH] Fix precompiled header for '-' being input file (PR, pch/78970)

2017-01-05 Thread Martin Liška
On 01/05/2017 11:09 AM, Jakub Jelinek wrote:
> On Thu, Jan 05, 2017 at 11:01:37AM +0100, Martin Liška wrote:
>> >From 0e14f21128c7aa67ed0eaa10877323a0b2011b63 Mon Sep 17 00:00:00 2001
>> From: marxin 
>> Date: Wed, 4 Jan 2017 16:04:44 +0100
>> Subject: [PATCH] Error for '-' as filename of a precompiled header (PR
>>  pch/78970)
>>
>> gcc/ChangeLog:
>>
>> 2017-01-05  Martin Liska  
>>
>>  * gcc.c (lookup_compiler): Reject '-' filename for a precompiled
>>  header.
>> ---
>>  gcc/gcc.c | 14 +-
>>  1 file changed, 13 insertions(+), 1 deletion(-)
>>
>> diff --git a/gcc/gcc.c b/gcc/gcc.c
>> index 8154953eb1d..ea4af119e73 100644
>> --- a/gcc/gcc.c
>> +++ b/gcc/gcc.c
>> @@ -8325,7 +8325,19 @@ lookup_compiler (const char *name, size_t length, 
>> const char *language)
>>  {
>>for (cp = compilers + n_compilers - 1; cp >= compilers; cp--)
>>  if (cp->suffix[0] == '@' && !strcmp (cp->suffix + 1, language))
>> -  return cp;
>> +  {
>> +if (name != NULL && strcmp (name, "-") == 0
>> +&& (strcmp (cp->suffix, "@c-header") == 0
>> +|| strcmp (cp->suffix, "@c++-header") == 0))
>> +  {
>> +fatal_error (input_location,
>> + "can't use '-' as input filename for a "
>> + "precompiled header");
> 
> That would be can%'t

Change to 'cannot', as Andreas suggested.

> 
> Won't the compiler still ICE if you invoke cc1 or cc1plus rather than the 
> driver?

It ICEs, fixed in the patch that bootstraps and survives regression tests.

Martin

> 
>   Jakub
> 

>From c2d521336f6c2d8d0048352dde6e690de7dc1ddd Mon Sep 17 00:00:00 2001
From: marxin 
Date: Wed, 4 Jan 2017 16:04:44 +0100
Subject: [PATCH] Error for '-' as filename of a precompiled header (PR
 pch/78970)

gcc/c-family/ChangeLog:

2017-01-05  Martin Liska  

	* c-opts.c (c_common_post_options): Reject '-' filename for a precompiled
	header.

gcc/ChangeLog:

2017-01-05  Martin Liska  

	* gcc.c (lookup_compiler): Reject '-' filename for a precompiled
	header.
---
 gcc/c-family/c-opts.c |  7 ++-
 gcc/gcc.c | 14 +-
 2 files changed, 19 insertions(+), 2 deletions(-)

diff --git a/gcc/c-family/c-opts.c b/gcc/c-family/c-opts.c
index 14e038c02b3..49df9e19157 100644
--- a/gcc/c-family/c-opts.c
+++ b/gcc/c-family/c-opts.c
@@ -744,7 +744,12 @@ c_common_post_options (const char **pfilename)
   in_fnames[0] = "";
 }
   else if (strcmp (in_fnames[0], "-") == 0)
-in_fnames[0] = "";
+{
+  if (pch_file)
+	error ("cannot use '-' as input filename for a precompiled header");
+
+  in_fnames[0] = "";
+}
 
   if (out_fname == NULL || !strcmp (out_fname, "-"))
 out_fname = "";
diff --git a/gcc/gcc.c b/gcc/gcc.c
index 8154953eb1d..f42c4ef372e 100644
--- a/gcc/gcc.c
+++ b/gcc/gcc.c
@@ -8325,7 +8325,19 @@ lookup_compiler (const char *name, size_t length, const char *language)
 {
   for (cp = compilers + n_compilers - 1; cp >= compilers; cp--)
 	if (cp->suffix[0] == '@' && !strcmp (cp->suffix + 1, language))
-	  return cp;
+	  {
+	if (name != NULL && strcmp (name, "-") == 0
+		&& (strcmp (cp->suffix, "@c-header") == 0
+		|| strcmp (cp->suffix, "@c++-header") == 0))
+	  {
+		fatal_error (input_location,
+			 "cannot use '-' as input filename for a "
+			 "precompiled header");
+		return 0;
+	  }
+
+	return cp;
+	  }
 
   error ("language %s not recognized", language);
   return 0;
-- 
2.11.0



Re: [bootstrap-O3,fortran] silence warning in simplify_transformation_to_array

2017-01-05 Thread Richard Biener
On Thu, Jan 5, 2017 at 5:06 AM, Alexandre Oliva  wrote:
> On Jan  4, 2017, Alexandre Oliva  wrote:
>
>> I'll prepare and post a patch anyway, but do we want to make it
>> standard practice?
>
> Here it is.
>
> simplify_transformation_to_array had the nested loop unrolled 7 times,
> which is reasonable given that it iterates over arrays of size
> GFC_MAX_DIMENSIONS == 7.
>
> The problem is that the last iteration increments the index, tests
> that it's less than result->rank, and then accesses the arrays with
> the incremented index.
>
> We did not optimize out that part in the 7th iteration, so VRP flagged
> the unreachable code as accessing arrays past the end.
>
> It couldn't possibly know that we'd never reach that part, since the
> test was on result->rank, and it's not obvious (for the compiler) that
> result->rank <= GFC_MAX_DIMENSIONS.
>
> Even an assert to that effect before the enclosing loop didn't avoid
> the warning turned to error, though; I suppose there might be some
> aliasing at play, because moving the assert into the loop does.  An
> assert on the index itself would also work, even more efficiently, but
> we're just silencing the warning instead.
>
> Regstrapped on x86_64-linux-gnu and i686-linux-gnu.  Ok to install?

Reasonable -- I'll leave it for others to comment on that "standard
practice" part
(it'll be the first case of using this IIRC).

Richard.

> for  gcc/fortran/ChangeLog
>
> * simplify.c (simplify_transformation_to_array): Silence
> array bounds warning.  Fix whitespace.
> ---
>  gcc/fortran/simplify.c |   11 ++-
>  1 file changed, 10 insertions(+), 1 deletion(-)
>
> diff --git a/gcc/fortran/simplify.c b/gcc/fortran/simplify.c
> index 549d9005..db860f7 100644
> --- a/gcc/fortran/simplify.c
> +++ b/gcc/fortran/simplify.c
> @@ -610,9 +610,18 @@ simplify_transformation_to_array (gfc_expr *result, 
> gfc_expr *array, gfc_expr *d
>   n++;
>   if (n < result->rank)
> {
> - count [n]++;
> +#pragma GCC diagnostic push
> + /* If the nested loop is unrolled GFC_MAX_DIMENSIONS
> +times, we'd warn for the last iteration, because the
> +array index will have already been incremented to the
> +array sizes, and we can't tell that this must make
> +the test against result->rank false, because ranks
> +must not exceed GFC_MAX_DIMENSIONS.  */
> +#pragma GCC diagnostic ignored "-Warray-bounds"
> + count[n]++;
>   base += sstride[n];
>   dest += dstride[n];
> +#pragma GCC diagnostic pop
> }
>   else
> done = true;
>
>
> --
> Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
> You must be the change you wish to see in the world. -- Gandhi
> Be Free! -- http://FSFLA.org/   FSF Latin America board member
> Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer


Re: [PATCH][tree-ssa-address] Use simplify_gen_binary in gen_addr_rtx

2017-01-05 Thread Kyrill Tkachov


On 05/01/17 12:01, Richard Biener wrote:

On Wed, Jan 4, 2017 at 4:07 PM, Kyrill Tkachov
 wrote:

On 04/01/17 14:19, Richard Biener wrote:

On Wed, Dec 21, 2016 at 10:40 AM, Kyrill Tkachov
 wrote:

On 20/12/16 17:30, Richard Biener wrote:

On December 20, 2016 5:01:19 PM GMT+01:00, Kyrill Tkachov
 wrote:

Hi all,

The testcase in this patch generates bogus assembly for arm with -O1
-mfloat-abi=soft:
  strdr4, [#0, r3]

This is due to non-canonical RTL being generated during expansion:
(set (mem:DI (plus:SI (const_int 0 [0])
 (reg/f:SI 153)) [0 MEM[symbol: a, index: _26, offset: 0B]+0 S8
A64])
   (reg:DI 154))

Note the (plus (const_int 0) (reg)). This is being generated in
gen_addr_rtx in tree-ssa-address.c
where it creates an explicit PLUS rtx through gen_rtx_PLUS, which
doesn't try to canonicalise its arguments
or simplify. The correct thing to do is to use simplify_gen_binary that
will handle all this properly.

But it has to match up the validity check which passes down exactly the
same RTL(?)  Or does this stem from propagation simplifying a MEM after
IVOPTs?


You mean TARGET_LEGITIMATE_ADDRESS_P? Yes, it gets passed on to that, but
the arm implementation of that
doesn't try to handle non-canonical RTL (plus (const0_rtx) (reg) is not
canonical).
Or do you mean some other check?

Ok, now looking at the actual patch and the code in question.  For your
testcase
it happens that symbol == const0_rtx?  In this case please amend the
if (symbol) check like we do for the base, thus

 if (symbol && symbol != const0_rtx)


No, symbol is not const0_rtx (it's just a symbol_ref).
index is const0_rtx and so gets assigned to *addr at the beginning of the
function.
base and step are NULL_RTX.
So at the time of the check:
if (*addr)
  *addr = gen_rtx_PLUS (address_mode, *addr, act_elem);
else
  *addr = act_elem;

*addr is const0_rtx. Do you want to amend that check to:
 if (*addr && *addr != const0_rtx) ?

Hmm, I guess given the if (step) in if (index) *addr can end up being
a not simplified mult.  So instead do

if (index && index != const0_rtx)


That works, I'll test a patch for this.


I haven't looked where the const0_rtx index comes from, so I don't know if
it
could have other CONST_INT values that may cause trouble.

Usually this happens when constant folding / propagation happens after
IVOPTs generates the TARGET_MEM_REF.  We do have some canonicalization
foldings for TMR, see maybe_fold_tmr, but that should have made index NULL
if it was constant...  So maybe we fail to fold a stmt at some point.

Btw, I fail to see the bogus asm with my arm-non-eabi cross with -O
-mfloat-abi=soft
so I can't tell how the TMR arrives at RTL expansion.


You'll also want to specify -marm (this doesn't trigger on Thumb) and perhaps 
-march=armv7-a.

Thanks,
Kyrill


Richard.


Kyrill



Richard.


Thanks,
Kyrill



I didn't change the other gen_rtx_PLUS calls in this function as their
results is later used in XEXP operations
that seem to rely on a PLUS expression being explicitly produced, but
this particular call doesn't, so it's okay
to change it. With this patch the sane assembly is generated:
   strdr4, [r3]

Bootstrapped and tested on arm-none-linux-gnueabihf, x86_64,
aarch64-none-linux-gnu.

Ok for trunk?

Thanks,
Kyrill

2016-12-20  Kyrylo Tkachov  

  * tree-ssa-address.c (gen_addr_rtx): Use simplify_gen_binary to
add
   *addr to act_elem.

2016-12-20  Kyrylo Tkachov  

   * gcc.dg/20161219.c: New test.






Re: [PATCH] Improve -f{,no}-vect-cost-model

2017-01-05 Thread Richard Biener
On Wed, Jan 4, 2017 at 10:15 PM, Jakub Jelinek  wrote:
> On Wed, Jan 04, 2017 at 01:31:28PM +0100, Richard Biener wrote:
>> >  And not sure why this actually
>> > is RejectNegative, wouldn't
>> > Common Alias(fvect-cost-model=,dynamic,unlimited)
>> > work just on fvect-cost-model (can test that)?
>>
>> Good question ;)  If it works, ok.
>
> And here is a patch for that, in addition to bootstrap/regtests I've
> tried in the debugger the effect of both -fvect-cost-model and
> -fno-vect-cost-model and they set global_options.x_flag_vect_cost_model
> to the expected enum values in each case.
>
> Ok for trunk?

Ok.

Richard.

> 2017-01-04  Jakub Jelinek  
>
> * common.opt (fvect-cost-model): Remove RejectNegative flag, use
> 3 argument Alias with unlimited for the negative form.
> (fno-vect-cost-model): Removed.
>
> --- gcc/common.opt.jj   2017-01-01 12:45:37.0 +0100
> +++ gcc/common.opt  2017-01-04 18:30:32.239318711 +0100
> @@ -2706,13 +2706,9 @@ EnumValue
>  Enum(vect_cost_model) String(cheap) Value(VECT_COST_MODEL_CHEAP)
>
>  fvect-cost-model
> -Common RejectNegative Alias(fvect-cost-model=,dynamic)
> +Common Alias(fvect-cost-model=,dynamic,unlimited)
>  Enables the dynamic vectorizer cost model.  Preserved for backward 
> compatibility.
>
> -fno-vect-cost-model
> -Common RejectNegative Alias(fvect-cost-model=,unlimited)
> -Enables the unlimited vectorizer cost model.  Preserved for backward 
> compatibility.
> -
>  ftree-vect-loop-version
>  Common Ignore
>  Does nothing. Preserved for backward compatibility.
>
>
> Jakub


Re: [PATCH] Tweak factor_out_conditional_conversion (PR tree-optimization/71016, take 2)

2017-01-05 Thread Richard Biener
On Thu, 5 Jan 2017, Jakub Jelinek wrote:

> Hi!
> 
> On Thu, Jan 05, 2017 at 09:09:06AM +0100, Richard Biener wrote:
> > I think that would be premature.  Consider a modified 
> > gcc.dg/tree-ssa/pr66726.c:
> > 
> > extern unsigned short mode_size[];
> > 
> > int
> > oof (int mode)
> > {
> >   int tem;
> >   if (64 < mode_size[mode])
> > tem = 64;
> >   else
> > tem = mode_size[mode];
> >   return tem;
> > }
> 
> You're right, that testcase still needs factor_out_*.
> 
> > That we now fold the GENERIC cond-expr during gimplification
> > is of course good but doesn't make the phiopt code useless.
> > As far as I remember I objected adding handling of conversions
> > in the minmax replacement code and instead suggested to add this
> > "enablement" phiopt instead.  So what I suggest is to tame that down
> > to the cases where it actually enables something (empty BB afterwards,
> > a PHI with a use that's also used on the controlling conditional).
> 
> So like this (if it passes bootstrap/regtest)?

Yes, this looks good to me.

Thanks,
Richard.

> 2017-01-05  Jakub Jelinek  
> 
>   PR tree-optimization/71016
>   * tree-ssa-phiopt.c (tree_ssa_phiopt_worker): Pass cond_stmt to
>   factor_out_conditional_conversion.  Formatting fix.
>   (factor_out_conditional_conversion): Add cond_stmt argument.
>   If arg1 is INTEGER_CST, punt if new_arg0 is not any operand of
>   cond_stmt and if arg0_def_stmt is not the only stmt in its bb.
> 
>   * gcc.target/i386/pr71016.c: New test.
>   * gcc.target/aarch64/pr71016.c: New test.
>   * gcc.dg/tree-ssa/pr66726-3.c: New test.
> 
> --- gcc/tree-ssa-phiopt.c.jj  2017-01-03 08:12:27.0 +0100
> +++ gcc/tree-ssa-phiopt.c 2017-01-05 12:12:09.571017488 +0100
> @@ -49,7 +49,8 @@ along with GCC; see the file COPYING3.
>  static unsigned int tree_ssa_phiopt_worker (bool, bool);
>  static bool conditional_replacement (basic_block, basic_block,
>edge, edge, gphi *, tree, tree);
> -static gphi *factor_out_conditional_conversion (edge, edge, gphi *, tree, 
> tree);
> +static gphi *factor_out_conditional_conversion (edge, edge, gphi *, tree, 
> tree,
> + gimple *);
>  static int value_replacement (basic_block, basic_block,
> edge, edge, gimple *, tree, tree);
>  static bool minmax_replacement (basic_block, basic_block,
> @@ -233,7 +234,7 @@ tree_ssa_phiopt_worker (bool do_store_el
> continue;
>   }
>else if (do_hoist_loads
> -  && EDGE_SUCC (bb1, 0)->dest == EDGE_SUCC (bb2, 0)->dest)
> +&& EDGE_SUCC (bb1, 0)->dest == EDGE_SUCC (bb2, 0)->dest)
>   {
> basic_block bb3 = EDGE_SUCC (bb1, 0)->dest;
>  
> @@ -313,7 +314,8 @@ tree_ssa_phiopt_worker (bool do_store_el
> gcc_assert (arg0 != NULL_TREE && arg1 != NULL_TREE);
>  
> gphi *newphi = factor_out_conditional_conversion (e1, e2, phi,
> - arg0, arg1);
> + arg0, arg1,
> + cond_stmt);
> if (newphi != NULL)
>   {
> phi = newphi;
> @@ -402,11 +404,12 @@ replace_phi_edge_with_variable (basic_bl
>  
>  /* PR66726: Factor conversion out of COND_EXPR.  If the arguments of the PHI
> stmt are CONVERT_STMT, factor out the conversion and perform the 
> conversion
> -   to the result of PHI stmt.  Return the newly-created PHI, if any.  */
> +   to the result of PHI stmt.  COND_STMT is the controlling predicate.
> +   Return the newly-created PHI, if any.  */
>  
>  static gphi *
>  factor_out_conditional_conversion (edge e0, edge e1, gphi *phi,
> -tree arg0, tree arg1)
> +tree arg0, tree arg1, gimple *cond_stmt)
>  {
>gimple *arg0_def_stmt = NULL, *arg1_def_stmt = NULL, *new_stmt;
>tree new_arg0 = NULL_TREE, new_arg1 = NULL_TREE;
> @@ -472,7 +475,31 @@ factor_out_conditional_conversion (edge
> && int_fits_type_p (arg1, TREE_TYPE (new_arg0)))
>   {
> if (gimple_assign_cast_p (arg0_def_stmt))
> - new_arg1 = fold_convert (TREE_TYPE (new_arg0), arg1);
> + {
> +   /* For the INTEGER_CST case, we are just moving the
> +  conversion from one place to another, which can often
> +  hurt as the conversion moves further away from the
> +  statement that computes the value.  So, perform this
> +  only if new_arg0 is an operand of COND_STMT, or
> +  if arg0_def_stmt is the only non-debug stmt in
> +  its basic block, because then it is possible this
> +  could enable further optimizations (minmax replacement
> +  etc.).  See PR71016.  */
> +   if (new_arg0 != gimple_cond_lhs (cond_stmt)
> +   && 

Re: [PATCH] Fix late dwarf generated early from optimized out globals

2017-01-05 Thread Richard Biener
On Wed, 4 Jan 2017, Andreas Tobler wrote:

> On 04.01.17 10:21, Richard Biener wrote:
> > On Wed, 28 Dec 2016, Andreas Tobler wrote:
> > 
> > > On 28.12.16 19:24, Richard Biener wrote:
> > > > On December 27, 2016 11:17:00 PM GMT+01:00, Andreas Tobler
> > > >  wrote:
> > > > > On 16.09.16 13:30, Richard Biener wrote:
> > > > > > On Thu, 15 Sep 2016, Richard Biener wrote:
> > > > > > 
> > > > > > > 
> > > > > > > This addresses sth I needed to address with the early LTO debug
> > > > > patches
> > > > > > > (you might now figure I'm piecemail merging stuff from that
> > > > > > > patch).
> > > > > > > 
> > > > > > > When the cgraph code optimizes out a global we call the
> > > > > late_global_decl
> > > > > > > debug hook to eventually add a DW_AT_const_value to its DIE (we
> > > > > don't
> > > > > > > really expect a location as that will be invalid after optimizing
> > > > > out
> > > > > > > and will be pruned).
> > > > > > > 
> > > > > > > With the early LTO debug patches I have introduced a
> > > > > early_dwarf_finished
> > > > > > > flag (mainly for consistency checking) and I figured I can use
> > > > > > > that
> > > > > to
> > > > > > > detect the call to the late hook during the early phase and
> > > > > > > provide
> > > > > > > the following cleaned up variant of avoiding to create locations
> > > > > that
> > > > > > > require later pruning (which doesn't work with emitting the early
> > > > > DIEs).
> > > > > > > 
> > > > > > > Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.
> > > > > > > 
> > > > > > > I verified it does the correct thing for a unit like
> > > > > > > 
> > > > > > > static const int i = 2;
> > > > > > > 
> > > > > > > (but ISTR we do have at least one testcase in the testsuite as
> > > > > well).
> > > > > > > 
> > > > > > > Will commit if testing finishes successfully.
> > > > > > 
> > > > > > Ok, so it showed issues when merging that back to early-LTO-debug.
> > > > > > Turns out in LTO we never call early_finish and thus
> > > > > early_dwarf_finished
> > > > > > was never set.  Also dwarf2out_late_global_decl itself is a better
> > > > > > place to constrain generating locations.
> > > > > > 
> > > > > > The following variant is in very late stage of testing.
> > > > > > 
> > > > > > Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.
> > > > > > LTO bootstrap on x86_64-unknown-linux-gnu in stage3.  LTO bootstrap
> > > > > > with early-LTO-debug in stage3, bootstraped with early-LTO-debug,
> > > > > > testing in progress.
> > > > > 
> > > > > Any chance to backport this commit (r240228) to 6.x?
> > > > > It fixes a bootstrap comparison issue on aarch64-*-freebsd*.
> > > > > The aarch64-*-freebsd* port is not yet merged to 6.x and 5.4.x due to
> > > > > the bootstrap comparison failure I faced.
> > > > 
> > > > Did you analyze the bootstrap miscompare?  I suspect the patch merely
> > > > papers
> > > > over the problem.
> > > 
> > > gcc/contrib/compare-debug -p prev-gcc/ipa-icf.o gcc/ipa-icf.o
> > > prev-gcc/ipa-icf.o.stripped. gcc/ipa-icf.o.stripped. differ: char 52841,
> > > line
> > > 253
> > > 
> > > 
> > > The objdump -dSx diff on the non stripped object looked always more or
> > > less
> > > the same, a rodata offset which was different.
> > > 
> > > -   1448: R_AARCH64_ADD_ABS_LO12_NC .rodata+0x1d8
> > > +   1448: R_AARCH64_ADD_ABS_LO12_NC .rodata+0x410
> > 
> > Hmm, sounds like a constant pool entry was created by -g at a different
> > time (and thus offset) from regular compilation.  So yes, the patch
> > in question should have the affect to "fix" this.
> > 
> > Note that I later changed the fix with
> > 
> > 2016-10-20  Richard Biener  
> > 
> > * cgraphunit.c (analyze_functions): Set node->definition to
> > false to signal symbol removal to debug_hooks->late_global_decl.
> > * ipa.c (symbol_table::remove_unreachable_nodes): When not in
> > WPA signal symbol removal to the debuginfo machinery.
> > * dwarf2out.c (dwarf2out_late_global_decl): Instead of
> > using early_finised to guard the we're called for symbol
> > removal case look at the symtabs definition flag.
> > (gen_variable_die): Remove redundant check.
> > 
> > (the dwarf2out_late_global_decl and analyze_functions part are
> > relevant).
> > 
> > That should be the fix to backport (avoiding the early_dwarf_finished
> > part).
> 
> Ok, I bootstrapped with the attached snippet (had to tweak a bit to apply
> cleanly). w/o the previous mentioned fix (r240228). Is this what you had in
> mind or do you think some parts of the other fix (r240228) is also needed?

Not sure if you need the dwarf2out.c part you included but you clearly
missed the dwarf2out_late_global_decl part doing

  /* We get called via the symtab code invoking late_global_decl
 for symbols that are optimized out.  Do not add locations
 

Re: [PATCH] Fix precompiled header for '-' being input file (PR, pch/78970)

2017-01-05 Thread Andreas Schwab
On Jan 05 2017, Jakub Jelinek  wrote:

> On Thu, Jan 05, 2017 at 11:01:37AM +0100, Martin Liška wrote:
>> diff --git a/gcc/gcc.c b/gcc/gcc.c
>> index 8154953eb1d..ea4af119e73 100644
>> --- a/gcc/gcc.c
>> +++ b/gcc/gcc.c
>> @@ -8325,7 +8325,19 @@ lookup_compiler (const char *name, size_t length, 
>> const char *language)
>>  {
>>for (cp = compilers + n_compilers - 1; cp >= compilers; cp--)
>>  if (cp->suffix[0] == '@' && !strcmp (cp->suffix + 1, language))
>> -  return cp;
>> +  {
>> +if (name != NULL && strcmp (name, "-") == 0
>> +&& (strcmp (cp->suffix, "@c-header") == 0
>> +|| strcmp (cp->suffix, "@c++-header") == 0))
>> +  {
>> +fatal_error (input_location,
>> + "can't use '-' as input filename for a "
>> + "precompiled header");
>
> That would be can%'t

Better: cannot.

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."


Re: [PATCH][tree-ssa-address] Use simplify_gen_binary in gen_addr_rtx

2017-01-05 Thread Richard Biener
On Wed, Jan 4, 2017 at 4:07 PM, Kyrill Tkachov
 wrote:
>
> On 04/01/17 14:19, Richard Biener wrote:
>>
>> On Wed, Dec 21, 2016 at 10:40 AM, Kyrill Tkachov
>>  wrote:
>>>
>>> On 20/12/16 17:30, Richard Biener wrote:

 On December 20, 2016 5:01:19 PM GMT+01:00, Kyrill Tkachov
  wrote:
>
> Hi all,
>
> The testcase in this patch generates bogus assembly for arm with -O1
> -mfloat-abi=soft:
>  strdr4, [#0, r3]
>
> This is due to non-canonical RTL being generated during expansion:
> (set (mem:DI (plus:SI (const_int 0 [0])
> (reg/f:SI 153)) [0 MEM[symbol: a, index: _26, offset: 0B]+0 S8
> A64])
>   (reg:DI 154))
>
> Note the (plus (const_int 0) (reg)). This is being generated in
> gen_addr_rtx in tree-ssa-address.c
> where it creates an explicit PLUS rtx through gen_rtx_PLUS, which
> doesn't try to canonicalise its arguments
> or simplify. The correct thing to do is to use simplify_gen_binary that
> will handle all this properly.

 But it has to match up the validity check which passes down exactly the
 same RTL(?)  Or does this stem from propagation simplifying a MEM after
 IVOPTs?
>>>
>>>
>>> You mean TARGET_LEGITIMATE_ADDRESS_P? Yes, it gets passed on to that, but
>>> the arm implementation of that
>>> doesn't try to handle non-canonical RTL (plus (const0_rtx) (reg) is not
>>> canonical).
>>> Or do you mean some other check?
>>
>> Ok, now looking at the actual patch and the code in question.  For your
>> testcase
>> it happens that symbol == const0_rtx?  In this case please amend the
>> if (symbol) check like we do for the base, thus
>>
>> if (symbol && symbol != const0_rtx)
>
>
> No, symbol is not const0_rtx (it's just a symbol_ref).
> index is const0_rtx and so gets assigned to *addr at the beginning of the
> function.
> base and step are NULL_RTX.
> So at the time of the check:
>if (*addr)
>  *addr = gen_rtx_PLUS (address_mode, *addr, act_elem);
>else
>  *addr = act_elem;
>
> *addr is const0_rtx. Do you want to amend that check to:
> if (*addr && *addr != const0_rtx) ?

Hmm, I guess given the if (step) in if (index) *addr can end up being
a not simplified mult.  So instead do

   if (index && index != const0_rtx)

> I haven't looked where the const0_rtx index comes from, so I don't know if
> it
> could have other CONST_INT values that may cause trouble.

Usually this happens when constant folding / propagation happens after
IVOPTs generates the TARGET_MEM_REF.  We do have some canonicalization
foldings for TMR, see maybe_fold_tmr, but that should have made index NULL
if it was constant...  So maybe we fail to fold a stmt at some point.

Btw, I fail to see the bogus asm with my arm-non-eabi cross with -O
-mfloat-abi=soft
so I can't tell how the TMR arrives at RTL expansion.

Richard.

> Kyrill
>
>
>> Richard.
>>
>>> Thanks,
>>> Kyrill
>>>
>>>
> I didn't change the other gen_rtx_PLUS calls in this function as their
> results is later used in XEXP operations
> that seem to rely on a PLUS expression being explicitly produced, but
> this particular call doesn't, so it's okay
> to change it. With this patch the sane assembly is generated:
>   strdr4, [r3]
>
> Bootstrapped and tested on arm-none-linux-gnueabihf, x86_64,
> aarch64-none-linux-gnu.
>
> Ok for trunk?
>
> Thanks,
> Kyrill
>
> 2016-12-20  Kyrylo Tkachov  
>
>  * tree-ssa-address.c (gen_addr_rtx): Use simplify_gen_binary to
> add
>   *addr to act_elem.
>
> 2016-12-20  Kyrylo Tkachov  
>
>   * gcc.dg/20161219.c: New test.


>


RE: Make MIPS soft-fp preserve NaN payloads for NAN2008

2017-01-05 Thread Matthew Fortune
Joseph Myers  writes:
> On Wed, 4 Jan 2017, Maciej W. Rozycki wrote:
> 
> >  AFAIR we deliberately decided not to define a 2008-NaN soft-float
> > ABI, and chose to require all soft-float binaries to use the legacy
> encoding.
> 
> Soft-float and 2008-NaN are naturally completely orthogonal and the
> combination works fine (of course, it doesn't need any special kernel or
> hardware support).  There's no need to disallow the combination.
> 
> In any case, the soft-fp change is relevant in the hard-float case as
> well, to make software TFmode behave consistently with hardware SFmode
> and DFmode regarding NaN payload preservation.

I agree here.

It is true to say that users are discouraged from using 2008-NaN with
soft-float for pre-R6 architectures simply to avoid further fragmentation
of software for no real gain. However, for R6 then soft-float is 2008-NaN
otherwise we are stuck with legacy-NaN forever.

If someone did want to build a system from source with soft-float as
2008-NaN then I see no reason to stop them but I doubt they would and I
don't expect the --with-nan GCC configure option to be used in conjunction
with --with-float=soft for the same reason. The most likely use of
--with-nan is to build a distribution specifically to target an MSA capable
system like P5600 or perhaps an M5150 with an FPU. The NaN interlinking
work will make these use-cases less important still though I think.

Matthew


Re: [build, libgo, libstdc++] Build libgo with -Wa,-nH if possible (PR go/78978)

2017-01-05 Thread Jonathan Wakely

On 05/01/17 10:20 +0100, Rainer Orth wrote:

As could have been expected, the static libgo.a causes the same problem
with hardware capabilities on Solaris/x86 as was solved for libgo.so
with

https://gcc.gnu.org/ml/gcc-patches/2016-12/msg00726.html

Instead of trying to pass -mclear-hwcap with -static-libgo, it was
deemed easier to solve both problems the same way and disable hwcaps
support in the assembler in the first place.

This has already been done in libstdc++, so I've moved the corresponding
autoconf macro to config/hwcaps.m4 and adapted it to set HWCAP_CFLAGS
instead of HWCAP_FLAGS to better differntiate from HWCAP_LDFLAGS.
Everything else is straightforward, I believe.

Bootstrapped without regressions on i386-pc-solaris2.1[02] with as/ld
(where as supports -nH) and sparc-sun-solaris2.12 with as/ld (where as
doesn't, or rather it's called -hwcap={1|0}), and the libgo
runtime/pprof failure is gone.

Ok for mainline?


I can't approve the whole thing, but moving the macro out of libstdc++
is OK.



[PATCH] Tweak factor_out_conditional_conversion (PR tree-optimization/71016, take 2)

2017-01-05 Thread Jakub Jelinek
Hi!

On Thu, Jan 05, 2017 at 09:09:06AM +0100, Richard Biener wrote:
> I think that would be premature.  Consider a modified 
> gcc.dg/tree-ssa/pr66726.c:
> 
> extern unsigned short mode_size[];
> 
> int
> oof (int mode)
> {
>   int tem;
>   if (64 < mode_size[mode])
> tem = 64;
>   else
> tem = mode_size[mode];
>   return tem;
> }

You're right, that testcase still needs factor_out_*.

> That we now fold the GENERIC cond-expr during gimplification
> is of course good but doesn't make the phiopt code useless.
> As far as I remember I objected adding handling of conversions
> in the minmax replacement code and instead suggested to add this
> "enablement" phiopt instead.  So what I suggest is to tame that down
> to the cases where it actually enables something (empty BB afterwards,
> a PHI with a use that's also used on the controlling conditional).

So like this (if it passes bootstrap/regtest)?

2017-01-05  Jakub Jelinek  

PR tree-optimization/71016
* tree-ssa-phiopt.c (tree_ssa_phiopt_worker): Pass cond_stmt to
factor_out_conditional_conversion.  Formatting fix.
(factor_out_conditional_conversion): Add cond_stmt argument.
If arg1 is INTEGER_CST, punt if new_arg0 is not any operand of
cond_stmt and if arg0_def_stmt is not the only stmt in its bb.

* gcc.target/i386/pr71016.c: New test.
* gcc.target/aarch64/pr71016.c: New test.
* gcc.dg/tree-ssa/pr66726-3.c: New test.

--- gcc/tree-ssa-phiopt.c.jj2017-01-03 08:12:27.0 +0100
+++ gcc/tree-ssa-phiopt.c   2017-01-05 12:12:09.571017488 +0100
@@ -49,7 +49,8 @@ along with GCC; see the file COPYING3.
 static unsigned int tree_ssa_phiopt_worker (bool, bool);
 static bool conditional_replacement (basic_block, basic_block,
 edge, edge, gphi *, tree, tree);
-static gphi *factor_out_conditional_conversion (edge, edge, gphi *, tree, 
tree);
+static gphi *factor_out_conditional_conversion (edge, edge, gphi *, tree, tree,
+   gimple *);
 static int value_replacement (basic_block, basic_block,
  edge, edge, gimple *, tree, tree);
 static bool minmax_replacement (basic_block, basic_block,
@@ -233,7 +234,7 @@ tree_ssa_phiopt_worker (bool do_store_el
  continue;
}
   else if (do_hoist_loads
-&& EDGE_SUCC (bb1, 0)->dest == EDGE_SUCC (bb2, 0)->dest)
+  && EDGE_SUCC (bb1, 0)->dest == EDGE_SUCC (bb2, 0)->dest)
{
  basic_block bb3 = EDGE_SUCC (bb1, 0)->dest;
 
@@ -313,7 +314,8 @@ tree_ssa_phiopt_worker (bool do_store_el
  gcc_assert (arg0 != NULL_TREE && arg1 != NULL_TREE);
 
  gphi *newphi = factor_out_conditional_conversion (e1, e2, phi,
-   arg0, arg1);
+   arg0, arg1,
+   cond_stmt);
  if (newphi != NULL)
{
  phi = newphi;
@@ -402,11 +404,12 @@ replace_phi_edge_with_variable (basic_bl
 
 /* PR66726: Factor conversion out of COND_EXPR.  If the arguments of the PHI
stmt are CONVERT_STMT, factor out the conversion and perform the conversion
-   to the result of PHI stmt.  Return the newly-created PHI, if any.  */
+   to the result of PHI stmt.  COND_STMT is the controlling predicate.
+   Return the newly-created PHI, if any.  */
 
 static gphi *
 factor_out_conditional_conversion (edge e0, edge e1, gphi *phi,
-  tree arg0, tree arg1)
+  tree arg0, tree arg1, gimple *cond_stmt)
 {
   gimple *arg0_def_stmt = NULL, *arg1_def_stmt = NULL, *new_stmt;
   tree new_arg0 = NULL_TREE, new_arg1 = NULL_TREE;
@@ -472,7 +475,31 @@ factor_out_conditional_conversion (edge
  && int_fits_type_p (arg1, TREE_TYPE (new_arg0)))
{
  if (gimple_assign_cast_p (arg0_def_stmt))
-   new_arg1 = fold_convert (TREE_TYPE (new_arg0), arg1);
+   {
+ /* For the INTEGER_CST case, we are just moving the
+conversion from one place to another, which can often
+hurt as the conversion moves further away from the
+statement that computes the value.  So, perform this
+only if new_arg0 is an operand of COND_STMT, or
+if arg0_def_stmt is the only non-debug stmt in
+its basic block, because then it is possible this
+could enable further optimizations (minmax replacement
+etc.).  See PR71016.  */
+ if (new_arg0 != gimple_cond_lhs (cond_stmt)
+ && new_arg0 != gimple_cond_rhs (cond_stmt)
+ && gimple_bb (arg0_def_stmt) == e0->src)
+   {
+ gsi = gsi_for_stmt (arg0_def_stmt);
+ gsi_prev_nondebug ();
+

Re: [PATCH 6/6][ARM] Implement support for ACLE Coprocessor MCRR and MRRC intrinsics

2017-01-05 Thread Kyrill Tkachov

Hi Andre,

On 09/11/16 10:12, Andre Vieira (lists) wrote:

Hi,

This patch implements support for the ARM ACLE Coprocessor MCR and MRC
intrinsics. See below a table mapping the intrinsics to their respective
instructions:

+---+---+
| Intrinsic signature   |
Instruction pattern   |
+---+---+
|void __arm_mcrr(coproc, opc1, uint64_t value, CRm) |
MCRR coproc, opc1, Rt, Rt2, CRm   |
+---+---+
|void __arm_mcrr2(coproc, opc1, uint64_t value, CRm)|
MCRR2 coproc, opc1, Rt, Rt2, CRm  |
+---+---+
|uint64_t __arm_mrrc(coproc, opc1, CRm) |
MRRC coproc, opc1, Rt, Rt2, CRm   |
+---+---+
|uint64_t __arm_mrrc2(coproc, opc1, CRm)|
MRRC2 coproc, opc1, Rt, Rt2, CRm  |
+---+---+
Note that any untyped variable in the intrinsic signature is required to
be a compiler-time constant and has the type 'unsigned int'.  We do some
boundary checks for coproc:[0-15], opc1[0-7] CR*:[0-31]. If either of
these requirements are not met a diagnostic is issued.

I added a new arm_arch variable for ARMv5TE to use when deciding whether
or not the MCRR and MRCC intrinsics are available.

Is this OK for trunk?


Same as with the previous two patches the define_insns need constraints
and also I believe you'll want to rebase this patch on top of Richard's rework 
of the
architecture feature bits for the ARMv5TE hunk.

Thanks,
Kyrill


Regards,
Andre

gcc/ChangeLog:
2016-11-09  Andre Vieira  

   * config/arm/arm.md (): New.
   (): New.
   * config/arm/arm.c (arm_arch5te): New.
   (arm_option_override): Set arm_arch5te.
   (arm_coproc_builtin_available): Add support for mcrr, mcrr2, mrrc
   and mrrc2.
   * config/arm/arm-builtins.c (MCRR_QUALIFIERS): Define to...
   (arm_mcrr_qualifiers): ... this. New.
   (MRRC_QUALIFIERS): Define to...
   (arm_mrrc_qualifiers): ... this. New.
   * config/arm/arm_acle.h (__arm_mcrr, __arm_mcrr2, __arm_mrrc,
   __arm_mrrc2): New.
   * config/arm/arm_acle_builtins.def (mcrr, mcrr2, mrrc, mrrc2): New.
   * config/arm/iterators.md (MCRRI, mcrr, MCRR): New.
   (MRRCI, mrrc, MRRC): New.
   * config/arm/unspecs.md (VUNSPEC_MCRR, VUNSPEC_MCRR2, VUNSPEC_MRRC,
   VUNSPEC_MRRC2): New.

gcc/testsuite/ChangeLog:

2016-11-09  Andre Vieira  

   * gcc.target/arm/acle/mcrr: New.
   * gcc.target/arm/acle/mcrr2: New.
   * gcc.target/arm/acle/mrrc: New.
   * gcc.target/arm/acle/mrrc2: New.





Re: [PATCH 5/6][ARM] Implement support for ACLE Coprocessor MCR and MRC intrinsics

2017-01-05 Thread Kyrill Tkachov


On 09/11/16 10:12, Andre Vieira (lists) wrote:

Hi,

This patch implements support for the ARM ACLE Coprocessor MCR and MRC
intrinsics. See below a table mapping the intrinsics to their respective
instructions:

+---+---+
| Intrinsic signature   |
Instruction pattern   |
+---+---+
|void __arm_mcr(coproc, opc1, uint32_t value, CRn, CRm, opc2)   |
MCR coproc, opc1, Rt, CRn, CRm, opc2  |
+---+---+
|void __arm_mcr2(coproc, opc1, uint32_t value, CRn, CRm, opc2)  |
MCR2 coproc, opc1, Rt, CRn, CRm, opc2 |
+---+---+
|uint32_t __arm_mrc(coproc, opc1, CRn, CRm, opc2)   |
MRC coproc, opc1, Rt, CRn, CRm, opc2  |
+---+---+
|uint32_t __arm_mrc2(coproc, opc1, CRn, CRm, opc2)  |
MRC2 coproc, opc1, Rt, CRn, CRm, opc2 |
+---+---+
Note that any untyped variable in the intrinsic signature is required to
be a compiler-time constant and has the type 'unsigned int'.  We do some
boundary checks for coproc:[0-15], opc1[0-7] CR*:[0-31],opc2:[0-7].  If
either of these requirements are not met a diagnostic is issued.

Is this OK for trunk?

Regards,
Andre

gcc/ChangeLog:
2016-11-09  Andre Vieira  

   * config/arm/arm.md (): New.
   (): New.
   * config/arm/arm.c (arm_coproc_builtin_available): Add
   support for mcr, mrc, mcr2 and mrc2.
   * config/arm/arm-builtins.c (MCR_QUALIFIERS): Define to...
   (arm_mcr_qualifiers): ... this. New.
   (MRC_QUALIFIERS): Define to ...
   (arm_mrc_qualifiers): ... this. New.
   (MCR_QUALIFIERS): Define to ...
   (arm_mcr_qualifiers): ... this. New.
   * config/arm/arm_acle.h (__arm_mcr, __arm_mrc, __arm_mcr2,
   __arm_mrc2): New.
   * config/arm/arm_acle_builtins.def (mcr, mcr2, mrc, mrc2): New.
   * config/arm/iterators.md (MCRI, mcr, MCR, MRCI, mrc, MRC): New.
   * config/arm/unspecs.md (VUNSPEC_MCR, VUNSPEC_MCR2, VUNSPEC_MRC,
   VUNSPEC_MRC2): New.


gcc/ChangeLog:
2016-11-09  Andre Vieira  

   * gcc.target/arm/acle/mcr.c: New.
   * gcc.target/arm/acle/mrc.c: New.
   * gcc.target/arm/acle/mcr2.c: New.
   * gcc.target/arm/acle/mrc2.c: New.


+(define_insn ""
+  [(unspec_volatile [(match_operand:SI 0 "immediate_operand")
+(match_operand:SI 1 "immediate_operand")
+(match_operand:SI 2 "s_register_operand")
+(match_operand:SI 3 "immediate_operand")
+(match_operand:SI 4 "immediate_operand")
+(match_operand:SI 5 "immediate_operand")] MCRI)
+   (use (match_dup 2))]
+  "arm_coproc_builtin_available (VUNSPEC_)"
+{
+  arm_const_bounds (operands[0], 0, 16);
+  arm_const_bounds (operands[1], 0, 8);
+  arm_const_bounds (operands[3], 0, (1 << 5));
+  arm_const_bounds (operands[4], 0, (1 << 5));
+  arm_const_bounds (operands[5], 0, 8);
+  return "\\tp%c0, %1, %2, CR%c3, CR%c4, %5";
+}
+  [(set_attr "length" "4")
+   (set_attr "type" "coproc")])
+
+(define_insn ""
+  [(set (match_operand:SI 0 "s_register_operand")
+   (unspec_volatile [(match_operand:SI 1 "immediate_operand")
+ (match_operand:SI 2 "immediate_operand")
+ (match_operand:SI 3 "immediate_operand")
+ (match_operand:SI 4 "immediate_operand")
+ (match_operand:SI 5 "immediate_operand")] MRCI))]
+  "arm_coproc_builtin_available (VUNSPEC_)"
+{
+  arm_const_bounds (operands[1], 0, 16);
+  arm_const_bounds (operands[2], 0, 8);
+  arm_const_bounds (operands[3], 0, (1 << 5));
+  arm_const_bounds (operands[4], 0, (1 << 5));
+  arm_const_bounds (operands[5], 0, 8);
+  return "\\tp%c1, %2, %0, CR%c3, CR%c4, %5";
+}
+  [(set_attr "length" "4")
+   (set_attr "type" "coproc")])

Same as the previous patch, these operands need constraints.
The MRC ones in particular need a '=' in their constraint string.


Re: [PATCH 4/6][ARM] Implement support for ACLE Coprocessor LDC and STC intrinsics

2017-01-05 Thread Kyrill Tkachov

Hi Andre,

On 09/11/16 10:12, Andre Vieira (lists) wrote:

Hi,

This patch implements support for the ARM ACLE Coprocessor LDC and STC
intrinsics. See below a table mapping the intrinsics to their respective
instructions:

++--+
| Intrinsic signature| Instruction
pattern  |
++--+
|void __arm_ldc(coproc, CRd, const void* p)  |LDC coproc, CRd,
[...]|
++--+
|void __arm_ldcl(coproc, CRd, const void* p) |LDCL coproc, CRd,
[...]   |
++--+
|void __arm_ldc2(coproc, CRd, const void* p) |LDC2 coproc, CRd,
[...]   |
++--+
|void __arm_ldc2l(coproc, CRd, const void* p)|LDC2L coproc, CRd,
[...]  |
++--+
|void __arm_stc(coproc, CRd, void* p)|STC coproc, CRd,
[...]|
++--+
|void __arm_stcl(coproc, CRd, void* p)   |STCL coproc, CRd,
[...]   |
++--+
|void __arm_stc2(coproc, CRd, void* p)   |STC2 coproc, CRd,
[...]   |
++--+
|void __arm_stc2l(coproc, CRd, void* p)  |STC2L coproc, CRd,
[...]  |
++--+
Note that any untyped variable in the intrinsic signature is required to
be a compiler-time constant and has the type 'unsigned int'.  We do some
boundary checks for coproc:[0-15], CR*:[0-31]. If either of these
requirements are not met a diagnostic is issued.


Is this ok for trunk?


I have a few comments below...


Regards,
Andre

gcc/ChangeLog:
2016-11-09  Andre Vieira  

   * config/arm/arm.md (*ldcstc): New.
   (): New.
   * config/arm/arm.c (arm_coproc_builtin_available): Add
   support for ldc,ldcl,stc,stcl,ldc2,ldc2l,stc2 and stc2l.
   (arm_coproc_ldc_stc_legitimate_address): New.
   * config/arm/arm-builtins.c (arm_type_qualifiers): Add
   'qualifier_const_pointer'.
   (LDC_QUALIFIERS): Define to...
   (arm_ldc_qualifiers): ... this. New.
   (STC_QUALIFIERS): Define to...
   (arm_stc_qualifiers): ... this. New.
   * config/arm/arm-protos.h
   (arm_coproc_ldc_stc_legitimate_address): New.
   * config/arm/arm_acle.h (__arm_ldc, __arm_ldcl, __arm_stc,
   __arm_stcl, __arm_ldc2, __arm_ldc2l, __arm_stc2, __arm_stc2l): New.
   * config/arm/arm_acle_builtins.def (ldc, ldc2, ldcl, ldc2l, stc,
   stc2, stcl, stc2l): New.
   * config/arm/constraints.md (Uz): New.
   * config/arm/iterators.md (LDCSTCI, ldcstc, LDCSTC): New.
   * config/arm/unspecs.md (VUNSPEC_LDC, VUNSPEC_LDC2, VUNSPEC_LDCL,
   VUNSPEC_LDC2L, VUNSPEC_STC, VUNSPEC_STC2, VUNSPEC_STCL,
   VUNSPEC_STC2L): New.

gcc/testsuite/ChangeLog:

2016-11-09  Andre Vieira  

   * gcc.target/arm/acle/ldc: New.
   * gcc.target/arm/acle/ldc2: New.
   * gcc.target/arm/acle/ldcl: New.
   * gcc.target/arm/acle/ldc2l: New.
   * gcc.target/arm/acle/stc: New.
   * gcc.target/arm/acle/stc2: New.
   * gcc.target/arm/acle/stcl: New.
   * gcc.target/arm/acle/stc2l: New.


+  /* const T * foo */
+  qualifier_const_pointer = 0x6,

Nit: full stop at end of comment.

+
+/* This function returns true if OP is a valid memory operand for the ldc and
+   stc coprocessor instructions and false otherwise.  */
+
+bool arm_coproc_ldc_stc_legitimate_address (rtx op)
+{

type and function name should be on separate lines.

+  int range;
+  /* Has to be a memory operand.  */
+  if (!MEM_P (op))
+return false;

+ /* Within the range of [-1020,1020].  */
+ if (range < -1020 || range > 1020)
+   return false;

Use !IN_RANGE (-1020, 1020), also make 'range' a HOST_WIDE_INT (that is what 
INTVAL returns).

--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -11858,6 +11858,25 @@
   [(set_attr "length" "4")
(set_attr "type" "coproc")])
 
+(define_insn "*ldcstc"

+  [(unspec_volatile [(match_operand:SI 0 "immediate_operand")
+(match_operand:SI 1 "immediate_operand")
+(match_operand:SI 2 "memory_operand" "Uz")] LDCSTCI)]
+  "arm_coproc_builtin_available (VUNSPEC_)"

You're missing constraints for operands 1 and 2? (just the general 'n' 
constraint should do it since you do 

Re: [PATCH 3/6][ARM] Implement support for ACLE Coprocessor CDP intrinsics

2017-01-05 Thread Kyrill Tkachov

Hi Andre,

On 09/11/16 10:11, Andre Vieira (lists) wrote:

Hi,

This patch implements support for the ARM ACLE Coprocessor CDP
intrinsics. See below a table mapping the intrinsics to their respective
instructions:

++--+
| Intrinsic signature| Instruction
pattern  |
++--+
|void __arm_cdp(coproc, opc1, CRd, CRn, CRm, opc2)   |CDP coproc, opc1,
CRd, CRn, CRm, opc2 |
++--+
|void __arm_cdp2(coproc, opc1, CRd, CRn, CRm, opc2)  |CDP2 coproc, opc1,
CRd, CRn, CRm, opc2|
++--+
Note that any untyped variable in the intrinsic signature is required to
be a compiler-time constant and has the type 'unsigned int'.  We do some
boundary checks for coproc:[0-15], opc1:[0-15], CR*:[0-31], opc2:[0-7].
If either of these requirements are not met a diagnostic is issued.

I renamed neon_const_bounds in this patch, to arm_const_bounds, simply
because it is also used in the Coprocessor intrinsics.  It also requires
the expansion of the builtin frame work such that it accepted 'void'
modes and intrinsics with 6 arguments.

I also changed acle.exp to run tests for multiple options, where all lto
option sets are appended with -ffat-objects to allow for assembly scans.

Is this OK for trunk?


This is okay if bootstrap and testing is ok (as part of the whole series)
modulo a couple of nits in the documentation below.

Thanks,
Kyrill


Regards,
Andre

gcc/ChangeLog:
2016-11-09  Andre Vieira  

   * config/arm/arm.md (): New.
   * config/arm/arm.c (neon_const_bounds): Rename this ...
   (arm_const_bounds): ... this.
   (arm_coproc_builtin_available): New.
   * config/arm/arm-builtins.c (SIMD_MAX_BUILTIN_ARGS): Increase.
   (arm_type_qualifiers): Add 'qualifier_unsigned_immediate'.
   (CDP_QUALIFIERS): Define to...
   (arm_cdp_qualifiers): ... this. New.
   (void_UP): Define.
   (arm_expand_builtin_args): Add case for 6 arguments.
   * config/arm/arm-protos.h (neon_const_bounds): Rename this ...
   (arm_const_bounds): ... this.
   (arm_coproc_builtin_available): New.
   * config/arm/arm_acle.h (__arm_cdp): New.
   (__arm_cdp2): New.
   * config/arm/arm_acle_builtins.def (cdp): New.
   (cdp2): New.
   * config/arm/iterators.md (CDPI,CDP,cdp): New.
   * config/arm/neon.md: Rename all 'neon_const_bounds' to
   'arm_const_bounds'.
   * config/arm/types.md (coproc): New.
   * config/arm/unspecs.md (VUNSPEC_CDP, VUNSPEC_CDP2): New.
   * gcc/doc/extend.texi (ACLE): Add a mention of Coprocessor intrinsics.

gcc/testsuite/ChangeLog:
2016-11-09  Andre Vieira  

   * gcc.target/arm/acle/acle.exp: Run tests for different options
   and make sure fat-lto-objects is used such that we can still do
   assemble scans.
   * gcc.target/arm/acle/cdp.c: New.
   * gcc.target/arm/acle/cdp2.c: New.
   * lib/target-supports.exp (check_effective_target_arm_coproc1_ok): New.
   (check_effective_target_arm_coproc1_ok_nocache): New.
   (check_effective_target_arm_coproc2_ok): New.
   (check_effective_target_arm_coproc2_ok_nocache): New.
   (check_effective_target_arm_coproc3_ok): New.
   (check_effective_target_arm_coproc3_ok_nocache): New.


--- a/gcc/doc/sourcebuild.texi
+++ b/gcc/doc/sourcebuild.texi
@@ -1675,6 +1675,21 @@ and @code{MOVT} instructions available.
 ARM target generates Thumb-1 code for @code{-mthumb} with
 @code{CBZ} and @code{CBNZ} instructions available.
 
+@item arm_coproc1_ok

+@anchor{arm_coproc1_ok}
+ARM target supports the following coprocessor instruction: @code{CDP},
+@code{LDC}, @code{STC}, @code{MCR} and @code{MRC}.


s/instruction/instructions/


+@item arm_coproc2_ok
+@anchor{arm_coproc2_ok}
+ARM target supports the all the coprocessor instructions also listed as
+supported in @ref{arm_coproc1_ok} and the following: @code{CDP2}, @code{LDC2},
+@code{LDC2l}, @code{STC2}, @code{STC2l}, @code{MCR2} and @code{MRC2}.
+

s/the all the/all the/.
Also, I'd prefer to say "in addition to the following" rather than "and the 
following"

 +@item arm_coproc3_ok
+ARM target supports the all the coprocessor instructions also listed as
+supported in @ref{arm_coproc2_ok} and the following: @code{MCRR}, @code{MCRR2},
+@code{MRRC}, and @code{MRRC2}.

Likewise.



Re: [PATCH 2/6][ARM] Move CRC builtins to refactored framework

2017-01-05 Thread Kyrill Tkachov


On 05/12/16 15:05, Andre Vieira (lists) wrote:

On 01/12/16 17:25, Andre Vieira (lists) wrote:

On 09/11/16 10:11, Andre Vieira (lists) wrote:

Hi,

This patch refactors the implementation of the ARM ACLE CRC builtins to
use the builtin framework.

Is this OK for trunk?

Regards,
Andre

gcc/ChangeLog
2016-11-09  Andre Vieira  

   * config/arm/arm-builtins.c (arm_unsigned_binop_qualifiers): New.
   (UBINOP_QUALIFIERS): New.
   (si_UP): Define.
   (acle_builtin_data): New. Change comment.
   (arm_builtins): Remove ARM_BUILTIN_CRC32B, ARM_BUILTIN_CRC32H,
   ARM_BUILTIN_CRC32W, ARM_BUILTIN_CRC32CB, ARM_BUILTIN_CRC32CH,
   ARM_BUILTIN_CRC32CW. Add ARM_BUILTIN_ACLE_BASE and include
   arm_acle_builtins.def.
   (ARM_BUILTIN_ACLE_PATTERN_START): Define.
   (arm_init_acle_builtins): New.
   (CRC32_BUILTIN): Remove.
   (bdesc_2arg): Remove entries for crc32b, crc32h, crc32w,
   crc32cb, crc32ch and crc32cw.
   (arm_init_crc32_builtins): Remove.
   (arm_init_builtins): Use arm_init_acle_builtins rather
   than arm_init_crc32_builtins.
   (arm_expand_acle_builtin): New.
   (arm_expand_builtin): Use 'arm_expand_acle_builtin'.
   * config/arm/arm_acle_builtins.def: New.


Hi,

Reworked this patch based on the changes made in [1/6]. No changes to
ChangeLog.

Is this OK?


Ok assuming normal bootstrap and regtest on an arm-none-linux-gnueabihf target.
Thanks,
Kyrill


Cheers,
Andre


Hi,

I had a typo in one of the range checks was using ARM_BUILTIN_ACLE_MAX
where it should've been ARM_BUILTIN_ACLE_BASE.

Cheers,
Andre




Re: [PATCH 1/6][ARM] Refactor NEON builtin framework to work for other builtins

2017-01-05 Thread Kyrill Tkachov

Hi Andre,

On 01/12/16 17:25, Andre Vieira (lists) wrote:

On 17/11/16 10:42, Kyrill Tkachov wrote:

Hi Andre,

On 09/11/16 10:11, Andre Vieira (lists) wrote:

Hi,

Refactor NEON builtin framework such that it can be used to implement
other builtins.

Is this OK for trunk?

Regards,
Andre

gcc/ChangeLog
2016-11-09  Andre Vieira  

  * config/arm/arm-builtins.c (neon_builtin_datum): Rename to ..
  (arm_builtin_datum): ... this.
  (arm_init_neon_builtin): Rename to ...
  (arm_init_builtin): ... this. Add a new parameters PREFIX
  and USE_SIG_IN_NAME.
  (arm_init_neon_builtins): Replace 'arm_init_neon_builtin' with
  'arm_init_builtin'. Replace type 'neon_builtin_datum' with
  'arm_builtin_datum'.
  (arm_init_vfp_builtins): Likewise.
  (builtin_arg): Rename enum's replacing 'NEON_ARG' with
  'ARG_BUILTIN' and add a 'ARG_BUILTIN_NEON_MEMORY.
  (arm_expand_neon_args): Rename to ...
  (arm_expand_builtin_args): ... this. Rename builtin_arg
  enum values and differentiate between ARG_BUILTIN_MEMORY
  and ARG_BUILTIN_NEON_MEMORY.
  (arm_expand_neon_builtin_1): Rename to ...
  (arm_expand_builtin_1): ... this. Rename builtin_arg enum
  values, arm_expand_builtin_args and add bool parameter NEON.
  (arm_expand_neon_builtin): Use arm_expand_builtin_1.
  (arm_expand_vfp_builtin): Likewise.
  (NEON_MAX_BUILTIN_ARGS): Remove, it was unused.

  /* Expand a neon builtin.  This is also used for vfp builtins, which
behave in
 the same way.  These builtins are "special" because they don't have
symbolic
 constants defined per-instruction or per instruction-variant.
Instead, the
-   required info is looked up in the NEON_BUILTIN_DATA record that is
passed
+   required info is looked up in the ARM_BUILTIN_DATA record that is
passed
 into the function.  */
  


The comment should be updated now that it's not just NEON builtins that
are expanded through this function.

  static rtx
-arm_expand_neon_builtin_1 (int fcode, tree exp, rtx target,
-   neon_builtin_datum *d)
+arm_expand_builtin_1 (int fcode, tree exp, rtx target,
+   arm_builtin_datum *d, bool neon)
  {

I'm not a fan of this 'neon' boolean as it can cause confusion among the
users of the function
(see long thread at https://gcc.gnu.org/ml/gcc/2016-10/msg4.html).
Whether the builtin is a NEON/VFP builtin
can be distinguished from FCODE, so lets just make that bool neon a
local variable and initialise it accordingly
from FCODE.

Same for:
+/* Set up a builtin.  It will use information stored in the argument
struct D to
+   derive the builtin's type signature and name.  It will append the
name in D
+   to the PREFIX passed and use these to create a builtin declaration
that is
+   then stored in 'arm_builtin_decls' under index FCODE.  This FCODE is
also
+   written back to D for future use.  If USE_SIG_IN_NAME is true the
builtin's
+   name is appended with type signature information to distinguish between
+   signedness and poly.  */
  
  static void

-arm_init_neon_builtin (unsigned int fcode,
-   neon_builtin_datum *d)
+arm_init_builtin (unsigned int fcode, arm_builtin_datum *d,
+  const char * prefix, bool use_sig_in_name)

use_sig_in_name is dependent on FCODE so just deduce it from that
locally in arm_init_builtin.

This is ok otherwise.
Thanks,
Kyrill



Hi,

Reworked patch according to comments. No changes to ChangeLog.

Is this OK?


Ok. Please wait for the other patches in the series to be approved before 
committing.

Thanks,
Kyrill


Cheers,
Andre




Re: [PATCH] Fix precompiled header for '-' being input file (PR, pch/78970)

2017-01-05 Thread Jakub Jelinek
On Thu, Jan 05, 2017 at 11:01:37AM +0100, Martin Liška wrote:
> >From 0e14f21128c7aa67ed0eaa10877323a0b2011b63 Mon Sep 17 00:00:00 2001
> From: marxin 
> Date: Wed, 4 Jan 2017 16:04:44 +0100
> Subject: [PATCH] Error for '-' as filename of a precompiled header (PR
>  pch/78970)
> 
> gcc/ChangeLog:
> 
> 2017-01-05  Martin Liska  
> 
>   * gcc.c (lookup_compiler): Reject '-' filename for a precompiled
>   header.
> ---
>  gcc/gcc.c | 14 +-
>  1 file changed, 13 insertions(+), 1 deletion(-)
> 
> diff --git a/gcc/gcc.c b/gcc/gcc.c
> index 8154953eb1d..ea4af119e73 100644
> --- a/gcc/gcc.c
> +++ b/gcc/gcc.c
> @@ -8325,7 +8325,19 @@ lookup_compiler (const char *name, size_t length, 
> const char *language)
>  {
>for (cp = compilers + n_compilers - 1; cp >= compilers; cp--)
>   if (cp->suffix[0] == '@' && !strcmp (cp->suffix + 1, language))
> -   return cp;
> +   {
> + if (name != NULL && strcmp (name, "-") == 0
> + && (strcmp (cp->suffix, "@c-header") == 0
> + || strcmp (cp->suffix, "@c++-header") == 0))
> +   {
> + fatal_error (input_location,
> +  "can't use '-' as input filename for a "
> +  "precompiled header");

That would be can%'t

Won't the compiler still ICE if you invoke cc1 or cc1plus rather than the 
driver?

Jakub


Re: [PATCH] Fix precompiled header for '-' being input file (PR, pch/78970)

2017-01-05 Thread Martin Liška
On 01/05/2017 10:11 AM, Jakub Jelinek wrote:
> On Thu, Jan 05, 2017 at 10:00:33AM +0100, Martin Liška wrote:
>> Having '-' used as indication that input should be taken from standard input
>> is unfriendly to pch which calculates md5sum of all input file descriptors.
>> With that fdopen fails for STDIN_FILENO. To be honest my patch is just a 
>> workaround
>> for the ICE. Is there a better solution for that?
> 
> Shouldn't we just error on trying to create PCH for - ?
> I mean, how are you going to use such a PCH file?  ./- is something
> different from - .

I think we should. I prepared a new version of patch which I've been testing.
Are you fine with the error message?

Martin

> 
> If it makes some sense (I don't see it), then the question is if the stdin
> is seakable or not.  If it is not seakable, don't we reject it already, or
> aren't there other places where we want to seek on it?
> If it is seakable, then you might e.g. just read it into a buffer in a loop
> and md5sum the buffer instead of md5summing the stream.
> 
>   Jakub
> 

>From 0e14f21128c7aa67ed0eaa10877323a0b2011b63 Mon Sep 17 00:00:00 2001
From: marxin 
Date: Wed, 4 Jan 2017 16:04:44 +0100
Subject: [PATCH] Error for '-' as filename of a precompiled header (PR
 pch/78970)

gcc/ChangeLog:

2017-01-05  Martin Liska  

	* gcc.c (lookup_compiler): Reject '-' filename for a precompiled
	header.
---
 gcc/gcc.c | 14 +-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/gcc/gcc.c b/gcc/gcc.c
index 8154953eb1d..ea4af119e73 100644
--- a/gcc/gcc.c
+++ b/gcc/gcc.c
@@ -8325,7 +8325,19 @@ lookup_compiler (const char *name, size_t length, const char *language)
 {
   for (cp = compilers + n_compilers - 1; cp >= compilers; cp--)
 	if (cp->suffix[0] == '@' && !strcmp (cp->suffix + 1, language))
-	  return cp;
+	  {
+	if (name != NULL && strcmp (name, "-") == 0
+		&& (strcmp (cp->suffix, "@c-header") == 0
+		|| strcmp (cp->suffix, "@c++-header") == 0))
+	  {
+		fatal_error (input_location,
+			 "can't use '-' as input filename for a "
+			 "precompiled header");
+		return 0;
+	  }
+
+	return cp;
+	  }
 
   error ("language %s not recognized", language);
   return 0;
-- 
2.11.0



[fixincludes] Only declare gets for C++ < 2014 on Solaris (PR libstdc++/78979)

2017-01-05 Thread Rainer Orth
While investigating PR libstdc++/78979, it turns out that Solaris
 incorrectly declares std::gets even for C++14 where it's been
removed.  This patch fixes that by adding an additional __cplusplus <
201402L guard via fixincludes.  This works for Solaris 12 (and 11), but
Solaris 10 declares the function unconditionally (i.e. without the
__STDC_VERSION__ < 201112L guard and the deprecated attribute), so
another fix adds both to handle that version, too.  With this in place,
the std::gets declaration in  needs a guard, too.

While testing this patch together with

https://gcc.gnu.org/ml/gcc-patches/2017-01/msg00288.html

it turned out another fix is needed which I'm throwing in here, too:

 declares

#if defined(_STDC_C11)
extern _Noreturn void quick_exit(int);
#else
extern void quick_exit(int) __NORETURN;
#endif  /* _STDC_C11 */

and the first one now breaks badly with __STDC_VERSION__ defined to the
C11 value:

/usr/include/iso/stdlib_c99.h:83:8: error: '_Noreturn' does not name a type
 extern _Noreturn void quick_exit(int);
^

To avoid this, I've changed to C11-only _Noreturn to
__attribute__((__noreturn__)).

Bootstrapped without regressions on i386-pc-solaris2.1[02] and
sparc-sun-solaris2.12 and fixing

FAIL: 27_io/headers/cstdio/functions_neg.cc  (test for errors, line 24)

as designed.  fixincludes make check, passes, too.

Ok for mainline now?  I'd like to backport the patch to the gcc-5 and
gcc-6 branches which are equally affected.

Thanks.
Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


2017-01-04  Rainer Orth  

PR libstdc++/78979
* inclhack.def (solaris_gets_c11, solaris_gets_cxx14)
(solaris_std_gets_cxx14, solaris_stdlib_noreturn): New fixes.
* fixincl.x: Regenerate.
* tests/base/iso/stdio_iso.h [SOLARIS_GETS_C11_CHECK,
SOLARIS_GETS_CXX14_CHECK, SOLARIS_STD_GETS_CXX14_CHECK,
SOLARIS_STDLIB_NORETURN_CHECK]: New tests.

# HG changeset patch
# Parent  8b946d3b0b1ff1d40660db201feee5925260f9ee
Only declare gets for C++ < 2014 on Solaris (PR libstdc++/78979)

diff --git a/fixincludes/fixincl.x b/fixincludes/fixincl.x
--- a/fixincludes/fixincl.x
+++ b/fixincludes/fixincl.x
@@ -2,11 +2,11 @@
  * 
  * DO NOT EDIT THIS FILE   (fixincl.x)
  * 
- * It has been AutoGen-ed  November 20, 2016 at 12:02:46 PM by AutoGen 5.16.2
+ * It has been AutoGen-ed  January  4, 2017 at 03:58:48 PM by AutoGen 5.16.2
  * From the definitionsinclhack.def
  * and the template file   fixincl
  */
-/* DO NOT SVN-MERGE THIS FILE, EITHER Sun Nov 20 12:02:47 MET 2016
+/* DO NOT SVN-MERGE THIS FILE, EITHER Wed Jan  4 15:58:48  2017
  *
  * You must regenerate it.  Use the ./genfixes script.
  *
@@ -15,7 +15,7 @@
  * certain ANSI-incompatible system header files which are fixed to work
  * correctly with ANSI C and placed in a directory that GNU C will search.
  *
- * This file contains 242 fixup descriptions.
+ * This file contains 245 fixup descriptions.
  *
  * See README for more information.
  *
@@ -7068,6 +7068,84 @@ static const char* apzSolaris_Getc_Stric
 
 /* * * * * * * * * * * * * * * * * * * * * * * * * *
  *
+ *  Description of Solaris_Gets_C11 fix
+ */
+tSCC zSolaris_Gets_C11Name[] =
+ "solaris_gets_c11";
+
+/*
+ *  File name selection pattern
+ */
+tSCC zSolaris_Gets_C11List[] =
+  "iso/stdio_iso.h\0";
+/*
+ *  Machine/OS name selection pattern
+ */
+tSCC* apzSolaris_Gets_C11Machs[] = {
+"*-*-solaris2*",
+(const char*)NULL };
+
+/*
+ *  content selection pattern - do fix if pattern found
+ */
+tSCC zSolaris_Gets_C11Select0[] =
+   "(extern char[ \t]*\\*gets\\(char \\*\\));";
+
+#defineSOLARIS_GETS_C11_TEST_CT  1
+static tTestDesc aSolaris_Gets_C11Tests[] = {
+  { TT_EGREP,zSolaris_Gets_C11Select0, (regex_t*)NULL }, };
+
+/*
+ *  Fix Command Arguments for Solaris_Gets_C11
+ */
+static const char* apzSolaris_Gets_C11Patch[] = {
+"format",
+"#if __STDC_VERSION__ < 201112L && __cplusplus < 201402L\n\
+%1 __attribute__((__deprecated__));\n\
+#endif",
+(char*)NULL };
+
+/* * * * * * * * * * * * * * * * * * * * * * * * * *
+ *
+ *  Description of Solaris_Gets_Cxx14 fix
+ */
+tSCC zSolaris_Gets_Cxx14Name[] =
+ "solaris_gets_cxx14";
+
+/*
+ *  File name selection pattern
+ */
+tSCC zSolaris_Gets_Cxx14List[] =
+  "iso/stdio_iso.h\0";
+/*
+ *  Machine/OS name selection pattern
+ */
+tSCC* apzSolaris_Gets_Cxx14Machs[] = {
+"*-*-solaris2*",
+(const char*)NULL };
+
+/*
+ *  content selection pattern - do fix if pattern found
+ */
+tSCC zSolaris_Gets_Cxx14Select0[] =
+   "(#if __STDC_VERSION__ < 201112L)\n\
+(extern char\t\\*gets\\(char \\*\\) __ATTR_DEPRECATED;)";
+
+#defineSOLARIS_GETS_CXX14_TEST_CT  1
+static tTestDesc aSolaris_Gets_Cxx14Tests[] = {
+  { TT_EGREP,zSolaris_Gets_Cxx14Select0, (regex_t*)NULL }, };
+
+/*
+ *  Fix 

Re: PING Re: [PATCH] Add x86_64-specific selftests for RTL function reader (v2)

2017-01-05 Thread Uros Bizjak
On Tue, Jan 3, 2017 at 5:47 PM, David Malcolm  wrote:
> Ping:
>   https://gcc.gnu.org/ml/gcc-patches/2016-12/msg01616.html
>
> (the patch has been successfully bootstrap on
> x86_64-pc-linux-gnu, and also tested on i686-pc-linux-gnu).

I consider this part of testing ifrastructure, not part of
target-dependent functionality, IOW, testing infrastructure just
happens to live in i386.md.

So, if there is nothing fundamentally x86 specific, I think that
previous OKs were enough.

Uros.

> On Mon, 2016-12-19 at 12:12 -0500, David Malcolm wrote:
>> Note to i386 maintainters: this patch is part of the RTL frontend.
>> It adds selftests for verifying that the RTL dump reader works as
>> expected, with a mixture of real and hand-written "dumps" to
>> exercise various aspects of the loader.   Many RTL dumps contain
>> target-specific features (e.g. names of hard regs), and so these
>> selftests need to be target-specific, and hence this patch puts
>> them in i386.c.
>>
>> Tested on i686-pc-linux-gnu and x86_64-pc-linux-gnu.
>>
>> OK for trunk, assuming bootstrap?
>> (this is dependent on patch 8a within the kit).
>>
>> Changed in v2:
>> - fixed selftest failures on i686:
>>   * config/i386/i386.c
>>   (selftest::ix86_test_loading_dump_fragment_1): Fix handling of
>>   "frame" reg.
>>   (selftest::ix86_test_loading_call_insn): Require TARGET_SSE.
>> - updated to use "<3>" syntax for pseudos, rather than "$3"
>>
>> Blurb from v1:
>> This patch adds more selftests for class function_reader, where
>> the dumps to be read contain x86_64-specific features.
>>
>> In an earlier version of the patch kit, these were handled using
>> preprocessor conditionals.
>> This version instead runs them via a target hook for running
>> target-specific selftests, thus putting them within i386.c.
>>
>> gcc/ChangeLog:
>>   * config/i386/i386.c
>>   (selftest::ix86_test_loading_dump_fragment_1): New function.
>>   (selftest::ix86_test_loading_call_insn): New function.
>>   (selftest::ix86_test_loading_full_dump): New function.
>>   (selftest::ix86_test_loading_unspec): New function.
>>   (selftest::ix86_run_selftests): Call the new functions.
>>
>> gcc/testsuite/ChangeLog:
>>   * selftests/x86_64: New subdirectory.
>>   * selftests/x86_64/call-insn.rtl: New file.
>>   * selftests/x86_64/copy-hard-reg-into-frame.rtl: New file.
>>   * selftests/x86_64/times-two.rtl: New file.
>>   * selftests/x86_64/unspec.rtl: New file.
>>
>> ---
>>  gcc/config/i386/i386.c | 210
>> +
>>  gcc/testsuite/selftests/x86_64/call-insn.rtl   |  17 ++
>>  .../selftests/x86_64/copy-hard-reg-into-frame.rtl  |  15 ++
>>  gcc/testsuite/selftests/x86_64/times-two.rtl   |  51 +
>>  gcc/testsuite/selftests/x86_64/unspec.rtl  |  20 ++
>>  5 files changed, 313 insertions(+)
>>  create mode 100644 gcc/testsuite/selftests/x86_64/call-insn.rtl
>>  create mode 100644 gcc/testsuite/selftests/x86_64/copy-hard-reg-into
>> -frame.rtl
>>  create mode 100644 gcc/testsuite/selftests/x86_64/times-two.rtl
>>  create mode 100644 gcc/testsuite/selftests/x86_64/unspec.rtl
>>
>> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
>> index 1cd1cd8..dc1a86f 100644
>> --- a/gcc/config/i386/i386.c
>> +++ b/gcc/config/i386/i386.c
>> @@ -51200,6 +51200,209 @@ ix86_test_dumping_memory_blockage ()
>> "] UNSPEC_MEMORY_BLOCKAGE)))\n", pat, );
>>  }
>>
>> +/* Verify loading an RTL dump; specifically a dump of copying
>> +   a param on x86_64 from a hard reg into the frame.
>> +   This test is target-specific since the dump contains target
>> -specific
>> +   hard reg names.  */
>> +
>> +static void
>> +ix86_test_loading_dump_fragment_1 ()
>> +{
>> +  rtl_dump_test t (SELFTEST_LOCATION,
>> +locate_file ("x86_64/copy-hard-reg-into
>> -frame.rtl"));
>> +
>> +  rtx_insn *insn = get_insn_by_uid (1);
>> +
>> +  /* The block structure and indentation here is purely for
>> + readability; it mirrors the structure of the rtx.  */
>> +  tree mem_expr;
>> +  {
>> +rtx pat = PATTERN (insn);
>> +ASSERT_EQ (SET, GET_CODE (pat));
>> +{
>> +  rtx dest = SET_DEST (pat);
>> +  ASSERT_EQ (MEM, GET_CODE (dest));
>> +  /* Verify the "/c" was parsed.  */
>> +  ASSERT_TRUE (RTX_FLAG (dest, call));
>> +  ASSERT_EQ (SImode, GET_MODE (dest));
>> +  {
>> + rtx addr = XEXP (dest, 0);
>> + ASSERT_EQ (PLUS, GET_CODE (addr));
>> + ASSERT_EQ (DImode, GET_MODE (addr));
>> + {
>> +   rtx lhs = XEXP (addr, 0);
>> +   /* Verify that the "frame" REG was consolidated.  */
>> +   ASSERT_RTX_PTR_EQ (frame_pointer_rtx, lhs);
>> + }
>> + {
>> +   rtx rhs = XEXP (addr, 1);
>> +   ASSERT_EQ (CONST_INT, GET_CODE (rhs));
>> +   ASSERT_EQ (-4, INTVAL (rhs));
>> + }
>> +  }
>> +  /* Verify the "[1 i+0 S4 A32]" was parsed.  */
>> +  ASSERT_EQ (1, 

Require C11 for C++17 on Solaris

2017-01-05 Thread Rainer Orth
While investigating PR libstdc++/78979, it turned out that g++ on
Solaris should define the C11 value for __STDC_VERSION__ instead of the
C99 one when compiling for C++17.

That's what this patch does.  It seemed safer to use the C99 value only
where we know it's needed, and default to the C11 one to be future-proof
for versions of C++ beyond 2017.

Bootstrapped without regressions on i386-pc-solaris2.1[10] and
sparc-sun-solaris2.12 (together with the soon-to-be-submitted patch for
the PR itself).

Unless Jon or someone else finds fault with the approach, I plan to
commit it to mainline soon.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


2017-01-04  Rainer Orth  

* config/sol2.h (TARGET_OS_CPP_BUILTINS): Define __STDC_VERSION__
to 201112L since C++17.

diff --git a/gcc/config/sol2.h b/gcc/config/sol2.h
--- a/gcc/config/sol2.h
+++ b/gcc/config/sol2.h
@@ -94,7 +94,22 @@ along with GCC; see the file COPYING3.  
 	   library.  */	\
 	if (c_dialect_cxx ())\
 	  {		\
+	switch (cxx_dialect)\
+	  {		\
+	  case cxx98:	\
+	  case cxx11:	\
+	  case cxx14:	\
+	/* C++11 and C++14 are based on C99.	\
+	   libstdc++ makes use of C99 features	\
+	   even for C++98.  */			\
 	builtin_define ("__STDC_VERSION__=199901L");\
+	break;	\
+			\
+	  default:	\
+	/* C++17 is based on C11.  */		\
+	builtin_define ("__STDC_VERSION__=201112L");\
+	break;	\
+	  }		\
 	builtin_define ("_XOPEN_SOURCE=600");	\
 	builtin_define ("_LARGEFILE_SOURCE=1");	\
 	builtin_define ("_LARGEFILE64_SOURCE=1");	\


[build, libgo, libstdc++] Build libgo with -Wa,-nH if possible (PR go/78978)

2017-01-05 Thread Rainer Orth
As could have been expected, the static libgo.a causes the same problem
with hardware capabilities on Solaris/x86 as was solved for libgo.so
with

https://gcc.gnu.org/ml/gcc-patches/2016-12/msg00726.html

Instead of trying to pass -mclear-hwcap with -static-libgo, it was
deemed easier to solve both problems the same way and disable hwcaps
support in the assembler in the first place.

This has already been done in libstdc++, so I've moved the corresponding
autoconf macro to config/hwcaps.m4 and adapted it to set HWCAP_CFLAGS
instead of HWCAP_FLAGS to better differntiate from HWCAP_LDFLAGS.
Everything else is straightforward, I believe.

Bootstrapped without regressions on i386-pc-solaris2.1[02] with as/ld
(where as supports -nH) and sparc-sun-solaris2.12 with as/ld (where as
doesn't, or rather it's called -hwcap={1|0}), and the libgo
runtime/pprof failure is gone.

Ok for mainline?

Once approved, how should we proceed with checking?  Ian, will you take
care of the libgo part once the rest is in?

Thanks.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


2017-01-04  Rainer Orth  

libgo:
PR go/78978
* Makefile.am (AM_CFLAGS): Add HWCAP_CFLAGS.
* configure.ac: Call GCC_CHECK_ASSEMBLER_HWCAP.
* aclocal.m4: Regenerate.
* configure: Regenerate.
* Makefile.in: Regenerate.
* testsuite/Makefile.in: Regenerate.

Revert:
2016-12-09  Rainer Orth  

* configure.ac: Call GCC_CHECK_LINKER_HWCAP.
* Makefile.am (AM_LDFLAGS): Initialize to HWCAP_LDFLAGS.
[USING_SPLIT_STACK]: Add to AM_LDFLAGS.
* aclocal.m4: Regenerate.
* configure: Regenerate.
* Makefile.in: Regenerate.
* testsuite/Makefile.in: Regenerate.

libstdc++-v3:
PR go/78978
* acinclude.m4 (GLIBCXX_CHECK_ASSEMBLER_HWCAP): Remove.
* configure.ac: Call GCC_CHECK_ASSEMBLER_HWCAP instead of
GLIBCXX_CHECK_ASSEMBLER_HWCAP.
* fragment.am (CONFIG_CXXFLAGS): Use HWCAP_CFLAGS instead of
HWCAP_FLAGS.
* aclocal.m4: Regenerate.
* configure: Regenerate.
* Makefile.in, doc/Makefile.in, include/Makefile.in,
libsupc++/Makefile.in, po/Makefile.in, python/Makefile.in,
src/Makefile.in, src/c++11/Makefile.in, src/c++98/Makefile.in,
src/filesystem/Makefile.in, testsuite/Makefile.in: Regenerate.

config:
PR go/78978
* hwcaps.m4 (GCC_CHECK_ASSEMBLER_HWCAP): New macro.

# HG changeset patch
# Parent  d2202c84f6ecdf3980dc94ba513be676b4015819
Build libgo with -Wa,-nH if possible (PR go/78978)

diff --git a/config/hwcaps.m4 b/config/hwcaps.m4
--- a/config/hwcaps.m4
+++ b/config/hwcaps.m4
@@ -1,3 +1,35 @@
+dnl
+dnl Check if the assembler used supports disabling generation of hardware
+dnl capabilities.  This is only supported by Solaris as at the moment.
+dnl
+dnl Defines:
+dnl  HWCAP_CFLAGS='-Wa,-nH' if possible.
+dnl
+AC_DEFUN([GCC_CHECK_ASSEMBLER_HWCAP], [
+  test -z "$HWCAP_CFLAGS" && HWCAP_CFLAGS=''
+
+  # Restrict the test to Solaris, other assemblers (e.g. AIX as) have -nH
+  # with a different meaning.
+  case ${target_os} in
+solaris2*)
+  ac_save_CFLAGS="$CFLAGS"
+  CFLAGS="$CFLAGS -Wa,-nH"
+
+  AC_MSG_CHECKING([for as that supports -Wa,-nH])
+  AC_TRY_COMPILE([], [return 0;], [ac_hwcap_flags=yes],[ac_hwcap_flags=no])
+  if test "$ac_hwcap_flags" = "yes"; then
+	HWCAP_CFLAGS="-Wa,-nH $HWCAP_CFLAGS"
+  fi
+  AC_MSG_RESULT($ac_hwcap_flags)
+
+  CFLAGS="$ac_save_CFLAGS"
+  ;;
+  esac
+
+  AC_SUBST(HWCAP_CFLAGS)
+])
+
+
 dnl
 dnl Check if the linker used supports linker maps to clear hardware
 dnl capabilities.  This is only supported on Solaris at the moment.
diff --git a/libgo/Makefile.am b/libgo/Makefile.am
--- a/libgo/Makefile.am
+++ b/libgo/Makefile.am
@@ -42,14 +42,12 @@ ACLOCAL_AMFLAGS = -I ./config -I ../conf
 
 AM_CFLAGS = -fexceptions -fnon-call-exceptions -fplan9-extensions \
 	$(SPLIT_STACK) $(WARN_CFLAGS) \
-	$(STRINGOPS_FLAG) $(OSCFLAGS) \
+	$(STRINGOPS_FLAG) $(HWCAP_CFLAGS) $(OSCFLAGS) \
 	-I $(srcdir)/../libgcc -I $(srcdir)/../libbacktrace \
 	-I $(MULTIBUILDTOP)../../gcc/include
 
-AM_LDFLAGS = $(HWCAP_LDFLAGS)
-
 if USING_SPLIT_STACK
-AM_LDFLAGS += -XCClinker $(SPLIT_STACK)
+AM_LDFLAGS = -XCClinker $(SPLIT_STACK)
 endif
 
 # Multilib support.
diff --git a/libgo/configure.ac b/libgo/configure.ac
--- a/libgo/configure.ac
+++ b/libgo/configure.ac
@@ -421,9 +421,6 @@ case "$target" in
 esac
 AC_SUBST(OSCFLAGS)
 
-dnl Check linker hardware capability support.
-GCC_CHECK_LINKER_HWCAP
-
 case "$target" in
 *86*-*-solaris2.1[[1-9]]*)
 	# Link with -shared-libgcc on Solaris 11+/x86 as a workaround for
@@ -433,6 +430,9 @@ case "$target" in
 esac
 AC_SUBST(OSLDFLAGS)
 
+# Check if assembler 

Re: [PATCH] Fix precompiled header for '-' being input file (PR, pch/78970)

2017-01-05 Thread Jakub Jelinek
On Thu, Jan 05, 2017 at 10:00:33AM +0100, Martin Liška wrote:
> Having '-' used as indication that input should be taken from standard input
> is unfriendly to pch which calculates md5sum of all input file descriptors.
> With that fdopen fails for STDIN_FILENO. To be honest my patch is just a 
> workaround
> for the ICE. Is there a better solution for that?

Shouldn't we just error on trying to create PCH for - ?
I mean, how are you going to use such a PCH file?  ./- is something
different from - .

If it makes some sense (I don't see it), then the question is if the stdin
is seakable or not.  If it is not seakable, don't we reject it already, or
aren't there other places where we want to seek on it?
If it is seakable, then you might e.g. just read it into a buffer in a loop
and md5sum the buffer instead of md5summing the stream.

Jakub


[PING**4] [PATCH, ARM] Further improve stack usage on sha512 (PR 77308)

2017-01-05 Thread Bernd Edlinger
Ping...

On 12/27/16 15:12, Bernd Edlinger wrote:
> Ping...
>
> On 12/19/16 06:49, Bernd Edlinger wrote:
>> Ping...
>>
>> On 12/12/16 06:59, Bernd Edlinger wrote:
>>> Ping...
>>>
>>> On 12/05/16 14:41, Bernd Edlinger wrote:
 Hi,

 this was the latest version of my patch:
 https://gcc.gnu.org/ml/gcc-patches/2016-11/msg02796.html


 Thanks
 Bernd.


Re: [PATCH] Do not sanitize in lower_omp_target context (PR, sanitizer/78815).

2017-01-05 Thread Jakub Jelinek
On Thu, Jan 05, 2017 at 09:42:54AM +0100, Martin Liška wrote:
> I like your approach, I'm sending updated patch which bootstraps and survives
> regression tests.
> 
> Ready to be installed?
> Martin

> >From 42887cf5fe7d94709ee5356fb6534c7a4fc26bff Mon Sep 17 00:00:00 2001
> From: marxin 
> Date: Wed, 4 Jan 2017 16:43:49 +0100
> Subject: [PATCH] Do not sanitize in an abnormal context (PR sanitizer/78815).
> 
> gcc/ChangeLog:
> 
> 2017-01-04  Martin Liska  
> 
>   PR sanitizer/78815
>   * gimplify.c (gimplify_decl_expr): Compare to
>   asan_poisoned_variables instread of checking flags.
>   (gimplify_target_expr): Likewise.
>   (gimplify_expr): Likewise.
>   (gimplify_function_tree): Conditionally initialize
>   asan_poisoned_variables.

Ok, thanks.

Jakub


[PATCH] Fix precompiled header for '-' being input file (PR, pch/78970)

2017-01-05 Thread Martin Liška
Having '-' used as indication that input should be taken from standard input
is unfriendly to pch which calculates md5sum of all input file descriptors.
With that fdopen fails for STDIN_FILENO. To be honest my patch is just a 
workaround
for the ICE. Is there a better solution for that?

Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.

Ready to be installed?
Martin
>From ad7f98ed0412e5213f5be9894bcf021b6bf93450 Mon Sep 17 00:00:00 2001
From: marxin 
Date: Wed, 4 Jan 2017 16:04:44 +0100
Subject: [PATCH] Fix precompiled header for '-' being input file (PR
 pch/78970)

libcpp/ChangeLog:

2017-01-04  Martin Liska  

	PR pch/78970
	* files.c (_cpp_save_file_entries): Do not calculate md5sum
	when input file is STDIN.
---
 libcpp/files.c | 10 +++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/libcpp/files.c b/libcpp/files.c
index 969a531033f..cc597d4f22a 100644
--- a/libcpp/files.c
+++ b/libcpp/files.c
@@ -1885,9 +1885,13 @@ _cpp_save_file_entries (cpp_reader *pfile, FILE *fp)
 	  free (result);
 	  return false;
 	}
-	  ff = fdopen (f->fd, "rb");
-	  md5_stream (ff, result->entries[count].sum);
-	  fclose (ff);
+	  /* Skip STDIN as fdopen would fail for the file descriptor.  */
+	  if (f->fd != STDIN_FILENO)
+	{
+	  ff = fdopen (f->fd, "rb");
+	  md5_stream (ff, result->entries[count].sum);
+	  fclose (ff);
+	}
 	  f->fd = oldfd;
 	}
   result->entries[count].size = f->st.st_size;
-- 
2.11.0



Re: [-fcompare-debug] var tracking options are not optimization options

2017-01-05 Thread Jakub Jelinek
On Thu, Jan 05, 2017 at 02:06:09AM -0200, Alexandre Oliva wrote:
> If we include them in the ICF hash, they may cause congruence_groups to
> be processed in a different order due to different hashes, which in turn
> causes different funcdef_nos to be assigned to functions.  Since these
> numbers are included in -fcompare-debug dumps, they cause failures.
> 
> Since these options are not optimization options, in that they do not
> (or should not, save for bugs like this) affect the executable code
> output by the compiler, they should not be marked as such.
> 
> This patch replaces the Optimization flag in the var-tracking options
> with the newly-introduced PerFunction flag, so that it can still be
> controlled on a per-function basis, but that disregards it in the hash
> computation used by ICF.
> 
> This fixes -fcompare-debug failures in numerous LTO testcases.
> 
> Regstrapped on x86_64-linux-gnu and i686-linux-gnu.  Ok to install?

I'm not sure how does that work though.

OPTIMIZATION_NODE is created by saving options, computing hash
(cl_option_hasher::hash apparently for OPTIMIZATION_NODE does not
use cl_optimization_hash, why?), comparing (no cl_optimization_eq,
just memcmp for equality) and if there is a match, use it instead.
And on the IPA-ICF side, it uses cl_optimization_hash for hash
computation and again memcmp for equality.

If your patch changes the behavior of cl_optimization_hash to ignore
the PerFunction flags, then if you are unlucky that still means
-fcompare-debug IPA-ICF issues (ICF might work differently with -g
vs. -g0).

So, I think either we change cl_optimization_hash to have an extra
bool parameter whether PerFunction counts or not, use it with true
in cl_option_hasher::hash and false in IPA-ICF (haven't looked at the LTO
streaming), and add cl_optimization_eq again with such a flag, and
use it with true in cl_option_hasher::equal and false in IPA-ICF instead of
the memcmp.  Or we keep doing the bitwise hash computation and memcmp in
build_optimization_node, do (what I assume you do in your patch) in
cl_optimization_hash and add cl_optimization_eq that also ignores the
PerFunction options in the comparison and use that in IPA-ICF instead of
memcmp.  That will have the undesirable effect that if you are unlucky and
have functions with optimize ("fvar-tracking*") and optimize
("fno-var-tracking*") attributes that IPA-ICF will just randomly pick one of
them, but I guess that is not a big deal and is quite unlikely.

Or tweak the PerFunction opts more and have some extra flag somewhere
whether the value of the option is default (coming from command line
options) or not (and use that default flag in cl_optimization_{hash,eq}
such that options with default flag hash the same no matter what their
value is and compare unequal to all the non-default options.
This would mean IPA-ICF would not be able to merge a function without
explicit optimize ("fvar-tracking*") or optimize ("fno-var-tracking*")
attribute with one that has such an attribute.  I guess this would be harder
to implement though.

Jakub


Re: [PATCH] vimrc: fix TAB settings

2017-01-05 Thread Martin Liška
On 01/04/2017 10:14 PM, Gerald Pfeifer wrote:
> On Mon, 14 Nov 2016, Martin Liška wrote:
>> Following patch adds TAB settings to contrib/vimrc file.
>> Hope it looks reasonable?
> 
> This does not appear applied, are you waiting for approval?  If
> so, for something like this in contrib we don't need to get overly
> strict and I'd say go ahead.
> 
> Gerald
> 

Yep, I've been waiting for approval :) Installed as r244094.

Thanks
Martin


Re: [PATCH] Do not sanitize in lower_omp_target context (PR, sanitizer/78815).

2017-01-05 Thread Martin Liška
On 01/04/2017 10:31 AM, Jakub Jelinek wrote:
> On Wed, Jan 04, 2017 at 10:19:28AM +0100, Martin Liška wrote:
>> PING^1
>>
>> On 12/16/2016 01:04 PM, Martin Liška wrote:
>>> Currently, use-after-scope relies on fact that entry point of 
>>> gimplify_decl_expr
>>> is gimplify_function_tree. Fixed by checking if asan_poisoned_variables is 
>>> non-null.
>>>
>>> Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.
>>>
>>> Ready to be installed?
> 
> Looking at asan_poisoned_variables, my preference would be to guard:
>   asan_poisoned_variables = new hash_set ();
> with
> if (asan_sanitize_use_after_scope ()
> && !asan_no_sanitize_address_p ())
> the delete asan_poisoned_variables; with if (asan_poisoned_variables)
> and all the poisoning stuff in the gimplifier also with if
> (asan_poisoned_variables) and no need to repeat there the 
> asan_sanitize_use_after_scope
> () and !asan_no_sanitize_address_p () tests.
>   if (asan_poisoned_variables != NULL
>   && asan_poisoned_variables->contains (t))
> is already fine,
>   if (asan_sanitize_use_after_scope ()
>   && !asan_no_sanitize_address_p ()
>   && !is_vla
>   && TREE_ADDRESSABLE (decl)
>   && !TREE_STATIC (decl)
>   && !DECL_HAS_VALUE_EXPR_P (decl)
>   && dbg_cnt (asan_use_after_scope))
> should replace the first 2 conditions with asan_poisoned_variables,
>   if (asan_sanitize_use_after_scope ()
>   && asan_used_labels != NULL
>   && asan_used_labels->contains (label))
> asan_poison_variables (asan_poisoned_variables, false, pre_p);
> should replace asan_sanitize_use_after_scope () with
> asan_poisoned_variables.  IMHO no need to add comments, especially not one
> mentioning omp lowering - the gimplifier is called from lots of various
> places.
> 
>   Jakub
> 

I like your approach, I'm sending updated patch which bootstraps and survives
regression tests.

Ready to be installed?
Martin
>From 42887cf5fe7d94709ee5356fb6534c7a4fc26bff Mon Sep 17 00:00:00 2001
From: marxin 
Date: Wed, 4 Jan 2017 16:43:49 +0100
Subject: [PATCH] Do not sanitize in an abnormal context (PR sanitizer/78815).

gcc/ChangeLog:

2017-01-04  Martin Liska  

	PR sanitizer/78815
	* gimplify.c (gimplify_decl_expr): Compare to
	asan_poisoned_variables instread of checking flags.
	(gimplify_target_expr): Likewise.
	(gimplify_expr): Likewise.
	(gimplify_function_tree): Conditionally initialize
	asan_poisoned_variables.
---
 gcc/gimplify.c | 18 ++
 1 file changed, 10 insertions(+), 8 deletions(-)

diff --git a/gcc/gimplify.c b/gcc/gimplify.c
index 14e79b4b3f3..e1e9ce9e903 100644
--- a/gcc/gimplify.c
+++ b/gcc/gimplify.c
@@ -1620,8 +1620,7 @@ gimplify_decl_expr (tree *stmt_p, gimple_seq *seq_p)
 	  is_vla = true;
 	}
 
-  if (asan_sanitize_use_after_scope ()
-	  && !asan_no_sanitize_address_p ()
+  if (asan_poisoned_variables
 	  && !is_vla
 	  && TREE_ADDRESSABLE (decl)
 	  && !TREE_STATIC (decl)
@@ -6413,8 +6412,7 @@ gimplify_target_expr (tree *expr_p, gimple_seq *pre_p, gimple_seq *post_p)
 	  else
 		cleanup = clobber;
 	}
-	  if (asan_sanitize_use_after_scope ()
-	  && dbg_cnt (asan_use_after_scope))
+	  if (asan_poisoned_variables && dbg_cnt (asan_use_after_scope))
 	{
 	  tree asan_cleanup = build_asan_poison_call_expr (temp);
 	  if (asan_cleanup)
@@ -11426,7 +11424,7 @@ gimplify_expr (tree *expr_p, gimple_seq *pre_p, gimple_seq *post_p,
 	  /* If the label is used in a goto statement, or address of the label
 	 is taken, we need to unpoison all variables that were seen so far.
 	 Doing so would prevent us from reporting a false positives.  */
-	  if (asan_sanitize_use_after_scope ()
+	  if (asan_poisoned_variables
 	  && asan_used_labels != NULL
 	  && asan_used_labels->contains (label))
 	asan_poison_variables (asan_poisoned_variables, false, pre_p);
@@ -12531,10 +12529,14 @@ gimplify_function_tree (tree fndecl)
   && !needs_to_live_in_memory (ret))
 DECL_GIMPLE_REG_P (ret) = 1;
 
-  asan_poisoned_variables = new hash_set ();
+  if (asan_sanitize_use_after_scope () && !asan_no_sanitize_address_p ())
+asan_poisoned_variables = new hash_set ();
   bind = gimplify_body (fndecl, true);
-  delete asan_poisoned_variables;
-  asan_poisoned_variables = NULL;
+  if (asan_poisoned_variables)
+{
+  delete asan_poisoned_variables;
+  asan_poisoned_variables = NULL;
+}
 
   /* The tree body of the function is no longer needed, replace it
  with the new GIMPLE body.  */
-- 
2.11.0



Re: [PR tree-optimization/67955] Exploit PTA in DSE

2017-01-05 Thread Richard Biener
On Wed, Jan 4, 2017 at 8:24 PM, Jeff Law  wrote:
> On 01/04/2017 11:55 AM, Jeff Law wrote:
>>
>> On 12/09/2016 01:28 AM, Richard Biener wrote:
>>>
>>> On Wed, Dec 7, 2016 at 12:18 AM, Jeff Law  wrote:



 So I was going through the various DSE related bugs as stumbled across
 67955.

 The basic issue in 67955 is that DSE is too simplistic when trying to
 determine if two writes hit the same memory address.  There are cases
 were
 PTA analysis can get us that information.

 The unfortunate problem is that PTA can generate a single points-to
 set for
 pointers to different elements in the same array.  So we can only
 exploit a
 special case.  Namely when we get a PTA singleton and the size of the
 store
 is the same size of the pointed-to object.

 That doesn't happen in a bootstrap, but it does happen for the
 testcase in
 the BZ as well as a handful of tests in the testsuite -- Tom reported 6
 unique tests where it triggered, I ran my own tests where two more
 were spit
 out.  Not huge.  But the cost is low.

 All that I've done with Tom's patch is update the call into the PTA
 system.
 It's very clearly Tom's work.

 Bootstrapped and regression tested on x86_64-linux-gnu.  Installing
 on the
 trunk.
>>>
>>>
>>> Just double-checked it doesn't break
>>>
>>> int foo (int *p, int b)
>>> {
>>>   int *q;
>>>   int i = 1;
>>>   if (b)
>>> q = 
>>>   else
>>> q = (void *)0;
>>>   *q = 2;
>>>   return i;
>>> }
>>
>> Right.  ISTM this shouldn't have a singleton points-to set and thus
>> shouldn't change behavior.But looking at things more closely, it
>> looks like points-to-null is distinct from the points-to variables.  ie,
>> we want to know if its a singleton and ! null.  Thus we have to check
>> pt_solution_singleton_or_null_p (pi->pt, _uid) && !pi->pt->null
>>
>> I think it's working for you because the path isolation code splits off
>> the NULL dereference path.  Adding
>> -fno-isolate-erroneous-paths-dereference to the switches shows DSE2,
>> incorrectly IMHO, removing the i = 1 store.
>>
>> Thoughts?
>
> The more I think about this the more I'm sure we need to verify pt.null is
> not in the points-to set.I've taken the above testcase and added it as a
> negative test.  Bootstrapped, regression tested and committed to the trunk
> along with the other minor cleanups you pointed out.

Note disabling this for pt.null == 1 makes it pretty useless given we compute
that conservatively to always 1 in points-to analysis (and only VRP ever sets
it to zero).  See PTAs find_what_p_points_to.  This is because PTA does
not conservatively compute whether a pointer may be NULL (all bugs, I have
partly fixed some and have an incomplete patch to fix others -- at the point
I looked into this we had no users of pt.null info and thus I decided the
extra constraints and complexity wasn't worth the compile-time/memory-use).

Without -fnon-call-exceptions removing the *0 = 2 store is IMHO ok, so we
only have to make sure to not break the exception case.

For

int foo (int *p, int b)
{
  int *q;
  int i = 1;
  if (b)
q = 
  else
q = (void *)0;
  *q = 2;
  i = 3;
  return *q;
}

we remove all stores but the last store to i and the load from q (but we don't
replace q with  here, a missed optimization if removing the other stores is
valid).

So for

int foo (int *p, int b)
{
  int i = 1;
  try {
  int *q;
  if (b)
q = 
  else
q = (int *)0;
  *q = 2;
  } catch (...) {
  return 0;
  }
  return i;
}

we do not remove the store to i with -fnon-call-exceptions.  Which means we are
fine not testing for pt.null != 1.

?

Richard.

> Thanks,
>
> Jeff
>
>
> diff --git a/gcc/ChangeLog b/gcc/ChangeLog
> index d43c9bc..3a7ad9d 100644
> --- a/gcc/ChangeLog
> +++ b/gcc/ChangeLog
> @@ -1,3 +1,11 @@
> +2017-01-04  Jeff Law  
> +
> +   PR tree-optimizatin/67955
> +   * tree-ssa-alias.c (same_addr_size_stores_p): Check offsets first.
> +   Allow any SSA_VAR_P as the base objects.  Use integer_zerop.  Verify
> +   the points-to solution does not include pt_null.  Use DECL_PT_UID
> +   unconditionally.
> +
>  2017-01-04  Uros Bizjak  
>
> * config/i386/i386.md (HI/SImode test with imm to QImode splitters):
> diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
> index d8ff32f..0cbc8cc 100644
> --- a/gcc/testsuite/ChangeLog
> +++ b/gcc/testsuite/ChangeLog
> @@ -1,3 +1,8 @@
> +2017-01-03  Jeff Law  
> +
> +   PR tree-optimization/67955
> +   * gcc.dg/tree-ssa/ssa-dse-28.c: New test.
> +
>  2017-01-04  Marek Polacek  
>
> PR c++/77545
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dse-28.c
> b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dse-28.c
> new file mode 100644
> index 000..d35377b
> --- /dev/null
> +++ 

Re: [bootstrap-O1] change value type to avoid sprintf buffer size warning

2017-01-05 Thread Andreas Schwab
On Jan 05 2017, Alexandre Oliva  wrote:

> diff --git a/gcc/c-family/c-pretty-print.c b/gcc/c-family/c-pretty-print.c
> index 90428ca..07fdbae 100644
> --- a/gcc/c-family/c-pretty-print.c
> +++ b/gcc/c-family/c-pretty-print.c
> @@ -2400,7 +2400,7 @@ pp_c_tree_decl_identifier (c_pretty_printer *pp, tree t)
>else
>  {
>static char xname[8];
> -  sprintf (xname, "", ((unsigned)((uintptr_t)(t) & 0x)));
> +  sprintf (xname, "", ((unsigned short)((uintptr_t)(t) & 
> 0x)));

Please fix the spacing while you are at it.

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."


Re: [PATCH] Special case clzl like builtins in factor_out_conditional_conversion (PR tree-optimization/71016)

2017-01-05 Thread Richard Biener
On Wed, 4 Jan 2017, Jeff Law wrote:

> On 01/04/2017 11:01 AM, Jakub Jelinek wrote:
> > On Wed, Jan 04, 2017 at 11:19:46AM +0100, Richard Biener wrote:
> > > For the SSA_NAME + INTEGER_CST case restrict it to the case
> > > 
> > >   if (x_1 > 5)
> > > tem_2 = (char) x_1;
> > >   # tem_3 = PHI 
> > > 
> > > that is, (char) x_1 uses x_1 and that also appears in the controlling
> > > GIMPLE_COND.  That's what enables followup minmax replacement
> > > (gcc.dg/tree-ssa/pr66726.c).  Another case where it might be
> > > profitable is if the BB is empty after the conversion is sunk
> > > (may enable other value-replacement cases).
> > 
> > Do we actually have any testcases where we need the SSA_NAME + INTEGER_CST
> > case?
> > The only testcase that has been added that actually relied on the
> > factor_out_* is since r242750 handled during gimple folding (already in
> > *.gimple dump).
> > 
> > If I modify the testcase so that it isn't recognized by the minmax folding,
> > extern unsigned short y[];
> > 
> > int
> > foo (int x)
> > {
> >   return 64 < y[x] ? 68 : y[x];
> > }
> > 
> > then vanilla gcc vs. gcc with factor_out* INTEGER_CST case removed gives:
> > -   cmpw$65, %ax
> > -   cmovnb  %edx, %eax
> > +   cmpw$64, %ax
> > +   cmova   %edx, %eax
> > so not worse or better.  And, even when the conversion is the only stmt
> > in the basic block, that still doesn't allow the bb to be removed, we need
> > the empty bb for the PHIs.
> My recollection is this was a piece of the puzzle towards solving 45397.
> Given that we haven't solved 45397 yet, I won't object if we want to reject
> SSA_NAME + INTEGER_CST at this time and come back to it when we dig into 45397
> again.

I think that would be premature.  Consider a modified 
gcc.dg/tree-ssa/pr66726.c:

extern unsigned short mode_size[];

int
oof (int mode)
{
  int tem;
  if (64 < mode_size[mode])
tem = 64;
  else
tem = mode_size[mode];
  return tem;
}

here phiopt1 manages to produce

oof (int mode)
{
  int tem;
  short unsigned int _1;
  short unsigned int _7;

   [100.00%]:
  _1 = mode_size[mode_4(D)];
  _7 = MIN_EXPR <_1, 64>;
  tem_2 = (int) _7;
  return tem_2;

which is an important simplification for example for vectorization or
other loop opts requiring a simple loop structure.

That we now fold the GENERIC cond-expr during gimplification
is of course good but doesn't make the phiopt code useless.
As far as I remember I objected adding handling of conversions
in the minmax replacement code and instead suggested to add this
"enablement" phiopt instead.  So what I suggest is to tame that down
to the cases where it actually enables something (empty BB afterwards,
a PHI with a use that's also used on the controlling conditional).

Richard.